Rash thoughts about .NET, C#, F# and Dynamics NAV.


"Every solution will only lead to new problems."

Monday, 21. March 2016


Property based testing in the real world – or how I made my package manager suck less

Filed under: C#,F#,Informatik — Steffen Forkmann at 16:57 Uhr

A couple of days ago I tried to fix a bug in the .NET/mono dependency manager “Paket“. The bug was a really strange edge case in Paket’s resolver algorithm that resulted in a false positive conflict (an overview about the algorithm can be found here).

In other words: Paket reported a version conflict where it should have reported a valid package resolution.

After some investigation I found that the bug was in some optimization code that tries to cut parts of the search tree. When I disabled that part, the resolver found a valid resolution, but was significantly slower. So after some further testing I came up with a fix that was fast and solved the issue.

But why didn’t we find that edge case before? And most importantly: do we have other bugs in the resolver?

Isaac Abraham suggested to investigate the resolver optimizations further and proposed to use property based testing for this (see Scott Wlaschin’s blog for amazing introduction material).

Problem formulation

In this post I want to evaluate if Paket’s package resolver works correct. Let’s start by clarifying some terms.

  • A “package” consists of a name, a version and a (possibly empty) list of dependencies. In this post we are only talking about meta data. The package contents are irrelevant.
  • A “dependency” is always referencing another package name and defines a version requirement (e.g. >= 2.0 or < 4.1).
  • The “package graph” is a list of packages and acts as a stub for a possible configuration of a package feed like nuget.org. It’s basically a collection of the meta data of all published packages.
  • A “resolver puzzle” consists of a package graph and a list of dependencies. In other words it defines a possible configuration of the package feed and a specific list of package requirements (basically what you would define in paket.dependencies).
  • A “resolution” is a list of packages that “solves” the resolver puzzle. This means all package requirements are satisfied and we don’t have more than one package for a every package name.

Our goal is to show that Paket’s resolver is returning a version conflict if and only if we can not find a package resolution via a brute-force algorithm. We do not compare resolutions, we just want to know if the optimization is missing some valid resolutions.

Generating test data

Property based testing is a stochastical test approach. The test framework (in this case we use FsCheck) generates lots of test cases with random data and tries to automatically falsify a given assumption about the algorithm. 

Usually FsCheck will just generate random, uniformly distributed data for whatever data type it sees. This is a very good default, but in our case this would result in 99% package graphs that have no resolution.

In order to create more sensible data and increasing the likelihood of finding real bugs we need to write our own generator functions.

Let’s consider a very small example and think about package names. If we would use the standard FsCheck data generator then it would generate completely random names including empty string, null and names that contain very special characters. While this is great for testing the validation part of Paket, we don’t really need this here. We just generate some valid package names:

As you can see we just generate non-negative integers and create the package name by prefix a “P”. Similarly we can create random but valid version numbers like that:

Here we just map a triplet of non-negative integers to a version of form major.minor.patch.

Now it’s easy to generate a list of packages with corresponding versions. Actually FsCheck can automatically generate a list of PackageName and Version pairs, but since it’s completely random it would generate a list where most packages have only one or two versions. The following custom generator creates a list of different versions for every package and flattens the result. This has a much bigger chance of creating interesting cases:

With a small helper function that creates random version requirements we can create a full package graph:

The last step is to create some package requirements to that graph that we want to resolve. This is basically what you would write into your paket.dependencies file.

Testing against brute-force algorithm

The following defines a FsCheck property that calls Paket’s resolver with a random puzzle. If the resolver finds a resolution, then we verify that this resolution is correct. If it can’t find a resolution we use a brute-force algorithm to verify that the puzzle indeed has no valid resolution:

This is testing the real resolver against a very naïve brute-force implementation. The brute-force version was relatively easy to write and has no smart optimizations:

And this point I reverted the fix in the optimization code and started the test to see what happens. FsCheck needed about 3s to come up with a random example where a valid resolution exists but Paket’s resolver didn’t find it. That was an amazing moment. Unfortunately the example was HUGE and very complicated to understand.

Shrinking

Most property based testing frameworks have a second important feature called “shrinkers”. While generators create random test data, shrinkers are used to simplify counter-examples and have type ‘a -> seq<‘a>. Given a value (our counter-example), it produces a sequence of new examples that are “smaller” than the given value. These new values are tested if they still falsify the given property. If they do a new shrinking process will be started that tries to reduce the new example even further.

There are two easy ways to create a new “smaller” graph. The first optios is to select a package from the graph and remove one of it’s dependencies and second option is to remove a package completely. The following code shows this:

In a similar way we can shrink a puzzle:

After running the test again FsCheck reported:

The ouput shows that FsCheck generated a random puzzle with 104 packages and reduced it to a sample with only 3 packages and 2 package requirements. This made it pretty easy to analyze what was going on in the optimization part of the resolver. The sample was even smaller than the one that was reported from the wild.

With the fix reapplied I rerun the tests and FsCheck reported another error:

So now FsCheck discovered a new, previously unknown bug in Paket’s resolver. The sample shows 2 packages and one of the packages is requiring a non-existing package.
This is trivial case and the brute-force algorihm is instantly finding a solution. So what is wrong in Paket’s resolver?
Again the answer is hidden in some performance optimization and the fix can be found here.

But this was not the end – the property based test spotted yet another counter-example so I fixed that as well.

Conclusion

It took me a while to build up property based testing for this complex scenario. Most of the time was needed to build custom generators and shrinkers, but I was able to reproduce a bug from the wild and also found two new bugs. I think this is a big success.

Monday, 21. December 2015


FsAdvent: Automatic re-build and background tasks for suave.io websites

Filed under: .NET,F#,FAKE - F# Make,Visual Studio — Steffen Forkmann at 14:12 Uhr

[This post my second entry of the F# advent calendar 2015 series. You can also read the first post about “Using Async.Choice in Paket“]

Recently I was asked to help with a website that was based on suave.io. The task was to transform the existing F# script based suave.io website to something that can be run in the enterprise infrastructure and also performs some background tasks like creating Excel reports recurringly.

Suave as a (windows) service

A lot of the suave.io samples work with simple .fsx script files. This approach is indeed simple and allows fast prototyping with the use of the REPL. Even more complex websites like fssnip.net are using this approach with great success (code here). One of the benefits of script based suave solutions is that they are often equipped with a FAKE build script that rebuilds the website automatically if someone changes a file. One prominent example is FsReveal:

But sometimes you want to have a “real Visual Studio project” with debugging support and your enterprise might want to install your website as a Windows service. So the first step was to convert the F# script based approach into a project and adding TopShelf layer to make it runnable as Windows service. Thanks to a short blog post by Tomas Jansson this was super easy. It’s basically just installing TopShelf.FSharp via Paket and adding a single file with the config. Look into Tomas’s blog post to see the details.

After moving to the TopShelf model we still wanted to have automatic website rebuild whenever someone touched the source code. This little FAKE script makes this possible:

Running background jobs

The website already allowed to generate Excel reports by clicking somewhere in the UI. In addition to that we wanted to create the Excel reports recurringly in background jobs. Scott Hanselman has a blog post about “How to run Background Tasks in ASP.NET” and shows a couple of solutions that while optimized for ASP.NET would probably work with suave.io as well. But in this case we wanted to have something that syncronizes with the IO of the website so we decided to use F#’s MailboxProcessor feature a.k.a. “agents”. (As always: Scott Wlaschin has a nice introduction to agents.)

As a small intro into background jobs we start with F# counter agent sample by Tomas Petricek:

This little snippet shows an agent that calculates averages of the messages that it receives. Now let’s add a background job that recurringly sends messages to the counter agent:

In order to use this we need a bit of ugly infrastructure code that we will hide in a library:

So let’s try this out in the F# interactive:

Since this works as expected we can use it in our website:

Since all the user triggered reporting will also go through this taskAgent we ensure that IO is synchronized.

Sample project

In https://github.com/forki/backgroundjobs you can find a sample project which does exactly the two points from above.

As you can see the website performs background jobs and is rebuilding/restarting automatically when a source code file is saved.

Tags: , ,

Monday, 30. November 2015


F# advent calendar: Using Async.Choice in Paket

Filed under: .NET,C#,Diverses,F#,PLINQ — Steffen Forkmann at 0:50 Uhr

[This post is part of the F# advent calendar 2015 series.]

Prologue: What is Paket?

Paket is a dependency manager for .NET with support for NuGet packages and GitHub repositories. It enables precise and predictable control over what packages the projects within your application reference. If you want to learn how to use Paket then read the “Getting started” tutorial and take a look at the FAQs.

Paket file structure

Async computations in F#

Asynchronous Workflows are a very old F# feature (they already shipped in 2007) and they are used in many places in Paket. In this article I want to highlight one of the nice applications that F#’s async model allows you.

Many readers are probably familiar with C#’s async and await keywords (IIRC released with C# in 2012), but F#’s async feature works a little different. You can read more about some of “Asynchronous gotchas in C#” in Tomas Petricek’s excellent blog post. Most of these gotchas come from the fact that C# is starting all async tasks automatically while F# wraps the logic in data and allows you to run it explicitly. For a really good introduction to asynchronous programming with F# I can recommend Scott Wlaschin’s blog post.

The issue: retrieving version numbers from NuGet feeds

Paket’s package resolution algorithm (if you are interested in algorithms then read more) needs to know which versions are avalaible for a given package. Usually users specify more than one NuGet feed and different NuGet feeds support different protocols. The following code shows how older Paket versions retrieved all version numbers for a given package across all configured NuGet feeds:

As you can see getAllVersions tries 4 different NuGet protocols per source and returns the first result that is not None. Every getVersionsViaProtocol call performs async web request. In GetAllVersions we run this function for every NuGet source in parallel and combine the results. This code is very similar to the “async web downloader” sample in Scott Wlaschin’s async blog post and basically the “hello world” of asynchronous programming. 

Code cleanup

The getAllVersions is deeply nested and it’s hard to understand what’s going on. With the help of List.tryPick we can rewrite the code as:

List.tryPick returns the first result that is not None. So instead of nesting multiple match and let! expressions we use a higher-order-function to encapsulate the same pattern.

Introducing Async.Choice

After using this code for a while in Paket we noticed that different NuGet server implementations each implement a different subset of the 4 protocols and differ very much in the response times. So there was no order of the protocols that would work good for all server implementations. But what if we could query all 4 protocols in parallel and just take the fastest response?

The change is relatively easy and we can rewrite the GetAllVersions as:Instead of running the web requests synchronously and in order we run the four web request per NuGet feed in parallel and take the first response that is not None.

Implementation of Async.Choice

Unfortunately Async.Choice is not part of the standard FSharp.Core library and it turns out it is not easy to implement. There are many different implementations floating around on the internet. The one we use in Paket is by Eirik Tsarpalis and taken from fssnip. One of the advantages of this version is that the automatic cancellation of tasks works nicely. Since we are running lots and lots of web requests in parallel and always take the first response we want to cancel all the other pending web requests. Otherwise we would basically DDOS the feeds with useless requests (and yes that happened to us ;-)). Since Async.Choice is very useful we sent a pull request to the Visual F# project and hope that one day it will be in the box. 

Epilogue

Treating computations as data has even more advantages. The m-brace project created a new version of computation workflows called “cloud”:

cloud is a computation workflow builder and allows you to run your computations in the cloud. In contrast to async we don’t only control when something gets executed but also where. They even have a Cloud.Choice, which allows you to run tasks asynchronous in the cloud and take the first succeeding result.

Btw: async and cloud aren’t language keywords like C#’s async and await, but implemented as normal F# code in libraries. If you want to define your own computation expression builders, then the MSDN docs are a good starting point.

Tags: , , ,

Wednesday, 14. October 2015


Crazy stuff people do with FAKE and Paket – 1 of n

Filed under: .NET,C#,F#,FAKE - F# Make,Visual Studio — Steffen Forkmann at 11:33 Uhr

As many readers already know I’m maintaing the open source projects FAKE and Paket. These projects are used in many companies and open source projects to make continuous integration work on .NET and mono.

In this article series I want to highlight some of the more unsual use cases. Today I want to start and highlight the first 6 amazing projects.

FsReveal

FsReveal allows you to write beautiful slides in Markdown and brings C# and F# to the reveal.js web presentation framework.” [project site]

FsReveal uses Paket’s GitHub file dependencies feature to download reveal.js:

FsReveal GitHub references

It also uses a FAKE build script which converts Markdown files to html slides (via FSharp.Formatting) and runs these slides in a suave.io web server. FAKE’s file system watcher + suave’s socket implementation even allows you to edit your slides with automatic preview:

FSReveal

Stanford.NLP.NET

Stanford.NLP for .NET is a port of Stanford NLP distributions to .NET.

This project contains build scripts that recompile Stanford NLP .jar packages to .NET assemblies using IKVM.NET, tests that help to be sure that recompiled packages are workable and Stanford.NLP for .NET documentation site that hosts samples for all packages. All recompiled packages are available on NuGet.” [project site]

This project downloads .jar packages via Paket’s HTTP dependencies feature, recompiles everything to .NET via IKVM.NET in a FAKE build script and republishes it on NuGet. Automatic Java to .NET compilation – how cool is that?

Stanford NLP HTTP reference

Paket.VisualStudio

“Manage your Paket dependencies from Visual Studio!” [project site]

Paket.VisualStudio is a VisualStudio addin that you can download from Microsoft’s VisualStudio gallery. Unfortunately even in 2015 Microsoft doesn’t provide an API for automating the upload to its gallery. Therefore Paket.VisualStudio uses Paket’s NuGet dependencies feature to download canopy and Selenium:

Paket.VS

It then uses canopy + Selenium in a FAKE script to upload the addin via browser automation:

VS Gallery

Ionide-F#

“It’s part of Ionide plugin suite. F# IDE-like possibilities in Atom editor” [project site]

Ionide is an Atom plugin written in F#. It’s using Paket and FAKE to automate the build and release process.

First it uses FunScript to transpile F# to Javascript and then automates npm to resolve Javascript dependencies and apm to upload the plugin to the Atom gallery.

Akka.Net

Akka.NET is a community-driven port of the popular Java/Scala framework Akka to .NET.” [project site]

Akka.Net is a big (mainly C#) open source project that uses FAKE and Paket. One interesting observation is that it needs to create different xUnit addins and therefore uses Paket’s groups feature to maintain the xUnit versions.

Akka.Net

FSharp.Data

“The F# Data library implements everything you need to access data in your F# applications and scripts. It implements F# type providers for working with structured file formats (CSV, HTML, JSON and XML) and for accessing the WorldBank data. It also includes helpers for parsing CSV, HTML and JSON files and for sending HTTP requests.” [project site]

One of the many interesting things in FSharp.Data’s build is that it uses Paket to retrieve the FSharp.TypeProviders.StarterPack. These files need to be included in any F# type provider project and Paket allows you to manage this easily.

FSharp.Data

Tags: ,

Monday, 12. October 2015


Developer Open Space 2015

Filed under: Diverses — Steffen Forkmann at 13:08 Uhr

Vor acht Jahren gab es die erste Ausgabe der (Un-)Konferenz mit dem Namen Developer Open Space und sie findet dieses Jahr vom 16.–18. Oktober 2015 in Leipzig statt. https://frmedicamentsenligne.com
Für mich nach wie vor die beste technische Konferenz in Deutschland.Die Fotos von der Konferenz sprechen für sich.AnmeldungDie Anmeldung ist seit einiger Zeit möglich.Die Plätze sind begrenzt.Nimm teil!

Tuesday, 13. January 2015


SlideShow: Paket Introduction

Filed under: Diverses — Steffen Forkmann at 11:17 Uhr

I created a short presentation for our new .NET dependency manager Paket.
It’s created with FsReveal and MIT licensed. Please feel free to use/modify it for your own presentations.

http://forki.github.io/PaketIntro

Wednesday, 12. November 2014


Introducing FsReveal – Beautiful, version controlled presentations in the browser

Filed under: Diverses — Steffen Forkmann at 13:14 Uhr

This FsReveal tool stack allows you to write beautiful slides in Markdown and brings F# to the wonderful reveal.js web presentation framework.

You write your slides in Markdown like the following and the rest is magic*:

will turn into:

FsReveal slide 1

FsReveal slide 2

FsReveal slide 3

Features
  • Syntax highlighting for most programming languages including C#, F# and LaTeX
  • Speaker notes – shows the current slide, next slide, elapsed time and current time
  • Built in themes
  • Built in slide transitions using CSS 3D transforms
  • Slide overview
  • Works on Windows, Linux and Mac
  • Slides work on mobile browsers. Swipe your way through the presentation.
Getting Started

Just follow the getting started guide at http://fsprojects.github.io/FsReveal/.

Examples

Check out what others have created:

The magic
  • reveal.js – Framework for easily creating beautiful presentations using HTML and JavaScript.
  • Paket – Dependency manager which retrieves the referenced NuGet packages and reveal.js.
  • FAKE – Build tool which allows to orchestrate the slide building process.
  • FSharp.Formatting – Used to render markdown slides and for F# syntax highlighting.
  • Suave.io – Simple web development framework/ webserver which allows to host the slides.
  • git – Allows you to bring your slides under version control.

All of this is bundled in FsReveal, which was originally created by the amazing @kimsk (Karlkim Suwanmongkol) and is now part of fsprojects.

Tuesday, 14. October 2014


Retrieving GitHub download counts

Filed under: Diverses — Steffen Forkmann at 13:41 Uhr

A couple of weeks ago we started the Paket project which is released on GitHub. Unfortunately GitHub doesn’t provide any download stats in the UI.

A quick look at the REST API docs shows that we can retrieve the numbers via JSON.

So let’s open Visual Studio reference FSharp.Data and FSharp.Charting and hack a bit with the JSON type provider:

and voilà:

 

Paket download stats Paket download stats

Saturday, 5. July 2014


Microsoft, Open Source development and Codeplex

Filed under: .NET,C#,Diverses,F# — Steffen Forkmann at 12:37 Uhr

Recently Microsoft released all their major programming languages as open source and even started to accept pull requests. I didn’t think this would ever be possible at Microsoft, but they opened up and people like me started to contribute:

Personally I sent 29 (mostly small) pull requests to the Visual F# project. 5 pull requests already got accepted and 15 are still under evaluation.  In principle the development process seems to be working very well. Especially in the Visual F# project Don Syme and the Visual F# team are doing an excellent job to encourage the community. They are marking user voice issues as “approved in principle” and even provide detailed documents for implementation tasks (See CoreLibraryFunctions). This makes it very easy to get started and to hack a bit on the F# compiler. A big thanks to you guys.

A very important part of the open source is the review process. The F# community is awesome in this regards. On issues like a new “compareWith” function I got comments with remarks about coding style, test cases, documentation and lots of new ideas about possible performance improvements. It’s really exciting to be part of such an active and welcoming community. There is only one “but” and this “but” is the choice of the development platform. I really think CodePlex is hindering these projects to become even more successful. In this post I want to show some of my experiences with Codeplex. Remember I already sent 29 pull requests so I think it’s fair to say I tried!

Overall usability

The first impression on the Codeplex site is that every click feels so frustrating slow. Waiting 4s and more for a site to load doesn’t exactly feel like 2014.

If you want to comment on something then there is an “interesting” distinction between issues and pull requests. On issues you get a preview box and some buttons to make formatting easier, but you can’t edit your comments later:

Issue comment editor

On pull request you can edit your comments later but your don’t have the formatting buttons:

Pull request comment editor

 

E-Mail notifications

Like most online portals you can enable E-Mail notifications on Codeplex. Unfortunately it doesn’t really work. I tried to set all notification sliders I found to the max. and still don’t get notifcations if someone sends a new pull request to the Visual F# projects. Instead I’m getting annoying E-Mail notifications on my own activities:

E-Mail notifications on own comments

 

Things like this make you wonder if they actually dogfood their own stuff.

Code reviews on pull requests

As described above good code reviews are a very important part of any software development project, for open source programming language projects even more so. Unfortunately Codeplex has a really bad code review tool. I mean it is possible to comment on diffs:

Comments on diff

But if you update the pull request with a fix then the comment is displayed on the new fixed code and makes no sense any more:

Wrong code comments on diff

The whole code review is broken when you add commits to a pull request and it get’s even messier when you rebase your pull request on the current master. Rebasing pull requests is a very common operation in open source projects since it allows to move the merge effort from the project maintainer to the contributor. But unfortunately Codeplex gets confused by the rebase and now shows a wrong diff:

Wrong diff on rebased pull requests

A code platform shows a wrong diff – yep I couldn’t believe it myself so I tried again. #7040, #7045, #7060, always the same.

Finding the needle in the hashstack

A couple of days ago I wanted to tweet a link to a cool performance trick and knew it was the last commit on a pull request. Now try to find the url:

Alphabetical order by hash

Yes that’s right – the commits are ordered in alphabetical order BY HASH or as I call it in “least useful order”.

Reporting Codeplex issues

Of course you may ask: “Steffen why didn’t you report these issues to the Codeplex team?” and that’s a valid question. Actually I went to codeplex.codeplex.com, which I believe is the home of the Codeplex project, and looked at their issue list. This is what I got:

Codeplex has quite some issues

 

Not a single issue on the first page is related to Codeplex and it seems they don’t even care to close this spam. So why should I care to log issues there?

What now?

It’s more than obvious – use the “move source code to github” strategy and people already created an issue on the TypeScript project. Unfortunately it got closed:

Business needs

After reading this I really doubt it was the decision of the teams to use Codeplex. Fortunately the ASP.NET team seems to be an exception to this. Somehow they managed to move to github.com/aspnet.

There are also other ways. Microsoft could really invest in Codeplex and make it a usable platform. But I don’t see this happing, because it will cost A LOT of money. Even if they would open source Codeplex I don’t see a community which is willing to improve this site.

So I appeal to the people in charge at Microsoft please answer the following questions:

  • What are the reasons for putting the Visual F# project on Codeplex, especially when the majority of the existing F# projects and community already operate on Github?
  • Do you think it’s more important to support Codeplex or to grow a community around the programming language projects?
  • If the F# community voted for the project to be moved, would you consider moving it?
  • If you insist on Codeplex how and when do you plan to fix these usability issues?

So please let your OSS teams and their community pick the open source platform they want!

Monday, 16. June 2014


FAKE 2.18 released – RoundhousE kick edition

Filed under: Diverses — Steffen Forkmann at 14:08 Uhr

With the FAKE 2.18.1 we now have 80 contributors.A big thanks to all of you.Here are the most important changes since my last blog post about FAKE 2.14.

What’s new?

More details about minor bug fixes can be found in the release notes.

Getting started

If you new to FAKE you should read the getting started guide or clone the F# ProjectScaffold.

Go and grab the bits

Feel free to contact me if you need help for the upgrade.