Rash thoughts about .NET, C#, F# and Dynamics NAV.


"Every solution will only lead to new problems."

Monday, 30. November 2015


F# advent calendar: Using Async.Choice in Paket

Filed under: .NET,C#,Diverses,F#,PLINQ — Steffen Forkmann at 0:50 Uhr

[This post is part of the F# advent calendar 2015 series.]

Prologue: What is Paket?

Paket is a dependency manager for .NET with support for NuGet packages and GitHub repositories. It enables precise and predictable control over what packages the projects within your application reference. If you want to learn how to use Paket then read the “Getting started” tutorial and take a look at the FAQs.

Paket file structure

Async computations in F#

Asynchronous Workflows are a very old F# feature (they already shipped in 2007) and they are used in many places in Paket. In this article I want to highlight one of the nice applications that F#’s async model allows you.

Many readers are probably familiar with C#’s async and await keywords (IIRC released with C# in 2012), but F#’s async feature works a little different. You can read more about some of “Asynchronous gotchas in C#” in Tomas Petricek’s excellent blog post. Most of these gotchas come from the fact that C# is starting all async tasks automatically while F# wraps the logic in data and allows you to run it explicitly. For a really good introduction to asynchronous programming with F# I can recommend Scott Wlaschin’s blog post.

The issue: retrieving version numbers from NuGet feeds

Paket’s package resolution algorithm (if you are interested in algorithms then read more) needs to know which versions are avalaible for a given package. Usually users specify more than one NuGet feed and different NuGet feeds support different protocols. The following code shows how older Paket versions retrieved all version numbers for a given package across all configured NuGet feeds:

As you can see getAllVersions tries 4 different NuGet protocols per source and returns the first result that is not None. Every getVersionsViaProtocol call performs async web request. In GetAllVersions we run this function for every NuGet source in parallel and combine the results. This code is very similar to the “async web downloader” sample in Scott Wlaschin’s async blog post and basically the “hello world” of asynchronous programming. 

Code cleanup

The getAllVersions is deeply nested and it’s hard to understand what’s going on. With the help of List.tryPick we can rewrite the code as:

List.tryPick returns the first result that is not None. So instead of nesting multiple match and let! expressions we use a higher-order-function to encapsulate the same pattern.

Introducing Async.Choice

After using this code for a while in Paket we noticed that different NuGet server implementations each implement a different subset of the 4 protocols and differ very much in the response times. So there was no order of the protocols that would work good for all server implementations. But what if we could query all 4 protocols in parallel and just take the fastest response?

The change is relatively easy and we can rewrite the GetAllVersions as:Instead of running the web requests synchronously and in order we run the four web request per NuGet feed in parallel and take the first response that is not None.

Implementation of Async.Choice

Unfortunately Async.Choice is not part of the standard FSharp.Core library and it turns out it is not easy to implement. There are many different implementations floating around on the internet. The one we use in Paket is by Eirik Tsarpalis and taken from fssnip. One of the advantages of this version is that the automatic cancellation of tasks works nicely. Since we are running lots and lots of web requests in parallel and always take the first response we want to cancel all the other pending web requests. Otherwise we would basically DDOS the feeds with useless requests (and yes that happened to us ;-)). Since Async.Choice is very useful we sent a pull request to the Visual F# project and hope that one day it will be in the box. 

Epilogue

Treating computations as data has even more advantages. The m-brace project created a new version of computation workflows called “cloud”:

cloud is a computation workflow builder and allows you to run your computations in the cloud. In contrast to async we don’t only control when something gets executed but also where. They even have a Cloud.Choice, which allows you to run tasks asynchronous in the cloud and take the first succeeding result.

Btw: async and cloud aren’t language keywords like C#’s async and await, but implemented as normal F# code in libraries. If you want to define your own computation expression builders, then the MSDN docs are a good starting point.

Tags: , , ,