# Rash thoughts about .NET, C#, F# and Dynamics NAV.

## Friday, 24. October 2008

### Using PLINQ in F# – Parallel Map and Reduce (Fold) functions – part 2

Filed under: .NET 3.0,English posts,F#,Informatik,PLINQ — Steffen Forkmann at 18:00 Uhr

Last time I showed how it is possible to use parallel map and fold functions to compute the sum of all factorials between 1 and 3000. The result was a nearly perfect load balancing for this task on a two processor machine. This time I will derive a generic function that computes partial results in parallel and folds them to a final result.

LetÔÇÖs consider our F# example:

`let add a b = a + b  `
`let fac (x:bigint) = `
`  [1I..x] |> List.fold_left (*) 1I`
`let sequential() =`
`  [1I..3000I]`
`   |> List.map fac`
`   |> List.fold_left add 0I`

This is the same as:

`let calcFactorialSum min max =`
`  [min..max] `
`   |> List.map fac`
`   |> List.fold_left add 0I  `
` `
`let f1() = calcFactorialSum    1I 2000I`
`let f2() = calcFactorialSum 2001I 2200I`
`let f3() = calcFactorialSum 2201I 2400I`
`let f4() = calcFactorialSum 2401I 2600I`
`let f5() = calcFactorialSum 2601I 2800I`
`let f6() = calcFactorialSum 2801I 3000I`
` `
`let sequential2() =`
`  f1() + f2() + f3() + f4() + f5() + f6()`

We spitted the summation into 6 independent tasks and computed the sum of the partial results. This has nearly no bearing on the runtime.

But with the help of PLINQ we can compute each task in parallel:

`let asParallel (list: 'a list) = `
`  list.AsParallel<'a>()`

`let runParallel functions = `
`    ParallelEnumerable.Select(`
`      asParallel functions, (fun f ->  f() ) )`
` `
`let pFold foldF seed (data:IParallelEnumerable<'a>)=`
`  ParallelEnumerable.Aggregate<'a,'b>(`
`    data, seed, new Func<'b,'a,'b>(foldF))`
` `

`let calcFactorialsParallel() =`
`  [f1; f2; f3; f4; f5; f6]`
`    |> runParallel`
`    |> pFold add 0I`

This time we build a list of functions (f1, f2, f3, f4, f5, f6) and run them in parallel. "runParallelÔÇŁ gives us back a list of the partial results, which we can fold with the function ÔÇťaddÔÇŁ to get the final result.

On my Core 2 Duo E6550 with 2.33 GHz and 3.5 GB RAM I get the following results:

Time Normal: 26.576s

Time Sequential2: 26.205s (Ratio: 0.99)

Time ÔÇťParallel FunctionsÔÇŁ: 18.426s (Ratio: 0.69)

Time PLINQ: 14.990s (Ratio: 0.56) (Last post)

Same Results: true

We can see that the parallel computation of the functions f1 ÔÇô f6 is much faster than the sequential.

But why is the PLINQ-version (see last post) still faster? We can easily see that each partial function needs a different runtime (e.g. itÔÇÖs much harder to calculate the factorials between 2800 and 3000 than between 2000 and 2200). On my machine I get:

Time F1: 8.738s

Time F2: 2.663s

Time F3: 3.119s

Time F4: 3.492s

Time F5: 3.889s

Time F6: 4.442s

The problem is that the Parallel Framework can only guess each runtime amount in advance. So the load balancing for 2 processors will not be optimal in every case. In the original PLINQ-version there are only small tasks, and the difference between each runtime is smaller. So it is easier to compute the load balancing.

But of course we can do better if we split f1 into two functions f7 and f8:

`let f7() = calcFactorialSum    1I 1500I`
`let f8() = calcFactorialSum 1501I 2000I`

So we can get a better load balancing:

Time F1: 8.721s

Time F7: 4.753s

Time F8: 4.829s

Time Normal: 26.137s

Time ÔÇťParallel FunctionsÔÇŁ: 16.138s (Ratio: 0.62)

Same Results: true

Tags: , , , , , ,

## Thursday, 23. October 2008

### Using PLINQ in F# ÔÇô Parallel Map and Reduce (Fold) functions – part 1

Filed under: .NET 3.0,C#,F# — Steffen Forkmann at 18:25 Uhr

If your wondering how Google computes query results in such a short time you have to read the famous ÔÇťMapReduceÔÇŁ-Paper by Jeffrey Dean and Sanjay Ghemawat (2004). It shows how one can split large tasks into a mapping and a reduce step which could then be processed in parallel.

With PLINQ (part of the Parallel Extensions to the .NET Framework) you can easily use ÔÇťMapReduceÔÇŁ-pattern in .NET and especially F#. PLINQ will take care of all the MultiThreading and load balancing stuff. You only have to give PLINQ a map and a reduce (or fold) function.

Lets consider a small example. Someone wants to compute the sum of the factorials of all integers from 1 to 3000. With List.map and List.fold_left this is a very easy task in F#:

`#light`
`open System`

`let add a b = a + b`
`let fac (x:bigint) = [1I..x] |> List.fold_left (*) 1I`

`let sum =`
`┬á [1I..3000I]`
`┬á┬á┬á |> List.map fac`
`┬á┬á┬á |> List.fold_left add 0I`

`printfn "Sum of Factorials: %A" sum`

Of course you could do much much better if you donÔÇÖt compute every factorial on its own (I will show this in one of the next parts) – but for this time I need an easy function that is time consuming.

This simple Task needs 27 sec. on my Core 2 Duo E6550 with 2.33 GHz and 3.5 GB RAM.

But we can do better if we use parallel map and fold functions with help of PLINQ:

`let pMap (mapF:'a -> 'b) (data:IParallelEnumerable<'a>) =`
`┬á ParallelEnumerable.Select(data, mapF)`

`let pFold foldF seed (data:IParallelEnumerable<'a>)=`
`┬á ParallelEnumerable.Aggregate<'a,'b>(`
`    data, seed, new Func<'b,'a,'b>(foldF))`

Now we can easily transform our calculation to a parallel version:

`let sum =`
`┬á [1I..3000I].AsParallel<bigint>()`
`┬á┬á┬á |> pMap fac `
`┬á┬á┬á |> pFold add 0I`

Putting all together we can write a small test application:

`#light `
`open System`
`open System.Linq`
`open System.Diagnostics`

`let testRuntime f =`
`┬á let watch = new Stopwatch()`
`┬á watch.Start()`
`┬á (f(),watch.Elapsed)`

`let add a b = a + b`
`let fac (x:bigint) = [1I..x] |> List.fold_left (*) 1I`

`let list = [1I..3000I]`

`let pMap (mapF:'a -> 'b) (data:IParallelEnumerable<'a>)=`
`┬á ParallelEnumerable.Select(data, mapF)`

`let pFold foldF seed (data:IParallelEnumerable<'a>)=`
`┬á ParallelEnumerable.Aggregate<'a,'b>(`
`┬á┬á┬á data, seed, new Func<'b,'a,'b>(foldF))`

`let PLINQ() =`
`┬á list.AsParallel<bigint>()`
`┬á┬á┬á |> pMap fac`
`┬á┬á┬á |> pFold add 0I`

`let sequential() =`
`┬á list`
`┬á┬á |> List.map fac`
`┬á┬á |> List.fold_left add 0I`

`let (sumSequential,timeSequential) =`
`┬á testRuntime sequential`
`printfn "Time Normal: %.3fs" timeSequential.TotalSeconds`

`let (sumPLINQ,timePLINQ) =`
`┬á testRuntime PLINQ`
`printfn "Time PLINQ: %.3fs" timePLINQ.TotalSeconds`

`timePLINQ.TotalSeconds / timeSequential.TotalSeconds`
`┬á |> printfn "Ratio: %.2f"`

`sumSequential = sumPLINQ`
`┬á |> printfn "Same Results: %A"`

On my machine I get the following results:

Time Normal: 27.955s

Time PLINQ: 15.505s

Ratio: 0.55

Same Results: true

This means I get nearly a perfect load balancing on my two processors for this task.

In part II I describe how one can compute a series of functions in parallel.

Tags: , , , , , , ,

## Sunday, 12. October 2008

### ParallelFX wird Kernkomponente vom .NET Framework 4.0

Filed under: .NET 3.0,F# — Steffen Forkmann at 11:32 Uhr

Wie das ParallelFX-Team bekannt gegeben hat, werden die ÔÇťParallel Extensions für das .NET FrameworkÔÇŁ nun zur Kernkomponente vom .NET-Framework 4.0 befördert.

ÔÇťParallel Extensions will indeed be a part of the .NET Framework 4.0.  Not only will it be a part of it, it will be a core part of it.ÔÇŁ

Das bedeutet, dass dann vermutlich eine ganze Reihe an Basisfunktionen schon von Hause aus parallel verarbeitet werden kann. Anwenden kann man die Bibliotheken auch jetzt schon ganz einfach ÔÇô allerdings immer mit zusätzlichem Installationsaufwand. Aber es lohnt sich wirklich.

Ein weiterer interessanter Aspekt ist übrigens, dass das F#-Team bereits angekündigt hat demnächst seine asynchronen Workflows auf ParallelFX umzustellen.

Weitere Informationen zu ParallelFX:

Tags: , , , ,