Parallel reduce: Hopac, Asyncs, Tasks and Scala's Futures

Tuomas Hietanen posted a parallel reduce function that uses TPL Tasks. I found it interesting to compare performance of this function with analogues implemented using F# Asyncs, Hopac Jobs and Scala Futures.
The author uses noop long-running reduce function to show that it's really run in parallel. In this blog post we are benchmarking another aspect of the implementations: how much extra cost is introduced by a particular parallezation mechanism (library) itself.

We translate the original code almost as is to Tasks and Hopac:


And Scala's Futures:


The results (Core i5, 4 cores):

  • Sequential List.reduce: Real: 00:00:00.014, CPU: 00:00:00.015, GC gen0: 0, gen1: 0, gen2: 0 
  • Tasks: Real: 00:00:01.790, CPU: 00:00:05.678, GC gen0: 36, gen1: 10, gen2: 1 
  • Hopac: Real: 00:00:00.514, CPU: 00:00:01.482, GC gen0: 27, gen1: 2, gen2: 1 
  • Asyncs: Real: 00:00:37.872, CPU: 00:01:48.405, GC gen0: 90, gen1: 29, gen2: 4
  • Scala Futures: 4.8 seconds

(Hopac - 3.4 times faster, Asyncs - 21.1 times slower, Scala - 1.8 times slower)

Hopac is ~3.5 times faster than TPL. What's wrong with Asyncs? I don't know. Maybe they are not intended for highly concurrent scenarios. Or my code may not be the most efficient. Any ideas, guys?

Let's test the leaders on larger arrays:


(Hopac is 3.37 times faster, Scala is 1.5 times slower)


(Hopac is 5.25 times faster, Scala is 1.05 times slower)

Comments

Thorium said…
This comment has been removed by the author.
Thorium said…
.NET Tasks and Async are more for IO-bound operations than for CPU-bound; for concurrency rather than parallelism. So maybe it is not the fastest thing, while it is quite easy. If you open async-tasks dll method with ILDASM / ILSpy / Reflector, you notice that it will cause some overhead.

For high-parallelism use e.g. Array.Parallel.
Thorium said…
Hmm... With thread-blocking Sleep(30) in the aggregate function I tried to simulate that what if your combinator is more complex than just (+)
E.g. for array with just 5000 items:

Array.reduce (fun c -> fun x -> System.Threading.Thread.Sleep(30);x+c) a
Real: 00:02:36.232

reduceParallelTasks (fun c -> fun x -> System.Threading.Thread.Sleep(30);x+c) a
Real: 00:00:06.730

reduceParallelAsync (fun c -> fun x -> System.Threading.Thread.Sleep(30);x+c) a
Real: 00:00:05.361

reduceParallelHopac (fun c -> fun x -> System.Threading.Thread.Sleep(30);x+c) a
Real: 00:00:20.032
Vesa Karvonen said…
The main reason for the existence of the Job monad is to make it possible to not block a native thread. So what your test does is that explicitly does the one thing, blocks the thread running the job, that you should not do.
Vasily said…
Try reduceParallelHopac (fun c -> fun x -> Timer.Global.timeOutMillis 30;x+c) a

Popular posts from this blog

Upcoming F# struct tuples: are they always faster?

Computing cryptography hashes: Rust, F#, D and Scala

Composing custom error types in F#