Testing Akka Performance

Few weeks ago I attended a workshop called “Understanding Mechanical Sympathy” ran by Martin Thompson. During that workshop we written and tested few concurrent programming techniques and as a first exercise we have written a simple Ping-Pong program:

As you can see we have 2 threads that need to synchronize on the single variable, this is achieved by checking the value of the variable and if it matches expected value, the thread can continue. What’s worth to note is that those 2 threads are busy spinning (inside while loops).

Testing Original Code

After running the code for few minutes the test results stabilize in my case (i7-3610QM) around those values:

After pinning threads to specific CPU cores the results are slightly better:

This pretty trivial optimization, helped to gain about 7% of performance gain, so overall not much (I suspect that my Linux system is performing some sort of optimizations at this level as well).

Akka version

After playing around with the pure Java version of the code, I decided to see how I could design a close enough Akka version of this exercise and what optimizations I could use to improve it’s original performance.

This is my code:

In this case instead of explicitly synchronizing on the variable we use 2 akka actors to send messages between each other. The PingActor just replies with Ping to each Pong it receives, but PongActor stores the counter and can stop the actor system when it reaches expected number of iterations. In this case we also need a little bit more boilerplate code to setup actor system and wait for termination.

Testing Akka Version

Let’s look at performance results after running this code without any additional configuration:

After running test for couple of minutes the results stabilized around those values:

As you can see it’s roughly 20x slower than Java version, that’s quite bad.

During test execution I noticed that it was using all my CPU cores at 100% which was undesired so I decided to add 2 things:

Setup proper Akka configuration, in this case something like that was enough:

Additional I’m pinning Java process to use only 2 CPU cores:

This gave much better test results:

Summary

As you can see the overhead is now 6x compared to Java version which is not bad when taking into account all additional work that happens in Akka to provide all additional features we don’t have when using pure threads.

I’m sure it’s possible to go even further with optimizations and I suspect that this time could be cut by additional 50% given enough experimentation, but I’ll leave as an exercise ­čÖé

Here is the git repository with full code: https://github.com/wlk/akka-ping-pong