Testing Akka Performance

Few weeks ago I attended a workshop called “Understanding Mechanical Sympathy” ran by Martin Thompson. During that workshop we written and tested few concurrent programming techniques and as a first exercise we have written a simple Ping-Pong program:


package uk.co.real_logic;

import static java.lang.System.out;

/*
Original exercise did during "Lock Free Workshop" by Martin Thompson: http://www.real-logic.co.uk/training.html
 */
public final class PingPong {
    private static final int REPETITIONS = 100_000_000;

    private static volatile long pingValue = -1;
    private static volatile long pongValue = -1;

    public static void main(final String[] args) throws Exception {
        final Thread pongThread = new Thread(new PongRunner());
        final Thread pingThread = new Thread(new PingRunner());
        pongThread.setName("pong-thread");
        pongThread.setName("ping-thread");
        pongThread.start();
        pingThread.start();

        final long start = System.nanoTime();

        pingThread.join();
        pongThread.join();

        final long duration = System.nanoTime() - start;

        out.printf("duration %,d (ns)%n", duration);
        out.printf("%,d ns/op%n", duration / (REPETITIONS * 2L));
        out.printf("%,d ops/s%n", (REPETITIONS * 2L * 1_000_000_000L) / duration);
        out.println("pingValue = " + pingValue + ", pongValue = " + pongValue);

        main(args);
    }

    public static class PingRunner implements Runnable {
        public void run() {
            for(int i = 0; i < REPETITIONS; ++i){
                pingValue = i;
                while(i != pongValue){
                }
            }
        }
    }

    public static class PongRunner implements Runnable {
        public void run() {
            for(int i = 0; i < REPETITIONS; ++i) {
                while (i != pingValue) {
                }
                pongValue = i;
            }
        }
    }
}


As you can see we have 2 threads that need to synchronize on the single variable, this is achieved by checking the value of the variable and if it matches expected value, the thread can continue. What’s worth to note is that those 2 threads are busy spinning (inside while loops).

Testing Original Code

After running the code for few minutes the test results stabilize in my case (i7-3610QM) around those values:

duration 10,756,457,024 (ns)
53 ns/op
18,593,482 ops/s

After pinning threads to specific CPU cores the results are slightly better:

duration 10,147,866,952 (ns)
50 ns/op
19,708,575 ops/s

This pretty trivial optimization, helped to gain about 7% of performance gain, so overall not much (I suspect that my Linux system is performing some sort of optimizations at this level as well).

Akka version

After playing around with the pure Java version of the code, I decided to see how I could design a close enough Akka version of this exercise and what optimizations I could use to improve it’s original performance.

This is my code:


case object Ping
case object Pong

object PingPongAkkaApp extends App {
  override def main(args: Array[String]): Unit = {
    val t = new Tester
    t.run()
    main(args)
  }
}

class Tester {
  val REPETITIONS: Int = 100000000

  val startTime: Long = System.nanoTime
  val pongValue: AtomicInteger = new AtomicInteger(0)

  def run() = {
    val system = ActorSystem("PingPongSystem")
    val pongActor = system.actorOf(Props(new PongActor(REPETITIONS, pongValue)), name = "Pong")
    val pingActor = system.actorOf(Props(new PingActor(REPETITIONS)), name = "Ping")

    system.registerOnTermination {
      val duration: Long = System.nanoTime - startTime
      printf("duration %,d (ns)%n", duration)
      printf("%,d ns/op%n", duration / (REPETITIONS * 2L))
      printf("%,d ops/s%n", (REPETITIONS * 2L * 1000000000L) / duration)
      println("pongValue = " + pongValue)
    }

    pongActor.tell(Ping, pingActor)

    Await.result(system.whenTerminated, Duration.Inf)
  }
}

class PingActor(repetitions: Int) extends Actor {

  override def receive = {
    case Pong =>
      sender ! Ping
  }
}

class PongActor(repetitions: Int, pongValue: AtomicInteger) extends Actor {
  var counter: Int = 0

  override def receive = {
    case Ping =>
      counter = counter + 1
      if (counter >= repetitions) {
        pongValue.set(counter)
        context.system.terminate()
      } else {
        sender ! Pong
      }
  }
}

In this case instead of explicitly synchronizing on the variable we use 2 akka actors to send messages between each other. The PingActor just replies with Ping to each Pong it receives, but PongActor stores the counter and can stop the actor system when it reaches expected number of iterations. In this case we also need a little bit more boilerplate code to setup actor system and wait for termination.

Testing Akka Version

Let’s look at performance results after running this code without any additional configuration:

After running test for couple of minutes the results stabilized around those values:


duration 222,328,154,119 (ns)
1,111 ns/op
899,571 ops/s


As you can see it’s roughly 20x slower than Java version, that’s quite bad.

During test execution I noticed that it was using all my CPU cores at 100% which was undesired so I decided to add 2 things:

Setup proper Akka configuration, in this case something like that was enough:

akka {
  actor {
    default-dispatcher {
      type = Dispatcher
      executor = "thread-pool-executor"
      throughput = 1
      fork-join-executor {
        parallelism-min = 2
        parallelism-factor = 0.5
        parallelism-max = 3
      }
    }
  }
}

Additional I’m pinning Java process to use only 2 CPU cores:

taskset -c 1,2 java -server -jar target/scala-2.11/akka-ping-pong-assembly-1.0.jar

This gave much better test results:

duration 60,581,436,734 (ns)
302 ns/op
3,301,341 ops/s

Summary

As you can see the overhead is now 6x compared to Java version which is not bad when taking into account all additional work that happens in Akka to provide all additional features we don’t have when using pure threads.

I’m sure it’s possible to go even further with optimizations and I suspect that this time could be cut by additional 50% given enough experimentation, but I’ll leave as an exercise 🙂

Here is the git repository with full code: https://github.com/wlk/akka-ping-pong

How To Setup Garmin 310XT To Work With Linux

In this post I intent to provide a overview of the steps that need to be performed to setup Garmin 310XT GPS Sports Watch to work with Linux (Ubuntu).

(This tutorial should also apply to other similar Garmin GPS Watches -Garmin Forerunner 60 – 405CX – 310XT – 610 – 910XT)

Install Required Packages

sudo apt-get install python-pip python-qt

sudo pip install pyusb

Install GFrun

GFrun is the program that you can use to download recorded workouts from your watch. This program has it’s own installation script:

wget -N https://github.com/xonel/GFrun/raw/GFrun/GFrun/GFrun.sh && chmod a+x GFrun.sh && sudo bash ./GFrun.sh


But I have discovered that running it as root is not required.

Configure udev Rules

At first you need to plug in your ANT+ stick and run lsusb |grep ANTUSB

In my case this was the result:

Bus 001 Device 031: ID 0fcf:1009 Dynastream Innovations, Inc. ANTUSB-m Stick

Now you just need to create file `/etc/udev/rules.d/51-garmin.rules` and set following content:

ATTRS{idVendor}=="0fcf", ATTRS{idProduct}=="1009", MODE="666"

After that you need to restart udev by running

/etc/init.d/udev restart

And re-plug your ANT+ stick.

Running GFrun

To simply extract workouts from device I only run command:

/home/w/GFrun/GFrun.sh -el

This will download FIT files from the device to

/home/w/GFrun/forerunners/{ID}/activities

The downloaded files are ready to be uploaded anywhere you like (for example Endomondo or Strava – both services accept them without issues).

How To Setup CI Build Pipeline With Travis CI, Heroku and sbt

This post covers all steps that are required to setup a Continuous Integration (CI) build pipeline using Travis CI as a main driver for deploying our Play application written in Scala to Heroku cloud.

Expected end result:

After each commit to the master branch of my Github project, I’d like to run full test suite. Following that, each successful build, should trigger a deployment to Heroku. If the tests fail, application should not be deployed.

I’ll use my own project as an example, but this tutorial is suitable for any other project that is uses sbt as a built as well. I’ll be covering following aspects of the work that needs to be performed:

  1. Deploy Play application to Heroku
  2. Integrate Travis CI with Play app
  3. Setup Travis to deploy to Heroku

Deploy Play application to Heroku

Deploying a Play application (build with Activator or sbt) can be done very simply with the help of sbt-heroku plugin.

Those are the steps I had to perform:

  1. Install heroku toolbelt and login with your heroku account
  2. Run heroku create which will generate new application name for you
  3. Add the sbt-heroku plugin – setup application name from the previous step
  4. Run sbt stage deployHeroku
  5. The application will be deployed in around 1 or 2 minutes
  6. If you go to your app activity log (in my case here: https://dashboard.heroku.com/apps/warm-hamlet-57324/activity) you should see all actions that were performed, including your latest deployment

Integrate Travis CI with Play app

This step is even simpler, it’s enough to login to Travis CI with your Github account, then from the list of discovered applications select the one you are interested in.

Travis CI will do everything else automatically and to get started that’s enough. Travis will get notified by Github on each commit you make, and will run your tests for you.

To allow for more customization and control over the build and test process it’s recommend to add .travis.yml configuration file. This is the simple one to get started.


language: scala

jdk:
  - oraclejdk8

scala:
  - 2.11.8

cache:
  directories:
    - $HOME/.m2/repository
    - $HOME/.sbt
    - $HOME/.ivy2

Setup Travis to deploy to Heroku

Next step is the combination of the work we did so far.

To get started you need your Heroku API key which can be found on your account page.

This API key needs to be configured on Travis CI project configuration page, in my case it is: https://travis-ci.org/wlk/game-arena/settings, you need to setup a new environment variable called HEROKU_API_KEY and set the API key as a value.

Additionally we need to update .travis.yml file to describe deployment steps:

language: scala

jdk:
  - oraclejdk8

scala:
  - 2.11.8

cache:
  directories:
    - $HOME/.m2/repository
    - $HOME/.sbt
    - $HOME/.ivy2

deploy:
  provider: script
  script: sbt stage deployHeroku

As you can see I have decided to configure Travis to run my own deployment script which is sbt stage deployHeroku. It’s exactly the same one I have used when deploying from localhost (this time the Heroku API key is not read from heroku toolbelt, but from the environment variable we configured one step above).

Note: Travis comes with build in Heroku deployment capabilities, but I decided no to use them, because I wanted to be able to reuse the same deployment code for both automated and manual deployments.

Summary

As you can see, setting up a simple CI build pipeline is quite a straightforward thing to do, after that the whole process of testing and deploying will happen automatically, and new version of your app can be live within few minutes after your last commit.

I have been using Travis CI for all my Github projects with good results, but this is the first time I have been deploying application automatically to Heroku, so there is still much more to learn how to do this effectively.

BTW. Did you know that I’m available for hire?