Why You Should Adapt Gitflow

GitFlow is a convention that says how you should structure your development process. It tells you when you should branch and merge your changes and helps you with hotfixes. By organizing branches effectively it unblocks developers allowing them to stay effective

Key characteristics

Work happens in parallel

Each developer is free to work on his individual features, all is required to branch off develop branch and once the feature is done create a PR back to develop
At later point in time current state of develop is picked up and new release is created.

Focus on releases

GitFlow focuses around releases (see comparison to GitHub Flow below), this means that it’s best suited when you release your code often, but at the same time not too often.
There is some (small) overhead related to organizing a release, so in most cases you would be releasing at most once per day (just a rule of thumb)

Clear separation of ongoing work vs hotfixes

Hotfixes that affect live application don’t interfere with ongoing work.

Most teams struggle with this in simpler workflows (or when there’s no workflow), especially when each PR after review is merged back to master and release of the master happens after a while.

In case of emergency they need to figure out what exactly version is running live, and branch off at that specific moment in the master branch, this is messy and always ad-hoc.

GitFlow deals with this cleanly because there’s one more level of separation between ongoing work, staging area (release branches) and production branch (master)

No more ad-hoc decisions

GitFlow is simple but at the same time quite comprehensive model/convention to how you and your team should structure your development process.

It can be compared to the application framework which tells you in which folders you should place specific files, GitFlow tells you when you should branch and merge.

Reading material

There’s already a good amount of reading material about it, so I’m not going to repeat those resources:
* https://datasift.github.io/gitflow/IntroducingGitFlow.html – good and short introduction to GitFlow
* http://nvie.com/posts/a-successful-git-branching-model/ – article that introduced GitFlow to the public
* https://githubflow.github.io/ – simplified version of the model: GitHub Flow
* https://www.atlassian.com/git/tutorials/comparing-workflows – a comparison of few Git workflows

Note about GitHub Flow

GitHub flow in general is very similar to what most people do out of the box with git, but with one exception. GitHub Flow requires you to deploy master to production after each merge. In my experience most teams are not ready to adapt it, or feel uneasy about it, therefore GitFlow might be a better convention for them.

Tooling

There is a whole tool dedicated to GitFlow https://github.com/nvie/gitflow which might be useful for some people. In my experience I’m happy with just using hub to help with creation of branches and PRs: https://github.com/github/hub

Summary

I think that once a development team reaches certain size you should introduce a workflow to organize your development.

GitFlow seems like a good choice, it’s well documented, widely used, easy to understand.
It brings along a small overhead but there’s always a price to pay for having a structure. I think it’s definitively worth to have some structure versus ad-hoc decisions, especially made at stressful moments, like hotfixes on production.

Remote Work – 5 Years Later

This month it will be my 5 year anniversary as a remote worker. During that time I have been working for different companies, some were fully remote, some were partially remote. In some cases I have worked with teams with almost no time overlap (9 hours difference), in some cases all of us were in the same time zone.

This post is an attempt to summarize my thoughts from the 5 year perspective, but also I’d like to share tips for both people who are remote members in their teams as well as for people who are working exclusively on-site while some of their peers are remote.

Remote work is here to stay

I think right now remote working is one of the biggest trends shifting the IT industry, more and more companies are open to hire people working remotely. 5 years ago the situation was very different, remote working was something a lot of people didn’t hear about and as potential hire it was nearly impossible to convince your potential employer to hire our like that.

Right now you can apply to many companies that don’t explicitly advertise that they are hiring remotely and very likely they won’t reject you right away.

Over that time we have also seen a massive amount of new tools aimed at distributed teams, for example: Slack, HipChat, Status Hero, etc; without them working remotely wouldn’t be as effective as it can be.

Communication is the key

Over the 5 years are remote worker, I have spent around 2-3 months working in the offices with people on my teams, in most cases I was invited to visit the office and meet with everyone in person.
Experience like that has helped me to adapt to remote environment, here are the most important observations I have for remote workers:

Over-communicate

Make sure that everyone is in the loop about the things you are working on, daily standups definitely help – this is a chance to synchronize whole team, but as remote worker you need to take this further.

Ideally you are using issue tracker to track your progress, I use it a lot to write down ideas or comments as I work on the related issue, this creates a sort of log of my progress through the issue – which makes your work visible to others, people who are subscribed or stumble on this in any other way can add comments while you are working on stuff.
What’s more: It’s very beneficial to setup filters in your issue tracking tool and (periodically, for example daily) go through the issues other are working on and offer your input where you think it would be beneficial

It also helps to write a more or less detailed plan of your work before starting to work on individual issues.

Be responsive to others

Make sure to periodically go through all communication tools your team uses. I usually try to check email 3 times per day (morning, lunch, before going offline), but I keep HipChat open all the time and scroll through it during short breaks when my project is compiling 🙂 . I’ll also get notified when someone mentions me or sends me direct message.

Additionally I think you should treat PR review requests with very high priority, often someone might be blocked waiting on your review.

Proactive communication

This point is especially useful when there’s little time overlap between time members or similar situations. To avoid blocking others or being blocked yourself try to be proactive when asking questions or answering them, for example:

If you are about to ask someone how you should design particular piece of code, instead of asking question right away, try to come up with few approaches you might take, write down shortly what pros and cons they have and suggest which one you think is right.

This helps to greatly reduce the need to go back and forth when communicating – remember that it might take more than 10 hours to get the response.

Advice for on-site workers

All the points above are also very relevant to on-site workers, and I think there are few additional things that on-site workers should pay more attention to.

I’m sure you are already familiar with chat tools like HipChat or Slack, what you might be missing is the opportunity to set your statuses to match your availability, for example make sure to set auto-away to 5 or 10 minutes of inactivity, this way when you leave your desk, remote people will notice your status and won’t keep pinging you to reply. Also enable “away” status to set when you lock your device.

If your team has regular standup calls when everyone gathers around single person in the office and all remote people are dialing in, a good quality microphone and camera is a must. Ideally get a microphone that’s designed to pick up human speech and cancels noise.

It might be very useful to experience being a remote worker yourself, I think it’s most effective if for example your team decides on having “Remote Fridays” (or any other day of the week).

Summary

As you can see, majority of my focus was on efficient communication, I believe that this is the most important point to focus on when building or living in a remote team.

Over last 5 years I think I have made great progress with respect to my communication habits and I’m convinced that there is still a lot to learn and grow.

I’m looking forward to next 5 years as remote worker.

ETags in Akka HTTP

I have recently been involved in implementing ETags and Last-Modified header support in one of the services based on Akka-http.

I have prepared a quite comprehensive example project that shows how to implement those capabilities in Akka-http based projects.

In this post I’ll describe in a practical manner what ETags are and how to support them in your own projects.

Side note: I’ll focus on ETags and have a section on Last-Modified header at the end.

Quick introduction to ETags

ETag is basically a additional HTTP header that’s returned by the server that can be treated like a checksum of the response.
The client can later use this value when sending consecutive requests to the same endpoint to indicate what version of the resource it has seen before.

Based on the value of ETag provided by the client, the server can decide not to return the HTTP body, and indicate this by returning HTTP code 304 – Not Modified.

When client receives back a 304 response this means that the resource that client has received previously is still up-to-date and there is no need to send it again by the server.

Note that this approach requires the HTTP client (or library) to keep cached responses on it’s side, and in case of 304 response, the data should be read from there.

Wikipedia has a very good article on ETags

Note also that the server sets value of the ETag header, and clients should use If-None-Match

First request
curl -v http://localhost:8080/books-etags/1
...
> GET /books-etags/1 HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost:8080
> Accept: */*
> 
< HTTP/1.1 200 OK
< ETag: W/"1e8ec132952ddfe628c6e2ff6a66d843"
< Last-Modified: Tue, 25 Oct 2016 12:45:00 GMT
* Server akka-http/10.0.0 is not blacklisted
< Server: akka-http/10.0.0
< Date: Fri, 25 Nov 2016 10:43:43 GMT
< Content-Type: application/json
< Content-Length: 197
...

And the body follows

Second Request
curl -v -H "If-None-Match: W/\"1e8ec132952ddfe628c6e2ff6a66d843\""  http://localhost:8080/books-etags/1
...
> GET /books-etags/1 HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost:8080
> Accept: */*
> If-None-Match: W/"1e8ec132952ddfe628c6e2ff6a66d843"
> 
< HTTP/1.1 304 Not Modified
< ETag: W/"1e8ec132952ddfe628c6e2ff6a66d843"
< Last-Modified: Tue, 25 Oct 2016 12:45:00 GMT
* Server akka-http/10.0.0 is not blacklisted
< Server: akka-http/10.0.0
< Date: Fri, 25 Nov 2016 10:47:26 GMT
< 

Second response doesn’t have any body

Third request

After the resource on the server was updated:

curl -v -H "If-None-Match: W/\"1e8ec132952ddfe628c6e2ff6a66d843\""  http://localhost:8080/books-etags/1
...
> GET /books-etags/1 HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost:8080
> Accept: */*
> If-None-Match: W/"1e8ec132952ddfe628c6e2ff6a66d843"
> 
< HTTP/1.1 200 OK
< ETag: W/"3049afc59dbff41160acfcbb0f3273ec"
< Last-Modified: Tue, 25 Oct 2016 12:45:00 GMT
* Server akka-http/10.0.0 is not blacklisted
< Server: akka-http/10.0.0
< Date: Fri, 25 Nov 2016 10:55:29 GMT
< Content-Type: application/json
< Content-Length: 197
< 
...

And the new body follows

I encourage you to download the sample project I have prepared: https://github.com/wlk/akka-http-etag-example which will allow you to try out those commands yourself, I have also added more debug logging to see how your request flows through on the server side.

Implementing ETags support

Akka-http has already implemented a conditional directive that allows us to use ETags quite effectively

So what’s left for us is to properly include it in our routing and pass correct arguments.

In my sample project there is a class BooksApi

This is most important snippet – I have written comments inline:

// There are 3 cases to consider
path("books-etags" / IntNumber) { id =>
  optionalHeaderValueByName("If-None-Match") {
    case Some(_) =>
      // First case, we get request with some value of "If-None-Match" header, right now we are unable to say if it's valid ETag
      booksService.getBookLastUpdatedById(id) match {
        case Some(lastUpdated) =>
          // "conditional" directive receives the ETag extracted from the request, and compares it to "lightweightBookETag(lastUpdated)"
          // which was calculated only based on the date of the book - we didn't have to fetch full object from the DB
          conditional(lightweightBookETag(lastUpdated), lastUpdated) {
            // "conditional" directive will return 304 if ETag or "If-Modified-Since" were valid
            // in this case we don't need to fetch anything more from DB
            complete {
              // If ETag was invalid (for example outdated), we continue, this time fetching full object from DB
              // Fetching full object from memory is more time consuming
              booksService.findById(id) match {
                case Some(book) => book
                case None       => throw new RuntimeException("This shouldn't happen")
              }
            }
          }
        case None => complete {
          // If resource doesn't exist we don't set any headers
          HttpResponse(NotFound, entity = "Not found")
        }
      }
    case None =>
      // Second case, request doesn't contain "If-None-Match" header
      // we know that we have to return 200 with full body, so we do that (alternatively we return 404 if resource wasn't found)
      booksService.findById(id) match {
        case Some(book) =>
          conditional(bookETag(book), book.lastUpdated) {
            complete {
              book
            }
          }
        case None =>
          complete {
            HttpResponse(NotFound, entity = "Not found")
          }
      }
  }
}

A word about Last-Modified header

In some cases ETag and Last-Modified value could serve the same purposes, even in my project I calculate ETag based on lastUpdated date, because I know that each time a resource changes, it will also update lastUpdated date.

Last-Modified is sometimes simpler to understand, but it’s not as universal as ETags, here are some cases were it won’t work but ETags would:

  • Collections
    When collection has multiple resources, we can calculate combined ETag as concatenation of all ETags of individual resources, and then hash them with MD5 (or any other hashing algorithm)
    In case of Last-Modified we don’t have a way to do this, because if we look at Max(Last-Modified) for all elements, we won’t notice for example removal of elements from collection

  • Modification date is not available
    There are resources which don’t have information about modification date, they can change any time as well. In those cases ETags are the only option

Summary

Support for ETags and Last-Modified header is quite easy to add for Akka-http projects.

I have shown how to add this to a single endpoint, the drawback is that it makes code much more verbose as there are 3 cases that require handling, each requires different path the request needs to go through.

This probably could be eliminated by defining more generic function that can encapsulate all the logic, but I decided to leave this out for now.