Notes on creating microservices-based applications

This post is a collection of tips and notes I gathered while working on microservices-based applications for last couple of months.

The notes are divided in a couple of sections that focus on the different areas during development and running your services.

I have decided to write more low level notes/tips to focus on specific problems, for more high-level overview see: The Twelve-Factor App

Project Setup

  • Each service should be a self-contained project, hosted in a separate repository.
  • The microservices shouldn’t have any code level dependencies on each other
    • For example they shouldn’t depend on each other during build time
  • All dependencies should be factored into separate libraries
    • Also keep them as small as possible
  • Ideally only dependencies you have should be the open source libraries that you use
    • As a workaround, you can also open source your own libraries
  • The README.md should have some basic description of what the project does, what are steps to start developing the project
  • Ideally you should have instructions on how to run the project inside the docker container
    • This will help other developers but also if you use something like Kubernetes it will help down the line
  • After adopting docker as main tool to deploy the code, you should create appropriate repository in ECR or Docker Hub to host your containers

Specification

  • Apply API-first principles
  • Use a widely supported tools like RAML or Swagger to design your API endpoints and schemas first
  • Iteratively implement new endpoints, replacing static examples of the responses with live endpoints
  • Setup infrastructure to validate your schemas
    • Integration testing seems like a good step, your schemas can be validated as a “proxy” during testing

Implementation

  • Make sure that you handle error responses by other services or applications you depend on
  • Make sure that you set correct response type – HTTP Header
  • You should also handle API versioning, ideally this should be done on the higher level as well
  • Add support for X-Trace-Token, make sure to pass it around as you make further HTTP requests to other services
  • Also add X-Trace-Token to all log messages.
  • Ideally you could implement a Zipkin-like service to help with that

Monitoring

  • Your services should have a standard health check endpoint
    • You should standardize on what data is shown there
    • Format should be readable by the monitoring infrastructure
    • During health-checking, the service should send ping requests to all services it depends on and report status of those connections
  • You should have also tools to perform instrumentation / metric collection
    • Tools like Prometheus, NewRelic, Grafana or similar can be very helpful here
  • Logging should be written to standard output
  • Error logs should be written to standard error
  • Those logs should be captured by the tooling around docker containers (like Kubernetes) and redirected to Kibana or similar tool

Configuration

  • Make sure that you set sensible defaults for all configurable parameters
    • For example the defaults should allow you to run service on localhost for development
  • Configuration that changes in each environment (for example testing and production) should be read through environment variables
    • Which could be also configured by the Kubernetes or alternative approaches
  • Configuration shouldn’t change while your service is running
    • It’s better to design for applications that can quickly restart and apply new configuration than have long running processes that can change their config

Resiliency

  • Set a reasonable timeout for all outgoing calls you make
    • Also consider implementing circuit breakers like Hystrix to improve resiliency even more by avoiding cascading failures
  • Make sure your application can continue running while services it depends on are down
    • Make sure your application doesn’t require any manual administration when dependencies are down and later start up
  • Your service should start up even when dependencies are not available
    • For example you shouldn’t make any pre-startup checks if database is connectable
  • Make sure that increased rates or complexities of incoming requests won’t kill your application
    • Implement measures to protect your service from abuse
    • For example set a maximum page[limit] to avoid making heavy database calls or to limit response size
  • Setup error reporting service
    • Services like Airbrake or Rollbar will notify you of any errors that your service generates

Scaling

  • Services should follow shared-nothing practices
    • You shouldn’t directly modify state of other services or databases that you don’t ‘own’
    • You also shouldn’t allow other services to modify your internals state
  • Service should be effectively state less
    • All durable state should exist in the database
    • Caching is OK, but your service should function correctly without it
  • It should be possible to start more copies of your service without modifying existing ones
  • Prefer horizontal scalability over vertical one
  • Don’t use mechanisms like sticky sessions
    • These usually can prevent you from handling load evenly among instances of your service

Other

  • The gap between testing and production environments should be a small as possible
    • Ideally these environments should differ only by environment variables and scaling
  • Setup a traffic mirroring service
    • A portion of your live production traffic could be sent over to testing environments
    • This will allow you to spot bugs more easily
  • One-off admin processes that need to be run during deployment should ideally be automated
    • Or at least those scripts should be bundled with your application
    • For example: database schema migrations