Docker healthchecks – things to consider

While working on a recent docker project I encountered an issue when it comes to health checks. You may define a health check for a container from within your Dockerfile, such as

HEALTHCHECK --interval=5s --timeout=5s --start-period=20s
         --retries=5 CMD wget --output-document=- 
         --quiet --tries=1 http://127.0.0.1

The syntax is quite simple to understand, you basically define how and when the health check should be executed and which command(s) to use. Breaking things down from the example

  • –interval=5s – run the check every 5 seconds
  • –timeout=5s – maximum time to wait for the health check to complete
  • –start-period=20s – allow some time for the process to be monitored to come to a functioning state, avoids false alarms during startup, in this case we allow 20 seconds
  • –retries=5 – only trigger an alarm after 5 unsuccessful tries (non zero exit code) – this helps avoiding false alarms in case the process is under heavy load
  • After the CMD follows the command to execute, this is done from inside the container, in the given example we check for the webserver to deliver something indicating the server is working properly

However every one should consider on how flexible the health check should be to match as many use cases as possible. This is especially true if you are dealing with web-based endpoints as pointed out in the example.

Continue reading