There are many cases where users want to monitor Docker running on their data center. Monitoring your infrastructure will alert your DevOps or Agile teams in case there’s problems – essential to any continuous integration or continuous delivery solutions.
However, not every organization has the budget for enterprise software that can provide this service. That’s where Nagios XI comes in. Nagios core is an open source free to use software. However, it does not offer Docker monitoring functionality by default, so that’s where this blog post can help. I will show how to create a customized plugin to monitor Docker for Nagios core edition.
- Have a working Nagios core monitor system with NRPE setup
- Have a running Docker stack
On client side
- Create a bash script check_docker.sh that collect info about docker status on client server.
The script gathers information about the docker environment. Variable INCIDENT will be assigned to any docker services that is currently not up and running. Variable NUM_SERVICES will get all the service that is created.
Nagios determines the status of a host or service by evaluating the return code from plugins. Here’s a list of valid return codes:
- 0 – Service is OK.
- 1 – Service has a WARNING.
- 2 – Service is in a CRITICAL status.
- 3 – Service status is UNKNOWN.
In the example above, exit code 1 or 0 will be returned based on the condition evaluation.
- Save this script in /usr/lib/nagios/plugins (this directory may be different depending on your configuration).
- Modify the permission of the file to allow execution.
#> sudo chmod +x /usr/lib/nagios/plugins/check_docker.sh
- Add this line into /etc/nagios /nrpe.cfg file by:
#> echo “command[docker_services_check]=/usr/lib/nagios/plugins/docker_services_check.sh” >> /etc/nagios /nrpe.cfg
- Restart the NRPE listener:
#> sudo systemctl restart nagios-nrpe-servder.service
Steps for the server side
- In /usr/local/nagios/etc/objects/commands.cfg file, add below lines:
2. In /usr/local/nagios/etc/objects/linuxbox.cfg file, add below lines:
Note: modify the host_name value based on your configuration.
3. Check the configuration and, if no errors or warnings, reload the service:
#> /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
#> sudo systemctl reload-or-restart nagios.service
These steps will help you moving forward in monitoring Docker, allowing you to supervise hosts and their services in a much simpler way.