Monitor Consul using Prometheus and Grafana

Irene W
2 min readJun 29, 2021

Consul is a service discovery tool for allowing services or VMs to be registered and then provide dns and http interfaces to query on the state of registered services/VMs.

Consul as DNS to reduce the latency from client to server. reference: https://medium.com/containers-on-aws/how-to-setup-service-discovery-in-elastic-container-service-3d18479959e6

Prometheus is a pull based model monitoring tool by getting the metrics data via querying each targets defined in the configuration.

Start Consul

touch /etc/consul/server/config.json

{
"telemetry": {
"prometheus_retention_time": "480h",
"disable_hostname": true
}
}

This is to enable metrics via telemetry, otherwise below errors will be observed:

415 Unsupport Media Type

Prometheus is not enabled since its retention time is not positive

If config correctly, prometheus data will be returned:

curl http://127.0.0.1:8500/v1/agent/metrics\?format\=prometheus

Configuration in prometheus

add job in scrape_configs:

- job_name: consul
honor_timestamps: true
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: '/v1/agent/metrics'
scheme: http
param:
format: ["prometheus"]
static_configs:
- targets:
- <consulserver1>:8500
- <consulserver2>:8500

Grafana dashboard ID: 10642, then you will get the dashboard for consul:

However, the native metrics are not intuitive to design the prometheus rule, so we reroute the metrics with consul_exporter:

docker pull prom/consul-exporter

docker run -d -p 9107:9107 prom/consul-exporter — consul.server=consulserver:8500

## test

curl -s http://localhost:9107/metrics | grep consul_catalog_service_node_healthy

Add the prometheus rule

groups:- name: consul.rulesrules:- alert: ConsulServiceHealthcheckFailedexpr: consul_catalog_service_node_healthy == 0for: 1mlabels:severity: criticalannotations:summary: Consul service healthcheck failed (instance {{ $labels.instance }})description: "Service: `{{ $labels.service_name }}` Healthcheck: `{{ $labels.service_id }}`\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"- alert: ConsulMissingMasterNodeexpr: consul_raft_peers < 3for: 0mlabels:severity: criticalannotations:summary: Consul missing master node (instance {{ $labels.instance }})description: "Numbers of consul raft peers should be 3, in order to preserve quorum.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"- alert: ConsulAgentUnhealthyexpr: consul_health_node_status{status="critical"} == 1for: 0mlabels:severity: criticalannotations:summary: Consul agent unhealthy (instance {{ $labels.instance }})description: "A Consul agent is down\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

Add the exporter job to prometheus.yml. we are ready to receive the alerts from consul!

--

--