Spring Boot Monitoring. Actuator, Prometheus, Grafana

Oleksii Dushenin
10 min readMay 29, 2021

--

In a previous post, Swagger was used for providing API documentation in Spring Boot Application. In this post, we will introduce Spring Boot Monitoring in the form of Spring Boot Actuator, Prometheus, and Grafana. It allows you to monitor the state of the application based on a predefined set of metrics. Dashboards can help you represent your data in a more meaningful and convenient way.

Spring Boot Actuator

Let’s start with Spring Boot Actuator. It brings a number of features to monitor and manage the Spring Boot Application. Examples of such features are health, info, and metrics. The full list of provided endpoints can be found at Spring Boot Actuator Endpoints.

As usual, Spring Boot Starter has to be added.

implementation 'org.springframework.boot:spring-boot-starter-actuator'

Let’s also change application.yml. We want to expose all available Spring Boot Actuator endpoints.

management:
endpoints:
web:
exposure:
include: '*'

That is all we have to do to enable Spring Boot Actuator endpoints.

The base endpoint for Spring Boot Actuator is:

http://localhost:8080/actuator

It responds with available actuator endpoints.

{
"_links": {
"self": {
"href": "http://localhost:8080/actuator",
"templated": false
},
"beans": {
"href": "http://localhost:8080/actuator/beans",
"templated": false
},
"caches-cache": {
"href": "http://localhost:8080/actuator/caches/{cache}",
"templated": true
},
"caches": {
"href": "http://localhost:8080/actuator/caches",
"templated": false
},
"health": {
"href": "http://localhost:8080/actuator/health",
"templated": false
},
"health-path": {
"href": "http://localhost:8080/actuator/health/{*path}",
"templated": true
},
"info": {
"href": "http://localhost:8080/actuator/info",
"templated": false
},
"conditions": {
"href": "http://localhost:8080/actuator/conditions",
"templated": false
},
"configprops": {
"href": "http://localhost:8080/actuator/configprops",
"templated": false
},
"env-toMatch": {
"href": "http://localhost:8080/actuator/env/{toMatch}",
"templated": true
},
"env": {
"href": "http://localhost:8080/actuator/env",
"templated": false
},
"liquibase": {
"href": "http://localhost:8080/actuator/liquibase",
"templated": false
},
"loggers": {
"href": "http://localhost:8080/actuator/loggers",
"templated": false
},
"loggers-name": {
"href": "http://localhost:8080/actuator/loggers/{name}",
"templated": true
},
"heapdump": {
"href": "http://localhost:8080/actuator/heapdump",
"templated": false
},
"threaddump": {
"href": "http://localhost:8080/actuator/threaddump",
"templated": false
},
"metrics-requiredMetricName": {
"href": "http://localhost:8080/actuator/metrics/{requiredMetricName}",
"templated": true
},
"metrics": {
"href": "http://localhost:8080/actuator/metrics",
"templated": false
},
"scheduledtasks": {
"href": "http://localhost:8080/actuator/scheduledtasks",
"templated": false
},
"mappings": {
"href": "http://localhost:8080/actuator/mappings",
"templated": false
}
}
}

Health endpoint responds with:

Custom application checks can be added there.

Let’s use it as an app service health check in docker-compose.yml instead of http://localhost:8080/items.

healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/actuator/health"]
interval: 10s
timeout: 10s
retries: 10

Metrics endpoint returns the next data (custom metrics can also be added, check Spring Boot Actuator Documentation).

{
"names": [
"hikaricp.connections",
"hikaricp.connections.acquire",
"hikaricp.connections.active",
"hikaricp.connections.creation",
"hikaricp.connections.idle",
"hikaricp.connections.max",
"hikaricp.connections.min",
"hikaricp.connections.pending",
"hikaricp.connections.timeout",
"hikaricp.connections.usage",
"http.server.requests",
"jdbc.connections.active",
"jdbc.connections.idle",
"jdbc.connections.max",
"jdbc.connections.min",
"jvm.buffer.count",
"jvm.buffer.memory.used",
"jvm.buffer.total.capacity",
"jvm.classes.loaded",
"jvm.classes.unloaded",
"jvm.gc.live.data.size",
"jvm.gc.max.data.size",
"jvm.gc.memory.allocated",
"jvm.gc.memory.promoted",
"jvm.gc.pause",
"jvm.memory.committed",
"jvm.memory.max",
"jvm.memory.used",
"jvm.threads.daemon",
"jvm.threads.live",
"jvm.threads.peak",
"jvm.threads.states",
"logback.events",
"process.cpu.usage",
"process.files.max",
"process.files.open",
"process.start.time",
"process.uptime",
"system.cpu.count",
"system.cpu.usage",
"system.load.average.1m",
"tomcat.sessions.active.current",
"tomcat.sessions.active.max",
"tomcat.sessions.alive.max",
"tomcat.sessions.created",
"tomcat.sessions.expired",
"tomcat.sessions.rejected"
]
}

As you can see, e.g. Hickari Connection Pool, JDBC and JVM metrics are automatically added to metrics endpoint.

Let’s check one of them e.g. hikaricp.connections (number of connections in the pool).

http://localhost:8080/actuator/metrics/hikaricp.connections

It responds with values at the moment of request execution.

{
"name": "hikaricp.connections",
"description": "Total connections",
"baseUnit": null,
"measurements": [
{
"statistic": "VALUE",
"value": 10.0
}
],
"availableTags": [
{
"tag": "pool",
"values": [
"HikariPool-1"
]
}
]
}

This metric shows us that there are ten connections in the pool.

Prometheus

In the previous chapter, we were able to retrieve a metric value at the moment of request execution. However, what we are interested in is the dynamics of the metric over time. We can use Prometheus for this purpose.

Prometheus is a monitoring tool that works on time series data. The pull model is used. As a result, Prometheus requires an endpoint to periodically pull metrics from it. Prometheus also requires a specific data format. Therefore, we have to make some changes to our data.

Micrometer can be used to transform /actuator/metrics endpoint into the format Prometheus can understand. It is done by adding the next dependency.

implementation 'io.micrometer:micrometer-registry-prometheus:1.7.0'

It will result in the next endpoint:

http://localhost:8080/actuator/prometheus

The data have the form that Prometheus can process. This endpoint will be used by Prometheus to fetch data from.

# HELP jvm_gc_memory_promoted_bytes_total Count of positive increases in the size of the old generation memory pool before GC to after GC
# TYPE jvm_gc_memory_promoted_bytes_total counter
jvm_gc_memory_promoted_bytes_total 1.2264144E7
# HELP hikaricp_connections_usage_seconds Connection usage time
# TYPE hikaricp_connections_usage_seconds summary
hikaricp_connections_usage_seconds_count{pool="HikariPool-1",} 2.0
hikaricp_connections_usage_seconds_sum{pool="HikariPool-1",} 0.006
# HELP hikaricp_connections_usage_seconds_max Connection usage time
# TYPE hikaricp_connections_usage_seconds_max gauge
hikaricp_connections_usage_seconds_max{pool="HikariPool-1",} 0.006
# HELP jvm_threads_states_threads The current number of threads having NEW state
# TYPE jvm_threads_states_threads gauge
jvm_threads_states_threads{state="runnable",} 12.0
jvm_threads_states_threads{state="blocked",} 0.0
jvm_threads_states_threads{state="waiting",} 14.0
jvm_threads_states_threads{state="timed-waiting",} 12.0
jvm_threads_states_threads{state="new",} 0.0
jvm_threads_states_threads{state="terminated",} 0.0
# HELP jvm_buffer_memory_used_bytes An estimate of the memory that the Java virtual machine is using for this buffer pool
# TYPE jvm_buffer_memory_used_bytes gauge
jvm_buffer_memory_used_bytes{id="mapped",} 0.0
jvm_buffer_memory_used_bytes{id="direct",} 1.6785409E7
# HELP system_load_average_1m The sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time
# TYPE system_load_average_1m gauge
system_load_average_1m 2.25
# HELP jdbc_connections_min Minimum number of idle connections in the pool.
# TYPE jdbc_connections_min gauge
jdbc_connections_min{name="dataSource",} 10.0
# HELP tomcat_sessions_expired_sessions_total
# TYPE tomcat_sessions_expired_sessions_total counter
tomcat_sessions_expired_sessions_total 0.0
# HELP jvm_buffer_total_capacity_bytes An estimate of the total capacity of the buffers in this pool
# TYPE jvm_buffer_total_capacity_bytes gauge
jvm_buffer_total_capacity_bytes{id="mapped",} 0.0
jvm_buffer_total_capacity_bytes{id="direct",} 1.6785408E7
# HELP jvm_threads_peak_threads The peak live thread count since the Java virtual machine started or peak was reset
# TYPE jvm_threads_peak_threads gauge
jvm_threads_peak_threads 40.0
# HELP process_files_open_files The open file descriptor count
# TYPE process_files_open_files gauge
process_files_open_files 159.0
# HELP hikaricp_connections_creation_seconds_max Connection creation time
# TYPE hikaricp_connections_creation_seconds_max gauge
hikaricp_connections_creation_seconds_max{pool="HikariPool-1",} 0.0
# HELP hikaricp_connections_creation_seconds Connection creation time
# TYPE hikaricp_connections_creation_seconds summary
hikaricp_connections_creation_seconds_count{pool="HikariPool-1",} 0.0
hikaricp_connections_creation_seconds_sum{pool="HikariPool-1",} 0.0
# HELP jvm_gc_pause_seconds Time spent in GC pause
# TYPE jvm_gc_pause_seconds summary
jvm_gc_pause_seconds_count{action="end of minor GC",cause="Metadata GC Threshold",} 1.0
jvm_gc_pause_seconds_sum{action="end of minor GC",cause="Metadata GC Threshold",} 0.019
jvm_gc_pause_seconds_count{action="end of minor GC",cause="G1 Evacuation Pause",} 2.0
jvm_gc_pause_seconds_sum{action="end of minor GC",cause="G1 Evacuation Pause",} 0.045
# HELP jvm_gc_pause_seconds_max Time spent in GC pause
# TYPE jvm_gc_pause_seconds_max gauge
jvm_gc_pause_seconds_max{action="end of minor GC",cause="Metadata GC Threshold",} 0.019
jvm_gc_pause_seconds_max{action="end of minor GC",cause="G1 Evacuation Pause",} 0.023
# HELP hikaricp_connections_timeout_total Connection timeout total count
# TYPE hikaricp_connections_timeout_total counter
hikaricp_connections_timeout_total{pool="HikariPool-1",} 0.0
# HELP hikaricp_connections_idle Idle connections
# TYPE hikaricp_connections_idle gauge
hikaricp_connections_idle{pool="HikariPool-1",} 10.0
# HELP jvm_memory_committed_bytes The amount of memory in bytes that is committed for the Java virtual machine to use
# TYPE jvm_memory_committed_bytes gauge
jvm_memory_committed_bytes{area="heap",id="G1 Survivor Space",} 2.097152E7
jvm_memory_committed_bytes{area="heap",id="G1 Old Gen",} 2.7787264E8
jvm_memory_committed_bytes{area="nonheap",id="Metaspace",} 8.6331392E7
jvm_memory_committed_bytes{area="nonheap",id="CodeHeap 'non-nmethods'",} 2555904.0
jvm_memory_committed_bytes{area="heap",id="G1 Eden Space",} 2.26492416E8
jvm_memory_committed_bytes{area="nonheap",id="Compressed Class Space",} 1.245184E7
jvm_memory_committed_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 1.1337728E7
# HELP jvm_gc_live_data_size_bytes Size of long-lived heap memory pool after reclamation
# TYPE jvm_gc_live_data_size_bytes gauge
jvm_gc_live_data_size_bytes 0.0
# HELP hikaricp_connections_pending Pending threads
# TYPE hikaricp_connections_pending gauge
hikaricp_connections_pending{pool="HikariPool-1",} 0.0
# HELP jvm_threads_daemon_threads The current number of live daemon threads
# TYPE jvm_threads_daemon_threads gauge
jvm_threads_daemon_threads 34.0
# HELP jdbc_connections_idle Number of established but idle connections.
# TYPE jdbc_connections_idle gauge
jdbc_connections_idle{name="dataSource",} 10.0
# HELP tomcat_sessions_alive_max_seconds
# TYPE tomcat_sessions_alive_max_seconds gauge
tomcat_sessions_alive_max_seconds 0.0
# HELP hikaricp_connections_min Min connections
# TYPE hikaricp_connections_min gauge
hikaricp_connections_min{pool="HikariPool-1",} 10.0
# HELP jvm_classes_unloaded_classes_total The total number of classes unloaded since the Java virtual machine has started execution
# TYPE jvm_classes_unloaded_classes_total counter
jvm_classes_unloaded_classes_total 0.0
# HELP process_uptime_seconds The uptime of the Java virtual machine
# TYPE process_uptime_seconds gauge
process_uptime_seconds 35.926
# HELP jdbc_connections_active Current number of active connections that have been allocated from the data source.
# TYPE jdbc_connections_active gauge
jdbc_connections_active{name="dataSource",} 0.0
# HELP process_cpu_usage The "recent cpu usage" for the Java Virtual Machine process
# TYPE process_cpu_usage gauge
process_cpu_usage 0.0
# HELP hikaricp_connections Total connections
# TYPE hikaricp_connections gauge
hikaricp_connections{pool="HikariPool-1",} 10.0
# HELP hikaricp_connections_active Active connections
# TYPE hikaricp_connections_active gauge
hikaricp_connections_active{pool="HikariPool-1",} 0.0
# HELP process_files_max_files The maximum file descriptor count
# TYPE process_files_max_files gauge
process_files_max_files 1048576.0
# HELP jvm_memory_used_bytes The amount of used memory
# TYPE jvm_memory_used_bytes gauge
jvm_memory_used_bytes{area="heap",id="G1 Survivor Space",} 2.097152E7
jvm_memory_used_bytes{area="heap",id="G1 Old Gen",} 1.6268288E7
jvm_memory_used_bytes{area="nonheap",id="Metaspace",} 8.3057392E7
jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'non-nmethods'",} 1297920.0
jvm_memory_used_bytes{area="heap",id="G1 Eden Space",} 8.388608E7
jvm_memory_used_bytes{area="nonheap",id="Compressed Class Space",} 1.1147696E7
jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 1.1308544E7
# HELP jvm_memory_max_bytes The maximum amount of memory in bytes that can be used for memory management
# TYPE jvm_memory_max_bytes gauge
jvm_memory_max_bytes{area="heap",id="G1 Survivor Space",} -1.0
jvm_memory_max_bytes{area="heap",id="G1 Old Gen",} 4.169138176E9
jvm_memory_max_bytes{area="nonheap",id="Metaspace",} -1.0
jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'non-nmethods'",} 6975488.0
jvm_memory_max_bytes{area="heap",id="G1 Eden Space",} -1.0
jvm_memory_max_bytes{area="nonheap",id="Compressed Class Space",} 1.073741824E9
jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 2.44682752E8
# HELP tomcat_sessions_active_current_sessions
# TYPE tomcat_sessions_active_current_sessions gauge
tomcat_sessions_active_current_sessions 0.0
# HELP jvm_threads_live_threads The current number of live threads including both daemon and non-daemon threads
# TYPE jvm_threads_live_threads gauge
jvm_threads_live_threads 38.0
# HELP jvm_classes_loaded_classes The number of classes that are currently loaded in the Java virtual machine
# TYPE jvm_classes_loaded_classes gauge
jvm_classes_loaded_classes 15686.0
# HELP jdbc_connections_max Maximum number of active connections that can be allocated at the same time.
# TYPE jdbc_connections_max gauge
jdbc_connections_max{name="dataSource",} 10.0
# HELP tomcat_sessions_active_max_sessions
# TYPE tomcat_sessions_active_max_sessions gauge
tomcat_sessions_active_max_sessions 0.0
# HELP jvm_gc_memory_allocated_bytes_total Incremented for an increase in the size of the (young) heap memory pool after one GC to before the next
# TYPE jvm_gc_memory_allocated_bytes_total counter
jvm_gc_memory_allocated_bytes_total 4.0894464E8
# HELP jvm_gc_max_data_size_bytes Max size of long-lived heap memory pool
# TYPE jvm_gc_max_data_size_bytes gauge
jvm_gc_max_data_size_bytes 4.169138176E9
# HELP system_cpu_usage The "recent cpu usage" for the whole system
# TYPE system_cpu_usage gauge
system_cpu_usage 0.0
# HELP jvm_buffer_count_buffers An estimate of the number of buffers in the pool
# TYPE jvm_buffer_count_buffers gauge
jvm_buffer_count_buffers{id="mapped",} 0.0
jvm_buffer_count_buffers{id="direct",} 3.0
# HELP tomcat_sessions_rejected_sessions_total
# TYPE tomcat_sessions_rejected_sessions_total counter
tomcat_sessions_rejected_sessions_total 0.0
# HELP hikaricp_connections_max Max connections
# TYPE hikaricp_connections_max gauge
hikaricp_connections_max{pool="HikariPool-1",} 10.0
# HELP process_start_time_seconds Start time of the process since unix epoch.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.62229726525E9
# HELP logback_events_total Number of error level events that made it to the logs
# TYPE logback_events_total counter
logback_events_total{level="warn",} 1.0
logback_events_total{level="debug",} 0.0
logback_events_total{level="error",} 0.0
logback_events_total{level="trace",} 0.0
logback_events_total{level="info",} 20.0
# HELP system_cpu_count The number of processors available to the Java virtual machine
# TYPE system_cpu_count gauge
system_cpu_count 4.0
# HELP hikaricp_connections_acquire_seconds Connection acquire time
# TYPE hikaricp_connections_acquire_seconds summary
hikaricp_connections_acquire_seconds_count{pool="HikariPool-1",} 2.0
hikaricp_connections_acquire_seconds_sum{pool="HikariPool-1",} 0.019018259
# HELP hikaricp_connections_acquire_seconds_max Connection acquire time
# TYPE hikaricp_connections_acquire_seconds_max gauge
hikaricp_connections_acquire_seconds_max{pool="HikariPool-1",} 0.019007214
# HELP tomcat_sessions_created_sessions_total
# TYPE tomcat_sessions_created_sessions_total counter
tomcat_sessions_created_sessions_total 0.0

As was already mentioned, Prometheus fetches data from the specified endpoint. As a result, we have to create some configuration.

The basic configuration ( check Prometheus Documentation) is created in the root of the application in the prometheus.yml file.

global:
scrape_interval: 5s

scrape_configs:
- job_name: 'spring_boot_prometheus'
metrics_path: '/actuator/prometheus'
scrape_interval: 5s
static_configs:
- targets: ['app:8080']

We simply create a job that will fetch data from http://app:8080/actuator/prometheus every 5 seconds. As a result, a metric at a single point of time transforms into time series. app — is a name of our application service in a docker-compose.yml.

Let’s add Prometheus to docker-compose.yml. It is done with the next service configuration.

prometheus:
container_name: sbs_prometheus
image: prom/prometheus:v2.27.1
ports:
- 9090:9090
networks:
- spring-boot-simple
healthcheck:
test: [ "CMD", "curl", "-f", "http://localhost:9090" ]
interval: 10s
timeout: 10s
retries: 10
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml

As you can see we use created prometheus.yml file as a Prometheus configuration. Port 9090 is used by Prometheus, so we added it to the port mapping.

Let’s start application with:

docker-compose up --build

After application startup go to the

http://localhost:9090/

You will see the next screen.

Let’s check that our configuration was successfully applied. Use menu to go to Status — Targets.

As you can see, the configuration was applied successfully. If the state is not UP, please wait a little bit.

Go to the Graph and use e.g. jvm_memory_used_bytes value in a search field. We can also use hikaricp.connections, however, the value will always be 10. And press Graph again.

We have JVM memory usage graph by our Spring Boot Application.

Grafana

Grafana provides rich monitoring visualization while Prometheus only a basic one. There are a lot of publicly available dashboards that can be easily imported. Grafana can pull data from various data sources such as Prometheus, ElasticSearch, PostgreSQL, etc. Grafana is the next step in Spring Boot Monitoring.

Let’s add Grafana to docker-compose.yml.

grafana:
container_name: sbs_grafana
image: grafana/grafana:7.5.7
ports:
- 3000:3000
networks:
- spring-boot-simple
healthcheck:
test: [ "CMD", "curl", "-f", "http://localhost:3030" ]
interval: 10s
timeout: 10s
retries: 10

It will be available at port 3000.

Restart application again.

docker-compose up --build

Go to

http://localhost:3000/

Use admin/admin credentials. You will be asked to enter new password.

After that you will see the next screen.

We need to add Prometheus as a data source. Press on DATA SOURCES and choose Prometheus.

We need to specify Prometheus URL. The problem is in the fact that we can’t use localhost for that because Prometheus is running in Docker. We need an IP address. You can use ifconfig or a similar tool to find it out. In my case, it is 192.168.0.106.

Press Save&Test. After that, go to the main page and press DASHBOARDS. Hover over a + sign and press import.

Choose an existing dashboard. We will use two of them: JVM MIcrometer and Spring Boot HikariCP.

Enter id of a dashboard (can be found in a URL). JVM MIcrometer — 4701 and Spring Boot HikariCP — 6083.

Press Load. After that choose our Prometheus data source.

Press Import. Do the same for Spring Boot HikariCP.

As a result, we will have two dashboards (you can set smaller time range for your dashboards to get a better view).

JVM MIcrometer

Spring Boot HikariCP

For full set of features please check the documentation. E.g. it is possible to set up alerts.

Summary

To sum up, in this post Spring Boot Monitoring based on Spring Boot Actuator, Micrometer, Prometheus, and Grafana was introduced.

The whole code can be found at Github v7.0.0-monitoring branch.

Originally published at https://datamify.com on May 29, 2021.

--

--