↩ en programming language Web related javascript 非公開: Introduction to Prometheus and Grafana

Introduction to Prometheus and Grafana

Prometheus is an open source metrics-based monitoring system. Collect data from services and hosts by making HTTP requests to metrics endpoints. The results are then stored in a time series database and made available for analysis and alerting.

Why monitor?

Enable alerts when a problem occurs, preferably before it occurs. So that someone can see it.
Provides insights that enable you to analyze, debug, and resolve issues.
You can see trends/changes over time. For example, the number of active sessions at a given time. This helps with design decisions and capacity planning.

Monitoring is usually related to events. Events include receiving an HTTP request, sending a response, reading from disk, logging in a user, etc. System monitoring includes profiling, logging, tracing, metrics, alerts, and visualization.

Black box and white box monitoring

Monitoring falls into two main categories:

black box monitoring

With black box monitoring, monitoring is done externally, so the monitoring is done at the application or host level. This can be quite limiting.

white box monitoring

White box monitoring means monitoring the internals of a service. Data about the health and performance of internal components may be exposed.

four golden lights

According to Google , if you only have four metrics to measure in your user-facing system, focus on these four, known as the “Four Golden Signals.”

#1.Latency

The time it takes to process the request (success or failure). It is important to track not only successful requests, but also failed requests.

#2.Traffic traffic

A measure of how much demand is placed on the system. For web services, this is typically HTTP requests per second.

#3.Error

Percentage of failed requests.

#4.Saturation

How complete is your service? Increased latency is often a key indicator of saturation. Many systems experience degraded performance long before they reach 100% utilization.

Types of Prometheus metrics

There are four main types of Prometheus metrics:

#1.Counter

The value of the counter is always increasing. It cannot be decreased, but it can be reset to zero. Therefore, if scraping fails, it only means that a data point is lost. The cumulative increase will be available on the next read. example:

Total number of HTTP requests received
Number of exceptions.

#2.Gauge

A gauge is a snapshot at any point in time. It can increase or decrease. If the data fetch fails, samples are lost. The next fetch may show different values, such as disk space or memory usage.

#3.Histogram

A histogram samples observations and counts them into configurable buckets. These are used for things like request duration and response size. For example, you can measure the request time of a particular HTTP request. The histogram contains a series of buckets such as 1 ms, 10 ms, 25 ms, etc. Rather than storing all durations for all requests, Prometheus stores the frequency of requests that fall into a particular bucket.

#4.Summary

Similar to observing histogram samples, typically the request duration or response size. Displays the total number of observations and the sum of all observations, and allows you to calculate the average of the observations. For example, there were 3 requests per minute, taking 2, 3, and 4 seconds. The total will be 9 and the count will be 3. The wait time will be 3 seconds.

Components of the Prometheus Ecosystem

prometheus server

Collect and store metrics, make them available for queries, and send alerts based on the metrics you collect.

sharpen

Prometheus is a pull-based system. To retrieve metrics, Prometheus sends an HTTP request called a scrape. Send scraping to targets based on configuration.

Each target (statically defined or dynamically discovered) is scraped at regular intervals (scraping interval). Each scraper reads the /metrics HTTP endpoint to obtain the current state of client metrics and saves the values to the Prometheus time series database.

There are other time series databases for monitoring solutions that you may want to consider.

client library

To monitor your service, you need to add instrumentation to your code. There are client libraries available for all popular languages and runtimes. These libraries allow your code to start outputting metrics with just a few lines of code. This is called direct measurement. These libraries allow you to define internal metrics and expose them via HTTP endpoints. When Prometheus collects the metrics HTTP endpoint, the client library sends the metrics to the server.

Official client libraries are provided by Prometheus for Go, Java, Python, and Ruby. Prometheus has an open ecosystem. There are also community-built client libraries available for C, PHP, Node.js, C#/.NET, and more.

exporter

Many applications expose metrics in non-Prometheus formats. For these applications, or applications you don’t own or don’t have access to the code, you can’t add instrumentation directly. Examples include MySQL, Kafka, JMX, HAProxy, and NGINX servers. For these scenarios, use exporters .

An exporter is a tool that you deploy with your application to retrieve metrics. The exporter acts like a proxy between your application and Prometheus. It receives requests from the Prometheus server, collects data from access logs and application error logs, converts it into the correct format, and finally returns it to the Prometheus server.

Popular exporters include:

Windows – for Windows server metrics
Node – for Linux server metrics
Black Box – for DNS and website performance metrics
JMX – for Java-based application metrics

Once your application is instrumented or your exporter is in place, you need to tell Prometheus where your application is located. This can be done using static configuration. In a dynamic environment, this is not possible. Therefore, service discovery is used.

Warning in progress

Alerts with Prometheus consist of two parts.

Alert rules send alerts to Alertmanager.

Alertmanager then manages those alerts. Send notifications using email, Slack, Hipchat, PagerDuty, and many other out-of-the-box integrations. Alertmanager can also perform silencing or aggregation to reduce the number of notifications.

This is a guide to monitoring Linux servers using Prometheus and Dashboard.

Visualization with dashboard

Prometheus has a number of APIs that allow PromQL queries to generate raw data for visualization.

Prometheus includes an expression browser that can be used for ad hoc queries, but the best tool available is Grafana . Grafana is fully integrated with Prometheus and allows you to create a variety of dashboards.

You need to configure Prometheus as a data source for Grafana.

Dashboards can be added in the following ways:

Importing community-created dashboards
build it yourself
Using predefined dashboards.

The predefined node exporter dashboard looks like this:

Grafana has a worldPing module that allows you to monitor performance metrics for sites and DNS around the world.

summary

Prometheus has few requirements. It’s a single binary with configuration files, so it’s very easy to run. It can process thousands of targets and ingest millions of samples per second. Prometheus is designed to track the overall system, health, and behavior of the system.

Grafana is the best tool available for visualizing metrics and integrates seamlessly with Prometheus .