Timescale recently published Promscale — an open source long-term remote storage for Prometheus built on top of TimescaleDB. According to the announcement, it should be fast and resource efficient. Let’s compare performance and resource usage on production workload for Promscale and VictoriaMetrics.
The following scheme has been constructed for the benchmark:
/--> VictoriaMetrics
2000 x node_exporter <-- vmagent --|
\--> Promscale
Node_exporter v1.0.1 has been installed on a single e2-standard-4 instance in GCP. It exports real-world resource usage metrics such as CPU usage, memory usage, disk IO usage, network usage, etc. These metrics are usually collected in typical production workloads.
Recently single-node VictoriaMetrics gained support for scraping Prometheus targets. This made possible to run apples-to-apples benchmark, which compares resource usage for Prometheus and VictoriaMetrics during scraping big number of real node_exporter
targets.
The benchmark was run in Google Compute Engine on four machines (instances):
node_exporter
cannot process more than a few hundred requests per second. Prometheus and VictoriaMetrics were generating much higher load on the node_exporter
during tests. So it has…
Prometheus supports relabeling, which allows performing the following tasks:
Lets’ looks at how to perform each of these tasks.
New label can be added with the following relabeling rule:
- target_label: "foo"
replacement: "bar"
This relabeling rule adds {foo="bar"}
label to all the incoming metrics. For example, metric{job="aa"}
will be converted to metric{job="aa",foo="bar"}
.
Existing label can be updated with the relabeling rule mentioned above:
…
Recently ScyllaDB published an interesting article How Scylla scaled to one billion rows per second. They conducted a benchmark (named Billy
) for a typical time series workload, which simulates a million temperature sensors reporting every minute for a year’s worth of data. This translates to 1M*60*24*365=525.6
billion data points. The benchmark was run on beefy servers from Packet:
ScyllaDB cluster achieved scan speed of more than 1 billion of data points per second on this setup. Later ClickHouse provided good results for slightly modified Billy benchmark…
Many technical terms could be used when referring to Prometheus storage — either local storage or remote storage. New users could be unfamiliar with these terms, which could result in misunderstandings. Let’s explain the most commonly used technical terms in this article.
A time series is a series of (timestamp, value)
pairs sorted by timestamp. The number of pairs per each time series can be arbitrary — from one to hundreds of billions. Timestamps have millisecond precision, while values are 64-bit floating point numbers. Each time series has a name. For instance:
node_cpu_seconds_total
— the total number of CPU seconds…It looks like histogram support is great in Prometheus ecosystem:
But why Prometheus users continue complaining about issues in histograms? Let’s look at the most annoying issues.
Suppose you decided covering response size with Prometheus histograms and defined the following histogram according to docs:
h := prometheus.NewHistogram(prometheus.HistogramOpts{
Name: "response_size_bytes",
Help: "The size of the response",
Buckets: prometheus.LinearBuckets(100, 100, 3),
})
This histogram has 4 buckets with the following response size ranges (aka le
label value):
Recently the Evaluating Performance and Correctness article has been published by Prometheus author. The article points to a few data model discrepancies between VictoriaMetrics and Prometheus. It also contains benchmark results showing poor compression ratio and poor performance for VictoriaMetrics comparing to Prometheus. Unfortunately the original article doesn’t support comments to leave the response, so let’s discuss all these issues in the post below.
This code has been used for generating time series for the benchmark. The code generates series of floating-point values with random 9 decimal digits after the point. Such series cannot be compressed well because 9 random…
Suppose you have a time series database containing terabytes of data. How do you mange backups for this data? Do you think it is too big to backup and blindly rely on database replication for data safety? Then you are in trouble.
Replication is the process of creating multiple copies of the same data on distinct hardware resources and maintaining this data in consistent state. Replication saves from hardware failures — if certain node or disk goes out of service, your data shouldn’t be lost or corrupted, since there should remain at least one copy of the data. Are we…
Thanos is known as long-term storage for Prometheus, while cluster version of VictoriaMetrics had been open sourced recently. Both solutions provide the following features:
Let’s compare different aspects of Thanos and VictoriaMetrics starting from their architecture and then comparing insert
and select
paths by the following properties:
High Availability setups and hosting costs are highlighted in the end of the article.
Thanos consists of the following components:
We are happy to announce that VictoriaMetrics enters open source world under Apache2 license!
VictoriaMetrics is high-performance resource-efficient time series database with the following features:
Founder and core developer at VictoriaMetrics