Prometheus Subqueries in VictoriaMetrics
- Percentage of time with more than 10 errors per second for the last hour:
avg_over_time((rate(errors_total[5m]) > bool 10)[1h:1m])
- 95th percentile of network bandwidth for the last hour:
- The minimum number of requests per second for the last 30 minutes:
- The maximum car acceleration during the last hour:
Previously such kind of queries couldn’t be implemented in one go. They required writing recording rules for the inner queries and then writing queries over the output time series for the recording rules.
Extending Prometheus subqueries
VictoriaMetrics supports Prometheus subqueries and extends them a bit starting from v1.8.0. VictoriaMetrics provides the following extensions:
- Offsets may be added at any place of the query. The following query returns the number of cache requests for the previous day:
(rate(hits_total[5m]) + rate(miss_total[5m])) offset 1d
This is equivalent to the following PromQL query:
rate(hits_total[5m] offset 1d) + rate(miss_total[5m] offset 1d)
This is especially useful when building multiple graphs with different offsets. For instance, the following query returns
rps graphs for “today”, “yesterday” and “week ago”, so it becomes obvious whether the
rps increases or decreases over time:
rps = rate(requests_total[5m]),
label_set(rps, “graph”, “today”),
label_set(rps offset 1d, “graph”, “yesterday”),
label_set(rps offset 7d, “graph”, “week_ago”),
The query uses PromQL extensions from VictoriaMetrics such as WITH expressions (aka Common Table Expressions — CTE) for outlining common expressions,
union function for returning results from multiple queries and
label_set function for setting additional labels to time series. The full list of PromQL extensions supported by VictoriaMetrics is available here.
[range:]in the outer query may be written as
[range]without the trailing colon:
- Square brackets may be omitted for both outer and inner queries:
It is equivalent to the following PromQL query with
step value obtained from query_range API.
step is also known as
interval and equals to the duration between two adjacent points on graph in Grafana.
VictoriaMetrics automatically adjusts too small
range if it becomes smaller than the interval between two time series points, so the graph remains visible (and usable) on small zoom levels or on big scrape intervals. This works around the corresponding Prometheus issue.
Prometheus subquery pitfalls
While subqueries are powerful, they are easy to misuse. For instance, the following query would return incorrect results:
The query sums all the
requests_total counters and then calculates rate for the sum. The problem is that the query deals wrong with counter resets — if certain
requests_total time series are reset (for instance, due to micro-service restart), then the sum may go down a bit, so the
rate will return incorrect result. Prometheus doesn’t provide functionality for fixing such queries. The only approach is to swap
remove_resets function, which can be used for fixing the original query:
remove_resets function removes counter resets from
requests_total, returning always increasing time series, which can be safely summed and passed to
Subqueries resolve the long-standing feature request and increase PromQL power. VictoriaMetrics goes further by cutting sharp corners and simplifying subqueries’ usage.
Try using single-node VictoriaMetrics as a long-term remote storage for Prometheus.
Update: VictoriaMetrics is open source now!