Prometheus Subqueries in VictoriaMetrics

Aliaksandr Valialkin
3 min readFeb 24, 2019

--

Photo by rawpixel on Unsplash

Prometheus added support for subqueries in v2.7.0. This is quite useful concept, which simplifies graphing and alerting the following cases:

  • Percentage of time with more than 10 errors per second for the last hour:

avg_over_time((rate(errors_total[5m]) > bool 10)[1h:1m])

  • 95th percentile of network bandwidth for the last hour:

quantile_over_time(0.95, rate(node_network_receive_bytes_total[5m])[1h:1m])

  • The minimum number of requests per second for the last 30 minutes:

min_over_time(rate(requests_total[5m])[30m:])

  • The maximum car acceleration during the last hour:

max_over_time(deriv(rate(traveled_meters_total[1m])[5m:])[1h:])

Previously such kind of queries couldn’t be implemented in one go. They required writing recording rules for the inner queries and then writing queries over the output time series for the recording rules.

Extending Prometheus subqueries

VictoriaMetrics supports Prometheus subqueries and extends them a bit starting from v1.8.0. VictoriaMetrics provides the following extensions:

  • Offsets may be added at any place of the query. The following query returns the number of cache requests for the previous day:

(rate(hits_total[5m]) + rate(miss_total[5m])) offset 1d

This is equivalent to the following PromQL query:

rate(hits_total[5m] offset 1d) + rate(miss_total[5m] offset 1d)

This is especially useful when building multiple graphs with different offsets. For instance, the following query returns rps graphs for “today”, “yesterday” and “week ago”, so it becomes obvious whether the rps increases or decreases over time:

with (
rps = rate(requests_total[5m]),
)
union(
label_set(rps, “graph”, “today”),
label_set(rps offset 1d, “graph”, “yesterday”),
label_set(rps offset 7d, “graph”, “week_ago”),
)

The query uses PromQL extensions from VictoriaMetrics such as WITH expressions (aka Common Table Expressions — CTE) for outlining common expressions, union function for returning results from multiple queries and label_set function for setting additional labels to time series. The full list of PromQL extensions supported by VictoriaMetrics is available here.

  • [range:] in the outer query may be written as [range] without the trailing colon:

min_over_time(rate(requests_total[5m])[1h])

  • Square brackets may be omitted for both outer and inner queries:

deriv(rate(requests_total))

It is equivalent to the following PromQL query with step value obtained from query_range API.

deriv(rate(requests_total[step])[step:step])

The step is also known as interval and equals to the duration between two adjacent points on graph in Grafana.

VictoriaMetrics automatically adjusts too small range if it becomes smaller than the interval between two time series points, so the graph remains visible (and usable) on small zoom levels or on big scrape intervals. This works around the corresponding Prometheus issue.

Prometheus subquery pitfalls

While subqueries are powerful, they are easy to misuse. For instance, the following query would return incorrect results:

rate(sum(requests_total)[5m:])

The query sums all the requests_total counters and then calculates rate for the sum. The problem is that the query deals wrong with counter resets — if certain requests_total time series are reset (for instance, due to micro-service restart), then the sum may go down a bit, so the rate will return incorrect result. Prometheus doesn’t provide functionality for fixing such queries. The only approach is to swap sum with rate:

sum(rate(requests_total[5m]))

VictoriaMetrics provides remove_resets function, which can be used for fixing the original query:

rate(sum(remove_resets(requests_total))[5m])

remove_resets function removes counter resets from requests_total, returning always increasing time series, which can be safely summed and passed to rate.

Wrapping up

Subqueries resolve the long-standing feature request and increase PromQL power. VictoriaMetrics goes further by cutting sharp corners and simplifying subqueries’ usage.

Try using single-node VictoriaMetrics as a long-term remote storage for Prometheus.

Update: VictoriaMetrics is open source now!

--

--