How do open source solutions for logs work: Elasticsearch, Loki and VictoriaLogs
If you use Elasticsearch, OpenSearch, Loki or VictoriaLogs and are curious why your system requires a lot of RAM or performs full-text search queries at a very slow speed, then this article might be interesting to you.
The definition of logs
Let’s suppose typical Kubernetes logs with many log fields (such logs are usually shipped by Fluentbit) are ingested into the centralized database for logs:
{
"@timestamp": "2024-10-18T21:11:52.237412Z",
"message": "error details: name = ErrorInfo reason = IAM_PERMISSION_DENIED domain = iam.googleapis.com metadata = map[permission:logging.logEntries.create]",
"kubernetes_annotations.EnableNodeJournal": "false",
"kubernetes_annotations.EnablePodSecurityPolicy": "false",
"kubernetes_annotations.SystemOnlyLogging": "false",
"kubernetes_annotations.components.gke.io/component-name": "fluentbit",
"kubernetes_annotations.components.gke.io/component-version": "1.30.2-gke.3",
"kubernetes_annotations.monitoring.gke.io/path": "/api/v1/metrics/prometheus",
"kubernetes_container_hash": "gke.gcr.io/fluent-bit-gke-exporter@sha256:0ef2fab2719444d5b5b0817cf4512f24a347c521d9db9c5e3f85eb4fdcf9a187",
"kubernetes_container_image": "sha256:81082ddf27934f981642f2d8e615f763cc15c08414baa0e908a674ccb116dfcb",
"kubernetes_container_name": "fluentbit",
"kubernetes_docker_id": "f39c349b5368b3abde7c22f97f3f8547c202228e725bb5ef620f399e2a5e67af",
"kubernetes_host": "gke-sandbox-sandbox-pool-4cpu-b-4dc82194-l0an",
"kubernetes_labels.component": "fluentbit-gke",
"kubernetes_labels.controller-revision-hash": "68cfcc69c",
"kubernetes_labels.k8s-app": "fluentbit-gke",
"kubernetes_labels.kubernetes.io/cluster-service": "true",
"kubernetes_labels.pod-template-generation": "24",
"kubernetes_namespace_name": "kube-system",
"kubernetes_pod_id": "7d75e660-9fcf-4b6a-b860-210293b5eda6",
"kubernetes_pod_name": "fluentbit-gke-jt7wb",
"stream": "stderr"
}
These logs contain the following fields:
@timestamp
field with the time the log was generatedmessage
field with plaintext log message- various Kubernetes-specific fields, which identify the source container for the given log entry
The typical length of such log entry is 1KiB-2KiB
.
How does Elasticsearch store and query logs?
Elasticsearch assigns an unique ID per every ingested log entry (this may be an offset in the file where plain log entries are stored). Then it splits every log field into words. For example, message
field with the value error details: name = ErrorInfo
is split into error
, details
, name
and ErrorInfo
words (aka tokens). Then it persists these tokens in the inverted index — this is a mapping from (field_name; token)
to log entry ID
. For example, the message
field above is transformed into four entries in the inverted index:
(message; error)
->ID
(message; details)
->ID
(message; name)
->ID
(message; ErrorInfo)
->ID
The typical token length is around 5–10 bytes. So we can estimate that Elasticsearch needs to create around 125 entries in the inverted index per each ingested log entry with 1KiB
length. So Elasticsearch creates 125 billions
of entries in the inverted index for a billion of logs with 1KiB
each. Inverted index entries for the same (field_name; token)
pairs are usually stored in a compact form known as postings
:
(field_name; token)
-> [ID_1, ID_2, … ID_N]
For example, all the tokens for the field kubernetes_container_name=fluentbit
are compacted eventually into a single inverted index entry across all the logs with this field:
(kubernetes_container_name; fluentbit) -> [ID_1, ID_2, … ID_N]
If the number of such logs equals 125 millions, then the [ID_1, ID_2, … ID_N]
list contains 125 millions of entries. Every ID is usually a 64-bit integer, so 125 millions of entries occupy 125M*8 = 1GB
.
Then the size of the inverted index for storing a billion of 1KiB
logs with 125 tokens each equals to at least 1B*125*8 = 1TB
(not counting the storage needed for (field_name; token)
pairs). Elasticsearch needs to store the original logs , so they could be shown in query results. A billion of 1KiB
logs needs 1B*1KiB = 1TiB
of storage space. So the total needed storage space equals to 1TB + 1TiB = 2TiB
. Elasticsearch may apply some compression to inverted index via roaring bitmaps. It also may compress the original logs. This may reduce the required disk space by a few times, but it is still too big :(
Elasticsearch uses the inverted index for fast full-text search. When you search for some word (aka token) at some field, then it instantly locates postings for the (field_name; token)
pair in the inverted index using binary search over sorted by (field_name; token)
postings, then the original log entries are located by their IDs and read from the storage one-by-one. That’s why Elasticsearch shows such an outstanding query performance for full-text search!
Are there downsides for querying logs in Elasticsearch? Yes:
- When you search for some field value, which occurs in a big share of logs, then Elasticsearch needs to read huge postings during queries. For example, if you search for logs with the field
kubernetes_container_name=fluentbit
, which exists in 125 millions of logs, then Elasticsearch needs to read 125M*8=1GiB of 8-byte log IDs from the corresponding inverted index postings. Such queries may be very slow. Alternatively, they may require a lot of RAM in order to improve performance a bit, when all the needed postings are cached in RAM. Unfortunately, this is quite common case for typical production setups of Elasticsearch :( - When the query returns too many logs, then Elasticsearch may need reading these logs from random places at the storage for the original logs. This can be very slow on low-IOPS storage systems. For example, a typical HDD provides 100–200 random read operations per second. This means that Elasticsearch may needs
10K logs / 100 iops = 100 seconds
for reading and returning10K
matching logs if these logs aren’t cached in RAM.
Let’s recap:
- Elasticsearch provides outstanding performance for full-text search across all the log fields thanks to inverted indexes.
- Elasticsearch requires huge amounts of storage space for moderate and large volumes of logs (e.g. more than a terabyte).
- Elasticsearch requires huge amounts of RAM for querying moderate and large volumes of logs (e.g. more than a terabyte) at a decent speed.
How does Grafana Loki store and query logs?
Loki takes all the log fields except of message
and @timestamp
, sorts them by field name and then constructs e.g. log stream labelset
from them:
{
kubernetes_annotations.EnableNodeJournal="false",
kubernetes_annotations.EnablePodSecurityPolicy="false",
kubernetes_annotations.SystemOnlyLogging="false",
kubernetes_annotations.components.gke.io/component-name="fluentbit",
kubernetes_annotations.components.gke.io/component-version="1.30.2-gke.3",
kubernetes_annotations.monitoring.gke.io/path="/api/v1/metrics/prometheus",
kubernetes_container_hash="gke.gcr.io/fluent-bit-gke-exporter@sha256:0ef2fab2719444d5b5b0817cf4512f24a347c521d9db9c5e3f85eb4fdcf9a187",
kubernetes_container_image="sha256:81082ddf27934f981642f2d8e615f763cc15c08414baa0e908a674ccb116dfcb",
kubernetes_container_name="fluentbit",
kubernetes_docker_id="f39c349b5368b3abde7c22f97f3f8547c202228e725bb5ef620f399e2a5e67af",
kubernetes_host="gke-sandbox-sandbox-pool-4cpu-b-4dc82194-l0an",
kubernetes_labels.component="fluentbit-gke",
kubernetes_labels.controller-revision-hash="68cfcc69c",
kubernetes_labels.k8s-app="fluentbit-gke",
kubernetes_labels.kubernetes.io/cluster-service="true",
kubernetes_labels.pod-template-generation="24",
kubernetes_namespace_name="kube-system",
kubernetes_pod_id="7d75e660-9fcf-4b6a-b860-210293b5eda6",
kubernetes_pod_name="fluentbit-gke-jt7wb",
stream="stderr"
}
This labelset uniquely identifies a stream of logs received from a single source (Kubernetes container in this case). Loki stores this labelset only once per each log stream. Loki puts this labelset into an inverted index for fast locating of the matching logs streams with the help of log stream filters. Contrary to Elasticsearch, the inverted index for log stream labelsets is very small, since it contains entries per log stream instead of entries per each log, and the number of log streams is usually small (e.g. a few millions of log streams in the worst case vs billions of log entries).
Loki groups all the message
fields by log streams, sorts logs in every stream by @timestamp
, and then puts them to persistent storage in a compressed form. Grouping log messages by streams improves compression ratio, since every log stream usually contains similar logs. This allows Loki providing 5x-10x compression ratio in typical production cases. For example, a billion of logs with 1KiB
size each may occupy only 1B*1KiB/10 = 100GiB
of storage space. The needed storage space of the inverted index for log stream labelsets can be ignored, since it is usually much smaller than 100GiB
.
As you can see, Loki requires much lower amounts of storage space (up to 10x less) than Elasticsearch for the same amounts of logs. This is good for heavy analytical queries, which need to scan a big share of stored logs, since Loki needs to read less data from storage than Elasticsearch. Loki also needs much smaller amounts of RAM (up to 10x less) because of much smaller inverted index, which needs to be kept in RAM for a decent query performance.
Are there downsides for Loki? Yes:
- It provides very poor performance for “needle in the haystack” queries, which search for some unique word or phrase over large volume of logs. This is because it needs to read, unpack and then scan all the log messages for the given word or phrase. For example, if you are searching for some unique
trace_id=7d75e660–9fcf-4b6a-b860-210293b5eda6
across a billion of log messages with1KiB
size each, then Loki needs to scan1B*1KiB=1TiB
of data. Of course, it may need to read only100GiB
of data from storage because of good compression ratio, but this still doesn’t save from slow query times, even on fast SSDs and NVMe disks. - It has very poor support for structured logs with high-cardinality fields, which contain big number of unique values such as
user_id
,trace_id
,ip
, etc. If you try storing such fields into log stream labelsets, then Loki will eat all the RAM because of the blown up inverted index size. It will also significantly increase disk IO and slow down query performance because it isn’t optimized for big number of log streams.
Let’s recap:
- Loki needs much lower amounts of storage space and RAM than Elasticsearch.
- Full-text search queries in Loki are usually much slower (1000x slower) than in Elasticsearch.
- Loki has very poor support for structured logs with high-cardinality fields.
How does VictoriaLogs store and query logs?
VictoriaLogs splits every log field into words (aka tokens) in the way similar like Elasticsearch does. But it doesn’t create inverted indexes from these tokens. Instead, it creates bloom filters from tokens. These bloom filters are used for quick skipping of data blocks without the words provided in the query. For example, if some unique phrase such as trace_id=7d75e660–9fcf-4b6a-b860-210293b5eda6
is searched in logs, then the most of data blocks will be skipped without reading them, and only a few data blocks will be read, unpacked and inspected for the given phrase. This helps improving performance for the “needle in the haystack” type of queries.
Bloom filters in VictoriaLogs need only 2 bytes per unique token seen in logs, while inverted indexes in Elasticsearch need at least 8 bytes per every token seen in logs. The number of unique tokens is usually much smaller than the total number of tokens. A typical log entry contains up to 5 unique tokens, while the rest of tokens are shared among log entries. So, a billion of log entries have 1B*5=5 billions
of unique tokens, which are stored into 5B * 2 bytes/token = 10GB
.
So bloom filters are usually 10x-100x smaller than inverted indexes for the same sets of tokens. This lowers both storage space requirements and RAM size requirements for efficient data ingestion and querying in VictoriaLogs comparing to Elasticsearch. This also reduces disk read usage during heavy queries.
VictoriaLogs also has the concept of log streams similar to Loki. The difference is that VictoriaLogs doesn’t put log fields into log stream labelsets by default. Instead, it relies on the set of stream log fields provided by the log shipper via _stream_fields
query arg or via VL-Stream-Fields
HTTP request header according to these docs. This allows efficiently storing and querying structured logs with high-cardinality fields such as user_id
, trace_id
or ip
.
VictoriaLogs groups and stores data per each log field in physically separate storage area (aka column-oriented storage similar to ClickHouse). This minimizes the amounts of data to read during queries, since only the data for the requested fields is read from the storage. This also helps improving compression ratio for per-field data, which, in turn, reduces storage space requirements.
Are there downsides in VictoriaLogs? Yes:
- VictoriaLogs provides slower full-text search speed than Elasticsearch for some simple queries, which select a few log entries. This is because VictoriaLogs needs to read more data from bloom filters than Elasticsearch needs to read from inverted indexes for simple queries. As for heavy queries with multiple filters over different log fields, VictoriaLogs usually beats Elasticsearch in query performance. This is because the total size of inverted index postings, which are needed to be read by Elasticsearch, starts exceeding the total size of bloom filters, which are needed to be read by VictoriaLogs.
- VictoriaLogs doesn’t support facets yet.
Let’s recap:
- VictoriaLogs uses bloom filters for improving the performance of full-text search, while keeping low storage space usage (up to 15x less than Elasticsearch) and low RAM size requirements (up to 30x less than Elasticsearch). Simple queries may be still slower than in Elasticsearch though :(
- VictoriaLogs supports log streams similar to Grafana Loki. This provides fast querying over log streams.
- VictoriaLogs uses column-oriented storage for reducing storage space usage further. This also reduces storage read IO bandwidth usage during heavy queries over large volumes of logs.
Conclusion
All the open source solutions for logs — Elasticsearch, Loki and VictoriaLogs — have their own pros and cons. I tried explaining in clear way how these solutions store and query logs. I hope this information helps you choosing the correct solution for your needs. If in doubt, try running multiple solutions at the same time for your particular workload and then choosing the best solution.
The article didn’t cover non-technical aspects such as operational complexity (configuration, setup and maintenance), infrastructure costs, query language usability, integrations with other solutions, the quality of documentation, etc. I’d recommend starting from quick start docs per each solution:
Full disclosure: I’m the core developer of VictoriaLogs, but I tried to be fair when writing this article.