Why VictoriaLogs is a better alternative to Grafana Loki?

7 min readJust now

Loki is an open-source log management system from Grafana. It is advertised as a better alternative to other open-source databases for logs such as Elasticsearch and Opensearch. Loki has the following advantages over Elasticsearch:

It uses less RAM because it doesn’t index the ingested logs for fast full-text search — instead it indexes only a small subset of log fields known as log stream labels. This allows quickly finding log streams by their labels. See these docs for details.
It needs less storage space because of better compression of the ingested logs.

These advantages allow saving costs on infrastructure (CPU, RAM, storage space) needed for storing and analyzing big amounts of logs.

But Loki has its own downsides comparing to Elasticsearch:

Loki is harder to configure and operate than Elasticsearch, since a typical Loki setup consists of many different interconnected services with non-trivial configs. See these docs for details.
A typical production Loki setup needs S3-compatible storage for persisting the ingested logs. This is an additionally dependency, which needs to be configured and managed properly. S3-compatible storage in public clouds also has hidden costs related to paid requests and paid data retrieval. For example, a million of PUT requests at S3 Standard costs $5, while reading of a terabyte of data from S3 Glacier costs $30. So, heavy queries over terabytes of logs in Loki may cost a lot. See these docs for details.
Loki is harder to upgrade to new releases than Elasticsearch, since new releases of Loki frequently break old config options and introduce new configs. This means you need to adjust Loki configs during upgrades to new releases.
Loki starts using more RAM than Elasticsearch if the ingested logs contain fields with big number of unique values (such as trace_id, user_id, ip, etc.). Loki recommends encoding high-cardinality fields into a JSON in the form of {"trace_id":"AAAA-BBBB-CCCC-DDDD","user_id":"12345678","ip":"12.34.56.78"} and to store this JSON into a plaintext log message. Later such message can be parsed with the json pipe at query time in order to apply filters to the parsed log fields. The problem is that this is very slow compared to the approach used in Elasticsearch — to quickly locate the needed log entries via inverted indexes for the given log field.

Side note: recently Loki added structured metadata feature aimed towards solving issues with high-cardinality log fields. But the problem is that this is still an experimental hard-to-configure-properly feature, which can break with new releases. See the detailed post about structured metadata issues from Loki user.

Loki doesn’t support fast full-text search over plaintext logs. It reads all the log messages from storage when you search for logs containing the given word or phrase. Loki recommends narrowing down the search with time range filters and with filters on log stream labels. This allows skipping logs outside the selected time range and logs outside the selected log streams. See these docs for details. But the problem is that sometimes you cannot narrow down the search with time range filters and label filters. For example, if you need to find all the logs with "user_id":"12345678" phrase across all the log streams, e.g. logs associated with the given user_id. Loki can be extremely slow for such queries comparing to Elasticsearch (hours vs seconds). Loki also can be very expensive for such queries if the ingested logs are stored at S3-compatible storage with paid data retrieval such as S3 Glacier, which costs $30 per each TB of read data.

As you can see, Loki may be more cost-efficient than Elasticsearch because it uses less RAM and storage space in some cases, while it may be hard to configure, very slow and expensive in other cases. Is there an alternative open-source solution for logs, which solves Loki issues while keeping its advantages? Yes — VictoriaLogs — an open source database for plaintext logs, structured logs and wide events.

Log stream

VictoriaLogs uses the log stream concept similar to Loki. A log stream is a time-ordered stream of logs generated by a single application instance. Adjacent logs in a log stream may be related to each other, so sometimes it is useful to look at the surrounding logs in a singe log stream when investigating some issues in production. Logs in a log stream usually have similar structure and have relatively small set of fields with the finite set of values. These properties suggest that it is a good idea to store and process logs in a single log stream as a single physical unit (data block):

This improves compression ratio and reduces the needed storage space for the ingested logs, since similar logs from a single log stream compress better than intermixed logs from many log streams. This also improves query performance, since less data needs to be read from disk.
This simplifies and speeds up locating the surrounding log entries for the given log, since they are located physically close to each other in the storage.
This speeds up locating all the logs for the given log stream, since they are stored tightly in a small number of physical blocks.

The definition of a log stream as “a stream of time-ordered logs generated by a single application instance” covers the majority of production cases where logs are generated by some long-running backend services / microservices. But this definition isn’t good for cases where logs are generated by big number of clients or by some short-running tasks (or cron jobs). In this case the following definition is better from the storage efficiency and query performance PoV — “a log stream is a stream of time-ordered logs with similar structure and the same set of log fields”.

Log stream logs may belong to different application instances. For example, if some application generates a few logs between restarts and is frequently restarted, then it may be a good idea to store all the logs from all the application instances into a single log stream. This helps keeping the number of log streams under control.

A single application instance can generate multiple log streams. For example, if an ad server generates logs for requests, views and clicks, it is likely these logs have different structure and a different set of fields. So it may be a good idea to store logs for requests, views and clicks into distinct log streams.

How to identify and locate the needed log streams? The best approach is to attach an unique set of key=value labels per every log stream. For example, {job="nginx", instance="host-123:4567"}. Then an inverted index can be built over these labels and used for quick search of the log streams with labels matching the given filters. For example, {job="nginx"} filter finds all the log streams with job="nginx" label. VictoriaLogs re-uses label filters concept from Loki, which, in turn, re-uses label filters from Prometheus. See these docs for details on Prometheus label filters.

VictoriaLogs uses the same log stream concept as Loki and uses the same filters on log stream labels as Loki. Why is it better than Loki then?

VictoriaLogs is a better alternative to Loki

VictoriaLogs is free from Loki downsides outlined above:

VictoriaLogs is easy to configure and manage. It consists of a single small executable, which runs optimally on any hardware with default configs (e.g. it is zero-config). Its’ capacity and performance automatically scales with available CPU and RAM. It can run efficiently on Raspberry PI and on a computer with hundreds of CPU cores and terabytes of RAM. It doesn’t need any configs for database schemas and indexes — it is a schemaless database. See VictoriaLogs data model for details.
Production setup for VictoriaLogs doesn’t need S3-compatible storage. It stores the ingested logs in a single local directory. You can mount any block device with any filesystem to this directory if needed. For example, you can store the data to durable network-attached block device such as Google persistent disk.
VictoriaLogs doesn’t break already existing configs in new releases, so the upgrade path is very simple — just stop the currently running VictoriaLogs instance and start the new release. Keeping backwards compatibility between releases is a policy and a promise of the core VictoriaMetrics products — see these docs.
VictoriaLogs supports high-cardinality log fields such as user_id, trace_id and ip, out of the box. It automatically indexes all the log fields and provides fast full-text search over all the log fields. It is optimized for logs with hundreds of fields (aka wide events). There is no need in storing high-cardinality log fields into a plaintext JSON message and then parsing the JSON at query time (the recommended workaround for Loki) — just store all the log fields as is and get fast full-text search over all the stored log fields. VictoriaLogs stores log fields into distinct columns (e.g. it uses column-oriented storage for log fields). This reduces disk space usage, since per-column values usually have good compression ratio. This also increases query performance when only a small subset of log fields are referred in the query, since VictoriaLogs reads only the data for the referred log fields, while skipping the data for the rest of log fields.
VictoriaLogs provides fast full-text search over plaintext logs. It splits plaintext logs into tokens (words), builds bloom filters for these tokens and then uses the bloom filters for fast skipping of data block, which do not contain the words / phrases specified in the query. Bloom filters significantly improve the performance of “needle in the haystack” searches such as “search for all the logs containing the given trace_id”. Bloom filters are more efficient than inverted indexes from Elasticsearch when searching over big amounts of logs, which do not fit available RAM. See this article for technical details.

On top of this, VictoriaLogs usually needs less RAM and disk space than Loki, and it performs typical queries over logs at much faster speed. See these benchmark results.

Conclusion

VictoriaLogs is the best alternative to Loki (and Elasticsearch):

It is zero-config and schemaless.
It doesn’t break with upgrades to new releases.
It supports high-cardinality fields out of the box.
It uses less RAM and less disk space than Loki and Elasticsearch.
It executes typical queries at much faster speed than Loki.
It provides much better query language for logs than Loki — see LogsQL docs.

Try VictoriaLogs right now on your production logs and then decide whether it is worth switching from Loki to VictoriaLogs! I bet you will be more than happy after the switch.

Why VictoriaLogs is a better alternative to Grafana Loki?

Log stream

VictoriaLogs is a better alternative to Loki

Conclusion

Written by Aliaksandr Valialkin

No responses yet