How VictoriaMetrics makes instant snapshots for multi-terabyte time series data

  • High insert performance on high-cardinality data. See this article for details.
  • High select performance when big amounts of data is analyzed. See this article and this spreadsheet for details.
  • High compression rate for typical time series data.
  • Online instant snapshots without degrading database operations.

A few words about ClickHouse

VictoriaMetrics stores data in data structures similar to MergeTree table from ClickHouse. Let’s refresh our memories about ClickHouse. It is the fastest database for analytical data and for other event streams. It outperforms conventional databases such as PostgreSQL and MySQL by 10x-1000x on a typical analytical queries. This means a single ClickHouse server may substitute a big cluster of conventional databases. Hurry up to give it a try and reduce operational costs :)

What is MergeTree?

MergeTree is a column-oriented table engine built on a data structure similar to Log Structured Merge Tree. MergeTree properties:

  • Data for each column is stored separately. This reduces overhead during column scans, since there is no need in spending resources on reading and skipping data for other columns. This also improves per-column compression ratio, since individual columns usually contain similar data.
  • Rows are sorted by a “primary key”, which may span multiple columns. There is no unique constraint on a primary key — multiple rows may have identical primary key. This allows for quick row lookups and range scans by a primary key or by its’ prefix. Additionally this improves compression ratio, since consecutive sorted rows usually contain similar data.
  • Rows are split into moderately sized blocks. Each block consists of per-column sub-blocks. Each block is processed independently. This means close-to-perfect scalability on multi-CPU systems— just feed all the available CPU cores with independent blocks. Block size may be configured, but it is advisable to use sub-blocks with sizes in the range of 64KB-2MB, so they fit CPU caches. This improves performance, since CPU cache access is much faster than RAM access. Additionally this reduces overhead when only a few rows must be accessed out of a block with many rows.
  • Blocks are merged into “parts”. These parts are similar to SSTables from Log Structured Merge (LSM) tree. ClickHouse merges smaller parts into bigger parts in the background. Unlike canonical LSM, MergeTree doesn’t have strict levels with similarly-sized parts. The merge process improves query performance, since lower number of parts are inspected with each query. Additionally the merge process reduces the number of data files, since each part contains fixed number of files proportional to the number of columns. Parts’ merging has yet another benefit — better compression rate, since it moves closer column data for sorted rows.
  • Parts are grouped into partitions by “partitioning key”. Initially ClickHouse allowed creating per-month partitions on a Date column. Now arbitrary expressions may be used for building partitioning key. Distinct values for partitioning key result in separate partitions. This allows fast and easy per-partition data archiving / removal.

Instant snapshots in VictoriaMetrics

VictoriaMetrics stores time series data in MergeTree-like tables, so it benefits from the features mentioned above. Additionally it stores inverted index for fast lookups by the given time series selectors. The inverted index is stored in a mergeset — data structure built on top of MergeTree ideas and optimized for inverted index lookups.

  • Newly added parts either appear in the MergeTree or fail to appear. MergeTree never contains partially created parts. The same applies to merge process — parts are either fully merged into a new part or fail to merge. There are no partially merged parts in MergeTree. Does this means that parts appear out of blue? No. Parts are assembled in temporary directories and then atomically moved to MergeTree. The same applies to the merge — old parts are atomically swapped with the new part when it is ready.
  • Part contents in MergeTree never change. Never ever. Parts are immutable. They may be only deleted after the merge to a bigger part.

Conclusion

VictoriaMetrics built instant snapshots on a brilliant idea from MergeTree table engine in ClickHouse. These snapshots may be created at any time without any downtime or normal operations’ disruption on VictoriaMetrics.

  • /snapshot/list — lists available snapshots
  • /snapshot/create — creates new snapshot
  • /snapshot/delete?snapshot=… — deletes the given snapshot

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aliaksandr Valialkin

Aliaksandr Valialkin

Founder and core developer at VictoriaMetrics