Stripping dependency bloat in VictoriaMetrics Docker image

Aliaksandr Valialkin
4 min readMar 21, 2019

--

Photo by Erwan Hesry on Unsplash

Let’s compare docker image sizes for popular time series database solutions:

Docker image for VictoriaMetrics is the smallest — it occupies only 5MB. This is:

  • 6.8 times smaller than the TimescaleDB image
  • 8.6 times smaller than the Prometheus image
  • 10.2 times smaller than the InfluxDB image
  • 31.8 times smaller than the ClickHouse image

Let’s see how to achieve such a small size for the VictoriaMetrics image comparing to other TSDB solutions.

Step 1: creating statically linked binary on scratch image

VictoriaMetrics is written in Go. This language is known to be able to build statically linked binaries without any dependencies. Such binaries may run on scratch image in Docker.

By default Go doesn’t build statically linked binaries:

$ go build ./app/victoria-metrics/
$ ldd ./victoria-metrics
linux-vdso.so.1 (0x00007ffcec9b8000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7369714000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7369323000)
/lib64/ld-linux-x86-64.so.2 (0x00007f7369933000)

The built binary depends on system libraries — libpthread and libc — which are missing in scratch image. In order to build statically linked binary, -ldflags "-extldflags '-static'" must be passed to go build:

$ go build -ldflags "-extldflags '-static'" ./app/victoria-metrics/
# github.com/valyala/VictoriaMetrics/app/victoria-metrics
/tmp/go-link-905380395/000004.o: In function `_cgo_7e1b3c2abc8d_C2func_getaddrinfo':
/tmp/go-build/cgo-gcc-prolog:57: warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking

WTF? Well, by default Go uses system library for DNS resolving if the program uses C libraries aka cgo. VictoriaMetrics uses gozstd, which depends on upstream C library. The following option must be passed to go build in order to force using Go-native DNS resolver in this case: -tags netgo. Let’s try it:

$ go build -ldflags "-extldflags '-static'" -tags netgo ./app/victoria-metrics/
$ ldd ./victoria-metrics
not a dynamic executable
$ ./victoria-metrics --help
Usage of ./victoria-metrics:
-httpListenAddr string
TCP address to listen for http connections (default ":8428")
... skip ...
-retentionPeriod int
Retention period in months (default 1)
... skip ...
-storageDataPath string
Path to storage data (default "victoria-metrics-data")

Great! Now we have working statically linked binary, which can run in scratch docker image. Here is a complete Dockerfile for building VictoriaMetrics image:

FROM scratch
COPY --from=local/certs:1.0.1 /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY bin/victoria-metrics-prod .
EXPOSE 8428
ENTRYPOINT ["/victoria-metrics-prod"]

Here bin/victoria-metrics-prod is a statically linked binary built at the previous step.

The following line puts root certificates to the docker image, so VictoriaMetrics could interact with external world by https:

COPY --from=local/certs:1.0.1 /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt

Now we have a small docker image for VictoriaMetrics. But its’ size may be reduced further.

Step 2: removing unneeded Go dependencies

We use go modules for building VictoriaMetrics. Initially it had a big go.mod file with a ton of external dependencies like go.mod from Prometheus. The majority of these dependencies were transient and weren’t used by VictoriaMetrics directly. These dependencies had negative impact on build times and the resulting binary size.

We started investigating the possibilities on how to remove unneeded dependencies and worked out the following solution:

  • To use self-contained small packages without big transient dependencies.
  • To extract the required functionality from bloated packages or packages with big transient dependencies.

Sometimes it was hard to extract the required functionality from bloated package. In this case we were implementing the functionality from scratch. For example, we removed github.com/prometheus/client_golang dependency for exposing metrics in Prometheus format, since it was bloated. We created a small self-contained package — github.com/VictoriaMetrics/metrics— with the required functionality. Go developers could choose between comprehensive and bloated github.com/prometheus/client_golang and lightweight package from VictoriaMetrics for exposing metrics in Prometheus format :)

Now the go.mod file for VictoriaMetrics contains only essential small third-party packages:

module github.com/valyala/VictoriaMetricsrequire (
github.com/VictoriaMetrics/fastcache v1.4.6
github.com/cespare/xxhash v1.1.0
github.com/golang/snappy v0.0.1
github.com/valyala/fastjson v1.4.1
github.com/valyala/fastrand v0.0.0-20170531153657-19dd0f0bf014
github.com/valyala/gozstd v1.3.0
github.com/valyala/quicktemplate v1.0.2
golang.org/x/sys v0.0.0-20190318195719-6c81ef8f67ca
)

This reduced VictoriaMetrics build times from 5 seconds to 1.5 seconds. The resulting statically linked binary size has been reduces from 23MB to 11MB. The binary is compressed into 5MB when put into scratch Docker image.

Conclusions

It is easy to create small Docker images using the following rules:

And try single-node VictoriaMetrics. It is able to substitute moderately sized cluster built with competing solutions such as Thanos, Uber M3, Cortex, InfluxDB or TimescaleDB.

Update: we open sourced github.com/VictoriaMetrics/metrics package mentioned in the article.

Update2: we open-sourced single-node and cluster versions of VictoriaMetrics, so you can investigate its dependencies and Dockerfiles.

--

--

Aliaksandr Valialkin
Aliaksandr Valialkin

Written by Aliaksandr Valialkin

Founder and core developer at VictoriaMetrics

Responses (3)