I’ve written about Whisker previously in migrating from Flannel to Calico. While I like being able to debug network traffic in realtime, Whisker does not give me a historical overview:

  • What services has the most traffic measured by packets or bandwidth in the last 30 minutes?
  • What is the distribution of traffic by protocol over the last 30 minutes?

Version 3.30 of Calico OSS also introduced Goldmane, the gRPC-API powering Whisker. Goldmane uses Protobufs, which exposes a streaming endpoint to receive flow logs in realtime.

I use Loki to store logs for everything I run in my homelab, including firewall logs from OPNsense. It only makes sense to ingest flow logs into Loki as well. The protobuf payload from Goldmane contains information about the amount of bytes and packet count for each network flow. I can use this info with Metric Queries to turn the flow logs into metrics.

While researching the HTTP API for ingesting logs into Loki, I stumbled upon the documentation for the OTLP endpoint. Having no experience with OTLP, I went down the rabbit hole and learned about OTLP/HTTP, Direct to collector logging, Logs SDK and the log/slog Logging bridge. Using an open standard like OTLP/HTTP instead of targeting Loki specifically sounds like a much better idea.

This lead me to writing calico-flow-logs-otlphttp-exporter, which streams flow logs from Goldmane to anything thats compatible with OTLP/HTTP.

Note: calico-flow-logs-otlphttp-exporter is still in development, use at your own risk.

This is my setup to ingest flow logs from Goldmane into Loki in the homelab:

The OpenTelemetry Collector is not really necessary, its just there in case I want to add processors later.

graph TB
  goldmane[Goldmane]
  exporter[calico-flow-logs-otlphttp-exporter]
  otel-collector[OpenTelemetry Collector]
  loki[Loki]
  grafana[Grafana]
  exporter-->|Subscribe to flow logs streaming endpoint|goldmane
  goldmane-->|Stream flow logs|exporter
  exporter-->|Push logs using OTLP/HTTP|otel-collector
  otel-collector-->|Push logs using OTLP/HTTP|loki
  grafana-->|Consume Datasource|loki

Using Grafana to query flow logs ingested into Loki:

I also created a Grafana dashboard to see the distribution of traffic based on protocol/bandwidth over a specific time period:

I now have historical data for all network flows in the K3s cluster. The flow logs can be analyzed, visualized, and I retain all the data in my own infrastructure.

Updated: