Configuring Prometheus Monitoring
On this page
- Overview
- Prerequisites
- Installing the Prometheus Reporter Plugin
- Enabling the Metrics Reporter in Fluss
- Exposing the Metrics Endpoint to Prometheus
- Annotation-based scraping
- Deploying the fluss-grafana Chart
- Prerequisites
- Configurable values
- Install
- Verifying End-to-End
- Troubleshooting
- Reporter returns connection refused
- Metrics services have no metrics port
- Related manuals:
This guide explains how to install and configure the Prometheus metrics reporter plugin to expose server-side metrics on Ververica Platform Fluss pods. It also details how to set up target scraping and deploy the fluss-grafana chart to visualize cluster data using curated dashboards.
Overview
Fluss exposes server-side metrics through a pluggable reporter framework. The Prometheus reporter publishes a text-format /metrics endpoint on every coordinator and tablet pod. A Prometheus instance then scrapes this endpoint, and you can visualize the data in Grafana.
Configuring monitoring involves the following 3 layers:
- Plugin: Install the metrics-prometheus reporter into the Fluss pods so the reporter classes are on the classpath.
- Reporter: Enable the reporter in the Fluss Helm values. The upstream chart then renders the metrics block in server.yaml and creates a headless metrics Service per role (coordinator and tablet).
- Scrape and Visualize: Point a Prometheus instance at the metrics services, either through annotation-based scraping or using a ServiceMonitor managed by the Prometheus Operator. You can then install the fluss-grafana chart to deploy a curated set of dashboards.
This documentation assumes that you already have a Prometheus stack running in your Kubernetes cluster. Setting up the Prometheus stack is out of scope for this guide.
Prerequisites
Before you configure monitoring, ensure that your environment meets the following requirements:
- A Running Fluss Cluster: Deploy a Fluss cluster using the fluss-bundle chart. For details, see Deploying Fluss on Kubernetes.docx.
- A Prometheus Instance: Set up a Prometheus instance that scrapes the cluster. You can either configure the instance for annotation-based service discovery, or run it with the Prometheus Operator to discover ServiceMonitor resources. Both methods are supported, so choose the option that matches your existing setup.
- Grafana Access: Ensure that Grafana is reachable from the cluster, and connect the Prometheus instance as a data source.
- Registry Access: Verify that you have access to the Ververica registry to pull the fluss-grafana chart. For details, see Obtaining Registry Access.docx.
Installing the Prometheus Reporter Plugin
The Prometheus metrics reporter is shipped as a plugin and you must install it into the Fluss pods at startup. The fluss-setup component runs as an init container before the Fluss server starts. This container writes the plugin JAR files to a shared volume that the main container mounts at /opt/fluss/plugins/prometheus/.
The init container and volumes must be applied to both fluss.coordinator and fluss.tablet. The Prometheus reporter runs inside both the coordinator server and every tablet server.
The pattern below installs only metrics-prometheus. If you also need filesystem or lake plugins, such as fs-s3, fs-azure or lake-iceberg-* add them to the same init container. For details on the multi-plugin pattern, see Installing Fluss .docx
For a complete reference, including available plugins, command syntax, private Maven repository setup, pre-baking plugins into a custom image, and troubleshooting, see Installing Fluss .docx.
1fluss:
2 coordinator:
3 extraVolumes:
4 - name: fluss-plugins
5 emptyDir: {}
6 extraVolumeMounts:
7 - name: fluss-plugins
8 mountPath: /opt/fluss/plugins/prometheus
9 subPath: prometheus
10 initContainers:
11 - name: install-plugins
12 image: registry.ververica.cloud/platform-images/fluss:0.9.1-vv-2
13 command:
14 - /bin/sh
15 - -c
16 - |
17 set -e
18 /opt/fluss/bin/setup/install.sh fluss \
19 metrics-prometheus \
20 --force \
21 -- \
22 -s /path/to/settings.xml -q
23 volumeMounts:
24 - name: fluss-plugins
25 mountPath: /opt/fluss/plugins
26 tablet:
27 extraVolumes:
28 - name: fluss-plugins
29 emptyDir: {}
30 extraVolumeMounts:
31 - name: fluss-plugins
32 mountPath: /opt/fluss/plugins/prometheus
33 subPath: prometheus
34 initContainers:
35 - name: install-plugins
36 image: registry.ververica.cloud/fluss/fluss:<TAG>
37 command:
38 - /bin/sh
39 - -c
40 - |
41 set -e
42 /opt/fluss/bin/setup/install.sh fluss \
43 metrics-prometheus \
44 --force \
45 -- \
46 -s /path/to/settings.xml -q
47 volumeMounts:
48 - name: fluss-plugins
49 mountPath: /opt/fluss/pluginsThe -s /path/to/settings.xml flag is optional. You only need this flag when you pull the plugin JAR files from a private Maven repository. If you use the default public Maven Central resolution, you can drop this flag.
To learn how to mount a settings.xml file from a ConfigMap or a Secret , see Installing Fluss.
For the full set of configurable fields under fluss:, refer to the Fluss Helm Chart documentation.
Enabling the Metrics Reporter in Fluss
Enable the Prometheus reporter under the top-level fluss-metrics block instead of configurationOverrides. The upstream chart explicitly rejects setting metrics.reporters or metrics.reporter.<name>.port through configurationOverrides and will fail the Helm install with a validation error.
Add the following to your values.yaml file:
1fluss:
2 metrics:
3 reporters: prometheus
4 prometheus:
5 port: 9249Behind the scenes the chart:
- Renders metrics.reporters: prometheus and metrics.reporter.prometheus.port: 9249 into /opt/fluss/conf/server.yaml on the pod.
- Creates two headless ClusterIP Service resources (-coordinator-server-metrics-hs and -tablet-server-metrics-hs), each exposing the configured port
The next section adds either annotations or labels to those metrics services so a Prometheus instance can discover them.
The next section explains how to add either annotations or labels to those metrics services so a Prometheus instance can discover them.
For the full set of configurable fields under fluss:, refer to the Fluss Helm chart documentation.
Exposing the Metrics Endpoint to Prometheus
Two scraping mechanisms work with the metrics services that the chart creates: annotation-based scraping (where you configure a Prometheus instance to discover services by annotation) and ServiceMonitor-based scraping (which uses the Prometheus Operator). Choose the option that matches your existing Prometheus setup.
The upstream Fluss reference for both methods is Metrics and Monitoring.
Annotation-based scraping
Add Prometheus scrape annotations to the metrics services through metrics.prometheus.service.annotations. The underlying HTTP server for the reporter accepts any path, including / and /metrics. However, the standard Prometheus convention is /metrics, which the upstream fluss-bundle chart annotation test asserts.
1fluss:
2 metrics:
3 reporters: prometheus
4 prometheus:
5 port: 9249
6 service:
7 annotations:
8 prometheus.io/scrape: "true"
9 prometheus.io/path: "/metrics"
10 prometheus.io/port: "9249"Your Prometheus configuration must include a scrape job that targets Kubernetes services using these annotations.
ServiceMonitor-based scraping (Prometheus Operator)
Tag the metrics services with a label and create a ServiceMonitor resource that matches it. Because the Helm chart does not ship a ServiceMonitor, you must create one yourself.
1fluss:
2 metrics:
3 reporters: prometheus
4 prometheus:
5 port: 9249
6 service:
7 portName: metrics
8 labels:
9 monitoring: enabled1apiVersion: monitoring.coreos.com/v1
2kind: ServiceMonitor
3metadata:
4 name: fluss-metrics
5 namespace: <FLUSS_NAMESPACE>
6spec:
7 selector:
8 matchLabels:
9 monitoring: enabled
10 endpoints:
11 - port: metrics
12 path: /The Prometheus Operator picks up this ServiceMonitor only if its serviceMonitorSelector and ServiceMonitorNameSpaceSelector allow it; set both to {} to enable cluster-wide discovery.
Deploying the fluss-grafana Chart
The fluss-grafana chart bundles 2 curated Grafana dashboards (fluss-overview and fluss-detail) for Fluss server metrics. Ververica Platform packages these dashboards as a single ConfigMap that the Grafana dashboard sidecar picks up automatically.
The PromQL expressions in both dashboards use a namespace variable as a parameter. This allows a single Grafana instance to serve multiple Fluss deployments.
To customize the dashboards (such as adding panels, changing thresholds, or dropping charts), execute the following steps:
- Pull the chart locally using helm pull.
- Edit the dashboard JSON files located under the templates/ directory.
- Install your modified copy instead of the stock release.
Prerequisites
- Grafana running in the cluster, configured with the Grafana sidecar for dashboards (kube-prometheus-stack enables this by default).
- A Prometheus datasource registered in Grafana that scrapes the Fluss metrics services.
Configurable values
Install
Log in to the Ververica registry following Obtaining Registry Access, then install the chart, pointing it at your Prometheus datasource:
1helm install fluss-grafana \
2 oci://registry.ververica.cloud/platform-charts/fluss-grafana \
3 --version 0.9.1-vv-2 \
4 --namespace <GRAFANA_NAMESPACE> \
5 --set datasource=<PROMETHEUS_DATASOURCE_UID>The datasource value is the UID of a Grafana data source, not its display name. For kube-prometheus-stack , the default Prometheus data source UID is prometheus. To list your data source UIDs, go to Connections → Data sources in the Grafana UI, or query the Grafana API.
To discover the correct chart version, see Obtaining Registry Access.
To override the discovery label for a non-default sidecar setup and pin the dashboards to a dedicated Grafana folder, use the following configuration:
1helm install fluss-grafana \
2 oci://registry.ververica.cloud/platform-charts/fluss-grafana \
3 --version 0.9.1-vv-2 \
4 --namespace <GRAFANA_NAMESPACE> \
5 --set datasource=<PROMETHEUS_DATASOURCE_UID> \
6 --set-string labels.grafana_dashboard=1 \
7 --set-string annotations.'k8s-sidecar-target-directory'=/tmp/dashboards/FlussAfter installation, the Grafana dashboard sidecar imports the 2 dashboards into the configured folder, or into the default folder if you did not set a k8s-sidecar- target-directory annotation. Each dashboard exposes a namespace template variable. You must set this variable to the namespace where your Fluss release runs.

Each dashboard exposes a namespace template variable — set it to the namespace your Fluss release runs in.
Verifying End-to-End
1. Confirm Fluss Is Exposing Metrics
Hit the Prometheus reporter directly on a pod:
1kubectl exec -n <FLUSS_NAMESPACE> <FLUSS_RELEASE>-tablet-server-0 -- \
2 wget -qO- http://localhost:9249/
3kubectl exec -n <FLUSS_NAMESPACE> <FLUSS_RELEASE>-coordinator-server-0 -- \
4 wget -qO- http://localhost:9249/Expected output is raw Prometheus text:
1# HELP fluss_tabletserver_messagesInPerSecond ...
2# TYPE fluss_tabletserver_messagesInPerSecond gauge
3fluss_tabletserver_messagesInPerSecond 0.02. Confirm the Metrics Services Route to the Pods
1kubectl run curl --image=curlimages/curl -it --rm --restart=Never -- \
2 curl http://<FLUSS_RELEASE>-tablet-server-metrics-hs.<FLUSS_NAMESPACE>.svc.cluster.local:9249/
3kubectl run curl --image=curlimages/curl -it --rm --restart=Never -- \
4 curl http://<FLUSS_RELEASE>-coordinator-server-metrics-hs.<FLUSS_NAMESPACE>.svc.cluster.local:9249/3. Confirm Prometheus Is Scraping Fluss
Port-forward Prometheus and check Status → Targets for entries matching the Fluss ServiceMonitor:
1kubectl port-forward -n <PROMETHEUS_NAMESPACE> \
2 svc/<PROMETHEUS_SERVICE> 9090:9090Then open http://localhost:9000/targets?search=fluss. Every Fluss pod should appear with the state UP.
A quick instant query also works:
1curl -s "http://localhost:9090/api/v1/query?query=fluss_coordinator_activeTabletServerCount" | jq .
2curl -s "http://localhost:9090/api/v1/label/__name__/values" | \
3 jq '.data[] | select(startswith("fluss"))'4. Confirm Dashboards Are Loaded in Grafana
Port-forward Grafana, log in, and search for fluss under Dashboards. The fluss-overview and fluss-detail dashboards should be present. Set the namespace template variable to your Fluss namespace.
Troubleshooting
Reporter returns connection refused
The reporter port is closed. Verify that the reporter is configured by inspecting the rendered server.yaml file on a pod:
1kubectl exec -n <FLUSS_NAMESPACE> <FLUSS_RELEASE>-tablet-server-0 -- \
2 cat /opt/fluss/conf/server.yaml | grep metricsExpected output:
1metrics.reporters: prometheus
2metrics.reporter.prometheus.port: 9249If those lines are missing, the fluss.metrics.reporters value did not take effect. Re-check your Helm values to ensure that you set the reporter under fluss.metrics instead of fluss.configurationOverrides.
If the lines are present but the port is still closed, the metrics-prometheus plugin is not on the classpath. Re-check the init container logs and inspect the /opt/fluss/plugins/prometheus directory inside the pod.
Metrics services have no metrics port
Confirm that the chart created the metrics services with a named port:
1kubectl describe svc -n <FLUSS_NAMESPACE> \
2 <FLUSS_RELEASE>-coordinator-server-metrics-hs
3kubectl describe svc -n <FLUSS_NAMESPACE> \
4 <FLUSS_RELEASE>-tablet-server-metrics-hsLook for the following output:
1Port: metrics 9249/TCPIf the port is unnamed, your metrics-prometheus.service.portName value did not propagate, or metrics.reporters might be empty. Without the reporter enabled, the chart skips creating the metrics services entirely. A ServiceMonitor configured with port: metrics will not resolve until you fix both issues.
ServiceMonitor is not discovered by Prometheus
The metrics services resolve, but Prometheus cannot scrape them. This issue might occur due to the following common causes:
- Wrong Endpoint Path: The Fluss Prometheus reporter serves metrics at /, not /metrics. Set endpoints[].path: / on your ServiceMonitor.
- Network Restrictions: A network policy or service mesh might deny traffic from the Prometheus pod namespace to <FLUSS_NAMESPACE> on port 9249.
- Mismatched Port: The ServiceMonitor uses port: metrics, but the metrics service exposes a differently named port. For details, see the previous section.
Grafana shows “No data” on dashboards
Confirm that the datasource value passed to helm install matches a real Grafana data source UID, and verify that the data source queries the Prometheus instance that is scraping Fluss.
Confirm that the namespace template variable for the dashboard matches the namespace where Fluss is running. The dashboards inject {namespace=”$namespace”} into every PromQL query, so an unset or incorrect value yields zero series.
Further Reading
- Apache Fluss observability quickstart — upstream metrics reference.
- Fluss Helm chart values reference — full set of fields under fluss:.
- Fluss configuration reference — list of all metrics.* keys.
- Prometheus Operator API reference — ServiceMonitor schema.
- kube-prometheus-stack Helm chart — ServiceMonitorSelector datasource provisioning, dashboard sidecar.