Docs Home
Viewing docs for
Self-ManagedNot available for BYOC

Kubernetes Probes

On this page

Kubernetes probes support cluster monitoring:

  • Readiness probes are used by Kubernetes to decide when containers are ready to start accepting traffic, and to signal Pods as ready.
  • Liveness probes are used to decide when containers need to be restarted, for example they can catch deadlock conditions where an application is still running but one or more containers are blocked.

As part of the standard Deployment configuration, Ververica Platform configures default health endpoints for the appmanager and gateway containers to enable Kubernetes readinessProbe and livenessProbe functionality to monitor the behaviour of a running Flink application.

You can check appmanager and gateway container health by running a simple curl command from inside the container, see Access the Health Endpoint (#access-the-health-endpoint) below.

Defaults

The default values are specified as follows, based on the Deployment configuration template:

YAML
1livenessProbe:
2  httpGet:
3    path: /actuator/health
4    port: management
5    initialDelaySeconds: 90
6    timeoutSeconds: 10
7readinessProbe:
8  httpGet:
9    path: /actuator/health
10    port: management
11    initialDelaySeconds: 10
12    timeoutSeconds: 10

Configuration

To configure alternative endpoints or change the delay and timeout values, add an appropriate configuration fragment to your application main values configuration file, by default values.yaml, as specified on the command line when you install/upgrade Ververica Platform:

BASH
1helm upgrade --install --values values.yaml

The configurable values are the following:

  • MANAGEMENT_ENDPOINTS_WEB_BASE_PATH Base path for building the container endpoints.
  • MANAGEMENT_ENDPOINTS_WEB_PATH_MAPPING_HEALTH Container health probe endpoints appended to the base path.
  • initialDelaySeconds Wait time from starting the container before first probe with readinessProbe and livenessProbe, i.e. specifies the time the container has before Kubernetes starts to probe. After this time, probing will start.
  • timeoutSeconds Wait time for a response from the container to readinessProbe and livenessProbe, i.e. specifies how quickly the container needs to respond to the probe. If the container fails to respond in time, the failure is counted. When failures exceed the failure threshold for the probe, the probe failure behaviour is triggered.

Example configuration

To configure the settings for the appmanager container, update the configuration values under the appmanager root property in the values Helm file. In the example, the default endpoint is changed from /actuator/health to /appmanager/health, probe timings are unchanged:

YAML
1appmanager:
2  env:
3    - name: "MANAGEMENT_ENDPOINTS_WEB_BASE_PATH"
4      value: "/appmanager"
5    - name: "MANAGEMENT_ENDPOINTS_WEB_PATH_MAPPING_HEALTH"
6      value: "/health"
7  livenessProbe:
8    httpGet:
9      path: /appmanager/health # default is /actuator/health
10      port: management
11    initialDelaySeconds: 90
12  readinessProbe:
13    httpGet:
14      path: /appmanager/health # default is /actuator/health
15      port: management
16    initialDelaySeconds: 10

To configure the settings for the gateway container, update the configuration values under the gateway root property in the values Helm file. In the example, the default endpoint is changed from /actuator/health to /gateway/health, probe timings are unchanged:

YAML
1gateway:
2  env:
3    - name: "MANAGEMENT_ENDPOINTS_WEB_BASE_PATH"
4      value: "/gateway"
5    - name: "MANAGEMENT_ENDPOINTS_WEB_PATH_MAPPING_HEALTH"
6      value: "/health"
7  livenessProbe:
8    httpGet:
9      path: /gateway/health # default is /actuator/health
10      port: management
11    initialDelaySeconds: 90
12  readinessProbe:
13    httpGet:
14      path: /gateway/health # default is /actuator/health
15      port: management
16    initialDelaySeconds: 10

To verify the configuration, after installation run the commands to access the health endpoint and verify the output.

Access the Health Endpoint

To access the health endpoint, run the following commands:

  1. From the bash prompt in a terminal, log into the container using kubectl. For example, log into the gateway container with the following command:
BASH
1kubectl exec -it pod-name -c gateway
  1. Execute the following curl command to check the health of the gateway container, this example assumes the endpoint was reconfigured from the default to /gateway/health:
BASH
1curl http://localhost:management-port/gateway/health

The output should be similar to the terminal output below, showing an UP / DOWN status with relevant details:

JSON
1{"status":"UP","components":{"db":{"status":"UP","details":{"database":"SQLite","validationQuery":"isValid()"}},"discoveryComposite":{"description":"Discovery Client not initialized","status":"UNKNOWN","components":{"discoveryClient":{"description":"Discovery Client not initialized","status":"UNKNOWN"}}},"diskSpace":{"status":"UP","details":{"total":101203873792,"free":78737813504,"threshold":10485760,"exists"}}}}
Was this helpful?