Deployments

Applies toSelf-Managed v2

5 min read

On this page

Overview
Specification
Full Example

Deployments are the core resource abstraction within Ververica Platform to manage Apache Flink® jobs. A Deployment specifies the desired state of a Flink job and its configuration. Ververica Platform tracks and reports each Deployment's status and derives other resources from it. Whenever the Deployment specification is modified, Ververica Platform will ensure that the corresponding Flink job will eventually reflect this change.

Overview

Deployments tie together a sequence of deployed Flink Jobs, their accumulated state, an event log, policies to execute upgrades, and a template for creating Flink jobs. A Deployment will always be backed by zero or one active Flink job(s). Each Flink job is either executed in application mode or session mode.

On a high-level, each Deployment consists of three parts:

metadata Auxiliary information (e.g. name or modification timestamps) to manage the Deployment. This part is mostly user-configurable.
spec Configuration and behaviour of the Deployment. This part is fully user-configurable.
status Current status of the Deployment. This part is read-only.

The main part of a Deployment is its spec which is described in the following sections.

Specification

The spec section is the main part of each Deployment and consists of:

A Desired State to control the state of a Deployment.
A Deployment Mode that determines how to deploy the Flink job.
A Deployment Template that defines the necessary information needed to deploy a Flink job.
A Deployment Upgrades that defines which strategy to apply when upgrading a running Flink job.
Limits on the allowed operations Ververica Platform will perform in various scenarios.
When creating or modifying a Deployment, optional fields will be expanded with their default values (Deployment Defaults).

Desired State

A Deployment resource has a desired state (specified in spec.state). A Deployment's desired state can be one of the following:

RUNNING This indicates that the user would like to start a Flink job as defined in the spec.template section.
CANCELED This indicates that the user would like to terminate any currently running Flink jobs.
SUSPENDED This indicates that the user would like to gracefully terminate the currently running job and take a snapshot of its state.

When the desired state does not match the current state, Ververica Platform will try to perform a state transition to achieve the desired state.

Note

The desired state will only be eventually achieved. The behavior of modifications happening concurrently to a state transition is undefined. Ververica Platform only guarantees that eventually the latest modification will be acted upon. Intermediate states may not be observed. This means that an ongoing state transition will not necessarily be immediately interrupted if the desired state changes while an operation is still ongoing.

Please refer to the Deployment Lifecycle page for more details.

Deployment Mode

Deployments can either be executed in application or session mode. This can be configured by providing one of the following attributes:

deploymentTargetName Execute the Deployment in application mode. The name references an existing Deployment Target in the same Namespace. In this mode, the deployed Flink job will have exclusive access to the Flink cluster.
sessionClusterName Execute the Deployment in session mode. The name references an existing Session Cluster resource in the same Namespace. In this mode, deployed Flink jobs share resources with other Flink jobs running on the same Flink cluster.

Please refer to the Deployment Modes page for more details.

Non-production mode

A job can be deployed in non-production mode for strictly test use cases. These jobs do not count against license resources and are not supported as a production deployment.

Deployment Template

The Deployment template section specifies which Apache Flink® job job to execute and how to execute it, including its configuration.

You can think of settings in the template being directly applied to your job instances, whereas the overall Deployment specification section defines how to control these jobs over time (for instance, how to do upgrades).

Please refer to the Deployment Templates page for more details.

Deployment Upgrades

When Ververica Platform discovers that a running Flink job deviates from specified Deployment template, it will perform an upgrade of the Flink job to reconcile the situation. The behaviour of Ververica Platform is defined by an upgrade strategy (spec.upgradeStrategy) and a restore strategy (spec.restoreStrategy).

Please refer to the Deployment Upgrades page for more details.

Deployment Defaults

When a Deployment is created, default values for attributes in the spec section will be filled in as specified in the global and namespace Deployment defaults configuration. Therefore, a Deployment will be always be fully specified after it has been created.

Please refer to the Deployment Defaults page for more details.

Strategic Merge Patch for Kubernetes Pod Templates

User can add volumes, volumeMounts, and environment variables atop base settings in DeploymentDefaults. This feature enables merging new changes without sacrificing existing configurations.

Unlike Kubernetes' built-in strategic merge patch with predefined merge rules for k8s objects, this YAML directive enables users to specify which array to patch using strategic merge semantics. The directive is an element in a YAML array of two fields:

$patch: Indicates the patch method (currently only merge is supported).
mergeKey: Specifies the field used to search for the target element in the target array (deployment default array) and the patch element in the patch array (deployment array) to merge.

The patch directive has several key points to consider:

It applies specifically to the YAML editor for deployment (JAR/PYTHON/SQL) creation.
It only applies to arrays within the deployment spec section.
It is applicable to nested arrays.
If mergeKey is missing in the directive, the patch will be appended to the target.
If the field specified by the mergeKey is absent in an element of the target array, the element will be retained in the merge result.
If the field specified by the mergeKey is absent in an element of the patch array, the element will be appended to the merge result.

Below is an example of each component of the merge process: Deployment Default, New Deployment, and the merged result.

Deployment Default:

BASH

1spec:
2  maxJobCreationAttempts: 4
3  maxSavepointCreationAttempts: 4
4  restoreStrategy:
5    allowNonRestoredState: false
6    kind: LATEST_STATE
7  state: CANCELLED
8  template:
9    metadata:
10      annotations:
11        flink.queryable-state.enabled: 'false'
12        flink.security.ssl.enabled: 'false'
13    spec:
14      artifact:
15        flinkImageRegistry: eu.gcr.io/vvp-devel-240810
16        flinkImageRepository: flink
17        flinkImageTag: 1.18.1-stream1-scala_2.12-java8
18        flinkVersion: '1.18'
19        kind: JAR
20      flinkConfiguration:
21        execution.checkpointing.externalized-checkpoint-retention: RETAIN_ON_CANCELLATION
22        execution.checkpointing.interval: 10s
23        taskmanager.numberOfTaskSlots: '1'
24        web.cancel.enable: 'false'
25      kubernetes:
26        taskManagerPodTemplate:
27          spec:
28            containers:
29              - name: container-1
30                env:
31                  - name: name-1-a
32                    value: value-1-a
33                  - name: name-1-b
34                    value: value-1-b
35                volumeMounts:
36                  - mountPath: /volume-1
37                    name: volume-1
38                  - mountPath: /volume-2
39                    name: volume-2
40              - name: container-2
41                env:
42                  - name: name-2-a
43                    value: value-2-a
44                  - name: name-2-b
45                    value: value-2-b
46                volumeMounts:
47                  - mountPath: /volume-1
48                    name: volume-1
49                  - mountPath: /volume-2
50                    name: volume-2
51            volumes:
52              - name: volume-1
53                secret:
54                  secretName: name-1
55              - name: volume-2
56                secret:
57                  secretName: name-2
58      logging:
59        log4jLoggers:
60          '': INFO
61        loggingProfile: default
62      parallelism: 2
63      resources:
64        jobmanager:
65          cpu: 1
66          memory: 1G
67        taskmanager:
68          cpu: 1
69          memory: 2G
70  upgradeStrategy:
71    kind: STATEFUL

Deployment:

BASH

1metadata:
2  displayName: test
3spec:
4  deploymentTargetId: null
5  deploymentTargetName: test
6  sessionClusterName: null
7  template:
8    spec:
9      artifact:
10        jarUri: >-
11          https://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming/1.18.0/flink-examples-streaming-1.18.0-WindowJoin.jar
12        kind: JAR
13      kubernetes:
14        taskManagerPodTemplate:
15          spec:
16            containers:
17              - name: container-1
18                env:
19                  - name: name-1-a
20                    value: value-1-a
21                  - name: name-1-b
22                    value: value-1-b
23                volumeMounts:
24                  - mountPath: /volume-1
25                    name: volume-1-update
26                  - mountPath: /volume-3
27                    name: volume-3
28                  - $patch: merge
29                    mergeKey: mountPath
30              - name: container-2
31                env:
32                  - name: name-2-a
33                    value: value-2-a
34                  - name: name-2-b
35                    value: value-2-b
36                volumeMounts:
37                  - mountPath: /volume-1
38                    name: volume-1-update
39                  - mountPath: /volume-3
40                    name: volume-3
41              - name: container-3
42                env:
43                  - name: name-3-a
44                    value: value-3-a
45                  - name: name-3-b
46                    value: value-3-b
47                volumeMounts:
48                  - mountPath: /volume-1
49                    name: volume-1
50                  - mountPath: /volume-2
51                    name: volume-2
52              - $patch: merge
53                mergeKey: env
54            volumes:
55              - name: volume-1
56                secret:
57                  secretName: name-1-update
58              - name: volume-3
59                secret:
60                  secretName: name-3
61              - $patch: merge
62                mergeKey: name

The merged result:

BASH

1metadata:
2  displayName: test
3spec:
4  deploymentTargetId: null
5  deploymentTargetName: test
6  maxJobCreationAttempts: 4
7  maxSavepointCreationAttempts: 4
8  restoreStrategy:
9    allowNonRestoredState: false
10    kind: LATEST_STATE
11  sessionClusterName: null
12  state: CANCELLED
13  template:
14    metadata:
15      annotations:
16        flink.queryable-state.enabled: 'false'
17        flink.security.ssl.enabled: 'false'
18    spec:
19      artifact:
20        flinkImageRegistry: eu.gcr.io/vvp-devel-240810
21        flinkImageRepository: flink
22        flinkImageTag: 1.18.1-stream1-scala_2.12-java8
23        flinkVersion: '1.18'
24        jarUri: >-
25          https://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming/1.18.0/flink-examples-streaming-1.18.0-WindowJoin.jar
26        kind: JAR
27      flinkConfiguration:
28        execution.checkpointing.externalized-checkpoint-retention: RETAIN_ON_CANCELLATION
29        execution.checkpointing.interval: 10s
30        taskmanager.numberOfTaskSlots: '1'
31        web.cancel.enable: 'false'
32      kubernetes:
33        taskManagerPodTemplate:
34          spec:
35            containers:
36              - env:
37                  - name: name-1-a
38                    value: value-1-a
39                  - name: name-1-b
40                    value: value-1-b
41                name: container-1
42                volumeMounts:
43                  - mountPath: /volume-1
44                    name: volume-1-update
45                  - mountPath: /volume-2
46                    name: volume-2
47                  - mountPath: /volume-3
48                    name: volume-3
49              - env:
50                  - name: name-2-a
51                    value: value-2-a
52                  - name: name-2-b
53                    value: value-2-b
54                name: container-2
55                volumeMounts:
56                  - mountPath: /volume-1
57                    name: volume-1-update
58                  - mountPath: /volume-3
59                    name: volume-3
60              - env:
61                  - name: name-3-a
62                    value: value-3-a
63                  - name: name-3-b
64                    value: value-3-b
65                name: container-3
66                volumeMounts:
67                  - mountPath: /volume-1
68                    name: volume-1
69                  - mountPath: /volume-2
70                    name: volume-2
71            volumes:
72              - name: volume-1
73                secret:
74                  secretName: name-1-update
75              - name: volume-2
76                secret:
77                  secretName: name-2
78              - name: volume-3
79                secret:
80                  secretName: name-3
81      logging:
82        log4jLoggers:
83          '': INFO
84        loggingProfile: default
85      parallelism: 2
86      resources:
87        jobmanager:
88          cpu: 1
89          memory: 1G
90        taskmanager:
91          cpu: 1
92          memory: 2G
93  upgradeStrategy:
94    kind: STATEFUL

Limits

During state transitions of a Deployment, Ververica Platform creates Job resource and potentially triggers a Savepoint.

These operations might fail due to transient reasons, like a network issue, or misconfiguration. For these cases you can limit the number of attempts that Ververica Platform tries to create a Job or Savepoint before transitioning to a FAILED state:

Key	Description	Default
Key	Description	Default
maxSavepointCreationAttempts	Maximum attempted Savepoints before failing.	4
maxJobCreationAttempts	Maximum Job creation attempts before failing.	4

The behavior on failed job creation attemps can be customized by setting the jobFailureExpirationTime duration parameter. Any attempts older than the specified value will be disregarded. If set to 0 or not specified, old attempts will be considered indefinitely.

The input string needs to conform with: “{amount}{time unit}”, e.g. “123ms”, “321 s”. Supported time units are:

Unit	shorthand
Unit	shorthand
DAYS	d, day
HOURS	h, hour
MINUTES	min, minute
SECONDS	s, sec, second
MILLISECONDS	ms, milli, millisecond
MICROSECONDS	µs, micro, microsecond
NANOSECONDS	ns, nano, nanosecond

Full Example

The following snippets are a complete example of a Deployment in application mode and session mode, including all optional keys and a Deployment Template.

Application Mode

Session mode

YAML

1kind: Deployment
2apiVersion: v1
3metadata:
4  name: top-speed-windowing-example
5  displayName: TopSpeedWindowing Example
6  labels:
7    env: testing
8spec:
9  state: RUNNING
10  deploymentTargetName: default
11  restoreStrategy:
12    kind: LATEST_STATE
13  upgradeStrategy:
14    kind:  STATEFUL
15  maxSavepointCreationAttempts: 4
16  maxJobCreationAttempts: 4
17  template:
18    metadata:
19      annotations:
20        flink.queryable-state.enabled: 'false'
21        flink.security.ssl.enabled: 'false'
22    spec:
23      artifact:
24        kind: jar
25        jarUri: https://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming_2.12/1.12.7/flink-examples-streaming_2.12-1.12.7-TopSpeedWindowing.jar
26        additionalDependencies:
27        - s3://mybucket/some_additional_library.jar
28        - s3://mybucket/some_additional_resources
29        mainArgs: --windowSize 10 --windowUnit minutes
30        entryClass: org.apache.flink.streaming.examples.windowing.TopSpeedWindowing
31        flinkVersion: 1.12
32        flinkImageRegistry: registry.ververica.com/v2.10
33        flinkImageRepository: flink
34        flinkImageTag: 1.12.7-stream2-scala_2.12
35      flinkConfiguration:
36        execution.checkpointing.externalized-checkpoint-retention: RETAIN_ON_CANCELLATION
37        execution.checkpointing.interval: 10s
38        high-availability: vvp-kubernetes
39        state.backend: filesystem
40      parallelism: 2
41      numberOfTaskManagers: 2
42      resources:
43        jobManager:
44          cpu: 1
45          memory: 1g
46        taskManager:
47          cpu: 1.0
48          memory: 2g
49      logging:
50        loggingProfile: default
51        log4jLoggers:
52          "": INFO
53          org.apache.flink.streaming.examples: DEBUG
54      kubernetes:
55        pods:
56          envVars:
57          - name: KEY
58            value: VALUE

Was this helpful?

Yes No