Upgrade Advisory
This documentation is for Flux (v1) and Helm Operator (v1). Both projects are in maintenance mode and will soon reach end-of-life. We strongly recommend you familiarise yourself with the newest Flux and start looking at your migration path.
For documentation regarding the latest Flux, please refer to this section.
Rollbacks
From time to time a release made by the Helm Operator may fail, this section of the guide will explain how you can recover from a failed release by enabling rollbacks.
Caution
Rollbacks of Helm charts containingStatefulSet
resources can be a
tricky operation, and are one of the main reasons automated rollbacks are not
enabled by default. Verify a manual rollback (using helm
) of your Helm
chart does not cause any problems before enabling it.Enabling rollbacks
When rollbacks for a HelmRelease
are enabled, the Helm Operator will detect
a faulty upgrade, including post-upgrade helm test
if enabled
failures, and instruct Helm to perform a rollback, it will not attempt a new
upgrade unless it detects a change in values and/or the chart, or
retries have been enabled. Changes
are detected by comparing the failed release to a fresh dry-run release.
Rollbacks can be enabled by setting .rollback.enable
:
spec:
rollback:
enable: true
Wait interaction
When rollbacks are enabled,
resource waiting
defaults to true
since this is necessary to validate whether the release should
be rolled back or not.
Tweaking the rollback configuration
To get more fine-grained control over how the rollback is performed by Helm,
the .rollback
of the HelmRelease
resources offers a couple of additional
settings.
spec:
rollback:
enable: true
disableHooks: false
force: false
recreate: false
timeout: 300
The definition of the listed keys is as follows:
enable
: Enables the performance of a rollback when a release fails.disableHooks
(Optional): When set totrue
, prevent hooks from running during rollback. Defaults tofalse
when omitted.force
(Optional): When set totrue
, force resource update through delete/recreate if needed. Defaults tofalse
when omitted.recreate
(Optional): When set totrue
, performs pods restart for the resource if applicable. Defaults tofalse
when omitted.timeout
(Optional): Time to wait for any individual Kubernetes operation during rollback in seconds. Defaults to300
when omitted.
Warning
When your chart requires a high non-defaulttimeout
value it is advised
to increase the terminationGracePeriod
on the Helm Operator pod to not
end up with a release in a faulty state due to the operator receiving a
SIGKILL
signal during an upgrade.Enabling retries of rolled back releases
Sometimes the cause of an upgrade failure may be transient. To guard yourself
against this it is possible to instruct the Helm Operator to retry the upgrade
of a rolled back release by setting .rollback.retry
to true
. This will
cause the Helm Operator to retry the upgrade until the .rollback.maxRetries
is reached:
spec:
rollback:
enable: true
retry: true
maxRetries: 5
The definition of the listed keys is as follows:
enable
: Enables the performance of a rollback when a release fails.retry
(Optional): When set totrue
, retries the upgrade of a failed release untilmaxRetries
is reached. Defaults tofalse
when omitted.maxRetries
(Optional): The maximum amount of retries that should be attempted for a rolled back release. Defaults to5
when omitted, use0
for an unlimited amount of retries.