Quiescing and graceful shutdown in Kubernetes environments

Question

EugeneK17074130

Member since 2024

1 post

ING

Posted: Feb 15, 2024

Last activity: Mar 7, 2024

Posted: 15 Feb 2024 5:49 EST
Last activity: 7 Mar 2024 6:33 EST

Closed

Solved

Quiescing and graceful shutdown in Kubernetes environments

Report

Good day.

Currently, we are attempting to implement zero-downtime rolling restart of Pega deployments in Kubernetes.

Pega is deployed via up-to-date Helm charts, into its own namespace.

What we expect to see during rollout restart is pega-web pods terminated in a graceful way and all the clients connected to those pods via Kubernetes Ingress/Service gracefully moved to new pods.

Unfortunately, we can not quite achieve this just yet.

What happens is, when the pega-web pod is scheduled for termination, it is immediately removed from Endpoint resource, and thus is becoming unavailable to clients / ingresses.

No network connectivity back to Ingress/end users is allowed.

This is as expected, per design of Kubernetes itself.

This unfortunately means downtime and loss of sessions for our end users, guaranteed.

We have tried both quiescing with immediate drain and slow drain, but the results do not differ much. We have tried adding preStop lifecycle hook to allow for connection draining to happen before quiescing happens.

But alas, users connected to the pod, that's being terminated, are loosing their sessions and state.

Good day.

Currently, we are attempting to implement zero-downtime rolling restart of Pega deployments in Kubernetes.

Pega is deployed via up-to-date Helm charts, into its own namespace.

What we expect to see during rollout restart is pega-web pods terminated in a graceful way and all the clients connected to those pods via Kubernetes Ingress/Service gracefully moved to new pods.

Unfortunately, we can not quite achieve this just yet.

What happens is, when the pega-web pod is scheduled for termination, it is immediately removed from Endpoint resource, and thus is becoming unavailable to clients / ingresses.

No network connectivity back to Ingress/end users is allowed.

This is as expected, per design of Kubernetes itself.

This unfortunately means downtime and loss of sessions for our end users, guaranteed.

But alas, users connected to the pod, that's being terminated, are loosing their sessions and state.

Given your expertise with Pega Cloud (which also runs on Kubernetes) we would like to request an advice on configuration we can use to achieve zero-downtime restarts.

This can be Kubernetes configuration (like use of EndpointSlices vs Endpoints) or Pega configuration, we are open to any and all suggestions.

Show Less

To see attachments, please log in.

Pega Platform 8.8

Client-managed Cloud

Pega Cloud

DevOps

Financial Services

System/Cloud Ops Administrator

Likes (1)

Eugene Kabanets
Share this page Facebook Twitter LinkedIn Email Copying... Copied!

Accepted Solution

Posted: 1 year ago

Updated: 1 year ago

Posted: 29 Feb 2024 8:21 EST
Updated: 7 Mar 2024 6:33 EST

MarijeSchillern

MOD

replied to EugeneK17074130

Report

@EugeneK17074130

To achieve zero-downtime restarts in Kubernetes, you should use the kubectl rollout restart command. This command gradually deletes and replaces all pods in a set so that there are always active pods running and users do not experience downtime. For example, to restart the pega-web tier, you would use the command kubectl rollout restart deployment/pega-web --namespace mypega. You can restart Pega Platform nodes in any order, either sequentially or in parallel. This method is recommended to avoid downtime when restarting Pega Platform nodes. Please note that this method assumes that you have a sufficient number of pods running to handle the load during the restart process.

⚠ This is a GenAI-powered tool. All generated answers require validation against the provided references. 🌕 Restarting nodes in containerized deployments 🌕 Environment restarts > Considerations for any Pega Cloud restarts 🌕 Installing, patching, or updating Pega software in Kubernetes deployments > Updating the Pega software in your deployment

@EugeneK17074130

If you need further help I would suggest that you log a support issue with our cloud team to discuss your needs.

Show Less

To see attachments, please log in.

Like (0)

Question

Quiescing and graceful shutdown in Kubernetes environments

Need help or want to help others?

Experience the benefits of Support Center when you log in.

Question

Quiescing and graceful shutdown in Kubernetes environments

Related content:

Need help or want to help others?

Experience the benefits of Support Center when you log in.

We'd prefer it if you saw us at our best.