Deployment on AWS EKS with a focus on high availability

Question

YujuT16716329

Member since 2022

1 post

Accenture

Posted: Jun 14, 2024

Last activity: Jun 21, 2024

Posted: 14 Jun 2024 5:02 EDT
Last activity: 21 Jun 2024 4:43 EDT

Closed

Deployment on AWS EKS with a focus on high availability

Report

Hello,

I am in the process of setting up a Pega deployment on AWS EKS with a focus on high availability. My goal is to ensure that pods are evenly distributed across multiple AWS availability zones and spread across different hosts within those zones.

While I have found that topologySpreadConstraints in Kubernetes can be used to distribute pods at the zone level, this feature seems to be missing from the provided Pega Helm chart, particularly for clustering service and Search and Reporting Service (SRS) pods.

Is there a recommended approach or best practice within the Pega community for implementing both availability zone and host-level distribution for these services? Specifically, I'm looking for a way to include topologySpreadConstraints or an equivalent functionality in the Helm chart configuration for clustering service and SRS pods.

I would greatly appreciate any advice or insights on how to configure our environment to meet these high availability requirements.

Thank you for your time and assistance!

***Edited by Moderator Rupashree S. to add Capability tags***

To see attachments, please log in.

Pega Platform 8.8.2

DevOps

Likes (1)

Yuju Tanaka
Share this page Facebook Twitter LinkedIn Email Copying... Copied!

Posted: 1 year ago

Posted: 18 Jun 2024 8:50 EDT

BhanuPrasanthT

Eclatprime Digital Private Limited

replied to YujuT16716329

Report

@YujuT16716329

Hi yuju,

In pega yaml files there will be a area about hpa where you can dedicatedly define your nodes and can decide where exactly the pods has to be created inside a cluster

using that approach you can predefine few nodes for web node only and few nodes for batch and few for stream. This helps pega like when ever a pod goes down it will automatically create another pod in same nodes instead of checking for other available nodes in the cluster.

To see attachments, please log in.

Likes (1)

Yuju Tanaka

Posted: 1 year ago

Updated: 1 year ago

Posted: 20 Jun 2024 21:37 EDT
Updated: 20 Jun 2024 21:40 EDT

YujuT16716329

Accenture

replied to BhanuPrasanthT

Report

@BhanuPrasanthT

Thank you for your response.

Based on our review, we have not found a method to control node placement using HPA (Horizontal Pod Autoscaler). It is likely that other features such as nodeSelector, affinity/anti-affinity rules would need to be combined for this purpose. If you could provide a specific example of how to configure node placement using HPA, we would greatly appreciate it.

To see attachments, please log in.

Like (0)

Posted: 1 year ago

Posted: 20 Jun 2024 21:43 EDT

BhanuPrasanthT

Eclatprime Digital Private Limited

replied to YujuT16716329

Report

@YujuT16716329

please go thru the below link values .yaml file and go to section hpa

there you can see all the topology and related things

https://github.com/pegasystems/pega-helm-charts/blob/master/charts/pega/values.yaml

To see attachments, please log in.

Like (0)

Posted: 1 year ago

Posted: 19 Jun 2024 12:42 EDT

AnuraagP9020

Phalanx Consultancy Services

replied to YujuT16716329

Report

@YujuT16716329

Hey, you are right, there does not seem to be an option to currently include the topologyspreadconstraints within the helm charts.

Possibly wait for Pega to update the charts :)

One way to probably go around this, is to deploy the srs using the current helm chart, retrieve the deployment manifest, make the changes with respect to the topology spread and then redeploy the updated manifest. This could be an easy(dirty) way to achieve HA and test but most definitely not the right way as the configuration drifts from the original helm version.

The other recommended approach is to make the changes via helm so that the configuration revision integrity is maintained. This is a bit tricky and you would have to make changes to the helm template files in the chart and ultimately the values.yaml.

we need to download the helm repo to your local working env and then navigate to the "srs/templates/srsservice_deployment.yaml".

topologySpreadConstraints:
{{- range .Values.srsRuntime.topologySpreadConstraints }}
- maxSkew: {{ .maxSkew }}
  topologyKey: {{ .topologyKey }}
  whenUnsatisfiable: {{ .whenUnsatisfiable }}
  labelSelector:
    matchLabels:
      app: srs
{{- end }}

you can also use indent technique to update the srsservice_deployment.yaml.

@YujuT16716329

Hey, you are right, there does not seem to be an option to currently include the topologyspreadconstraints within the helm charts.

Possibly wait for Pega to update the charts :)

we need to download the helm repo to your local working env and then navigate to the "srs/templates/srsservice_deployment.yaml".

topologySpreadConstraints:
{{- range .Values.srsRuntime.topologySpreadConstraints }}
- maxSkew: {{ .maxSkew }}
  topologyKey: {{ .topologyKey }}
  whenUnsatisfiable: {{ .whenUnsatisfiable }}
  labelSelector:
    matchLabels:
      app: srs
{{- end }}

you can also use indent technique to update the srsservice_deployment.yaml.

update this yaml file with details of the topology spread (example)

srsRuntime:
  replicaCount: 2
  srsImage: <image>
  imagePullPolicy: "IfNotPresent"
  # topologySpreadConstraints: ### like how it is mentioned in the pega/values.yaml
      # - maxSkew: <integer>
      #  topologyKey: <string>
      #  whenUnsatisfiable: <string>
      #  labelSelector: <object>

Once all the changes are done, use the local updated helm charts to deploy. I have not tested these but hopefully this is work. Please test this and let us know if it worked.

Show Less

To see attachments, please log in.

Likes (1)

Yuju Tanaka

Posted: 1 year ago

Updated: 1 year ago

Posted: 20 Jun 2024 21:40 EDT
Updated: 20 Jun 2024 21:41 EDT

YujuT16716329

Accenture

replied to AnuraagP9020

Report

@AnuraagP9020

Thanks for your response.

The method you have presented certainly seems to work well. However, we are concerned that if we customize the helm chart provided by Pega, that environment will no longer be supported.

To see attachments, please log in.

Like (0)

Posted: 1 year ago

Posted: 21 Jun 2024 4:43 EDT

AnuraagP9020

Phalanx Consultancy Services

replied to YujuT16716329

Report

@YujuT16716329 , Sure I understand.. But currently, there is seems to be no option on the helm deployment template to use node selectors, anti affinity, etc..

So may be you can do one thing without changing anything on the helm charts.

First deploy as is using Pega's helm chart for SRS ..
Create seperate manifests for pdb, hpa and then for the deployment.
Apply the pdb and hpa manifest and then patch the deployment manifest to your existing deployment

Example:

@YujuT16716329 , Sure I understand.. But currently, there is seems to be no option on the helm deployment template to use node selectors, anti affinity, etc..

So may be you can do one thing without changing anything on the helm charts.

First deploy as is using Pega's helm chart for SRS ..
Create seperate manifests for pdb, hpa and then for the deployment.
Apply the pdb and hpa manifest and then patch the deployment manifest to your existing deployment

Example:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: srs-pdb
  namespace: your-srs-namespace
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app.kubernetes.io/instance: backingservices
      app.kubernetes.io/name: srs-service

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: srs-hpa
  namespace: your-srs-namespace
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your-srs-deployment-name
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

apiVersion: apps/v1
kind: Deployment
metadata:
  name: your-srs-deployment-name
  namespace: your-srs-namespace
spec:
  template:
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app.kubernetes.io/name
                    operator: In
                    values:
                      - srs-service
              topologyKey: "kubernetes.io/hostname"
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: eks.amazonaws.com/nodegroup
                    operator: In
                    values:
                      - default
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app.kubernetes.io/name: srs-service

change the topologykey, values for nodeselectorterms, accordingly.

Show Less

To see attachments, please log in.

Like (0)

Question

Deployment on AWS EKS with a focus on high availability

Need help or want to help others?

Experience the benefits of Support Center when you log in.

Question

Deployment on AWS EKS with a focus on high availability

Related content:

Need help or want to help others?

Experience the benefits of Support Center when you log in.

We'd prefer it if you saw us at our best.