Question
Idexx
US
Last activity: 8 Oct 2024 8:13 EDT
GKE container startup using cloud-sql-proxy sidecar
We are using the cloud-sql-proxy as a sidecar for securely connecting to Postgres Cloud SQL instance from GKE. During pod startup if the proxy isn't fully ready and accepting connections pega-web will have issues with startup. We are talking milliseconds here.
Error from container log: SEVERE [main] com.pega.pegarules.internal.bootstrap.PRBootstrapDataSource. Unable to connect to database. java.sql.SQLException: Cannot create PoolableConnectionFactory (Connection to localhost:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.)
At this point the container does not fully start and waits for the startupProbe to timeout prior to restarting the container. On the next start the container comes up without issue since the cloud-sql-proxy sidecar is running and accepting connections.
This has a huge impact on scaling workloads and take much longer than needed.
**To Reproduce** Configure a sidecar container to use for connections to cloud sql from the web tier.
We are using the cloud-sql-proxy as a sidecar for securely connecting to Postgres Cloud SQL instance from GKE. During pod startup if the proxy isn't fully ready and accepting connections pega-web will have issues with startup. We are talking milliseconds here.
Error from container log: SEVERE [main] com.pega.pegarules.internal.bootstrap.PRBootstrapDataSource. Unable to connect to database. java.sql.SQLException: Cannot create PoolableConnectionFactory (Connection to localhost:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.)
At this point the container does not fully start and waits for the startupProbe to timeout prior to restarting the container. On the next start the container comes up without issue since the cloud-sql-proxy sidecar is running and accepting connections.
This has a huge impact on scaling workloads and take much longer than needed.
**To Reproduce** Configure a sidecar container to use for connections to cloud sql from the web tier.
tier: - name: "web" custom: sidecarContainers: - name: cloud-sql-proxy image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.7.0 args: - "--private-ip" - "--port=5432" - "<INSTANCE_CONNECTION_NAME>" securityContext: runAsNonRoot: true resources: requests: memory: "2Gi" cpu: "1" nodeType: "WebUser"
**Expected behavior** It would be nice if there was some retry logic in the startup for connecting to the PRBootstrapDataSource or modifying the Pega images to delay startup until the sidecar is ready.
The cloud-sql-proxy has health checks. Those could be enabled and then the pega image could check to make sure the proxy is online prior to start.
until $(curl --output /dev/null --silent --head --fail http:// Proprietary information hidden:9090/liveness); do printf '.' sleep .1 done
Has anyone run into the this issue with GKE. Are there other potential workarounds to the issue I am running into?
***Edited by Moderator Marije to add closed INC reference and FDBK-105941****