Question
Rulesware
CA
Last activity: 2 Oct 2024 3:46 EDT
Replace External Kafka with Azure event hubs
We have externalized Kafka for our Pega 8.8.3. It’s working well, but we want to replace Kafka and Zookeeper VMs with Azure Event Hubs.
1. In externalizing Kafka for Pega, is it acceptable to use Azure Event Hubs as the Kafka implementation?
2. In externalizing Kafka for Pega, is it acceptable to use a Kafka cluster hosted by Azure HDInsight as the Kafka implementation?
3. If the answer to 1) or 2) is “no”, can you explain why that is the case, what specific incompatibility would arise from its use?
4. In general terms, it seems that so long as the solution exposes the appropriate Kafka broker endpoints and ports, Pega should not know or care what the underlying implementation is – is that assumption correct?
***Edited by Moderator Marije to add Case ID***
-
Reply
-
Share this page Facebook Twitter LinkedIn Email Copying... Copied!
Accepted Solution
Updated: 11 Apr 2024 10:41 EDT
Pegasystems Inc.
GB
Here is the response tat was provided by our Global Client Support team for the closed support ticket:
There are many different flavors of Kafka nowadays and it’s very challenging for us to validate the external Stream service using all these flavors, so our Engineering team has validated using the main ones, which are Amazon MSK, Confluent Platform/Cloud, Instaclustr, Bitnami and Apache.
With that being said, we have helped another client implement EventHub integration with their external stream service on Platform 8.8.2 version (also tested in 8.7.3) and it’s been working since then without issues. The only limitation is that Platform currently supports only PLAIN and SCRAM authentication for SASL mechanisms and MS EventHub is commonly used with OAuth 2.0.
Here’s some useful information that will help you achieving a successful stream integration with EventHub using PLAIN authentication (the following information is considering default settings for a new EH namespace):
External Stream integration with MS EventHub
Every EventHub cluster created using default settings utilize the following ports:
Here is the response tat was provided by our Global Client Support team for the closed support ticket:
There are many different flavors of Kafka nowadays and it’s very challenging for us to validate the external Stream service using all these flavors, so our Engineering team has validated using the main ones, which are Amazon MSK, Confluent Platform/Cloud, Instaclustr, Bitnami and Apache.
With that being said, we have helped another client implement EventHub integration with their external stream service on Platform 8.8.2 version (also tested in 8.7.3) and it’s been working since then without issues. The only limitation is that Platform currently supports only PLAIN and SCRAM authentication for SASL mechanisms and MS EventHub is commonly used with OAuth 2.0.
Here’s some useful information that will help you achieving a successful stream integration with EventHub using PLAIN authentication (the following information is considering default settings for a new EH namespace):
External Stream integration with MS EventHub
Every EventHub cluster created using default settings utilize the following ports:
- 9092 - for inter-broker communication (insecure/not exposed) considering there are 3 AZs providing data redundancy.
- 9093 - for client connections. It is SSL-secured (doesn’t require client-side truststore by default) and exposed over internet by default.
- 443 - for API calls, SchemaRegistry over HTTPS and Service bus operations.
To establish a connection to the EventHub endpoint, a Kafka principal needs to be created which Microsoft named as SharedAccessKey. This SharedAccessKey needs to be created with MANAGE permission (Microsoft call resource permissions as claims) which includes LISTEN and SEND operations (similar level of access when ALL operations are given in Kafka ACL authorizer mechanism).
To create an SharedAccessKey, it can be done via Azure portal > EventHub > All services > your-eventhub-namespace > Settings > Shared access policies.
Once the SharedAccessKey is created, we need to copy the Connection string–primary key which will be used as the password for authentication when constructing the Kafka producer/consumer later. The Connection string–primary key will look like below example:
Endpoint=sb://hidden.servicebus.windows.net/;SharedAccessKeyName=Principal;SharedAccessKey=HIDDEN=
EventHub’s authentication does accept SASL PLAIN security mechanism with SSL encryption, so when configuring the client properties to convert them into PRCONFIGs, the security protocol and mechanism should be:
security.protocol=SASL_SSL sasl.mechanism=PLAIN
Since it’s using PLAIN, a username and password needs to be provided for the Plain LoginModule. For EventHub, the Connection string that was mentioned earlier contain the necessary strings for both username and password (SharedAccessKeyName & SharedAccessKey), but EventHub is expecting the credentials to be passed using a $ConnectionString mapping for the username string which is parsed on server side during the authentication challenges, so we need to configure the SASL JAAS LOGIN client property like below:
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="Endpoint=sb://hidden.servicebus.windows.net/;SharedAccessKeyName=Principal;SharedAccessKey=hidden=";
Since EventHub uses 3 availability zones by default, the replication factor must be set to 3.
There are no prefixed permissions for topic names (Microsoft call topics as “event hubs”) – so any stream name pattern can be used with MANAGE permission/claim and Platform’s commands to create topics should succeed.
Here are the external stream PRCONFIGs that were used for our internal test, as an example (use single quotes for the username and password):
<env name="services/stream/provider" value="ExternalKafka"/> <env name="services/stream/broker/url" value="hidden.servicebus.windows.net:9093"/> <env name="services/stream/external/replication/factor" value="3"/> <env name="services/stream/encryption/security/protocol" value="SASL_SSL"/> <env name="services/stream/name/pattern" value="pega873-{stream.name}"/> <env name="services/stream/encryption/sasl/mechanism" value="PLAIN"/> <env name="services/stream/encryption/sasl/jaas/config" value="org.apache.kafka.common.security.plain.PlainLoginModule required username='$ConnectionString' password='Endpoint=sb://hidden.servicebus.windows.net/;SharedAccessKeyName=Principal;SharedAccessKey=hidden=';" />
Considering the above information/instructions, once Platform is started all topics/event hubs should be automatically created. Even though the EventHub portal will display topics in lower case, they are actually created using Platform’s logic (QP topics in upper case, stream data sets in camel case and internal topics in lower case).
As Platform needs to create more than 10 topics/event hubs per Pega instance, a Premium plan is required, otherwise there should be a limitation of only 10 topics per namespace and “PolicyViolationExceptions” are expected to happen.
In case you have the intention to point multiple Pega instances to the same EventHub namespace, make sure their corresponding stream name patterns (PRCONFIG “services/stream/name/pattern”) have unique prefixes/suffixes to avoid conflicting resources.
************************************************************* *****************************************
Original answer provided by GenAI below
- Yes, you can replace Kafka and Zookeeper VMs with Azure Event Hubs.
- Yes, it is acceptable to use a Kafka cluster hosted by Azure HDInsight as the Kafka implementation for Pega Platform.
- The answer to both 1) and 2) is "yes", so there's no need to explain any incompatibility.
- Your assumption is correct. As long as the solution exposes the appropriate Kafka broker endpoints and ports, Pega Platform should be able to connect and operate without knowing the specifics of the underlying implementation.
⚠ This is a GenAI-powered tool. All generated answers require validation against the provided references.
External Kafka in your deployment
Platform Support Guide > Kafka
Is Kafka essential to Pega platform?
***********************************************************************************************
Lantiqx
GB
External Kafka using Azure event hub is not supported. You have to use Kafka only from Azure as well.
https://support.pega.com/question/azure-event-hub-external-kafka
https://support.pega.com/question/it-possible-use-azure-event-hub-external-kafka
Rulesware
CA
Thanks @KrishnaChaitanyaG16918484 for your reply.
Can you please also let me know the answers to questions 3 and 4? I will appreciate it.
Best Regards
Accepted Solution
Updated: 11 Apr 2024 10:41 EDT
Pegasystems Inc.
GB
Here is the response tat was provided by our Global Client Support team for the closed support ticket:
There are many different flavors of Kafka nowadays and it’s very challenging for us to validate the external Stream service using all these flavors, so our Engineering team has validated using the main ones, which are Amazon MSK, Confluent Platform/Cloud, Instaclustr, Bitnami and Apache.
With that being said, we have helped another client implement EventHub integration with their external stream service on Platform 8.8.2 version (also tested in 8.7.3) and it’s been working since then without issues. The only limitation is that Platform currently supports only PLAIN and SCRAM authentication for SASL mechanisms and MS EventHub is commonly used with OAuth 2.0.
Here’s some useful information that will help you achieving a successful stream integration with EventHub using PLAIN authentication (the following information is considering default settings for a new EH namespace):
External Stream integration with MS EventHub
Every EventHub cluster created using default settings utilize the following ports:
Here is the response tat was provided by our Global Client Support team for the closed support ticket:
There are many different flavors of Kafka nowadays and it’s very challenging for us to validate the external Stream service using all these flavors, so our Engineering team has validated using the main ones, which are Amazon MSK, Confluent Platform/Cloud, Instaclustr, Bitnami and Apache.
With that being said, we have helped another client implement EventHub integration with their external stream service on Platform 8.8.2 version (also tested in 8.7.3) and it’s been working since then without issues. The only limitation is that Platform currently supports only PLAIN and SCRAM authentication for SASL mechanisms and MS EventHub is commonly used with OAuth 2.0.
Here’s some useful information that will help you achieving a successful stream integration with EventHub using PLAIN authentication (the following information is considering default settings for a new EH namespace):
External Stream integration with MS EventHub
Every EventHub cluster created using default settings utilize the following ports:
- 9092 - for inter-broker communication (insecure/not exposed) considering there are 3 AZs providing data redundancy.
- 9093 - for client connections. It is SSL-secured (doesn’t require client-side truststore by default) and exposed over internet by default.
- 443 - for API calls, SchemaRegistry over HTTPS and Service bus operations.
To establish a connection to the EventHub endpoint, a Kafka principal needs to be created which Microsoft named as SharedAccessKey. This SharedAccessKey needs to be created with MANAGE permission (Microsoft call resource permissions as claims) which includes LISTEN and SEND operations (similar level of access when ALL operations are given in Kafka ACL authorizer mechanism).
To create an SharedAccessKey, it can be done via Azure portal > EventHub > All services > your-eventhub-namespace > Settings > Shared access policies.
Once the SharedAccessKey is created, we need to copy the Connection string–primary key which will be used as the password for authentication when constructing the Kafka producer/consumer later. The Connection string–primary key will look like below example:
Endpoint=sb://hidden.servicebus.windows.net/;SharedAccessKeyName=Principal;SharedAccessKey=HIDDEN=
EventHub’s authentication does accept SASL PLAIN security mechanism with SSL encryption, so when configuring the client properties to convert them into PRCONFIGs, the security protocol and mechanism should be:
security.protocol=SASL_SSL sasl.mechanism=PLAIN
Since it’s using PLAIN, a username and password needs to be provided for the Plain LoginModule. For EventHub, the Connection string that was mentioned earlier contain the necessary strings for both username and password (SharedAccessKeyName & SharedAccessKey), but EventHub is expecting the credentials to be passed using a $ConnectionString mapping for the username string which is parsed on server side during the authentication challenges, so we need to configure the SASL JAAS LOGIN client property like below:
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="Endpoint=sb://hidden.servicebus.windows.net/;SharedAccessKeyName=Principal;SharedAccessKey=hidden=";
Since EventHub uses 3 availability zones by default, the replication factor must be set to 3.
There are no prefixed permissions for topic names (Microsoft call topics as “event hubs”) – so any stream name pattern can be used with MANAGE permission/claim and Platform’s commands to create topics should succeed.
Here are the external stream PRCONFIGs that were used for our internal test, as an example (use single quotes for the username and password):
<env name="services/stream/provider" value="ExternalKafka"/> <env name="services/stream/broker/url" value="hidden.servicebus.windows.net:9093"/> <env name="services/stream/external/replication/factor" value="3"/> <env name="services/stream/encryption/security/protocol" value="SASL_SSL"/> <env name="services/stream/name/pattern" value="pega873-{stream.name}"/> <env name="services/stream/encryption/sasl/mechanism" value="PLAIN"/> <env name="services/stream/encryption/sasl/jaas/config" value="org.apache.kafka.common.security.plain.PlainLoginModule required username='$ConnectionString' password='Endpoint=sb://hidden.servicebus.windows.net/;SharedAccessKeyName=Principal;SharedAccessKey=hidden=';" />
Considering the above information/instructions, once Platform is started all topics/event hubs should be automatically created. Even though the EventHub portal will display topics in lower case, they are actually created using Platform’s logic (QP topics in upper case, stream data sets in camel case and internal topics in lower case).
As Platform needs to create more than 10 topics/event hubs per Pega instance, a Premium plan is required, otherwise there should be a limitation of only 10 topics per namespace and “PolicyViolationExceptions” are expected to happen.
In case you have the intention to point multiple Pega instances to the same EventHub namespace, make sure their corresponding stream name patterns (PRCONFIG “services/stream/name/pattern”) have unique prefixes/suffixes to avoid conflicting resources.
************************************************************* *****************************************
Original answer provided by GenAI below
- Yes, you can replace Kafka and Zookeeper VMs with Azure Event Hubs.
- Yes, it is acceptable to use a Kafka cluster hosted by Azure HDInsight as the Kafka implementation for Pega Platform.
- The answer to both 1) and 2) is "yes", so there's no need to explain any incompatibility.
- Your assumption is correct. As long as the solution exposes the appropriate Kafka broker endpoints and ports, Pega Platform should be able to connect and operate without knowing the specifics of the underlying implementation.
⚠ This is a GenAI-powered tool. All generated answers require validation against the provided references.
External Kafka in your deployment
Platform Support Guide > Kafka
Is Kafka essential to Pega platform?
***********************************************************************************************
Rulesware
CA
A detailed answer is available at INC-B12925.
Accenture Brazil
BR
Our customer is planning to upgrade Pega from version 8.8.2 to 24.1. Is external Kafka using Azure Event Hub supported by Pega in Standard Support?
Thanks.
Updated: 2 Oct 2024 3:46 EDT
Pegasystems Inc.
GB
@Jonathan Pereiradoes the answer provided by GCS not cover your question?
This is a closed post.
I see that you already logged a new PSC question here. Please continue on that thread.