Discussion

jedrb

Member since 2021

7 posts

PEGA

Posted: Dec 12, 2024

Last activity: Dec 12, 2024

Posted: 12 Dec 2024 10:48 EST
Last activity: 12 Dec 2024 10:50 EST

Knowledge Buddy REST APIs

In this article you can find an introduction on what the Pega Knowledge Buddy API is and how can it be useful as a RAG Service. To evaluate this approach I used the following categories:

Retrieval Quality
Context Management
Integration Capabilities

I also included results of the Pega Knowledge Buddy API testing to let you understand basics of this integration.

Pega Knowledge Buddy

Pega Knowledge Buddy is an AI-powered application that uses Natural Language Processing (NLP) and machine learning to help users quickly find accurate information. By effectively understanding user queries more effectively, it provides more personalized responses based on available data sources that you provide.

Pega Knowledge Buddy offers concise answers to questions that would typically require browsing through multiple articles as users would in a conventional search experience. Furthermore, Knowledge Buddy responses are customized according to the audience and their corresponding access permissions.

Key benefits of Knowledge Buddy include:

Natural Language Processing (NLP) capabilities to understand and interpret user queries.
Customizable data sources for personalized and relevant responses.
Security features for access roles, data sources, and articles that are incorporated or ingested into Knowledge Buddy by using APIs.

Depending on your business environment, you can integrate Knowledge Buddy into the following applications:

Pega Knowledge
Pega CRM
Other Pega applications
Any third-party application

How does Pega Knowledge Buddy work? First, you create or identify the content that you want your buddy to use as a basis for its responses. This content then serves as a data source.

Once you have configured one or more data sources, you can create a buddy – a Large Language Model-based tool that employs RAG to answer questions using the knowledge you provided. Pega Knowledge Buddy enables you to create multiple buddies, each with a unique set of data sources, and each with a different query purpose, including sales, customer service, marketing and more.

Mulitple Knowledge Buddies

Data Ingestion

Customers can input any text-based content into Knowledge Buddy using the REST API service. When used in combination with Pega Knowledge Management software, data ingestion can be run automatically whenever a knowledge article is published. Once the content is uploaded, Pega Knowledge Buddy generates a set of content chunks and processes them through the Pega GenAI gateway. This step creates embeddings – segments of content with detailed semantic features attached to them. Embeddings are finally saved in the Pega GenAI Vector Store.

Data Ingestion

Asking Questions

Pega Knowledge Buddy receives the question from the application through the corresponding REST API service. The question is then passed through the Pega GenAI gateway, where it is converted into embeddings. Next, Pega Knowledge Buddy runs a similarity search in the vector database to find relevant content embeddings that match the question. Once the matching content is identified, Knowledge Buddy adds it to a specially formulated prompt. This prompt, along with the user role information, is sent to the Pega GenAI gateway again. The Large Language Model within the GenAI gateway processes the prompt and generates the answer to the user's question.

API

Knowledge Buddy provides REST APIs for various functions, including ingesting content, asking questions, deleting content, providing feedback on received responses, and performing semantic searches on content within the Knowledge Buddy.

Knowledge Buddy APIs use the OAuth 2.0 authentication model, but also support Basic authentication. If required, you can change the authentication method.

Ingestion API

Ingestion API ingests new content or updates existing content in the Knowledge Buddy database.

The request contains attributes of content that is ingested. The attributes provide metadata/configuration on the data source, chunking configuration, and the content.

Below is a sample request that demonstrates required structure. The `content` attribute contains all relevant information to answer coming queries. However, the `text` attributes include the same information as well, to allow filtering when searching through the knowledge base.

{
    "dataSource": "top_songs",
    "collection": "spotify",
    "objectId": "001",
    "title": "MILLION DOLLAR BABY",
    "chunkingMethod": "NONE",
    "chunkSize": 200,
    "chunkOverlap": 200,
    "roles": [
        {
            "value": "KnowledgeBuddy:Public"
        }
    ],
    "text": [
        {
            "content": "1. \\"MILLION DOLLAR BABY\\" by Tommy Richman (Released April 26, 2024): This is currently the #1 ranked track with an impressive track score of 725.4.It's performing exceptionally well on social media, particularly on TikTok with over 5.7 million posts and a staggering 5.3 billion views. The song has nearly 390 million Spotify streams and appears in over 30,000 playlists. Its high Spotify popularity score of 92 and significant YouTube presence (84 million views) indicate it's a major mainstream hit. It's a clean (non-explicit) track that's getting strong radio play with over 40,000 AirPlay spins.",
            "roles": [
                {
                    "value": "KnowledgeBuddy:Public"
                }
            ],
            "attributes": [
                {
                    "name": "artist",
                    "values": [
                        {
                            "value": "Tommy Richman"
                        }
                    ]
                },
                {
                    "name": "releaseDate",
                    "values": [
                        {
                            "value": "2024-04-26"
                        }
                    ]
                },
                {
                    "name": "trackScore",
                    "values": [
                        {
                            "value": "725.4"
                        }
                    ]
                },
                {
                    "name": "spotifyStreams",
                    "values": [
                        {
                            "value": "390000000"
                        }
                    ]
                },
                {
                    "name": "tiktokPosts",
                    "values": [
                        {
                            "value": "5700000"
                        }
                    ]
                },
                {
                    "name": "tiktokViews",
                    "values": [
                        {
                            "value": "5300000000"
                        }
                    ]
                },
                {
                    "name": "youtubeViews",
                    "values": [
                        {
                            "value": "84000000"
                        }
                    ]
                },
                {
                    "name": "playlistCount",
                    "values": [
                        {
                            "value": "30000"
                        }
                    ]
                },
                {
                    "name": "airplaySpins",
                    "values": [
                        {
                            "value": "40000"
                        }
                    ]
                },
                {
                    "name": "spotifyPopularityScore",
                    "values": [
                        {
                            "value": "92"
                        }
                    ]
                },
                {
                    "name": "explicit",
                    "values": [
                        {
                            "value": "false"
                        }
                    ]
                }
            ]
        }
    ]
}

Find more about the Ingestion API.

Question API

Use the Question API to ask questions to a Knowledge Buddy and receive answers.

The request contains metadata/configuration of the question. It may include question-related attributes, a flag to include search results, or filtering attributes.

Find more about the Question API.

Retrieval Quality

I tested various queries. You can find the examples below.

Question 1 - general question:

{
    "question": "What is the #1 ranked track?",
    "questionSource": "chat",
    "questionTag": "user query",
    "includeSearchResults": false,
    "buddy": "APITesting",
    "userEmail": "[email protected]"
}

Answer to the question 1:

{
    "questionID": "ASK-72038",
    "answer": "The #1 ranked track is \\"MILLION DOLLAR BABY\\" by Tommy Richman, released on April 26, 2024. It has an impressive track score of 725.4 and is performing exceptionally well on social media, particularly on TikTok with over 5.7 million posts and 5.3 billion views. The song has nearly 390 million Spotify streams and appears in over 30,000 playlists. It has a Spotify popularity score of 92 and significant YouTube presence with 84 million views.",
    "status": "Results found"
}

Question 2 - asking on the content attribute:

{
    "question": "What is the released date of the #2 ranked track?",
    "questionSource": "chat",
    "questionTag": "user query",
    "includeSearchResults": false,
    "buddy": "APITesting",
    "userEmail": "[email protected]"
}

Answer to the question 2:

{
    "questionID": "ASK-72026",
    "answer": "The #2 ranked track, \\"Not Like Us\\" by Kendrick Lamar, was released on May 4, 2024.",
    "status": "Results found"
}

Question 3 - asking that involves content attribute of multiple contents:

{
    "question": "What are the tracks that have Spotify popularity of 92?",
    "questionSource": "chat",
    "questionTag": "user query",
    "includeSearchResults": false,
    "buddy": "APITesting",
    "userEmail": "[email protected]"
}

Semantic search is a search method that understands the intent and contextual meaning of a query, not just matching keywords.

Question 4 - Semantic search:

{
    "question": "Who sings the second top song?",
    "questionSource": "chat",
    "questionTag": "user query",
    "includeSearchResults": false,
    "buddy": "APITesting",
    "userEmail": "[email protected]"
}

Answer to the question 4:

{
    "questionID": "ASK-72040",
    "answer": "The second top song is \\"Not Like Us\\" by Kendrick Lamar.",
    "status": "Results found"
}

Overall, quality and relevance of retrieved results were acceptable. Probably, to achieve even better results I would need to play more with chunking attributes of the Ingestion API.

Context management

When it comes to context management it's possible to retrieve specific portions of a document. By enabling the includeSearchResults flag in the request the Knowledge Buddy API returns chunks of data that is returned during semantic search.

Because the REST API is stateless, it does not maintain context between related pieces of information. You can achieve it by handling state on the client side that is calling the Question API.

Integration capabilities

Pega Knowledge Buddy API can undoubtedly serve content that can be integrated with your application. The API's responds with JSON format that provides proper structuring of the retrieved content. It simplifies using the integration in your application.

Conclusion

Pega Knowledge Buddy API proves to be a robust solution for implementing RAG (Retrieval-Augmented Generation) services. Its key strengths include:

Strong retrieval capabilities with semantic search functionality that understands context and intent beyond simple keyword matching.
Flexible content ingestion system that allows for detailed metadata and attribute configuration
Well-structured JSON responses that enable easy integration with external applications.

While the API's stateless nature requires client-side context management, the overall architecture provides a solid foundation for building knowledge-based applications. The combination of OAuth 2.0 security, customizable chunking options, and comprehensive documentation makes it a viable choice for organizations looking to implement AI-powered information retrieval systems.

To see attachments, please log in.

Pega Knowledge Buddy 24.2

Generative AI

Natural Language Processing

Other Industry

Senior System Architect

Developer Knowledge Share

Reply
Likes (4)

Kamil Janeczek Pawel Krol Vamsee Krishna Durbha Piotr Kupczyk
Share this page Facebook Twitter LinkedIn Email Copying... Copied!

Discussion

Knowledge Buddy REST APIs

Pega Knowledge Buddy

Data Ingestion

Asking Questions

API

Ingestion API

Question API

Retrieval Quality

Context management

Integration capabilities

Conclusion

Need help or want to help others?

Experience the benefits of Support Center when you log in.

Discussion

Knowledge Buddy REST APIs

Pega Knowledge Buddy

Data Ingestion

Asking Questions

API

Ingestion API

Question API

Retrieval Quality

Context management

Integration capabilities

Conclusion

Related content:

Need help or want to help others?

Experience the benefits of Support Center when you log in.

We'd prefer it if you saw us at our best.