Question


BCBS Florida
US
Last activity: 30 Jun 2025 9:51 EDT
Hw to retreive specific text from a string
Below is my Example string, i want to extract 6 separate things StartTime, EndTime, Location, subject, Attendees and emailbody and show it on different fields. how can we achieve this with pegaRobotics22
************************************************************************
1. StartTime: Thursday, March 18th, 2023, 2:00 PM EST
2. ENDTIME: Thursday, March 18th, 2023, 4:00 PM EST
3. Location: Conference Room A, 123 Main St, New York, NY 10001
4. Subject: AI Focus Group Meeting
5. Attendees:
John Doe ( Proprietary information hidden) Jane Smith ( Proprietary information hidden) David Lee ( Proprietary information hidden)
6. EmailBody: You are invited to join our AI Focus Group meeting to discuss the latest developments and advancements in Artificial Intelligence. This meeting will provide a platform for experts and enthusiasts to share their knowledge, experiences, and ideas on AI
****************************************
-
Reply
-
Share this page Facebook Twitter LinkedIn Email Copying... Copied!
Accepted Solution
Updated: 30 Jun 2025 9:51 EDT


Pegasystems Inc.
US
@SrinivasR3193 Sure. The explanations are below. I use a tool called Expresso to build and test RegEx, but if you are unsure about where to start, I have found AI like Copilot, ChatGPT, or Gemini to be a good place to start. If you are using the later versions of Pega Robot Studio and have configured your platform with access to AI, you can also ask AI to generate you a C# script from within the script container.
(?s)(?<=^|\n)(\d\.\s.*?)(?=\n\d\.|\n*$)
This is the expression to match each section. Each section starts with a number and ends with either the end of the string or the start of the next session.
@SrinivasR3193 Sure. The explanations are below. I use a tool called Expresso to build and test RegEx, but if you are unsure about where to start, I have found AI like Copilot, ChatGPT, or Gemini to be a good place to start. If you are using the later versions of Pega Robot Studio and have configured your platform with access to AI, you can also ask AI to generate you a C# script from within the script container.
(?s)(?<=^|\n)(\d\.\s.*?)(?=\n\d\.|\n*$)
This is the expression to match each section. Each section starts with a number and ends with either the end of the string or the start of the next session.
/// <summary> /// Regular expression built for C# on: Mon, Jun 30, 2025, 06:24:22 AM /// Using Expresso Version: 3.1.7917, http://www.ultrapico.com /// /// A description of the regular expression: /// /// Change options within the enclosing group [s] /// Turn ON Single Line option /// Match a prefix but exclude it from the capture. [^|\n] /// Select from 2 alternatives /// Beginning of line or string /// New line /// [1]: A numbered capture group. [\d\.\s.*?] /// \d\.\s.*? /// Any digit /// Literal . /// Whitespace /// Any character, any number of repetitions, as few as possible /// Match a suffix but exclude it from the capture. [\n\d\.|\n*$] /// Select from 2 alternatives /// \n\d\. /// New line /// Any digit /// Literal . /// \n*$ /// New line, any number of repetitions /// End of line or string /// /// /// </summary>
(?<=^\\d+\\.\\s*)([^:\\n]+)(?=:\\s*(?:\\n|.))
This is the expression for the label portion of the section. Essentially, after the number and period and before the colon.
/// <summary> /// Regular expression built for C# on: Mon, Jun 30, 2025, 06:27:22 AM /// Using Expresso Version: 3.1.7917, http://www.ultrapico.com /// /// A description of the regular expression: /// /// Match a prefix but exclude it from the capture. [^\d+\.\s*] /// ^\d+\.\s* /// Beginning of line or string /// Any digit, one or more repetitions /// Literal . /// Whitespace, any number of repetitions /// [1]: A numbered capture group. [[^:\n]+] /// Any character that is NOT in this class: [:\n], one or more repetitions /// Match a suffix but exclude it from the capture. [:\s*(?:\n|.)] /// :\s*(?:\n|.) /// : /// Whitespace, any number of repetitions /// Match expression but don't capture it. [\n|.] /// Select from 2 alternatives /// New line /// Any character /// /// /// </summary>
(?<=^\d+\.\s*[^:\n]+:\s*)([\s\S]*?)(?=(?:\n\d+\.\s*[^:\n]+:)|\n*$)
This is the expression for the remainder of the section after the label. It finds everything but the label.
/// <summary> /// Regular expression built for C# on: Mon, Jun 30, 2025, 06:29:31 AM /// Using Expresso Version: 3.1.7917, http://www.ultrapico.com /// /// A description of the regular expression: /// /// Match a prefix but exclude it from the capture. [^\d+\.\s*[^:\n]+:\s*] /// ^\d+\.\s*[^:\n]+:\s* /// Beginning of line or string /// Any digit, one or more repetitions /// Literal . /// Whitespace, any number of repetitions /// Any character that is NOT in this class: [:\n], one or more repetitions /// : /// Whitespace, any number of repetitions /// [1]: A numbered capture group. [[\s\S]*?] /// Any character in this class: [\s\S], any number of repetitions, as few as possible /// Match a suffix but exclude it from the capture. [(?:\n\d+\.\s*[^:\n]+:)|\n*$] /// Select from 2 alternatives /// Match expression but don't capture it. [\n\d+\.\s*[^:\n]+:] /// \n\d+\.\s*[^:\n]+: /// New line /// Any digit, one or more repetitions /// Literal . /// Whitespace, any number of repetitions /// Any character that is NOT in this class: [:\n], one or more repetitions /// : /// \n*$ /// New line, any number of repetitions /// End of line or string /// /// /// </summary>


Pegasystems Inc.
US
With a strict format like you provided, the best way to parse this is to use Regular Expressions (or RegEx). If you are not an expert in RegEx, you can consult your favorite search engine, or an AI model like Gemini or Copilot to generate them for you. You can then use the RegExMatches method to return a list of the list of the matches for a given RegEx.
You could also use any number of ways to accomplish this as well. You could split the string on each line and then check to see if the line starts with specific text (using the StartsWith method from the Toolbox) and then take the remaining text and parse it as well using the various methods available in the Toolbox under Data handling\Strings.
You could also just use a C# script if you are most comfortable writing C#. The downside to that is you couldn't use breakpoints in Studio to assist with debugging your script.
Here is a sample project where I used some relatively complex RegEx to accomplish this. If you'd like an explanation for each RegEx, please let me know.


BCBS Florida
US
@ThomasSasnett Thanks Thomas for the sample code.
can you please explain me in detail of each RegEx. in case if i need to make any changes to it i can easily be made.
Accepted Solution
Updated: 30 Jun 2025 9:51 EDT


Pegasystems Inc.
US
@SrinivasR3193 Sure. The explanations are below. I use a tool called Expresso to build and test RegEx, but if you are unsure about where to start, I have found AI like Copilot, ChatGPT, or Gemini to be a good place to start. If you are using the later versions of Pega Robot Studio and have configured your platform with access to AI, you can also ask AI to generate you a C# script from within the script container.
(?s)(?<=^|\n)(\d\.\s.*?)(?=\n\d\.|\n*$)
This is the expression to match each section. Each section starts with a number and ends with either the end of the string or the start of the next session.
@SrinivasR3193 Sure. The explanations are below. I use a tool called Expresso to build and test RegEx, but if you are unsure about where to start, I have found AI like Copilot, ChatGPT, or Gemini to be a good place to start. If you are using the later versions of Pega Robot Studio and have configured your platform with access to AI, you can also ask AI to generate you a C# script from within the script container.
(?s)(?<=^|\n)(\d\.\s.*?)(?=\n\d\.|\n*$)
This is the expression to match each section. Each section starts with a number and ends with either the end of the string or the start of the next session.
/// <summary> /// Regular expression built for C# on: Mon, Jun 30, 2025, 06:24:22 AM /// Using Expresso Version: 3.1.7917, http://www.ultrapico.com /// /// A description of the regular expression: /// /// Change options within the enclosing group [s] /// Turn ON Single Line option /// Match a prefix but exclude it from the capture. [^|\n] /// Select from 2 alternatives /// Beginning of line or string /// New line /// [1]: A numbered capture group. [\d\.\s.*?] /// \d\.\s.*? /// Any digit /// Literal . /// Whitespace /// Any character, any number of repetitions, as few as possible /// Match a suffix but exclude it from the capture. [\n\d\.|\n*$] /// Select from 2 alternatives /// \n\d\. /// New line /// Any digit /// Literal . /// \n*$ /// New line, any number of repetitions /// End of line or string /// /// /// </summary>
(?<=^\\d+\\.\\s*)([^:\\n]+)(?=:\\s*(?:\\n|.))
This is the expression for the label portion of the section. Essentially, after the number and period and before the colon.
/// <summary> /// Regular expression built for C# on: Mon, Jun 30, 2025, 06:27:22 AM /// Using Expresso Version: 3.1.7917, http://www.ultrapico.com /// /// A description of the regular expression: /// /// Match a prefix but exclude it from the capture. [^\d+\.\s*] /// ^\d+\.\s* /// Beginning of line or string /// Any digit, one or more repetitions /// Literal . /// Whitespace, any number of repetitions /// [1]: A numbered capture group. [[^:\n]+] /// Any character that is NOT in this class: [:\n], one or more repetitions /// Match a suffix but exclude it from the capture. [:\s*(?:\n|.)] /// :\s*(?:\n|.) /// : /// Whitespace, any number of repetitions /// Match expression but don't capture it. [\n|.] /// Select from 2 alternatives /// New line /// Any character /// /// /// </summary>
(?<=^\d+\.\s*[^:\n]+:\s*)([\s\S]*?)(?=(?:\n\d+\.\s*[^:\n]+:)|\n*$)
This is the expression for the remainder of the section after the label. It finds everything but the label.
/// <summary> /// Regular expression built for C# on: Mon, Jun 30, 2025, 06:29:31 AM /// Using Expresso Version: 3.1.7917, http://www.ultrapico.com /// /// A description of the regular expression: /// /// Match a prefix but exclude it from the capture. [^\d+\.\s*[^:\n]+:\s*] /// ^\d+\.\s*[^:\n]+:\s* /// Beginning of line or string /// Any digit, one or more repetitions /// Literal . /// Whitespace, any number of repetitions /// Any character that is NOT in this class: [:\n], one or more repetitions /// : /// Whitespace, any number of repetitions /// [1]: A numbered capture group. [[\s\S]*?] /// Any character in this class: [\s\S], any number of repetitions, as few as possible /// Match a suffix but exclude it from the capture. [(?:\n\d+\.\s*[^:\n]+:)|\n*$] /// Select from 2 alternatives /// Match expression but don't capture it. [\n\d+\.\s*[^:\n]+:] /// \n\d+\.\s*[^:\n]+: /// New line /// Any digit, one or more repetitions /// Literal . /// Whitespace, any number of repetitions /// Any character that is NOT in this class: [:\n], one or more repetitions /// : /// \n*$ /// New line, any number of repetitions /// End of line or string /// /// /// </summary>