Using email bots for natural language processing and machine learning in Pega Platform 8.4
Email bots allow a user to take advantage of some rich natural language processing (NLP) and machine learning (ML) capabilities to effectively classify and extract important data from emails. This information can then be used to decide the best course of action, whether it is routing it to the correct team, creating a case, sending a response, or just making a suggestion to the person responding. All of this is done without being an expert in data science or needing to write code.
Consider the correct type of entity
It is important to consider the correct type of entity to use before you start training because it may be much easier and more accurate to just use a RUTA or Keyword entity type rather than training a machine learning based entity. However, there is a lot of power in being able to extract data using machine learning so this article will mostly focus on how you can take advantage of the machine learning capabilities to train models for identifying topics and entities.
Have an overview of the scope
It’s good to have an overview of the scope of your bot before you get started. If it’s broad and the different outcomes are relatively distinct than you will likely need less training data to draw a distinction between them. However, if the bot is doing several things that are all very similar (for example, different kinds of address changes), then you will need more training data to differentiate them. If they are very similar it may be better to handle that differentiation at the case level rather than trying to determine the correct case type. For example, it might be better to have a flag on the case for differentiating between commercial and residential address change, rather than trying to get the NLP to make that distinction.
Start Training
Once you have an idea of the initial scope of your email bot you can create start training your topics and entities. One easy way to start this training is to use the Training Data tab of the Email channel. This tab will show the text of emails that were already triaged if you previously enabled the Record training data check box for your Email channel. You can also easily add new entries manually. It is often easiest to just think about different ways your users will initiate an action and quickly add them for each topic. For more information about training the email bot, see Training the model for the Email channel.
Now that you have data on the Training Data tab you can select an item, choose the appropriate topic, highlight and then right click to add entities, then mark the record as reviewed. Start with 20 entries for each topic and entity as a baseline. At any point you can build the model and start testing it directly from the training data screen. This allows you to quickly iterate through the process of building an intelligent bot and evaluating its accuracy.
There are other ways to build and enhance your model by using Pega Prediction Studio™ but those are more complex and will be covered in another article.
Note: To learn about text analysis and using NLP to detect the correct information from user emails, see Understanding text analysis.