Question

How to convert text entered in rich text editor control to plain text.
I am having a Rich text editor control in a screen where user will enter some comments and click on submit.Then the comments should get save to a DB table.
can anybody help me on the below:
The text entered by user is having html tags because the control is rich text editor but I want to have plain text only from it and save in DB table. So, how to convert rich text format(contains html tags) to plain text.
Thanks in Advance.
***Updated by moderator: Lochan to close post***
This post has been archived for educational purposes. Contents and links will no longer be updated. If you have the same/similar question, please write a new post.
-
Like (0)
-
Accepted Solution

the below link may be useful for you. It worked for me.
https://pdn.pega.com/support-articles/rich-text-editor-adds-extra-paragraph-tags

Is there any reason to use Rich text editor if you want to save as plain text?
You should just use Text area for this purpose.

Reason is the comments is being sent in the email. and we save those comments in DB for Audit purpose.

Any additional input here?
I have a client that needs to push the Rich Text as plain text to a document.
You can try using filterRichText method defined in StringUtils class.
It will convert most of it. Some tags like "<p>" still remains as this uses a policy defined to filter rich text which allows tags like <p>.
Let me know if this helps.
Else I can help you can define your own policy and code to filter.
Hi,
We also have the same requirement. If we use the filter rich text function only few tags are getting removed. Most of the tags like <span>, <font>, <p> are not getting removed. Could you please suggest any other solution

Hey here is the simple solution to remove html tags from the text.
Write a java step in a activity and include the below code for replacing html tags with blank.
Declare a local variable for example textLocal and put the original text in it.
textLocal=textLocal.replaceAll("\\<.*?\\>","");
Enjoy!

This will just remove the tags and not other HTML stuff like " " etc.
I did a bit of google searching, and there's a suggestion (I have not tried it!) that the java JSoup library will allow you to convert html to plain text with this construct:
String plain = new HtmlToPlainText().getPlainText(Jsoup.parse(html));
Try searching the web for
jsoup clean whitelist
/Eric

Eric,
We repackage package the Apache Commons Lang library and thus have access to StringEscapeUtils (Commons Lang 2.6 API)
"com.pega.apache.commons.lang.StringEscapeUtils" is our class name and it's used in pzPageMessagesToPageList
Excerpt from the activity Java step:
message.putString("pyDescription", com.pega.apache.commons.lang.StringEscapeUtils.unescapeHtml(pageMessage));
Reference: Pega7: Escape HTML markup in a String from RTE field
-Alexei
Accepted Solution

the below link may be useful for you. It worked for me.
https://pdn.pega.com/support-articles/rich-text-editor-adds-extra-paragraph-tags

You can configure CKeditor Like this. Use this in non autogenerated section and include the section below the RTFfield.
<script>
CKEDITOR.config.font_defaultLabel = 'Arial';
CKEDITOR.config.fontSize_defaultLabel = '10px';
CKEDITOR.config.entities = true;
CKEDITOR.config.basicEntities = true;
CKEDITOR.config.entities_greek = false;
CKEDITOR.config.entities_latin = false;
CKEDITOR.config.forcePasteAsPlainText = true;
CKEDITOR.config.removeButtons = 'Copy,Cut,Paste,Undo,Redo,Print,Form,TextField,Textarea,Button,SelectAll,CreateDiv,Table,PasteText,PasteFromWord,Select,HiddenField,BGColor,TextColor,RemoveFormat,BulletedList,NumberedList,Image,Format,FontSize,Font,Italic,Bold,Underline,JustifyCenter';
CKEDITOR.on( 'instanceReady', function( ev ) {
ev.editor.window.$.document.body.style.fontFamily = "Arial";
ev.editor.window.$.document.body.style.fontSize = "10px";
});
</script>