Conversion of Word & Excel documents to PDF from Case Attachments

Question

Surya Toleti

Member since 2018

1 post

Capgemini America Inc

Posted: Aug 4, 2021

Last activity: Aug 12, 2021

Posted: 4 Aug 2021 5:46 EDT
Last activity: 12 Aug 2021 10:56 EDT

Closed

Solved

Conversion of Word & Excel documents to PDF from Case Attachments

Report

I have a requirement to convert the attached documents on a case (Word and Excel) to PDF and send the same to case requestor via email notification.

Pega’s OOB PDF conversion is based on HTML Stream/markup. Since our scenario is from attachments on the case we have either pyAttachStream object or pyAttachStream converted to byte array.

Tried converting the same using custom java code but the pdf generated is not in proper format/corrupt/unable to open.

Appreciate for any generic solution you have worked/know which works for any type of document converting to PDF.

***Edited by Moderator: Pooja Gadige to add platform capability tag***

To see attachments, please log in.

Pega Platform 8.2.7

Dev/Designer Studio

Case Management

Cross-Industry

Lead System Architect

Like (0)
Share this page Facebook Twitter LinkedIn Email Copying... Copied!

Accepted Solution

Posted: 3 years ago

Updated: 3 years ago

Posted: 12 Aug 2021 9:50 EDT
Updated: 12 Aug 2021 9:52 EDT

Surya Toleti

Capgemini America Inc

replied to Tanay Kumar Bal

Report

@Tanay Kumar Bal

Thank you very much for your support. Able to generate doc & docx formats and merge to a single pdf using the shared code.

Differences being the attachments stored to Pega DB & Get Attach Stream and decode it for generating pfd's.

In the generated final pdf after merging all the documents, is there anyway we can differentiate someway between the appended documents?

View reply inline

To see attachments, please log in.

Posted: 3 years ago

Posted: 4 Aug 2021 23:58 EDT

Tanay Kumar Bal

LTIMindtree Limited

replied to Surya Toleti

Report

Hi @Surya Toleti,

We had a similar requirement where we needed to merge all the "selected" documents from the documents attached in the case (there was a UI for selection), only if the documents attached to the case were of a particular file extension - which included - JPEG,JPG, GIF, PNG, TXT, DOC, DOCX and PDF.

The attached JAVA was written after importing some custom jar files for this conversion and then merging. Note the parameter PDFDocument at the end of the JAVA code which is then passed on to OOTB activity Code-Pega-PDF.AttachToWork to finally attach the merged PDF in the current case.

To see attachments, please log in.

Likes (3)

S S K Suryanarayana Toleti Adrian Zafir Satish Chagarlamudi

Posted: 3 years ago

Posted: 5 Aug 2021 10:47 EDT

Surya Toleti

Capgemini America Inc

replied to Tanay Kumar Bal

Report

Hi @Tanay Kumar Bal Thanks for your reply and sharing the code. Can you please elaborate on "Custom jar files for this conversion" and how they are added to your environment?

We are in 8.2.7 and on cloud. Please share your experience and thoughts on this as well. Thanks in advance.

To see attachments, please log in.

Like (0)

Posted: 3 years ago

Updated: 3 years ago

Posted: 5 Aug 2021 12:00 EDT
Updated: 5 Aug 2021 12:02 EDT

Tanay Kumar Bal

LTIMindtree Limited

replied to Surya Toleti

Report

@Surya Toleti We didnt merge XLS so not sure which jar you would require for that.

For DOC and DOCX we required to import following jars

fr.opensagres.poi.xwpf.converter.core-2.0.2
fr.opensagres.poi.xwpf.converter.pdf-2.0.2
fr.opensagres.xdocreport.itext.extension-2.0.2

To import jars you can use normal import option in Dev Studio and create your own code set in pega.

import jar

You need to restart the cloud instance post import before using the classes and methods in your JAVA code (function)

To see attachments, please log in.

Likes (1)

S S K Suryanarayana Toleti

Posted: 3 years ago

Posted: 9 Aug 2021 3:11 EDT

Surya Toleti

Capgemini America Inc

replied to Tanay Kumar Bal

Report

Hi @Tanay Kumar Bal,

Thanks for your reply. We tried adding the above mentioned jars and restarted the cloud server. Still the itext classes are not compiling in Java step. Checking on it.

I see there is custom utility/function in the shared code. What should be the dependency for it? I believe it is for some format conversion and suppose it works without that scale ratio conversion as well. Please let us know your inputs on this.

//img.scalePercent(75); float scaleRatio = crm_crmutils.CalculateScaleRatio(document, img); if (scaleRatio < 1F) { img.scalePercent(scaleRatio * 100F); }"

To see attachments, please log in.

Like (0)

Posted: 3 years ago

Posted: 9 Aug 2021 4:19 EDT

Tanay Kumar Bal

LTIMindtree Limited

replied to Surya Toleti

Report

@Surya Toleti

Which classes are not compiling?
You can try removing the line which adjusts the scale of the image. The function requires Pega CS.

To see attachments, please log in.

Like (0)

Posted: 3 years ago

Posted: 9 Aug 2021 5:38 EDT

Surya Toleti

Capgemini America Inc

replied to Tanay Kumar Bal

Report

Hi @Tanay Kumar Bal,

com.itextpdf.text.* is not getting compiled in our Java step. We are interested in doc/docx. Hence commented the image logic from java code.

While importing we have done all the 3 jar imports using one CodeSet and Version. Does they require separate CodeSet and Version for each i.e opensagres-core(01-01-01), opensagres-pdf(01-01-01) & opensagres-extension(01-01-01)? Please confirm.

To see attachments, please log in.

Like (0)

Posted: 3 years ago

Posted: 9 Aug 2021 10:57 EDT

Tanay Kumar Bal

LTIMindtree Limited

replied to Surya Toleti

Report

@Surya Toleti

Try importing "itextpdf-5.5.9" as well and see if it works please. Also please make sure you restart the server after import. Codeset name and version are not related to any kind of execution error. You can mention as you wish.

To see attachments, please log in.

Like (0)

Posted: 3 years ago

Posted: 10 Aug 2021 14:04 EDT

Surya Toleti

Capgemini America Inc

replied to Tanay Kumar Bal

Report

Hi @Tanay Kumar Bal,

After importing the itext PDF jar code is fine. Now I have difficulty reading the files from cloud repository as mentioned in your code below. Any other way we can get the input stream of the documents since we are already in loop of attachments.

ParameterPage paramPage = tools.getParameterPage(); paramPage.putParamValue("repositoryName", "pegacloudfilestorage"); paramPage.putParamValue("filePath", "/attachments/" + repositoryFileName); paramPage.putParamValue("responseType", "STREAM"); pega.getDeclarativePageUtils().deleteDeclarativePageInstance("D_pxGetFile", paramPage); //fetch the file stream from pega repositoryFileName isCurrentStream = (java.io.InputStream) pega.findDataPage("D_pxGetFile", false, "repositoryName", "pegacloudfilestorage", "filePath", "/attachments/" + repositoryFileName, "responseType", "STREAM").getObject("pyStream");

Hi @Tanay Kumar Bal,

I am getting below error and causing thread dump in logs for accessing the file from above cloud folder. Please suggest if this can achieved in any other ways.

"There was an issue creating stream for the file.: Cannot find S3 object at Container/Bucket=cuttyhunk-prod-data-bucket-us-east-1; prefix/root=null; path/key=XXX/DevTest/XXX/filestorage/attachments/Base64 To PDF as an Attachment_1_TRS-302.docx"

I tried taking the input stream as below. But due to thread dump I could not see the proper results.

isCurrentStream = (java.io.InputStream) myStepPage.getObject("pyStream");

Show Less

To see attachments, please log in.

Like (0)

Posted: 3 years ago

Posted: 11 Aug 2021 0:44 EDT

Tanay Kumar Bal

LTIMindtree Limited

replied to Surya Toleti

Report

@Surya Toleti

Check to see if in your application you are using cloud file repository to store the attachments or are you storing locally in Pega DB. To verify this navigate to your application rule, Integration and security tab, Content storage.

If Pega DB then this would not work. In that case you need to fetch the pyAttachStream from Data-WorkAttach-File, decode the value and set it to the input stream. The code should look somewhat like below

String AttachStream = <open the Data-WorkAttach-File instance and set the value of pyAttachStream here>;

isCurrentStream = new java.io.ByteArrayInputStream(org.apache.commons.codec.binary.Base64.decodeBase64(AttachStream));

To see attachments, please log in.

Likes (1)

S S K Suryanarayana Toleti

Accepted Solution

Posted: 3 years ago

Updated: 3 years ago

Posted: 12 Aug 2021 9:50 EDT
Updated: 12 Aug 2021 9:52 EDT

Surya Toleti

Capgemini America Inc

replied to Tanay Kumar Bal

Report

@Tanay Kumar Bal

Thank you very much for your support. Able to generate doc & docx formats and merge to a single pdf using the shared code.

Differences being the attachments stored to Pega DB & Get Attach Stream and decode it for generating pfd's.

In the generated final pdf after merging all the documents, is there anyway we can differentiate someway between the appended documents?

To see attachments, please log in.

Likes (1)

Kapil Dev Gautam

Posted: 3 years ago

Posted: 12 Aug 2021 10:56 EDT

Tanay Kumar Bal

LTIMindtree Limited

replied to Surya Toleti

Report

@Surya Toleti

Not sure if I understand your question properly. Do you want to put some additional text in between merged documents? Why do you need to differentiate?

To see attachments, please log in.

Like (0)

Question

Conversion of Word & Excel documents to PDF from Case Attachments

Need help or want to help others?

Experience the benefits of Support Center when you log in.

Question

Conversion of Word & Excel documents to PDF from Case Attachments

Related content:

Need help or want to help others?

Experience the benefits of Support Center when you log in.

We'd prefer it if you saw us at our best.