Question
Carelon Global Solutions(Legato)
IN
Last activity: 12 Aug 2021 10:56 EDT
Conversion of Word & Excel documents to PDF from Case Attachments
I have a requirement to convert the attached documents on a case (Word and Excel) to PDF and send the same to case requestor via email notification.
Pega’s OOB PDF conversion is based on HTML Stream/markup. Since our scenario is from attachments on the case we have either pyAttachStream object or pyAttachStream converted to byte array.
Tried converting the same using custom java code but the pdf generated is not in proper format/corrupt/unable to open.
Appreciate for any generic solution you have worked/know which works for any type of document converting to PDF.
-
Like (0)
-
Share this page Facebook Twitter LinkedIn Email Copying... Copied!
Accepted Solution
Updated: 12 Aug 2021 9:52 EDT
Carelon Global Solutions(Legato)
IN
Thank you very much for your support. Able to generate doc & docx formats and merge to a single pdf using the shared code.
Differences being the attachments stored to Pega DB & Get Attach Stream and decode it for generating pfd's.
In the generated final pdf after merging all the documents, is there anyway we can differentiate someway between the appended documents?
Maantic Inc.
IN
Hi @Surya Toleti,
We had a similar requirement where we needed to merge all the "selected" documents from the documents attached in the case (there was a UI for selection), only if the documents attached to the case were of a particular file extension - which included - JPEG,JPG, GIF, PNG, TXT, DOC, DOCX and PDF.
The attached JAVA was written after importing some custom jar files for this conversion and then merging. Note the parameter PDFDocument at the end of the JAVA code which is then passed on to OOTB activity Code-Pega-PDF.AttachToWork to finally attach the merged PDF in the current case.
-
S S K Suryanarayana Toleti Adrian Zafir Satish Chagarlamudi
Carelon Global Solutions(Legato)
IN
Hi @Tanay Kumar Bal Thanks for your reply and sharing the code. Can you please elaborate on "Custom jar files for this conversion" and how they are added to your environment?
We are in 8.2.7 and on cloud. Please share your experience and thoughts on this as well. Thanks in advance.
Updated: 5 Aug 2021 12:02 EDT
Maantic Inc.
IN
@Surya Toleti We didnt merge XLS so not sure which jar you would require for that.
For DOC and DOCX we required to import following jars
- fr.opensagres.poi.xwpf.converter.core-2.0.2
- fr.opensagres.poi.xwpf.converter.pdf-2.0.2
- fr.opensagres.xdocreport.itext.extension-2.0.2
To import jars you can use normal import option in Dev Studio and create your own code set in pega.
You need to restart the cloud instance post import before using the classes and methods in your JAVA code (function)
-
S S K Suryanarayana Toleti
Carelon Global Solutions(Legato)
IN
Hi @Tanay Kumar Bal,
Thanks for your reply. We tried adding the above mentioned jars and restarted the cloud server. Still the itext classes are not compiling in Java step. Checking on it.
I see there is custom utility/function in the shared code. What should be the dependency for it? I believe it is for some format conversion and suppose it works without that scale ratio conversion as well. Please let us know your inputs on this.
//img.scalePercent(75); float scaleRatio = crm_crmutils.CalculateScaleRatio(document, img); if (scaleRatio < 1F) { img.scalePercent(scaleRatio * 100F); }"
Maantic Inc.
IN
- Which classes are not compiling?
- You can try removing the line which adjusts the scale of the image. The function requires Pega CS.
Carelon Global Solutions(Legato)
IN
Hi @Tanay Kumar Bal,
com.itextpdf.text.* is not getting compiled in our Java step. We are interested in doc/docx. Hence commented the image logic from java code.
While importing we have done all the 3 jar imports using one CodeSet and Version. Does they require separate CodeSet and Version for each i.e opensagres-core(01-01-01), opensagres-pdf(01-01-01) & opensagres-extension(01-01-01)? Please confirm.
Maantic Inc.
IN
Try importing "itextpdf-5.5.9" as well and see if it works please. Also please make sure you restart the server after import. Codeset name and version are not related to any kind of execution error. You can mention as you wish.
Carelon Global Solutions(Legato)
IN
Hi @Tanay Kumar Bal,
After importing the itext PDF jar code is fine. Now I have difficulty reading the files from cloud repository as mentioned in your code below. Any other way we can get the input stream of the documents since we are already in loop of attachments.
ParameterPage paramPage = tools.getParameterPage(); paramPage.putParamValue("repositoryName", "pegacloudfilestorage"); paramPage.putParamValue("filePath", "/attachments/" + repositoryFileName); paramPage.putParamValue("responseType", "STREAM"); pega.getDeclarativePageUtils().deleteDeclarativePageInstance("D_pxGetFile", paramPage); //fetch the file stream from pega repositoryFileName isCurrentStream = (java.io.InputStream) pega.findDataPage("D_pxGetFile", false, "repositoryName", "pegacloudfilestorage", "filePath", "/attachments/" + repositoryFileName, "responseType", "STREAM").getObject("pyStream");
Hi @Tanay Kumar Bal,
After importing the itext PDF jar code is fine. Now I have difficulty reading the files from cloud repository as mentioned in your code below. Any other way we can get the input stream of the documents since we are already in loop of attachments.
ParameterPage paramPage = tools.getParameterPage(); paramPage.putParamValue("repositoryName", "pegacloudfilestorage"); paramPage.putParamValue("filePath", "/attachments/" + repositoryFileName); paramPage.putParamValue("responseType", "STREAM"); pega.getDeclarativePageUtils().deleteDeclarativePageInstance("D_pxGetFile", paramPage); //fetch the file stream from pega repositoryFileName isCurrentStream = (java.io.InputStream) pega.findDataPage("D_pxGetFile", false, "repositoryName", "pegacloudfilestorage", "filePath", "/attachments/" + repositoryFileName, "responseType", "STREAM").getObject("pyStream");
I am getting below error and causing thread dump in logs for accessing the file from above cloud folder. Please suggest if this can achieved in any other ways.
"There was an issue creating stream for the file.: Cannot find S3 object at Container/Bucket=cuttyhunk-prod-data-bucket-us-east-1; prefix/root=null; path/key=XXX/DevTest/XXX/filestorage/attachments/Base64 To PDF as an Attachment_1_TRS-302.docx"
I tried taking the input stream as below. But due to thread dump I could not see the proper results.
isCurrentStream = (java.io.InputStream) myStepPage.getObject("pyStream");
Maantic Inc.
IN
Check to see if in your application you are using cloud file repository to store the attachments or are you storing locally in Pega DB. To verify this navigate to your application rule, Integration and security tab, Content storage.
If Pega DB then this would not work. In that case you need to fetch the pyAttachStream from Data-WorkAttach-File, decode the value and set it to the input stream. The code should look somewhat like below
String AttachStream = <open the Data-WorkAttach-File instance and set the value of pyAttachStream here>;
isCurrentStream = new java.io.ByteArrayInputStream(org.apache.commons.codec.binary.Base64.decodeBase64(AttachStream));
-
S S K Suryanarayana Toleti
Accepted Solution
Updated: 12 Aug 2021 9:52 EDT
Carelon Global Solutions(Legato)
IN
Thank you very much for your support. Able to generate doc & docx formats and merge to a single pdf using the shared code.
Differences being the attachments stored to Pega DB & Get Attach Stream and decode it for generating pfd's.
In the generated final pdf after merging all the documents, is there anyway we can differentiate someway between the appended documents?
-
Kapil Dev Gautam
Maantic Inc.
IN
Not sure if I understand your question properly. Do you want to put some additional text in between merged documents? Why do you need to differentiate?