Read PDF Data
I have a following scenario.
1. During case processing, Pega sends email (with editable PDF as attachment) for users to fill out (Users are external, not within Pega).
2. User will open PDF attachment. Fill the form and reply to received email with filled PDF.
3. Pega has Email listener configured to read incoming emails. Pega should be able to read incoming email and PDF and update the case properties with inputs user provided in PDF File.
To test this, I configured email listener and Sent email with PDF (please see attachment FillablePDF1) populated with details attached to email.
I received PDF as Base64 String in pyAttachmentPage. I wrote some java code which I found on this forum to read the PDF data.
org.apache.pdfbox.pdmodel.PDDocument doc=null; org.apache.pdfbox.text.PDFTextStripper pdfStripper;
I have a following scenario.
1. During case processing, Pega sends email (with editable PDF as attachment) for users to fill out (Users are external, not within Pega).
2. User will open PDF attachment. Fill the form and reply to received email with filled PDF.
3. Pega has Email listener configured to read incoming emails. Pega should be able to read incoming email and PDF and update the case properties with inputs user provided in PDF File.
To test this, I configured email listener and Sent email with PDF (please see attachment FillablePDF1) populated with details attached to email.
I received PDF as Base64 String in pyAttachmentPage. I wrote some java code which I found on this forum to read the PDF data.
org.apache.pdfbox.pdmodel.PDDocument doc=null; org.apache.pdfbox.text.PDFTextStripper pdfStripper;
java.io.InputStream is = new java.io.ByteArrayInputStream( Base64Util.decodeToByteArray( EncodedPDF ) ); try { doc=org.apache.pdfbox.pdmodel.PDDocument.load( is ); if (doc.isEncrypted()) { oLog.infoForced("Document is encrypted: trying to decrypt with blank password"); try { // doc.decrypt(""); doc.setAllSecurityToBeRemoved(true); } catch(Exception e) { throw new PRRuntimeException(e); } } pdfStripper=new org.apache.pdfbox.text.PDFTextStripper(); ExtractedText=pdfStripper.getText(doc); oLog.infoForced("---Extracted Text---"+"\n"+ExtractedText); doc.close(); } catch(Exception e){ throw new PRRuntimeException(e); }
But apparantly, it only reads static data from PDF. (I printed it in pega logs). I don't see any filled data such as name, address etc printed in logs. Here's what I see in the log. Only static text and no input data printed.
PDF Form Example This is an example of a user fillable PDF form. Normally PDF is used as a final publishing format. However PDF has an option to be used as an entry form that can be edited and saved by the user. The fields of this form have been selected to demonstrate as many as possible of the common entry fields. This document and PDF form have been created with OpenOffice (version 3.4.0). To fill out the form, make sure the PDF file is not read-only. If the file is read-only save it first to a folder or computer desktop. Close this file and open the saved file. Please fill out the following fields. Important fields are marked yellow. Given Name: Family Name: Address 1: House nr: Address 2: Postcode: City: Country: Gender: Height (cm): Driving License: I speak and understand (tick all that apply):
Any ideas if this is doable or anything missing?