Question
Sensiple Software Solutions Pvt Ltd
IN
Last activity: 10 Sep 2019 10:57 EDT
Data capture
Hi,
I am trying to capture data for 'Balance Due' field in this pdf. Can someone provide me a solution to get the amount from Balance Due field. I have attached my pdf for reference.
Thanks.
-
Like (0)
-
Share this page Facebook Twitter LinkedIn Email Copying... Copied!
Pegasystems Inc.
US
Here you go. The key to working with PDFs is to look at the Developer Tools available in the PDFViewer. You can highlight the various parts you can extract and then determine how to navigate the document from there.
I hard-coded this to use the attached PDF, but you could replace that with a selection if you like.
Accenture
IN
Hi Thomos,
What is the use of PDFViewer in windows form.
Pegasystems Inc.
US
It serves two purposes;
1. It allows you to see the PDF to validate what you are pulling.
2. It allows you to enable to the developer options and highlight the word. segments, and lines. This help with development so you know how it is able to parse the file.
Sensiple Software Solutions Pvt Ltd
IN
Hi,
Thanks for the response! I looked at the attachment, It is working great.
I couldn't find System.IO.Path#GetDirectoryName in my Robotics studio.
Also Can you please explain the PDF_P_LoadFiles.os automation flow, the use for RuntimeHost & System.IO.Path blocks in that flow.
Is it possible to capture the same fields through OCR techniques.
Pegasystems Inc.
US
1. To add any static .Net methods (like System.IO.Path.GetDirectoryName), you right-click on an area of the Toolbox and select Choose Items. You then select "Pega Robotics Static Members". Next, select "From Global Assembly Cache". Finally, select the assembly the method you wish to use is in. In this case (most of the interesting ones are in either mscorlib or System) select mscorlib (it is near the bottom of the "m", so if you click the letter "n" and scroll up), you can find it quicker) and locate the Directory node. Simply check next to whichever method you like.
2. I wanted to attach the PDF to the example for ease of distribution. In practice, you'd load the PDF from another path. The logic there is just used to locate the file on-disk in the extract directory of the solution, since I have attached the file to the deployment.
3. OCR simply turns whatever file you have into a PDF to use the PDF Connector component on it, so not directly. I guess you could get the entire text of the file and parse it yourself though.
Sensiple Software Solutions Pvt Ltd
IN
Hi,
I see you have uploaded the sample pdf in the solution. I couldn't either attach my sample pdf to the solution nor locate my pdf folder through automation.
Can you tell how to open the pdf files (located in my folder) to load the files in Pdf viewer, in Robotics studio. I tried using Path.GetDirectoryName method, but error occurs.
Thanks.
Accenture
IN
Hi,
Implemented the same in the below automation and able to extract particular line from a pdf which is in my local drive.
I guess you missed out adding the pdf source to the pdfviewer component.