Closed
How to fetch specific data from pdf using itextsharp in openspan?
how can we fetch particular data from pdf using itextsharp in openspan?
***Edited by Moderator Marissa to update categories***
This content is closed to future replies and is no longer being maintained or updated.
Links may no longer function. If you have a similar request, please write a new post.
how can we fetch particular data from pdf using itextsharp in openspan?
***Edited by Moderator Marissa to update categories***
This question has already been asked numerous times on StackOverflow
https://stackoverflow.com/questions/15767952/how-to-detect-table-start-in-itextsharp
Ultimately, it comes down to
- PDF documents contain (at minimum) only rendering instructions. A piece of text in a PDF document does not exist as such. Even text extraction itself is a non-trivial problem.
- Detecting (let alone processing) tables, lists and other content in PDF documents is very hard. It's the kind of thing thesis papers are written about.
If all of your incoming PDF documents look similar, you can use iText7 and pdf2Data (an iText7 add-on that features some good algorithms for doing table detection, sentence and paragraph detection etc).
Try it out at http://pdf2data.online/
Learn more about iText at https://developers.itextpdf.com/
And if you have further questions, it is advisable to post them on StackOverflow (unless you are paying customer, in which case you can directly access our jira board). We make a point of checking StackOverflow at least daily.
ITextSharp is an external library - you would need to utilize ITextSharp's methods in order to extract data from a PDF. The easiest way would be to use a C# script.
Question
Question Solved
Question Solved
Question
Question
Question
Question
Question
Question Solved
Question
Pega Collaboration Center has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.