Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I used itextsharp library to convert the PDF files to text and then go from there. Once you have the file in text format you can then determine how to parse - that would be the structure you will see all the time for that document. In my case, each document differ by vendors, hence different parsers. That's the gist of it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: