Tuesday, June 16, 2015

How to Load Data From PDF file in SSIS

This is the question some one asked today. I haven’t done this but I can remember I read this some where. So thought of digging this today.

Some forums are suggesting a product.

http://www.datawatch.com/monarch-is-back/

Out of the box, Datawatch Monarch can work with a wide range of report formats including PDF, XML, HTML, text, spool and ASCII files. Access data from invoices, sales reports, balance sheets, customer lists, inventory, logs and more. The system is easy to use, allowing you to quickly select a file and automatically convert it into structured data for analysis.

Following blog includes how to import several types to SQL Server.

http://sqlage.blogspot.com/2013/12/ssis-how-to-import-files-text-pdf-excel.html

Following blogspot also have lot of links

http://www.forumarray.com/ssis-data-flow-source-component-to-read-a-pdf-file-226359

Following blog is the most relevant one for me. http://sql31.blogspot.com/2013/03/how-to-load-data-from-pdf-file-in-ssis.html

This users script task and you have the option of customizing it.

1 comment:

  1. Another great option for extracting data from PDFs to SQL Server is ReportMiner from Astera Software. Its rule based data extraction is based on a visual drag and drop interface and can extract data from documents, including PDF, DOC, XLS, TXT, PRN, and other files. You can find more details here -- http://www.astera.com/reportminer

    Best,
    Bob Smith

    Note : I work for Astera Software

    ReplyDelete