PDF Extractor SDK enables developers to easily extract text and images from PDFs, convert PDFs to XML or CSV for Excel, and retrieve information about PDF files through .NET and ActiveX interfaces. No additional software is necessary for its use.

With Bytescout PDF Extractor SDK, you can extract tables in PDF files to Excel (CSV) by reading cells from a given rectangle. It also allows users to convert tables in PDF to XML files efficiently. Extracting information about a PDF file is easy with this software. You can extract metadata such as the title, author, and description, alongside other information like the number of pages the file has, whether it's encrypted or not.
Working with embedded images in PDF documents is also seamless with this software. Users can extract embedded images from PDF documents in ASP.NET, VB.NET, C#, VB6, and VBScript. DocumentMerger and DocumentSplitter interfaces and classes are also available in the software to help merge and split PDF documents.
One of the notable strengths of PDF Extractor SDK is that it doesn't require Adobe Reader or any other PDF reader software to be installed. It provides .NET and ActiveX interfaces, and it's made of 100% managed C# code. Overall, this software is an excellent tool for software developers who need a robust PDF extraction tool.
Version 9.0.0.3079:
Added filtering of extracted content by font name, font size and color.
Updated OCR engine to the latest version. Update language files from 'tessdata' folder.
Improved text extraction, lines grouping in tabular data, performance, XFA forms extraction, TableDetector, fixed PDF parsing issues.
Version 8.8.0.3015:
Added filtering of extracted content by font name, font size and color.
Updated OCR engine to the latest version. Update language files from 'tessdata' folder.
Improved text extraction, lines grouping in tabular data, performance, XFA forms extraction, TableDetector, fixed PDF parsing issues.
Version 8.7.0.3006:
Added filtering of extracted content by font name, font size and color.
Updated OCR engine to the latest version. Update language files from 'tessdata' folder.
Improved text extraction, lines grouping in tabular data, performance, XFA forms extraction, TableDetector, fixed PDF parsing issues.
Version 8.7.0.2980:
Added filtering of extracted content by font name, font size and color.
Updated OCR engine to the latest version. Update language files from 'tessdata' folder.
Improved text extraction, lines grouping in tabular data, performance, XFA forms extraction, TableDetector, fixed PDF parsing issues.
Version 8.6.0.2911:
Added filtering of extracted content by font name, font size and color.
Updated OCR engine to the latest version. Update language files from 'tessdata' folder.
Improved text extraction, lines grouping in tabular data, performance, XFA forms extraction, TableDetector, fixed PDF parsing issues.
Version 8.5.0.2855:
Added filtering of extracted content by font name, font size and color.
Updated OCR engine to the latest version. Update language files from 'tessdata' folder.
Improved text extraction, lines grouping in tabular data, performance, XFA forms extraction, TableDetector, fixed PDF parsing issues.
Version 8.3.0.2792:
Added filtering of extracted content by font name, font size and color.
Updated OCR engine to the latest version. Update language files from 'tessdata' folder.
Improved text extraction, lines grouping in tabular data, performance, XFA forms extraction, TableDetector, fixed PDF parsing issues.
Version 8.2.0.2697a:
Added filtering of extracted content by font name, font size and color.
Updated OCR engine to the latest version. Update language files from 'tessdata' folder.
Improved text extraction, lines grouping in tabular data, performance, XFA forms extraction, TableDetector, fixed PDF parsing issues.