VB.NET PDF: How to HTML5 PDF Viewer PDF Create PDF Export File & Page Process PDF Read PDF Write Form Process Document Protect Annotation & Drawing PDF Print WPF PDF Viewer Work with Other SDKs Barcode Read Barcode Create OCR Twain
OCR
  |  
Home ›› XDoc.PDF ›› VB.NET PDF: OCR

VB.NET PDF - Extract Text from Scanned PDF Using OCR SDK for VB.NET


VB.NET Tutorial for Using OCR Library to Extract Text from Adobe PDF Document in Visual Basic Class




Overview



Best VB.NET OCR SDK for Visual Studio .NET


Scan text content from adobe PDF document in Visual Basic.NET application


Able to specify any area of PDF to perform OCR function in .NET WinForms and ASP.NET webpage


.NET library for batching OCR PDF text content in VB.NET


Support .NET WinForms, ASP.NET MVC in IIS, ASP.NET Ajax, Azure cloud service, DNN (DotNetNuke), SharePoint


Recognize the whole PDF document and get all text content in VB.NET


Recognize a page of PDF document and extract its text content in Visual Basic .NET class


Recognize scanned PDF file and output OCR result to adobe PDF file


Recognize scanned PDF document and output OCR result to MS Word file


Online VB.NET class source code for evaluation


Free VB.NET components and controls for downloading and using in .NET framework




Extract Text from Whole PDF Document in VB.NET



Right-click the project and select "Add Reference..." to locate and add the following DLLs as project references;


  RasterEdge.Imaging.Basic.dll


  RasterEdge.Imaging.Basic.Codec.dll


  RasterEdge.Imaging.Drawing.dll


  RasterEdge.Imaging.Font.dll


  RasterEdge.Imaging.Processing.dll


  RasterEdge.XDoc.Raster.dll


  RasterEdge.XDoc.Raster.Core.dll


  RasterEdge.XDoc.PDF.dll


  RasterEdge.XImage.AdvancedCleanup.Core.dll


  RasterEdge.XImage.OCR.dll


  RasterEdge.XImage.OCR.Tesseract.dll


Use corresponding namespaces;


  RasterEdge.Imaging.Basic;


  RasterEdge.XDoc.PDF;


  RasterEdge.XImage.OCR;


Add the following VB.NET OCR PDF text demo code to your project.




Dim OcrSource As String = "D:\Alice\DLL\Source\"
OCRHandler.SetTrainResourcePath(OcrSource)
Dim pdf As PDFDocument = New PDFDocument("C:\input.pdf")
Dim page As BasePage = pdf.GetPage(0)
Dim bmp As Bitmap = page.ConvertToImage()
Dim ocrPage As OCRPage = OCRHandler.Import(bmp)
ocrPage.Recognize()
ocrPage.SaveTo(MIMEType.TXT, "C:\output.txt")