OCR: How to C#
Using OCR SDK for C#.NET
Sample C#.NET Codes
Extract Text from Jpeg, Png, Bitmap Images
  |  
Home ›› XImage.OCR ›› C# OCR: Extract File Page Content

C#: Extract Text from Jpeg, Png, Gif, Bitmap Images


Provide Free Visual C# Demos for How to Extract Text from Common Raster Image Files




Overview



By using well-compiled APIs, C# users can easily extract text from image files like Jpeg, Png, Bitmap, and Gif. Moreover, C# users may choose to define a special region on image and extract text from it accordingly. The following list demonstrates what you can do with our XImage.OCR for .NET and respective demos are offered on this tutorial.


Related .net document control helps:
c# asp.net dicom document viewer: ASP.NET Dicom Document Viewer Control: view, annotate dicom imaging files online in ASP.NET
asp.net pdf editor using c#: EdgePDF: ASP.NET PDF Editor Web Control: Online view, annotate, redact, edit, process, convert PDF documents
asp.net webforms pdf editor using c#: ASP.NET WebForms PDF Editor: create, view, edit, annotate, redact PDF file in ASP.NET WebForms application
c# asp.net pdf document viewer: ASP.NET PDF Document Viewer in C#: view, annotate, redact Adobe PDF files online in ASP.NET
view file asp.net: View multiple document formats in ASP.NET, MVC, Ajax, Azure using C# control
asp.net edit pdf page using c#: ASP.NET PDF Pages Edit Control: add, remove, sort, replace PDF pages online using C#
asp.net open pdf password using c#: ASP.NET PDF Password Edit Control: online add, remove, update PDF file open password using C#


Implement OCR technology on Jpeg, Png, Gif or Bitmap image file and extract all text information contained  


Use mature API to specify a target zone on loaded image and recognize its text content


Save recognized text characters to PDF or Word document




C#Project DLLs: Extract Text from



In order to run the following scan tiff image text sample code successfully, please do as follows:


Add References


  RasterEdge.XImage.OCR.dll


  RasterEdge.XImage.OCR.Tesseract.dll


  RasterEdge.Imaging.Basic.dll


  RasterEdge.Imaging.Basic.Codec.dll


  RasterEdge.Imaging.Drawing.dll


  RasterEdge.Imaging.Font.dll


  RasterEdge.Imaging.Processing.dll


  RasterEdge.XImage.AdvancedCleanup.Core.dll


Using Namespaces


  using RasterEdge.XImage.OCR;


Note: When you get the error "Could not load file or assembly 'RasterEdge.Imaging.Basic' or any other assembly or one of its dependencies. An attempt to load a program with an incorrect format", please check your configure as follows:

       

       If you are using x64 libraries/dlls, Right click the project -> Properties -> Build -> Platform target: x64.

       

       If using x86, the platform target should be x86.




Using C# Demo to Extract Text from Image File



In this section, you will see a piece of C# demo code that will help you quickly extract all text content from a Bmp image.




// The folder that contains '.traineddata' files.
OCRHandler.SetTrainResourcePath(DefaultSourceFolder);

// Set input file path.
String inputFilePath = RootFolder + "\\" + "Test.bmp";

// Set output file path.
String outputFilePath = RootFolder + "\\" + "Output.txt";

OCRHandler.Translate(inputFilePath, MIMEType.TXT, outputFilePath);





Using C# Demo to Extract Text from Specified Zone in Image



The following Visual C# OCR demo illustrates how to extract text from a specified region of Bmp image.




// The folder that contains '.traineddata' files.
OCRHandler.SetTrainResourcePath(DefaultSourceFolder);

// Set input file path.
String inputFilePath = RootFolder + "\\" + "Test.bmp";

// Set output file path.
String outputFilePath = RootFolder + "\\" + "Output1.txt";

// Import the image file.
OCRDocument doc = OCRHandler.Import(inputFilePath);

// Get the first page.
OCRPage page = doc.GetPage(0);

// Get a page zone start from point (10, 10) with width 400, height 300.
OCRZone pageZone = page.CreateZone(new Rectangle(10, 10, 400, 300));

// Apply recognizing.
pageZone.Recognize();

// Output the result to a text file.
pageZone.SaveTo(MIMEType.TXT, outputFilePath);





Using C# Demo to Scan Image and Output OCR Result to PDF



In order to save the content to pdf, add extra following libraries is necessary.


        RasterEdge.XDoc.PDF.dll


        RasterEdge.XImage.Raster.dll


        RasterEdge.XImage.Raster.Core.dll


You may directly copy C# sample code below to scan your image file and save recognized text content to PDF document.




// The folder that contains '.traineddata' files.
OCRHandler.SetTrainResourcePath(DefaultSourceFolder);

// Set input file path.
String inputFilePath = RootFolder + "\\" + "Test.bmp";

// Set output file path.
String outputFilePath = RootFolder + "\\" + "Output.pdf";

OCRHandler.Translate(inputFilePath, MIMEType.PDF, outputFilePath);





Using C# Demo to Scan Image and Output OCR Result to Word



Besides PDF document, you may also output OCR result to Microsoft Word document.




// The folder that contains '.traineddata' files.
OCRHandler.SetTrainResourcePath(DefaultSourceFolder);

// Set input file path.
String inputFilePath = RootFolder + "\\" + "Test.bmp";

// Set output file path.
String outputFilePath = RootFolder + "\\" + "Output.docx";

OCRHandler.Translate(inputFilePath, MIMEType.DOCX, outputFilePath);