OCR: How to C#
Using OCR SDK for C#.NET
Sample C#.NET Codes
Extract Text from Jpeg, Png, Bitmap Images
  |  
Home ›› XImage.OCR ›› C# OCR: Extract File Page Content

C#: Extract Text from Jpeg, Png, Gif, Bitmap Images


Provide Free Visual C# Demos for How to Extract Text from Common Raster Image Files




Overview



By using well-compiled APIs, C# users can easily extract text from image files like Jpeg, Png, Bitmap, and Gif. Moreover, C# users may choose to define a special region on image and extract text from it accordingly. The following list demonstrates what you can do with our XImage.OCR for .NET and respective demos are offered on this tutorial.


Related .net document control helps:
asp.net mvc document viewer: ASP.NET MVC Document Viewer: view, annotate, redact files on ASP.NET MVC web projects
asp.net word document viewer: ASP.NET Office Word Document Viewer: view Word doc files online using C# in ASP.NET MVC web applications
asp.net document viewer control: EdgeDoc:ASP.NET Document Viewer C# Control: Open, view, annotate, redact, convert documents online in C#, VB.NET, AS...
asp.net pdf viewer control: ASP.NET PDF Viewer Control: view, navigate, zoom Adobe PDF document in C# ASP.NET
asp.net image viewer zoom: ASP.NET Image Viewer Control(MVC & WebForms): view, annotate, redact, convert image files in html, JQuery
asp.net mvc text file viewer: ASP.NET Text file viewer in MVC, WebForms: Open, view, annotate, convert txt files in C# ASP.NET
sharepoint document viewer: ASP.NET SharePoint Document Viewer: view, annotate, redact documents in SharePoint


Implement OCR technology on Jpeg, Png, Gif or Bitmap image file and extract all text information contained  


Use mature API to specify a target zone on loaded image and recognize its text content


Save recognized text characters to PDF or Word document




C#Project DLLs: Extract Text from



In order to run the following scan tiff image text sample code successfully, please do as follows:


Add References


  RasterEdge.XImage.OCR.dll


  RasterEdge.XImage.OCR.Tesseract.dll


  RasterEdge.Imaging.Basic.dll


  RasterEdge.Imaging.Basic.Codec.dll


  RasterEdge.Imaging.Drawing.dll


  RasterEdge.Imaging.Font.dll


  RasterEdge.Imaging.Processing.dll


  RasterEdge.XImage.AdvancedCleanup.Core.dll


Using Namespaces


  using RasterEdge.XImage.OCR;


Note: When you get the error "Could not load file or assembly 'RasterEdge.Imaging.Basic' or any other assembly or one of its dependencies. An attempt to load a program with an incorrect format", please check your configure as follows:

       

       If you are using x64 libraries/dlls, Right click the project -> Properties -> Build -> Platform target: x64.

       

       If using x86, the platform target should be x86.




Using C# Demo to Extract Text from Image File



In this section, you will see a piece of C# demo code that will help you quickly extract all text content from a Bmp image.




// The folder that contains '.traineddata' files.
OCRHandler.SetTrainResourcePath(DefaultSourceFolder);

// Set input file path.
String inputFilePath = RootFolder + "\\" + "Test.bmp";

// Set output file path.
String outputFilePath = RootFolder + "\\" + "Output.txt";

OCRHandler.Translate(inputFilePath, MIMEType.TXT, outputFilePath);





Using C# Demo to Extract Text from Specified Zone in Image



The following Visual C# OCR demo illustrates how to extract text from a specified region of Bmp image.




// The folder that contains '.traineddata' files.
OCRHandler.SetTrainResourcePath(DefaultSourceFolder);

// Set input file path.
String inputFilePath = RootFolder + "\\" + "Test.bmp";

// Set output file path.
String outputFilePath = RootFolder + "\\" + "Output1.txt";

// Import the image file.
OCRDocument doc = OCRHandler.Import(inputFilePath);

// Get the first page.
OCRPage page = doc.GetPage(0);

// Get a page zone start from point (10, 10) with width 400, height 300.
OCRZone pageZone = page.CreateZone(new Rectangle(10, 10, 400, 300));

// Apply recognizing.
pageZone.Recognize();

// Output the result to a text file.
pageZone.SaveTo(MIMEType.TXT, outputFilePath);





Using C# Demo to Scan Image and Output OCR Result to PDF



In order to save the content to pdf, add extra following libraries is necessary.


        RasterEdge.XDoc.PDF.dll


        RasterEdge.XImage.Raster.dll


        RasterEdge.XImage.Raster.Core.dll


You may directly copy C# sample code below to scan your image file and save recognized text content to PDF document.




// The folder that contains '.traineddata' files.
OCRHandler.SetTrainResourcePath(DefaultSourceFolder);

// Set input file path.
String inputFilePath = RootFolder + "\\" + "Test.bmp";

// Set output file path.
String outputFilePath = RootFolder + "\\" + "Output.pdf";

OCRHandler.Translate(inputFilePath, MIMEType.PDF, outputFilePath);





Using C# Demo to Scan Image and Output OCR Result to Word



Besides PDF document, you may also output OCR result to Microsoft Word document.




// The folder that contains '.traineddata' files.
OCRHandler.SetTrainResourcePath(DefaultSourceFolder);

// Set input file path.
String inputFilePath = RootFolder + "\\" + "Test.bmp";

// Set output file path.
String outputFilePath = RootFolder + "\\" + "Output.docx";

OCRHandler.Translate(inputFilePath, MIMEType.DOCX, outputFilePath);