OCR: How to C#
Using OCR SDK for C#.NET
Sample C#.NET Codes
Extract Text from Jpeg, Png, Bitmap Images
  |  
Home ›› XImage.OCR ›› C# OCR: Extract File Page Content

C#: Extract Text from Jpeg, Png, Gif, Bitmap Images


Provide Free Visual C# Demos for How to Extract Text from Common Raster Image Files




Overview



By using well-compiled APIs, C# users can easily extract text from image files like Jpeg, Png, Bitmap, and Gif. Moreover, C# users may choose to define a special region on image and extract text from it accordingly. The following list demonstrates what you can do with our XImage.OCR for .NET and respective demos are offered on this tutorial.


Implement OCR technology on Jpeg, Png, Gif or Bitmap image file and extract all text information contained  


Use mature API to specify a target zone on loaded image and recognize its text content


Save recognized text characters to PDF or Word document




C#Project DLLs: Extract Text from



In order to run the following scan tiff image text sample code successfully, please do as follows:


Add References


  RasterEdge.XImage.OCR.dll


  RasterEdge.XImage.OCR.Tesseract.dll


  RasterEdge.Imaging.Basic.dll


  RasterEdge.Imaging.Basic.Codec.dll


  RasterEdge.Imaging.Drawing.dll


  RasterEdge.Imaging.Font.dll


  RasterEdge.Imaging.Processing.dll


  RasterEdge.XImage.AdvancedCleanup.Core.dll


Using Namespaces


  using RasterEdge.XImage.OCR;




Using C# Demo to Extract Text from Image File



In this section, you will see a piece of C# demo code that will help you quickly extract all text content from a Bmp image.




// The folder that contains '.traineddata' files.
OCRHandler.SetTrainResourcePath(DefaultSourceFolder);

// Set input file path.
String inputFilePath = RootFolder + "\\" + "Test.bmp";

// Set output file path.
String outputFilePath = RootFolder + "\\" + "Output.txt";

OCRHandler.Translate(inputFilePath, MIMEType.TXT, outputFilePath);





Using C# Demo to Extract Text from Specified Zone in Image



The following Visual C# OCR demo illustrates how to extract text from a specified region of Bmp image.




// The folder that contains '.traineddata' files.
OCRHandler.SetTrainResourcePath(DefaultSourceFolder);

// Set input file path.
String inputFilePath = RootFolder + "\\" + "Test.bmp";

// Set output file path.
String outputFilePath = RootFolder + "\\" + "Output1.txt";

// Import the image file.
OCRDocument doc = OCRHandler.Import(inputFilePath);

// Get the first page.
OCRPage page = doc.GetPage(0);

// Get a page zone start from point (10, 10) with width 400, height 300.
OCRZone pageZone = page.CreateZone(new Rectangle(10, 10, 400, 300));

// Apply recognizing.
pageZone.Recognize();

// Output the result to a text file.
pageZone.SaveTo(MIMEType.TXT, outputFilePath);





Using C# Demo to Scan Image and Output OCR Result to PDF



In order to save the content to pdf, add extra following libraries is necessary.


        RasterEdge.XDoc.PDF.dll


        RasterEdge.XImage.Raster.dll


        RasterEdge.XImage.Raster.Core.dll


You may directly copy C# sample code below to scan your image file and save recognized text content to PDF document.




// The folder that contains '.traineddata' files.
OCRHandler.SetTrainResourcePath(DefaultSourceFolder);

// Set input file path.
String inputFilePath = RootFolder + "\\" + "Test.bmp";

// Set output file path.
String outputFilePath = RootFolder + "\\" + "Output.pdf";

OCRHandler.Translate(inputFilePath, MIMEType.PDF, outputFilePath);





Using C# Demo to Scan Image and Output OCR Result to Word



Besides PDF document, you may also output OCR result to Microsoft Word document.




// The folder that contains '.traineddata' files.
OCRHandler.SetTrainResourcePath(DefaultSourceFolder);

// Set input file path.
String inputFilePath = RootFolder + "\\" + "Test.bmp";

// Set output file path.
String outputFilePath = RootFolder + "\\" + "Output.docx";

OCRHandler.Translate(inputFilePath, MIMEType.DOCX, outputFilePath);