OCR: How to C#
Using OCR SDK for C#.NET
Sample C#.NET Codes

Overview

Home ›› XImage.OCR ›› C# OCR: Overview

C#: Use OCR SDK Library to Get Image and Document Text


C#.NET Online Tutorial for How to Extract Text from Tiff, Jpeg, Png, Gif, Bmp, and Scanned PDF Files




RasterEdge provides users with the most standard and comprehensive Optical Character Recognition SDK that is fully developed, highly accurate and easy to work within C#.NET, VB.NET, ASP.NET web and .NET WinForms program development environments. This online tutorial mainly talks about high level OCR toolkit in C# class programming. With this C# imaging OCR SDK, users are supposed to extract text from various images like Jpeg, Png, Bmp, Gif, Tiff and scanned PDF document, and output to text file, SVG image or PDF file rapidly. So, if you want to deploy OCR recognition, RasterEdge .NET OCR SDK is your best choice. Flexible C# OCR recognition, detecting and setting options are provided for better performance.




Major Features



RasterEdge OCR SDK provides you with mature functions to recognize characters out of images and documents types that are supported by RasterEdge .NET Document Imaging SDK.


Free to implement reliable and high performance Optical Character Recognition in any .NET development environment


Simple to integrate .NET Imaging OCR Software into C# and VB.NET programming applications


Support using this OCR SDK to extract image and document text content that in various popular languages


Able to recognize images captured by a digital camera, scanned document or image-only PDF using C# OCR toolkit


Support both monochrome and bitonal color image recognition for scanned documents and pictures in C#


Complete and rapid report of extracted text, including size, font, location, character attribute, etc.




Sample Code



RasterEdge.com provides free sample code for using our .NET OCR SDK. You may click below to see an example of using Visual C# programming code to extract text from Jpeg, and output to text file and PDF file.  Please note, you need to firstly integrate four assemblies into your C#.NET project as references.


RasterEdge.Imaging.OCR.dll


RasterEdge.Imaging.OCR.Tesseract.dll


RasterEdge.Imaging.Basic.Codec.dll


RasterEdge.Imaging.Basic.dll




// Register all referenced RasterEdge DLLs
WorkRegistry.Reset();

// Set the training data path. Please put eng.traineddata (for English) under the directory you specified.
OCRHandler.SetTrainResourcePath(@"c:\source");

REImage img = new REImage(@"C:\page.jpeg");

// Resize image to improve accuracy. If the image is clear enough, skip this.
img = img.Resize(new Size((int)img.Width * 2, (int)img.Height * 2));

// Recognize  characters from this image. Default language is English.
OCRPage page = OCRHandler.Import(img);
page.Recognize();

Console.WriteLine(page.GetText());





How To List



Install, Deploy and Distribute SDK



Basic SDK Concept


Supported Languages

1. System requirements


2. How to install SDK into Visual Studio


3. How to deploy SDK into IIS server


4. How to distribute SDK with your Windows application


1. OCRHandle Class


2. OCRRecSetting Class


3. OCRDoucument and OCRPage Classes


4. OCRZone Class


RasterEdge OCR module supports recognizing various language types, like English, Spanish, French, German, Italian, Russian, etc. You may click to see all languages and corresponding abbreviations.






Extract Text from Tiff



Extract File Page Content


Extract Content from Image

You may start with how to use RaterEdge .NET OCR SDK in your application to extract and get text from a Tiff image file. Free sample code is illustrated on this Visual C# tutorial page.


This online C# tutorial will tell you how to extract page content, extract specified page region content, and extract multiple pages file (like Tiff and scanned PDF) text content.


Free Visual C# programming codes are provided. You may directly copy to your .NET application to extract content from image (Tiff, scanned PDF, Jpeg, Png, Bmp, ...) and output to text or PDF file.