C# PDF Image Reader Library
How to read, extract images from existing PDF file using c# .net
A .NET Library Support PDF Image Extraction from a Page, a Region on a Page, and PDF Document in C#
In this tutorial, you learn how to read, extract images from a PDF file using C# without Acrobar installed.
How to read, extract PDF file images using C#
- Best C#.NET library for extracting image from adobe PDF page in Visual Studio .NET framework project
- Provide trial SDK components for quick integration in Visual C#.NET WinForms and ASP.NET project for PDF image extraction
- Free C# source code for extracting image from specified PDF page position in .NET class
- Free PDF text how-tos for C#:
remove text from pdf c#,
pdf replace text c#,
c# pdf insert image,
c# add text to pdf,
c# extract text from pdf,
c# remove images from pdf.
- Support .NET Core, ASP.NET Core MVC, .NET WinForms, ASP.NET MVC in IIS, ASP.NET Ajax, Azure cloud service, DNN (DotNetNuke), SharePoint
- Extract various types of image from PDF file, like XObject Image, XObject Form, Inline Image, etc
- Get JPG, JPEG and other high quality image files from PDF document
- Able to extract vector images from PDF in .NET console application
- Extract all images from whole PDF or a specified PDF page
- Capture image from whole PDF based on special characteristics
- Scan image to PDF, tiff and various image formats
- Get image information, such as its location, zonal information, metadata, and so on
About class REImage
Using XDoc.PDF for .NET SDK, you can easily read, extract images from pdf document, page, page region. The extracted images are stored in PDFImage objects.
Class PDFImage includes the following properties and methods
- Position: Position of the image in the page. Unit: pixel (in 96 dpi)
- IsRotated: Indicate if the image is rotated.
- Image: Get the embedded image resource related to this object.
- IsInlineImage: Indicate if the resource is an Inline Image.
- IsForm: Indicate if the resource is an XObject Form. Return false if the resource is an XObject Image or Inline Image.
- IsXObjectForm: Same to IsForm
- IsRGB: Indicate if the image is an XObject Image with ColorSpace DeviceRGB.
- IsCMYK: Indicate if the image is an XObject Image with ColorSpace DeviceCMYK.
- IsGray: Indicate if the image is an XObject Image with ColorSpace DeviceGray.
- IsIndexed: Indicate if the image is an XObject Image with ColorSpace Indexed.
- IsCIEBased: Indicate if the image is an XObject Image with CIE-based ColorSpace.
- RectangleF GetBoundary(): Get boundary of the item in the Device Space (Windows-like coordinate system). Unit: pixel (in 96 dpi)
- GetBitmap(): Get appearence of the page item in the Device Space (Windows-like coordinate system).
- GetColorSpaceName(): Get Color Space type of the image. Only valid for PDFImageType.XObjImage.
Extract images from a pdf document using C#.NET
Using XDoc.PDF C# library, you can easily read, extract images from a pdf file, pdf pages, pdf page specified location or area.
C# extract images from whole pdf document
#region extract images from whole pdf document
internal static void extractImagesFromPdfFile()
{
String inputFilePath = @"C:\demo.pdf";
// Open a document.
PDFDocument doc = new PDFDocument(inputFilePath);
// Extract all images in the document.
List<PDFImage> allImages = PDFImageHandler.ExtractImages(doc);
}
#endregion
C# extract images from specified PDF page
#region extract images from one pdf page
internal static void extractImagesFromPdfPage()
{
String inputFilePath = @"C:\demo.pdf";
// Open a document.
PDFDocument doc = new PDFDocument(inputFilePath);
PDFPage page = (PDFPage)doc.GetPage(0);
// Extract all images on one pdf page.
List<PDFImage> allImages = PDFImageHandler.ExtractImages(page);
}
#endregion
C# read the image from specified position (coordinates) inside pdf document
XDoc.PDF SDK is using Window Coordinate System.
Points on the screen are described by x- and y-coordinate pairs.
The x-coordinates increase to the right; y-coordinates increase from top to bottom. The origin (0,0) is the most left top point on the pdf page.
#region read the image from specified position (coordinates) inside pdf document
internal static void extractImagesFromSpecifiedPosition()
{
String inputFilePath = @"C:\demo.pdf";
// Open a document.
PDFDocument doc = new PDFDocument(inputFilePath);
// Get page 3 from the document.
PDFPage page = (PDFPage)doc.GetPage(3);
// Select image by the point (50F, 100F).
PDFImage img = PDFImageHandler.SelectImage(page, new PointF(50F, 100F));
// ...
}
#endregion
Read images list from specified area in pdf page
For api: public RectangleF(float x, float y, float width, float height)
(x,y) is the Rectangle left top corner coordicate, width, height are the rectangle's width and height.
// open a document
String inputFilePath = Program.RootPath + "\\" + "3.pdf";
PDFDocument doc = new PDFDocument(inputFilePath);
// get the first page
int pageIndex = 0;
PDFPage page = (PDFPage)doc.GetPage(pageIndex);
// define the region (Rectangle [50F, 50F, 300F, 400F]) of the page
RectangleF region = new RectangleF(50F, 50F, 300F, 400F);
// get all images in the region in sequence (from bottom to top)
List<PDFImage> images = PDFImageHandler.SelectImages(page, region);
// select the top image in the region
PDFImage image1 = PDFImageHandler.SelectImage(page, region);
// select the bottom image in the region
int sequenceIndex = 0;
PDFImage image2 = PDFImageHandler.SelectImage(page, region, sequenceIndex);
Common Asked Questions
How to extract pictures from a PDF?
You can open the PDF file, select the image and right click, choose "Edit Text & Image". Copy the selected image. You can paste the image to other document application, like Microsoft Word.
Using C# PDF library, you can select the image from the PDF, and copy, paste to other program in C# ASP.NET web and Windows Forms applications.
Is there a free PDF image extractor?
There are free online or desktop PDF tools, which support image extraction from PDF file. Using RasterEdge PDF C# library, you can extract images from PDF file
in C# class.
How do I export selected images from a PDF?
You can copy the highlighed image from PDF to Word application or other document, image programs.
Using C# PDF image library, you can save images from PDF to image files in png, jpeg, bitmap, tiff formats.
How to copy an image from a PDF and save as JPEG?
You can extract and save the image from PDF using PDF image extractor tools, and use image tools to convert it to JPEG format.
Using RasterEdge XDoc.PDF C# library, you can select the image in pdf, and export it to JPEG image directly in C# application.