C# PDF Text Remover Library
How to delete, remove text from PDF file using C# .net
How to use C# library source code to remove text content from PDF document with .NET PDF Component SDK
In this tutorial, you learn how to find, remove text from a PDF file using C# in ASP.NET, Windows applications.
How to remove text from PDF file using C#
- Professional PDF SDK library for adobe PDF text deletion in Visual Studio .NET framework program
- Free evaluation components able to perform PDF text deletion function in both C#.NET WinForms and ASP.NET project
- PDF text, image edit for C# library:
c# extract image from pdf file,
c# add text to pdf,
c# remove images from pdf,
how to add image in pdf in c#,
how to search text in pdf using c#.
- Support WinForms (.NET Core, Framework), ASP.NET .NET Core MVC, IIS, ASP.NET Ajax, Azure cloud service, DNN (DotNetNuke), SharePoint
- Delete text from PDF file in preview without adobe PDF reader component installed in ASP.NET
- C# class source code able to help users delete text characters at specified position from PDF in .NET console application
- Able to pull text out of selected PDF page or all PDF document in .NET WinForms
- Functionality to remove text format by modifying text font, size, color, etc
- Other PDF edit functionalities, like add PDF text, add PDF text box and field
C#: how to remove, delete text search results from pdf document
To delete, remove text search results from PDF pages, you need find, locate the text first. With XDoc.PDF for .NET SDK, you can find and location text through the following methods:
- Do search and find text
- Using regular expression to search and find text
- Find all text inside a page region
- Find the text char by the page position
Simple text search
Using c#, you run searches to find specific text items in PDF file. You can run a simple text search, looking for a search term within list of PDF pages, or a page region.
Or you can use advanced search options, and search PDF document. Search Options and example C# source code:
- WholeWord: Finds only occurrences of the complete word. For example, if you search for the word inside, the words in and side aren't found.
- IgnoreCase: Finds only occurrences of the words that match the capitalization you provide. For example, if you search for the word White, the words white and WHITE aren't found.
- ContextExpansion: The number or chars will be returned with searched text
RESearchOption searchOps = new RESearchOption();
searchOps.MatchString = "RasterEdge";
searchOps.IgnoreCase = true;
searchOps.WholeWord = false;
searchOps.ContextExpansion = 0;
Text search with regular expression
In C#, you can do advanced text search with regular expression. The following C# example source code support text search on urls.
// Search pattern for URL
String pattern = @"\b(\S+)://(\S+)\b";
RegexOptions regexOps = RegexOptions.IgnoreCase;
C#: how to search and remove text from the PDF pages
The following C# example code will demo how to do a simple text search, and delete the searched text results from the list of PDF pages.
String inputFilePath = @"C:\1.pdf";
String outputFilePath = @"C:\output.pdf";
// Open file
PDFDocument doc = new PDFDocument(inputFilePath);
// Search text "RasterEdge"
String matchString = "RasterEdge";
// Set search option
RESearchOption searchOps = new RESearchOption();
searchOps.MatchString = matchString;
searchOps.IgnoreCase = true;
searchOps.WholeWord = false;
searchOps.ContextExpansion = 0;
// Set search range (from page 1 to 3)
int pageOffset = 0;
int pageCount = 3;
// Remove all search results from the document
doc.SearchTextAndDelete(matchString, searchOps, pageOffset, pageCount);
// Save file
doc.Save(outputFilePath);
You can also search text from a specified PDF page region, and delete the searched results.
String inputFilePath = @"C:\1.pdf";
String outputFilePath = @"C:\output.pdf";
// Open file
PDFDocument doc = new PDFDocument(inputFilePath);
// Search text "RasterEdge"
String matchString = "RasterEdge";
// Set search option
RESearchOption searchOps = new RESearchOption();
searchOps.MatchString = matchString;
searchOps.IgnoreCase = true;
searchOps.WholeWord = false;
searchOps.ContextExpansion = 0;
// Set target page region in the 1st page.
int pageIndex = 0;
// Region: start point (0,0), with = 500, height = 300. Unit: pixel (in 96 dpi).
RectangleF pageRegion = new RectangleF(0, 0, 500, 300);
// Remove all search results from the document
doc.SearchTextAndDelete(matchString, searchOps, pageIndex, pageRegion);
// Save file
doc.Save(outputFilePath);
C#: how to search and remove text using regular expression from the PDF pages
The following C# example code will demo how to do a text search with regular expression from list of pdf pages, and delete the searched text results.
String inputFilePath = @"C:\1.pdf";
String outputFilePath = @"C:\output.pdf";
// Open file
PDFDocument doc = new PDFDocument(inputFilePath);
// Search pattern for URL
String pattern = @"\b(\S+)://(\S+)\b";
RegexOptions regexOps = RegexOptions.IgnoreCase;
// Set search range (from page 1 to 3)
int pageOffset = 0;
int pageCount = 3;
// Remove all search results from the document
doc.SearchTextAndDelete(pattern, regexOps, pageOffset, pageCount);
// Save file
doc.Save(outputFilePath);
Do a text search with regular expression from the pdf page region, and delete the searched text results.
String inputFilePath = @"C:\1.pdf";
String outputFilePath = @"C:\output.pdf";
// Open file
PDFDocument doc = new PDFDocument(inputFilePath);
// Search pattern for URL
String pattern = @"\b(\S+)://(\S+)\b";
RegexOptions regexOps = RegexOptions.IgnoreCase;
// Set target page region in the 1st page.
int pageIndex = 0;
// Region: start point (0,0), with = 500, height = 300. Unit: pixel (in 96 dpi).
RectangleF pageRegion = new RectangleF(0, 0, 500, 300);
// remove all search results from the document
doc.SearchTextAndDelete(pattern, regexOps, pageIndex, pageRegion);
// Save file
doc.Save(outputFilePath);
C#: delete, remove all text from page location
Besides removing text from search results, you will also delete text content from page locations, such as page coordinates, page area.
Remove all text characters in a PDF page region using C#
You can delete, remove all text chars on a PDF page region.
String inputFilePath = Program.RootPath + "\\" + "1.pdf";
String outputFilePath = Program.RootPath + "\\" + "output.pdf";
// open document
PDFDocument doc = new PDFDocument(inputFilePath);
// get the 3rd page
PDFPage page = (PDFPage)doc.GetPage(2);
// set redact region
RectangleF region = new RectangleF(100F, 100F, 300F, 300F);
// create redaction option
RedactionOptions options = new RedactionOptions();
options.AreaFillColor = Color.Black;
// process redaction
PDFTextHandler.RedactText(page, region, options);
// output file
doc.Save(outputFilePath);
Delete text content by a page position (x, y coordinates)
You can remove the text char by a PDF page point.
// open a document
String inputFilePath = Program.RootPath + "\\" + "1.pdf";
PDFDocument doc = new PDFDocument(inputFilePath);
// get a text manager from the document object
PDFTextMgr textMgr = PDFTextHandler.ExportPDFTextManager(doc);
// get the first page from the document
int pageIndex = 0;
PDFPage page = (PDFPage)doc.GetPage(pageIndex);
// select char at position (127F, 187F)
PointF cursor = new PointF(127F, 187F);
PDFTextCharacter aChar = textMgr.SelectChar(page, cursor);
// delete a selected character
textMgr.DeleteChar(aChar);
// output the new document
String outputFilePath = Program.RootPath + "\\" + "output.pdf";
doc.Save(outputFilePath);
Delete characters in a PDF page
// open a document
String inputFilePath = Program.RootPath + "\\" + "1.pdf";
PDFDocument doc = new PDFDocument(inputFilePath);
// get a text manager from the document object
PDFTextMgr textMgr = PDFTextHandler.ExportPDFTextManager(doc);
// get the first page from the document
int pageIndex = 0;
PDFPage page = (PDFPage)doc.GetPage(pageIndex);
// extract all characters in the page
List<PDFTextCharacter> chars = textMgr.ExtractTextCharacter(page);
int cnt = 0;
// delete a character every 3 characters
foreach (PDFTextCharacter aChar in chars)
{
if (cnt % 3 == 0)
{
textMgr.DeleteChar(aChar);
}
cnt++;
}
// output the new document
String outputFilePath = Program.RootPath + "\\" + "output.pdf";
doc.Save(outputFilePath);