How to Start Convert PDF Work with PDF Modules PDF Document PDF Pages Text Image Graph & Path Annotation, Markup & Drawing Redaction Security Digital Signature Forms Watermark Bookmark Link File Attachment File Metadata Printing Work with Other SDKs Barcode read Barcode create OCR Twain

PDF Text VB.NET Library
How to search and find text in PDF file using VB.NET


Learn How to Search Text in PDF Document and Obtain Text Content and Location Information in VB.NET application





In this VB.NET tutorial, you will learn how to search PDF for text in Visual Basic .NET applications.

  • Search text in PDF document, pages, page regions
  • Search text using regular expression
  • Find and get coordinates of text in pdf
  • Easy to integrate in your VB.NET Windows Forms, WPF, Console applications

How to find, search text in PDF file using VB.NET

  1. Download XDoc.PDF Text Editor vb.net library
  2. Install VB library to search text in PDF document
  3. Step by Step Tutorial










  • Best Visual Studio .NET PDF document SDK, built in .NET framework 2.0 and compatible with VB.NET programming language
  • Easy to search and find text content in multiple page adobe PDF files in .NET WinForms and ASP.NET
  • Search text in PDF images by using XDoc.PDF SDK for VB.NET
  • Support .NET WinForms, ASP.NET MVC in IIS, ASP.NET Ajax, Azure cloud service, DNN (DotNetNuke), SharePoint
  • Help to find and get PDF text position details in Visual Basic class program
  • Allow to search defined PDF file page or the whole document
  • Support various search options, like whole word, ignore case, match string, etc
  • Ability to search and replace PDF text programmatically in VB.NET
  • VB.NET class online source code and free VB.NET XDoc.PDF library and component are available


If the source PDF document is with multiple pages, it may be difficult for you to find certain text from the Microsoft PDF document page. Our VB.NET PDF Document Add-On enables you to search for text in target PDF document by using PDFPage class. Once you have found the text, various operations are available according to specific needs. For example, you can locate the searched text together with methods stated above.

API and VB.NET sample code below can be utilized to search for text in target PDF document in your Visual Studio project using VB language. Furthermore, if you are a Visual C# .NET programmer, you can go to this Visual C# tutorial for PDF text search in .NET project.









About text search on PDF



Using XDoc.PDF for .NET sdk, you can easily do text search on PDF document. you can find and location text through the following methods:


  1. Utilize regular expression to do search and find text
  2. Find all text inside a page region, page ranges, or whole PDF document
  3. Extract the text coordinates from search results




Text search options


Using VB.NET, you can easily enable advanced text search options using RESearchOption class. Here are the list of properties in class RESearchOption.


  1. WholeWord: Finds only occurrences of the complete word. For example, if you search for the word inside, the words in and side aren't found.
  2. IgnoreCase: Finds only occurrences of the words that match the capitalization you provide. For example, if you search for the word White, the words white and WHITE aren't found.
  3. ContextExpansion: The number or chars will be returned with searched text




Text search with regular expression


You can do text search with regular expression on PDF file. View the VB.NET example code to do text search with regular expression in PDF





Get searched text coordinates


After you do a text search on a pdf file in VB.NET code, you will get a list of SearchResultItem objects. Each SearchResultItem object has one property CombinedResultArea, which contains the text coordinates information.

  1. Area.X: the text coordinates, left top point X value on the pdf page
  2. Area.Y: the text coordinates, left top point Y value on the pdf page
  3. Area.Width: the text coordinates, area width
  4. Area.Height: the text coordinates, area height










How to search text on PDF document using VB.NET?


The content below will show how to do text search on pdf whole document, a page, or page region in vb.net code.







VB.NET search text on whole pdf document


The text and VB.NET code below shows how to do a text search on a pdf document.

  1. Define a new PDFDocument object with an existing PDF file loaded
  2. Define a String var with search term "RasterEdge"
  3. Define a new RESearchOption object with search options applied
  4. Call method PDFDocument.Search() to do text search on PDF with search results returned



        Dim inputFilePath As String = "C:\1.pdf"

        ' Open file
        Dim doc As PDFDocument = New PDFDocument(inputFilePath)

        ' Search text "RasterEdge"
        Dim matchString As String = "RasterEdge"
        ' Set search option
        Dim searchOps As RESearchOption = New RESearchOption()
        searchOps.MatchString = matchString
        searchOps.IgnoreCase = True
        searchOps.WholeWord = False
        searchOps.ContextExpansion = 10

        ' Apply searching
        Dim result As SearchResult = doc.Search(matchString, searchOps)

        ' Show result
        If result.HaveMatched Then
            For Each item In result.Result
                Console.WriteLine("Matched String: '{0}'", item.MatchedString)
                Console.WriteLine("Context String: '{0}'", item.ContextString)
                Console.WriteLine("Result Area(s):")
                For Each area In item.CombinedResultArea
                    Console.WriteLine("  {0}: {1},{2}; W={3}; H={4}", area.PageIndex, area.Area.X.ToPixel(), area.Area.Y.ToPixel(), area.Area.Width.ToPixel(), area.Area.Height.ToPixel())
                Next
            Next
        End If






How to search text on single PDF page, or consecutive pdf pages in VB.NET code


The VB.NET code below shows how to do a text search on the first three PDF pages.



Dim inputFilePath As String = "C:\1.pdf"

' Open file
Dim doc As PDFDocument = New PDFDocument(inputFilePath)

' Search text "RasterEdge"
Dim matchString As String = "RasterEdge"
' Set search option
Dim searchOps As RESearchOption = New RESearchOption()
searchOps.MatchString = matchString
searchOps.IgnoreCase = True
searchOps.WholeWord = False
searchOps.ContextExpansion = 10
' Set search range (from page 1 to 3)
Dim pageOffset As Integer = 0
Dim pageCount As Integer = 3

' Apply searching
Dim result As SearchResult = doc.Search(matchString, searchOps, pageOffset, pageCount)

' Show result
If result.HaveMatched Then
    For Each item In result.Result
        Console.WriteLine("Matched String: '{0}'", item.MatchedString)
        Console.WriteLine("Context String: '{0}'", item.ContextString)
        Console.WriteLine("Result Area(s):")
        For Each area In item.CombinedResultArea
            Console.WriteLine("  {0}: {1},{2}; W={3}; H={4}", area.PageIndex, area.Area.X.ToPixel(), area.Area.Y.ToPixel(), area.Area.Width.ToPixel(), area.Area.Height.ToPixel())
        Next
    Next
End If






Search text in the specified page region in VB.NET


The VB.NET code below explains how to do a text search inside a pdf page region



Dim inputFilePath As String = "C:\1.pdf"

' Open file
Dim doc As PDFDocument = New PDFDocument(inputFilePath)

' Search text "RasterEdge"
Dim matchString As String = "RasterEdge"
' Set search option
Dim searchOps As RESearchOption = New RESearchOption()
searchOps.MatchString = matchString
searchOps.IgnoreCase = True
searchOps.WholeWord = False
searchOps.ContextExpansion = 10
' Set target page region in the 1st page.
Dim pageIndex As Integer = 0
' Region: start Point(0, 0), with = 500, height = 300.Unit: pixel(in 96 dpi).
Dim pageRegion As RectangleF = New RectangleF(0, 0, 500, 300)

' Apply searching
Dim result As SearchResult = doc.Search(matchString, searchOps, pageIndex, pageRegion)

' Show result
If result.HaveMatched Then
    For Each item In result.Result
        Console.WriteLine("Matched String: '{0}'", item.MatchedString)
        Console.WriteLine("Context String: '{0}'", item.ContextString)
        Console.WriteLine("Result Area(s):")
        For Each area In item.CombinedResultArea
            Console.WriteLine("  {0}: {1},{2}; W={3}; H={4}", area.PageIndex, area.Area.X.ToPixel(), area.Area.Y.ToPixel(), area.Area.Width.ToPixel(), area.Area.Height.ToPixel())
        Next
    Next
End If








How to search text with regular expression on PDF file using Visual Basic .NET


The VB.NET code below shows how to do a text search with regular expression on pdf pages.



Dim inputFilePath As String = "C:\1.pdf"

' Open file
Dim doc As PDFDocument = New PDFDocument(inputFilePath)

' Search pattern for URL
Dim pattern As String = "\b(\S+)://(\S+)\b"
Dim regexOps As RegexOptions = RegexOptions.IgnoreCase
' Set search range (from page 1 to 3)
Dim pageOffset As Integer = 0
Dim pageCount As Integer = 3

' Apply searching
Dim result As MatchResult = doc.Search(pattern, regexOps, pageOffset, pageCount)

' Show result
If result.HaveMatched Then
    For Each item In result.GetResult()
        Console.WriteLine("Matched String: '{0}'", item.MatchedString)
        Console.WriteLine("Context String: '{0}'", item.ContextString)
        Console.WriteLine("Result Area(s):")
        For Each area In item.CombinedResultArea
            Console.WriteLine("  {0}: {1},{2}; W={3}; H={4}", area.PageIndex, area.Area.X.ToPixel(), area.Area.Y.ToPixel(), area.Area.Width.ToPixel(), area.Area.Height.ToPixel())
        Next
    Next
Else
    Console.WriteLine("No Matched Item")
End If