Click or drag to resize
Pdf Library for .NET

Search for Text in PDF

The Pdf To Text Converter provided by SelectPdf, described in Extract Text from PDF section, provides one other very interesting feature: the possibility to search for text in a PDF document and retrieve the locations for all searched text occurrences.

This can be done using the Search method of the PdfToText class. The method will return an array of TextPosition objects.

Sample Code

This sample shows how to use SelectPdf PDF Library for .NET to search for text in a PDF document. A new PDF will be created highlighting the text that has been found.

// the test file
string filePdf = Server.MapPath("~/files/selectpdf.pdf");

// settings
bool caseSensitive = ChkCaseSensitive.Checked;
bool wholeWordsOnly = ChkWholeWordsOnly.Checked;

// instantiate a pdf to text converter object
PdfToText pdfToText = new PdfToText();

// load PDF file
pdfToText.Load(filePdf);

// search for text and retrieve all found text positions
TextPosition[] positions = pdfToText.Search(TxtSearchText.Text, 
    caseSensitive, wholeWordsOnly);

// open the existing PDF document in editing mode
PdfDocument doc = new PdfDocument(filePdf);

// highlight the found text in the existing PDF document
for (int i = 0; i < positions.Length; i++)
{
    TextPosition position = (TextPosition)positions[i];

    PdfPage page = doc.Pages[position.PageNumber - 1];

    PdfRectangleElement rect = new PdfRectangleElement(
        position.X, position.Y, position.Width, position.Height);
    rect.BackColor = new PdfColor(240, 240, 0);
    rect.Transparency = 30;
    page.Add(rect);
}

// save pdf document
doc.Save(Response, false, "Sample.pdf");

// close pdf document
doc.Close();
See Also