Search for Text in PDF |
The Pdf To Text Converter provided by SelectPdf, described in Extract Text from PDF section, provides one other very interesting feature: the possibility to search for text in a PDF document and retrieve the locations for all searched text occurrences.
This can be done using the Search method of the PdfToText class. The method will return an array of TextPosition objects.
This sample shows how to use SelectPdf PDF Library for .NET to search for text in a PDF document. A new PDF will be created highlighting the text that has been found.
// the test file string filePdf = Server.MapPath("~/files/selectpdf.pdf"); // settings bool caseSensitive = ChkCaseSensitive.Checked; bool wholeWordsOnly = ChkWholeWordsOnly.Checked; // instantiate a pdf to text converter object PdfToText pdfToText = new PdfToText(); // load PDF file pdfToText.Load(filePdf); // search for text and retrieve all found text positions TextPosition[] positions = pdfToText.Search(TxtSearchText.Text, caseSensitive, wholeWordsOnly); // open the existing PDF document in editing mode PdfDocument doc = new PdfDocument(filePdf); // highlight the found text in the existing PDF document for (int i = 0; i < positions.Length; i++) { TextPosition position = (TextPosition)positions[i]; PdfPage page = doc.Pages[position.PageNumber - 1]; PdfRectangleElement rect = new PdfRectangleElement( position.X, position.Y, position.Width, position.Height); rect.BackColor = new PdfColor(240, 240, 0); rect.Transparency = 30; page.Add(rect); } // save pdf document doc.Save(Response, false, "Sample.pdf"); // close pdf document doc.Close();