SelectPdf for .NET - Pdf To Text Converter. Extract Text from PDF - VB.NET / ASP.NET MVC Sample

This sample shows how to use SelectPdf Pdf Library for .NET to extract text from a PDF document.

The sample uses the following (existing) test PDF:
Test PDF document

Text Layout:

Start Page:


End Page:

(Leave empty to extract until the last page)

Note: The free trial version of SelectPdf will always extract text from the first 3 pages of the PDF document, no matter the page settings received.


Sample Code VB.NET



Imports SelectPdf

Namespace Controllers
    Public Class PdfToTextConverterController
        Inherits Controller

        ' GET: PdfToTextConverter
        Function Index() As ActionResult
            Return View()
        End Function

        <HttpPost>
        Public Function SubmitAction(collection As FormCollection) As ActionResult
            ' the test file
            Dim filePdf As String = Server.MapPath("~/files/selectpdf.pdf")

            ' settings
            Dim text_layout As String = collection("DdlTextLayout")
            Dim textLayout As TextLayout = DirectCast([Enum].Parse(
                GetType(TextLayout), text_layout, True), TextLayout)

            Dim startPage As Integer = 1
            Try
                startPage = Convert.ToInt32(collection("TxtStartPage"))
            Catch
            End Try

            Dim endPage As Integer = 0
            Try
                endPage = Convert.ToInt32(collection("TxtEndPage"))
            Catch
            End Try

            ' instantiate a pdf to text converter object
            Dim pdfToText As New PdfToText()

            ' load PDF file
            pdfToText.Load(filePdf)

            ' set the properties
            pdfToText.Layout = textLayout
            pdfToText.StartPageNumber = startPage
            pdfToText.EndPageNumber = endPage

            ' extract the text
            Dim text As String = pdfToText.GetText()

            ' convert text to UTF-8 bytes
            Dim utf8 As Byte() = System.Text.Encoding.UTF8.GetBytes(text)

            ' return resulted text file
            Dim fileResult As FileResult = New FileContentResult(utf8,
                                                "text/plain; charset=UTF-8")
            fileResult.FileDownloadName = "output.txt"
            Return fileResult
        End Function
    End Class
End Namespace