SelectPdf for .NET - Pdf To Text Converter. Extract Text from PDF - VB.NET / ASP.NET Sample

This sample shows how to use SelectPdf Pdf Library for .NET to extract text from a PDF document.

The sample uses the following (existing) test PDF:
Test PDF document

Text Layout:


Start Page:


End Page:

(Leave empty to extract until the last page)

Note: The free trial version of SelectPdf will always extract text from the first 3 pages of the PDF document, no matter the page settings received.


Sample Code Vb.Net



Public Class pdf_to_text_converter
    Inherits System.Web.UI.Page

    Protected Sub BtnSubmit_Click(sender As Object, e As EventArgs)
        ' the test file
        Dim filePdf As String = Server.MapPath("~/files/selectpdf.pdf")

        ' settings
        Dim text_layout As String = DdlTextLayout.SelectedValue
        Dim textLayout As TextLayout = _
            DirectCast([Enum].Parse(GetType(TextLayout), text_layout, True),  _
            TextLayout)

        Dim startPage As Integer = 1
        Try
            startPage = Convert.ToInt32(TxtStartPage.Text)
        Catch
        End Try

        Dim endPage As Integer = 0
        Try
            endPage = Convert.ToInt32(TxtEndPage.Text)
        Catch
        End Try

        ' instantiate a pdf to text converter object
        Dim pdfToText As New PdfToText()

        ' load PDF file
        pdfToText.Load(filePdf)

        ' set the properties
        pdfToText.Layout = textLayout
        pdfToText.StartPageNumber = startPage
        pdfToText.EndPageNumber = endPage

        ' extract the text
        Dim text As String = pdfToText.GetText()

        ' convert text to UTF-8 bytes
        Dim utf8 As Byte() = System.Text.Encoding.UTF8.GetBytes(text)

        ' send text to browser
        Response.Clear()
        Response.ClearHeaders()

        Response.AddHeader("Content-Type", "text/plain; charset=UTF-8")
        Response.AddHeader("Content-Length", utf8.Length.ToString())
        Response.AppendHeader("content-disposition", _
                              "attachment;filename=""output.txt""")

        Response.BinaryWrite(utf8)
        Response.[End]()
    End Sub
End Class