PDF To TEXT API

SelectPdf Pdf To Text REST API is an online solution that lets you extract text from your PDF documents or search your PDF document for certain words. The API is easy to use, the integration can be done in a few minutes with only a few lines of code.

GET A DEMO LICENSE KEY NOW

The free trial key for the online API is valid for 7 days and it includes 200 conversions.

Features

  • Extract text from PDF.
  • Search PDF.
  • Specify start and end page for partial file processing.
  • Specify output format (plain text or html).
  • Use a PDF from an online location (url) or upload a local PDF document.

Usage

It’s very easy to use SelectPdf Pdf To Text REST API. All you need is a license key and you will be able to extract text from your PDFs in an instant. If you want to see how to do this in PHP, Java, Ruby, Python, Perl, Node.JS, C# or VB.net, click here to go directly to the coding examples.

Here is a basic usage example using a POST request:

API endpoint: https://selectpdf.com/api2/pdftotext/
Method: POST
Content Type: multipart/form-data

POST /api2/pdftotext/ HTTP/1.1
Content-Type: multipart/form-data; boundary=—011000010111000001101001
Host: selectpdf.com
Content-Length: 273

—–011000010111000001101001
Content-Disposition: form-data; name=”key”

_put___your___license___key___here__
—–011000010111000001101001
Content-Disposition: form-data; name=”url”

https://selectpdf.com/demo/files/selectpdf.pdf
—–011000010111000001101001–

Options

Mandatory Parameters

SelectPdf Pdf To Text REST API has only 2 mandatory parameters. The rest of the parameters are optional. When they are missing, the default value is used.

Parameter Description
key Your API license key
url The url of the PDF that is processed
— OR —
=== Single file upload ===

Optional Parameters

These parameters are used by both “Convert” and “Search” actions.

Parameter Description
action Specify the action performed on the PDF. Possible values: Convert, Search. Default value: Convert.
start_page Start page number in the PDF. Default value is 1 (first page).
end_page End page number in the PDF. Default value is 0 (process till the last page).
user_password PDF document password (if the document is password protected).
timeout Maximum amount of time (in seconds) for this job (up to 120 seconds allowed). Default value: 30 seconds.

Text Extraction Parameters

These parameters are used only by “Convert” action.

Parameter Description
text_layout Output text layout. Possible values: 0 – Original, 1 – Reading Order. Default value: 0.
output_format Output format. Possible values: 0 – Text, 1 – Html. Default value: 0.

Search Parameters

These parameters are used only by “Search” action.

Parameter Description
search_text Text to search.
case_sensitive Specify if the search is case sensitive or not. Possible values: True, False. Default value: False.
whole_words_only Specify if the search works on whole words or not. Possible values: True, False. Default value: False.

Return Codes

Our API returns HTTP response codes, which you can check to see if the conversion was successful or not. Here is the list of HTTP codes used by SelectPdf REST API:

Code Description
200 OK The API call succeeded.
If action is ‘Convert’ – the extracted text is returned.
If action is ‘Search’ – a json with the searched text positions is returned.
400 Bad Request Url or file not uploaded. The body of the response contains an explanation in plain text.
401 Authorization Required License key not specified or invalid. The body of the response contains an explanation in plain text.
415 Unsupported Media Type The PDF to TEXT API requires data to be posted as multipart/form-data.
499 Custom Conversion error. The body of the response contains an explanation in plain text.

Remarks

  • Concurrency: depending on payment plan, the following number of concurrent requests can be sent to the API:
      Free Trial – 1 request
      Entry Level – 2 requests
      Standard Level – 4 requests
      Advanced Level – 8 requests
      Premium Level – 8 requests
      Ultra Level – 16 requests
      Dedicated Level – 16 requests.

    If more requests are sent, they are either queued or rejected with a Too Many Requests error.

  • PDF file size: The API accepts files up to 100Mb. Each 50 pages is counted as 1 conversion credit.
  • The API can process PDF documents with selectable text. It will not work with text embedded in images. This is not an OCR tool.

Code Examples

Pdf To Text Conversion in PHP

Pdf To Text Conversion in Java

Pdf To Text Conversion in Ruby

Pdf To Text Conversion in Python

Pdf To Text Conversion in C#

Pdf To Text Conversion in Go

Pdf To Text Conversion in Node.js

Test SelectPdf PDF To TEXT online RESTful API Now

Request a Demo License Key for SelectPdf REST API right now. Feel free to ask any questions if needed.
The free trial key for the online API is valid for 7 days and it includes 200 conversions.

SelectPdf for Cloud’s platform independent Pdf to Text API is a true REST API that can be used with any language: .NET, Java, PHP, Ruby, Rails, Python, jQuery and many more. You can use it with any language or platform that supports REST. Almost all platforms and languages support REST and provide native REST clients to work with REST APIs. You do not need to worry about language or platform limitations. You can use it with any platform – web, desktop, mobile, and cloud. Try now for free the best API to convert PDF to TEXT.