SelectPdf Pdf To Text REST API is an online solution that lets you extract text from your PDF documents or search your PDF document for certain words. The API is easy to use, the integration can be done in a few minutes with only a few lines of code.
GET A DEMO LICENSE KEY NOW
The free trial key for the online API is valid for 7 days and it includes 200 conversions.
Features
- Extract text from PDF.
- Search PDF.
- Specify start and end page for partial file processing.
- Specify output format (plain text or html).
- Use a PDF from an online location (url) or upload a local PDF document.
Usage
It’s very easy to use SelectPdf Pdf To Text REST API. All you need is a license key and you will be able to extract text from your PDFs in an instant. If you want to see how to do this in PHP, Java, Ruby, Python, Perl, Node.JS, C# or VB.net, click here to go directly to the coding examples.
Here is a basic usage example using a POST request:
Method: POST
Content Type: multipart/form-data
POST /api2/pdftotext/ HTTP/1.1
Content-Type: multipart/form-data; boundary=—011000010111000001101001
Host: selectpdf.com
Content-Length: 273
—–011000010111000001101001
Content-Disposition: form-data; name=”key”
_put___your___license___key___here__
—–011000010111000001101001
Content-Disposition: form-data; name=”url”
https://selectpdf.com/demo/files/selectpdf.pdf
—–011000010111000001101001–
Options
Mandatory Parameters
SelectPdf Pdf To Text REST API has only 2 mandatory parameters. The rest of the parameters are optional. When they are missing, the default value is used.
Parameter | Description |
key | Your API license key |
url | The url of the PDF that is processed |
— OR — | |
=== Single file upload === |
Optional Parameters
These parameters are used by both “Convert” and “Search” actions.
Parameter | Description |
action | Specify the action performed on the PDF. Possible values: Convert, Search. Default value: Convert. |
start_page | Start page number in the PDF. Default value is 1 (first page). |
end_page | End page number in the PDF. Default value is 0 (process till the last page). |
user_password | PDF document password (if the document is password protected). |
timeout | Maximum amount of time (in seconds) for this job (up to 120 seconds allowed). Default value: 30 seconds. |
Text Extraction Parameters
These parameters are used only by “Convert” action.
Parameter | Description |
text_layout | Output text layout. Possible values: 0 – Original, 1 – Reading Order. Default value: 0. |
output_format | Output format. Possible values: 0 – Text, 1 – Html. Default value: 0. |
Search Parameters
These parameters are used only by “Search” action.
Parameter | Description |
search_text | Text to search. |
case_sensitive | Specify if the search is case sensitive or not. Possible values: True, False. Default value: False. |
whole_words_only | Specify if the search works on whole words or not. Possible values: True, False. Default value: False. |
Return Codes
Our API returns HTTP response codes, which you can check to see if the conversion was successful or not. Here is the list of HTTP codes used by SelectPdf REST API:
Code | Description |
200 OK | The API call succeeded. If action is ‘Convert’ – the extracted text is returned. If action is ‘Search’ – a json with the searched text positions is returned. |
400 Bad Request | Url or file not uploaded. The body of the response contains an explanation in plain text. |
401 Authorization Required | License key not specified or invalid. The body of the response contains an explanation in plain text. |
415 Unsupported Media Type | The PDF to TEXT API requires data to be posted as multipart/form-data. |
499 Custom | Conversion error. The body of the response contains an explanation in plain text. |
Remarks
- Concurrency: depending on payment plan, the following number of concurrent requests can be sent to the API:
Free Trial – 1 request
Entry Level – 2 requests
Standard Level – 4 requests
Advanced Level – 8 requests
Premium Level – 8 requests
Ultra Level – 16 requests
Dedicated Level – 16 requests.If more requests are sent, they are either queued or rejected with a Too Many Requests error.
- PDF file size: The API accepts files up to 100Mb. Each 50 pages is counted as 1 conversion credit.
- The API can process PDF documents with selectable text. It will not work with text embedded in images. This is not an OCR tool.
Code Examples
Pdf To Text Conversion in PHP
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
<?php $client = new http\Client; $request = new http\Client\Request; $body = new http\Message\Body; $body->addForm([ 'key' => '_put___your___license___key___here__', 'url' => 'https://selectpdf.com/demo/files/selectpdf.pdf' ], null); $request->setRequestUrl('https://selectpdf.com/api2/pdftotext/'); $request->setRequestMethod('POST'); $request->setBody($body); $client->enqueue($request)->send(); $response = $client->getResponse(); echo $response->getBody(); ?> |
Pdf To Text Conversion in Java
1 2 3 4 5 6 7 |
HttpRequest request = HttpRequest.newBuilder() .uri(URI.create("https://selectpdf.com/api2/pdftotext/")) .header("Content-Type", "multipart/form-data; boundary=---011000010111000001101001") .method("POST", HttpRequest.BodyPublishers.ofString("-----011000010111000001101001\r\nContent-Disposition: form-data; name=\"key\"\r\n\r\n_put___your___license___key___here__\r\n-----011000010111000001101001\r\nContent-Disposition: form-data; name=\"url\"\r\n\r\nhttps://selectpdf.com/demo/files/selectpdf.pdf\r\n-----011000010111000001101001--\r\n")) .build(); HttpResponse<String> response = HttpClient.newHttpClient().send(request, HttpResponse.BodyHandlers.ofString()); System.out.println(response.body()); |
Pdf To Text Conversion in Ruby
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
require 'uri' require 'net/http' require 'openssl' url = URI("https://selectpdf.com/api2/pdftotext/") http = Net::HTTP.new(url.host, url.port) http.use_ssl = true http.verify_mode = OpenSSL::SSL::VERIFY_NONE request = Net::HTTP::Post.new(url) request["Content-Type"] = 'multipart/form-data; boundary=---011000010111000001101001' request.body = "-----011000010111000001101001\r\nContent-Disposition: form-data; name=\"key\"\r\n\r\n_put___your___license___key___here__\r\n-----011000010111000001101001\r\nContent-Disposition: form-data; name=\"url\"\r\n\r\nhttps://selectpdf.com/demo/files/selectpdf.pdf\r\n-----011000010111000001101001--\r\n" response = http.request(request) puts response.read_body |
Pdf To Text Conversion in Python
1 2 3 4 5 6 7 8 9 10 |
import requests url = "https://selectpdf.com/api2/pdftotext/" payload = "-----011000010111000001101001\r\nContent-Disposition: form-data; name=\"key\"\r\n\r\n_put___your___license___key___here__\r\n-----011000010111000001101001\r\nContent-Disposition: form-data; name=\"url\"\r\n\r\nhttps://selectpdf.com/demo/files/selectpdf.pdf\r\n-----011000010111000001101001--\r\n" headers = {"Content-Type": "multipart/form-data; boundary=---011000010111000001101001"} response = requests.request("POST", url, data=payload, headers=headers) print(response.text) |
Pdf To Text Conversion in C#
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
var client = new HttpClient(); var request = new HttpRequestMessage { Method = HttpMethod.Post, RequestUri = new Uri("https://selectpdf.com/api2/pdftotext/"), Content = new MultipartFormDataContent { new StringContent("_put___your___license___key___here__") { Headers = { ContentDisposition = new ContentDispositionHeaderValue("form-data") { Name = "key", } } }, new StringContent("https://selectpdf.com/demo/files/selectpdf.pdf") { Headers = { ContentDisposition = new ContentDispositionHeaderValue("form-data") { Name = "url", } } }, }, }; using (var response = await client.SendAsync(request)) { response.EnsureSuccessStatusCode(); var body = await response.Content.ReadAsStringAsync(); Console.WriteLine(body); } |
Pdf To Text Conversion in Go
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
package main import ( "fmt" "strings" "net/http" "io/ioutil" ) func main() { url := "https://selectpdf.com/api2/pdftotext/" payload := strings.NewReader("-----011000010111000001101001\r\nContent-Disposition: form-data; name=\"key\"\r\n\r\n_put___your___license___key___here__\r\n-----011000010111000001101001\r\nContent-Disposition: form-data; name=\"url\"\r\n\r\nhttps://selectpdf.com/demo/files/selectpdf.pdf\r\n-----011000010111000001101001--\r\n") req, _ := http.NewRequest("POST", url, payload) req.Header.Add("Content-Type", "multipart/form-data; boundary=---011000010111000001101001") res, _ := http.DefaultClient.Do(req) defer res.Body.Close() body, _ := ioutil.ReadAll(res.Body) fmt.Println(res) fmt.Println(string(body)) } |
Pdf To Text Conversion in Node.js
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
const request = require('request'); const options = { method: 'POST', url: 'https://selectpdf.com/api2/pdftotext/', headers: {'Content-Type': 'multipart/form-data; boundary=---011000010111000001101001'}, formData: { key: '_put___your___license___key___here__', url: 'https://selectpdf.com/demo/files/selectpdf.pdf' } }; request(options, function (error, response, body) { if (error) throw new Error(error); console.log(body); }); |
Test SelectPdf PDF To TEXT online RESTful API Now
Request a Demo License Key for SelectPdf REST API right now. Feel free to ask any questions if needed.
The free trial key for the online API is valid for 7 days and it includes 200 conversions.
SelectPdf for Cloud’s platform independent Pdf to Text API is a true REST API that can be used with any language: .NET, Java, PHP, Ruby, Rails, Python, jQuery and many more. You can use it with any language or platform that supports REST. Almost all platforms and languages support REST and provide native REST clients to work with REST APIs. You do not need to worry about language or platform limitations. You can use it with any platform – web, desktop, mobile, and cloud. Try now for free the best API to convert PDF to TEXT.