Package com.selectpdf
Class PdfToTextClient
java.lang.Object
com.selectpdf.ApiClient
com.selectpdf.PdfToTextClient
public class PdfToTextClient extends ApiClient
Pdf To Text Conversion with SelectPdf Online API.
package com.selectpdf;
public class PdfToText {
public static void main(String[] args) throws Exception {
String testUrl = "https://selectpdf.com/demo/files/selectpdf.pdf";
String testPdf = "Input.pdf";
String localFile = "Result.txt";
String apiKey = "Your API key here";
System.out.println(String.format("This is SelectPdf-%s.", ApiClient.CLIENT_VERSION));
try {
PdfToTextClient client = new PdfToTextClient(apiKey);
// set parameters - see full list at https://selectpdf.com/pdf-to-text-api/
client
.setStartPage(1) // start page (processing starts from here)
.setEndPage(0) // end page (set 0 to process file til the end)
.setOutputFormat(ApiEnums.OutputFormat.Text) // set output format (0-Text or 1-HTML)
;
System.out.println("Starting pdf to text...");
// convert local pdf to local text file
client.getTextFromFileToFile(testPdf, localFile);
// extract text from local pdf to memory
// String text = client.getTextFromFile(testPdf);
// print text
// System.out.println(text);
// convert pdf from public url to local text file
// client.getTextFromUrlToFile(testUrl, localFile);
// extract text from pdf from public url to memory
// String text = client.getTextFromUrl(testUrl);
// print text
// System.out.println(text);
System.out.println(String.format("Finished! Number of pages: %d.", client.getNumberOfPages()));
// get API usage
UsageClient usageClient = new UsageClient(apiKey);
String usage = usageClient.getUsage(false);
System.out.printf("Usage details: %s.\r\n", usage);
// org.json.JSONObject usageObject = new org.json.JSONObject(usage);
// int available = usageObject.getInt("available");
// System.out.printf("Conversions remained this month: %d.\r\n", available);
}
catch (Exception ex) {
System.out.println("An error occured: " + ex.getMessage());
}
}
}
-
Field Summary
Fields inherited from class com.selectpdf.ApiClient
apiAsyncEndpoint, apiEndpoint, apiWebElementsEndpoint, AsyncCallsMaxPings, AsyncCallsPingInterval, binaryData, CLIENT_VERSION, files, headers, jobId, lastHTTPCode, MULTIPART_FORM_DATA_BOUNDARY, NEW_LINE, numberOfPages, parameters
-
Constructor Summary
Constructors Constructor Description PdfToTextClient(java.lang.String apiKey)
Construct the Pdf To Text Client. -
Method Summary
Modifier and Type Method Description java.lang.String
getTextFromFile(java.lang.String inputPdf)
Get the text from the specified pdf.java.lang.String
getTextFromFileAsync(java.lang.String inputPdf)
Get the text from the specified pdf with an asynchronous call.void
getTextFromFileToFile(java.lang.String inputPdf, java.lang.String outputFilePath)
Get the text from the specified pdf and write it to the specified text file.void
getTextFromFileToFileAsync(java.lang.String inputPdf, java.lang.String outputFilePath)
Get the text from the specified pdf with an asynchronous call and write it to the specified text file.void
getTextFromFileToStream(java.lang.String inputPdf, java.io.OutputStream stream)
Get the text from the specified pdf and write it to the specified stream.void
getTextFromFileToStreamAsync(java.lang.String inputPdf, java.io.OutputStream stream)
Get the text from the specified pdf with an asynchronous call and write it to the specified stream.java.lang.String
getTextFromUrl(java.lang.String url)
Get the text from the specified pdf.java.lang.String
getTextFromUrlAsync(java.lang.String url)
Get the text from the specified pdf with an asynchronous call.void
getTextFromUrlToFile(java.lang.String url, java.lang.String outputFilePath)
Get the text from the specified pdf and write it to the specified text file.void
getTextFromUrlToFileAsync(java.lang.String url, java.lang.String outputFilePath)
Get the text from the specified pdf with an asynchronous call and write it to the specified text file.void
getTextFromUrlToStream(java.lang.String url, java.io.OutputStream stream)
Get the text from the specified pdf and write it to the specified stream.void
getTextFromUrlToStreamAsync(java.lang.String url, java.io.OutputStream stream)
Get the text from the specified pdf with an asynchronous call and write it to the specified stream.java.lang.String
searchFile(java.lang.String inputPdf, java.lang.String textToSearch)
Search for a specific text in a PDF document.java.lang.String
searchFile(java.lang.String inputPdf, java.lang.String textToSearch, java.lang.Boolean caseSensitive, java.lang.Boolean wholeWordsOnly)
Search for a specific text in a PDF document.java.lang.String
searchFileAsync(java.lang.String inputPdf, java.lang.String textToSearch)
Search for a specific text in a PDF document with an asynchronous call.java.lang.String
searchFileAsync(java.lang.String inputPdf, java.lang.String textToSearch, java.lang.Boolean caseSensitive, java.lang.Boolean wholeWordsOnly)
Search for a specific text in a PDF document with an asynchronous call.java.lang.String
searchUrl(java.lang.String url, java.lang.String textToSearch)
Search for a specific text in a PDF document.java.lang.String
searchUrl(java.lang.String url, java.lang.String textToSearch, java.lang.Boolean caseSensitive, java.lang.Boolean wholeWordsOnly)
Search for a specific text in a PDF document.java.lang.String
searchUrlAsync(java.lang.String url, java.lang.String textToSearch)
Search for a specific text in a PDF document with an asynchronous call.java.lang.String
searchUrlAsync(java.lang.String url, java.lang.String textToSearch, java.lang.Boolean caseSensitive, java.lang.Boolean wholeWordsOnly)
Search for a specific text in a PDF document with an asynchronous call.PdfToTextClient
setCustomParameter(java.lang.String parameterName, java.lang.String parameterValue)
Set a custom parameter.PdfToTextClient
setEndPage(int endPage)
Set End Page number.PdfToTextClient
setOutputFormat(ApiEnums.OutputFormat outputFormat)
Set the output format.PdfToTextClient
setStartPage(int startPage)
Set Start Page number.PdfToTextClient
setTextLayout(ApiEnums.TextLayout textLayout)
Set the text layout.PdfToTextClient
setTimeout(int timeout)
Set the maximum amount of time (in seconds) for this job.PdfToTextClient
setUserPassword(java.lang.String userPassword)
Set PDF user password.Methods inherited from class com.selectpdf.ApiClient
getNumberOfPages, performPost, performPostAsMultipartFormData, serializeDictionary, serializeParameters, setApiAsyncEndpoint, setApiEndpoint, setApiWebElementsEndpoint, startAsyncJob, startAsyncJobMultipartFormData
-
Constructor Details
-
PdfToTextClient
public PdfToTextClient(java.lang.String apiKey)Construct the Pdf To Text Client.- Parameters:
apiKey
- API Key.
-
-
Method Details
-
getTextFromFile
public java.lang.String getTextFromFile(java.lang.String inputPdf)Get the text from the specified pdf.- Parameters:
inputPdf
- Path to a local PDF file.- Returns:
- Extracted text.
-
getTextFromFileToFile
public void getTextFromFileToFile(java.lang.String inputPdf, java.lang.String outputFilePath) throws java.io.IOExceptionGet the text from the specified pdf and write it to the specified text file.- Parameters:
inputPdf
- Path to a local PDF file.outputFilePath
- The output file where the resulted text will be written.- Throws:
java.io.IOException
-
getTextFromFileToStream
public void getTextFromFileToStream(java.lang.String inputPdf, java.io.OutputStream stream) throws java.io.IOExceptionGet the text from the specified pdf and write it to the specified stream.- Parameters:
inputPdf
- Path to a local PDF file.stream
- The output stream where the resulted PDF will be written.- Throws:
java.io.IOException
-
getTextFromFileAsync
public java.lang.String getTextFromFileAsync(java.lang.String inputPdf)Get the text from the specified pdf with an asynchronous call.- Parameters:
inputPdf
- Path to a local PDF file.- Returns:
- Extracted text.
-
getTextFromFileToFileAsync
public void getTextFromFileToFileAsync(java.lang.String inputPdf, java.lang.String outputFilePath) throws java.io.IOExceptionGet the text from the specified pdf with an asynchronous call and write it to the specified text file.- Parameters:
inputPdf
- Path to a local PDF file.outputFilePath
- The output file where the resulted text will be written.- Throws:
java.io.IOException
-
getTextFromFileToStreamAsync
public void getTextFromFileToStreamAsync(java.lang.String inputPdf, java.io.OutputStream stream) throws java.io.IOExceptionGet the text from the specified pdf with an asynchronous call and write it to the specified stream.- Parameters:
inputPdf
- Path to a local PDF file.stream
- The output stream where the resulted PDF will be written.- Throws:
java.io.IOException
-
getTextFromUrl
public java.lang.String getTextFromUrl(java.lang.String url)Get the text from the specified pdf.- Parameters:
url
- Address of the PDF file.- Returns:
- Extracted text.
-
getTextFromUrlToFile
public void getTextFromUrlToFile(java.lang.String url, java.lang.String outputFilePath) throws java.io.IOExceptionGet the text from the specified pdf and write it to the specified text file.- Parameters:
url
- Address of the PDF file.outputFilePath
- The output file where the resulted text will be written.- Throws:
java.io.IOException
-
getTextFromUrlToStream
public void getTextFromUrlToStream(java.lang.String url, java.io.OutputStream stream) throws java.io.IOExceptionGet the text from the specified pdf and write it to the specified stream.- Parameters:
url
- Address of the PDF file.stream
- The output stream where the resulted PDF will be written.- Throws:
java.io.IOException
-
getTextFromUrlAsync
public java.lang.String getTextFromUrlAsync(java.lang.String url)Get the text from the specified pdf with an asynchronous call.- Parameters:
url
- Address of the PDF file.- Returns:
- Extracted text.
-
getTextFromUrlToFileAsync
public void getTextFromUrlToFileAsync(java.lang.String url, java.lang.String outputFilePath) throws java.io.IOExceptionGet the text from the specified pdf with an asynchronous call and write it to the specified text file.- Parameters:
url
- Address of the PDF file.outputFilePath
- The output file where the resulted text will be written.- Throws:
java.io.IOException
-
getTextFromUrlToStreamAsync
public void getTextFromUrlToStreamAsync(java.lang.String url, java.io.OutputStream stream) throws java.io.IOExceptionGet the text from the specified pdf with an asynchronous call and write it to the specified stream.- Parameters:
url
- Address of the PDF file.stream
- The output stream where the resulted PDF will be written.- Throws:
java.io.IOException
-
searchFile
public java.lang.String searchFile(java.lang.String inputPdf, java.lang.String textToSearch)Search for a specific text in a PDF document. The search is case insensitive and returns partial words also. Pages that participate to this operation are specified by setStartPage() and setEndPage() methods.- Parameters:
inputPdf
- Path to a local PDF file.textToSearch
- Text to search.- Returns:
- List with text positions in the current PDF document.
-
searchFile
public java.lang.String searchFile(java.lang.String inputPdf, java.lang.String textToSearch, java.lang.Boolean caseSensitive, java.lang.Boolean wholeWordsOnly)Search for a specific text in a PDF document. Pages that participate to this operation are specified by setStartPage() and setEndPage() methods.- Parameters:
inputPdf
- Path to a local PDF file.textToSearch
- Text to search.caseSensitive
- If the search is case sensitive or not.wholeWordsOnly
- If the search works on whole words or not.- Returns:
- List with text positions in the current PDF document.
-
searchFileAsync
public java.lang.String searchFileAsync(java.lang.String inputPdf, java.lang.String textToSearch)Search for a specific text in a PDF document with an asynchronous call. The search is case insensitive and returns partial words also. Pages that participate to this operation are specified by setStartPage() and setEndPage() methods.- Parameters:
inputPdf
- Path to a local PDF file.textToSearch
- Text to search.- Returns:
- List with text positions in the current PDF document.
-
searchFileAsync
public java.lang.String searchFileAsync(java.lang.String inputPdf, java.lang.String textToSearch, java.lang.Boolean caseSensitive, java.lang.Boolean wholeWordsOnly)Search for a specific text in a PDF document with an asynchronous call. Pages that participate to this operation are specified by setStartPage() and setEndPage() methods.- Parameters:
inputPdf
- Path to a local PDF file.textToSearch
- Text to search.caseSensitive
- If the search is case sensitive or not.wholeWordsOnly
- If the search works on whole words or not.- Returns:
- List with text positions in the current PDF document.
-
searchUrl
public java.lang.String searchUrl(java.lang.String url, java.lang.String textToSearch)Search for a specific text in a PDF document. The search is case insensitive and returns partial words also. Pages that participate to this operation are specified by setStartPage() and setEndPage() methods.- Parameters:
url
- Address of the PDF file.textToSearch
- Text to search.- Returns:
- List with text positions in the current PDF document.
-
searchUrl
public java.lang.String searchUrl(java.lang.String url, java.lang.String textToSearch, java.lang.Boolean caseSensitive, java.lang.Boolean wholeWordsOnly)Search for a specific text in a PDF document. Pages that participate to this operation are specified by setStartPage() and setEndPage() methods.- Parameters:
url
- Address of the PDF file.textToSearch
- Text to search.caseSensitive
- If the search is case sensitive or not.wholeWordsOnly
- If the search works on whole words or not.- Returns:
- List with text positions in the current PDF document.
-
searchUrlAsync
public java.lang.String searchUrlAsync(java.lang.String url, java.lang.String textToSearch)Search for a specific text in a PDF document with an asynchronous call. The search is case insensitive and returns partial words also. Pages that participate to this operation are specified by setStartPage() and setEndPage() methods.- Parameters:
url
- Address of the PDF file.textToSearch
- Text to search.- Returns:
- List with text positions in the current PDF document.
-
searchUrlAsync
public java.lang.String searchUrlAsync(java.lang.String url, java.lang.String textToSearch, java.lang.Boolean caseSensitive, java.lang.Boolean wholeWordsOnly)Search for a specific text in a PDF document with an asynchronous call. Pages that participate to this operation are specified by setStartPage() and setEndPage() methods.- Parameters:
url
- Address of the PDF file.textToSearch
- Text to search.caseSensitive
- If the search is case sensitive or not.wholeWordsOnly
- If the search works on whole words or not.- Returns:
- List with text positions in the current PDF document.
-
setStartPage
Set Start Page number. Default value is 1 (first page of the document).- Parameters:
startPage
- Start page number (1-based).- Returns:
- Reference to the current object.
-
setEndPage
Set End Page number. Default value is 0 (process till the last page of the document).- Parameters:
endPage
- End page number (1-based).- Returns:
- Reference to the current object.
-
setUserPassword
Set PDF user password.- Parameters:
userPassword
- PDF user password.- Returns:
- Reference to the current object.
-
setTextLayout
Set the text layout. The default value is TextLayout.Original.- Parameters:
textLayout
- The text layout.- Returns:
- Reference to the current object.
-
setOutputFormat
Set the output format. The default value is OutputFormat.Text.- Parameters:
outputFormat
- The output format.- Returns:
- Reference to the current object.
-
setTimeout
Set the maximum amount of time (in seconds) for this job. The default value is 30 seconds. Use a larger value (up to 120 seconds allowed) for large documents.- Parameters:
timeout
- Timeout in seconds.- Returns:
- Reference to the current object.
-
setCustomParameter
public PdfToTextClient setCustomParameter(java.lang.String parameterName, java.lang.String parameterValue)Set a custom parameter. Do not use this method unless advised by SelectPdf.- Parameters:
parameterName
- Parameter name.parameterValue
- Parameter value.- Returns:
- Reference to the current object.
-