SelectPdf Online REST API is a platform independent PDF manipulation API. SelectPdf REST API is cloud based and it can be used with any language: .NET (C# or VB.NET), Java, PHP, Python, Go, Ruby, Node.js, Perl and many more. We are presenting today the dedicated Python client library for SelectPdf API.
SelectPdf Python client library can be used to take advance of the features offered by SelectPdf Online REST API:
HTML to PDF REST API – Use SelectPdf HTML To PDF Online REST API to generate PDF documents from web page urls or raw HTML code.
PDF to TEXT REST API – SelectPdf Pdf To Text REST API is a cloud based solution that can be used to extract text from PDF documents or search PDF documents for specific words.
PDF Merge REST API – SelectPdf Pdf Merge REST API is an online solution that can be used to merge local or remote PDFs into a final PDF document.
All these APIs can be easily integrated with Python scripts and applications using the dedicated client library.
Installation
Download selectpdf-api-python-client-1.4.0.zip, unzip it and run:
python setup.py install
OR
Install SelectPdf Python Client for Online API via PyP: SelectPdf API on PyPI.
OR
Clone selectpdf-api-python-client from Github and install the library.
cd selectpdf-api-python-client
python setup.py install
Get a trial key for SelectPdf online REST API
Once the library is installed, you need a key to be able to access the API.
GET A DEMO LICENSE KEY NOW
The free trial key for the online API is valid for 7 days and it includes 200 conversions.
Sample Code
The Python client library makes accessing SelectPdf online REST API very easy. Here are a few samples that present the main features of the API. For details and full list of parameters access the individual pages of the APIs: HTML to PDF API or PDF to TEXT API or PDF Merge API.
Convert HTML to PDF in Python
The following sample shows the main features of the HTML To PDF API. Comment/uncomment code to convert an url to file or memory or also convert raw HTML to file or memory.
# -*- coding: utf-8 -*-
import sys, json
import selectpdf
url = "https://selectpdf.com"
localFile = "Test.pdf"
apiKey = "Your API key here"
pythonVersion = "Python 3" if selectpdf.IS_PYTHON3 else "Python 2"
print ("This is SelectPdf-{0} using {1}.".format(selectpdf.CLIENT_VERSION, pythonVersion))
try:
client = selectpdf.HtmlToPdfClient(apiKey)
# set parameters - see full list at https://selectpdf.com/html-to-pdf-api/
# main properties
client.setPageSize(selectpdf.PageSize.A4) # PDF page size
client.setPageOrientation(selectpdf.PageOrientation.Portrait) # PDF page orientation
client.setMargins(0) # PDF page margins
client.setRenderingEngine(selectpdf.RenderingEngine.WebKit) # rendering engine
client.setConversionDelay(1) # conversion delay
client.setNavigationTimeout(30) # navigation timeout
client.setShowPageNumbers(False) # page numbers
client.setPageBreaksEnhancedAlgorithm(True) # enhanced page break algorithm
# additional properties
# client.setUseCssPrint(True) # enable CSS media print
# client.setDisableJavascript(True) # disable javascript
# client.setDisableInternalLinks(True) # disable internal links
# client.setDisableExternalLinks(True) # disable external links
# client.setKeepImagesTogether(True) # keep images together
# client.setScaleImages(True) # scale images to create smaller pdfs
# client.setSinglePagePdf(True) # generate a single page PDF
# client.setUserPassword("password") # secure the PDF with a password
# generate automatic bookmarks
# client.setPdfBookmarksSelectors("H1, H2") # create outlines (bookmarks) for the specified elements
# client.setViewerPageMode(selectpdf.PageMode.UseOutlines) # display outlines (bookmarks) in viewer
print ("Starting conversion ...")
# convert url to file
client.convertUrlToFile(url, localFile)
# convert url to memory
# pdf = client.convertUrl(url)
# convert html string to file
# client.convertHtmlStringToFile("This is some html.", localFile)
# convert html string to memory
# pdf = client.convertHtmlString("This is some html.")
print ("Finished! Number of pages: {0}.".format(client.getNumberOfPages()))
# get API usage
usageClient = selectpdf.UsageClient(apiKey)
usage = usageClient.getUsage()
print("Conversions remained this month: {0}.".format(usage["available"]))
except selectpdf.ApiException as ex:
print ("An error occurred: {0}.".format(ex.getMessage()))
Convert HTML to PDF with custom header/footer in Python
The following sample shows how to convert a web page to PDF and also setting a custom header or footer.
# -*- coding: utf-8 -*-
import sys, json
import selectpdf
url = "https://selectpdf.com"
localFile = "Test.pdf"
apiKey = "Your API key here"
pythonVersion = "Python 3" if selectpdf.IS_PYTHON3 else "Python 2"
print ("This is SelectPdf-{0} using {1}.".format(selectpdf.CLIENT_VERSION, pythonVersion))
try:
client = selectpdf.HtmlToPdfClient(apiKey)
# set parameters - see full list at https://selectpdf.com/html-to-pdf-api/
client.setMargins(0) # PDF page margins
client.setPageBreaksEnhancedAlgorithm(True) # enhanced page break algorithm
# header properties
client.setShowHeader(True) # display header
# client.setHeaderHeight(50) # header height
# client.setHeaderUrl(url) # header url
client.setHeaderHtml("This is the HEADER!!!!") # header html
# footer properties
client.setShowFooter(True) # display footer
# client.setFooterHeight(60) # footer height
# client.setFooterUrl(url) # footer url
client.setFooterHtml("This is the FOOTER!!!!") # footer html
# footer page numbers
client.setShowPageNumbers(True) # show page numbers in footer
client.setPageNumbersTemplate("{page_number} / {total_pages}") # page numbers template
client.setPageNumbersFontName("Verdana") # page numbers font name
client.setPageNumbersFontSize(12) # page numbers font size
client.setPageNumbersAlignment(selectpdf.PageNumbersAlignment.Center) # page numbers alignment (2-Center)
print ("Starting conversion ...")
# convert url to file
client.convertUrlToFile(url, localFile)
# convert url to memory
# pdf = client.convertUrl(url)
# convert html string to file
# client.convertHtmlStringToFile("This is some html.", localFile)
# convert html string to memory
# pdf = client.convertHtmlString("This is some html.")
print ("Finished! Number of pages: {0}.".format(client.getNumberOfPages()))
# get API usage
usageClient = selectpdf.UsageClient(apiKey)
usage = usageClient.getUsage()
print("Conversions remained this month: {0}.".format(usage["available"]))
except selectpdf.ApiException as ex:
print ("An error occurred: {0}.".format(ex.getMessage()))
Extract text from PDF in Python
The following sample shows how to extract the text from a PDF document using SelectPdf API. Comment/uncomment code to convert a local PDF or a PDF from a remote url to file or memory.
# -*- coding: utf-8 -*-
import sys, json
import selectpdf
testUrl = "https://selectpdf.com/demo/files/selectpdf.pdf"
testPdf = "Input.pdf"
localFile = "Result.txt"
apiKey = "Your API key here"
pythonVersion = "Python 3" if selectpdf.IS_PYTHON3 else "Python 2"
print ("This is SelectPdf-{0} using {1}.".format(selectpdf.CLIENT_VERSION, pythonVersion))
try:
client = selectpdf.PdfToTextClient(apiKey)
# set parameters - see full list at https://selectpdf.com/pdf-to-text-api/
client.setStartPage(1) # start page (processing starts from here)
client.setEndPage(0) # end page (set 0 to process file til the end)
client.setOutputFormat(selectpdf.OutputFormat.Text) # set output format (0-Text or 1-HTML)
print ("Starting pdf to text ...")
# convert local pdf to local text file
client.getTextFromFileToFile(testPdf, localFile)
# extract text from local pdf to memory
# text = client.getTextFromFile(testPdf)
# print text
# print (text)
# convert pdf from public url to local text file
# client.getTextFromUrlToFile(testUrl, localFile)
# extract text from pdf from public url to memory
# text = client.getTextFromUrl(testUrl)
# print text
# print (text)
print ("Finished! Number of pages processed: {0}.".format(client.getNumberOfPages()))
# get API usage
usageClient = selectpdf.UsageClient(apiKey)
usage = usageClient.getUsage()
print("Conversions remained this month: {0}.".format(usage["available"]))
except selectpdf.ApiException as ex:
print ("An error occurred: {0}.".format(ex.getMessage()))
Search for text in PDF using Python
The following sample shows how to search a PDF document for a specific text.
# -*- coding: utf-8 -*-
import sys, json
import selectpdf
testUrl = "https://selectpdf.com/demo/files/selectpdf.pdf"
testPdf = "Input.pdf"
apiKey = "Your API key here"
pythonVersion = "Python 3" if selectpdf.IS_PYTHON3 else "Python 2"
print ("This is SelectPdf-{0} using {1}.".format(selectpdf.CLIENT_VERSION, pythonVersion))
try:
client = selectpdf.PdfToTextClient(apiKey)
# set parameters - see full list at https://selectpdf.com/pdf-to-text-api/
client.setStartPage(1) # start page (processing starts from here)
client.setEndPage(0) # end page (set 0 to process file til the end)
client.setOutputFormat(selectpdf.OutputFormat.Text) # set output format (0-Text or 1-HTML)
print ("Starting search pdf ...")
# search local pdf
results = client.searchFile(testPdf, "pdf")
# search pdf from public url
# results = client.searchUrl(testUrl, "pdf")
print ("Search results:\n{0}\nSearch results count: {1}.".format(json.dumps(results, indent=4), len(results)))
print ("Finished! Number of pages processed: {0}.".format(client.getNumberOfPages()))
# get API usage
usageClient = selectpdf.UsageClient(apiKey)
usage = usageClient.getUsage()
print("Conversions remained this month: {0}.".format(usage["available"]))
except selectpdf.ApiException as ex:
print ("An error occurred: {0}.".format(ex.getMessage()))
Merge PDFs using Python
The following sample shows how merge several PDF documents into a final file. The source PDFs can be local files or PDFs from remote urls. The final PDF can be retrieved in memory or saved to a local file.
# -*- coding: utf-8 -*-
import sys, json
import selectpdf
testUrl = "https://selectpdf.com/demo/files/selectpdf.pdf"
testPdf = "Input.pdf"
localFile = "Result.pdf"
apiKey = "Your API key here"
pythonVersion = "Python 3" if selectpdf.IS_PYTHON3 else "Python 2"
print ("This is SelectPdf-{0} using {1}.".format(selectpdf.CLIENT_VERSION, pythonVersion))
try:
client = selectpdf.PdfMergeClient(apiKey)
# set parameters - see full list at https://selectpdf.com/pdf-merge-api/
# specify the pdf files that will be merged (order will be preserved in the final pdf)
client.addFile(testPdf) # add PDF from local file
client.addUrlFile(testUrl) # add PDF From public url
# client.addFileWithPassword(testPdf, "pdf_password") # add PDF (that requires a password) from local file
# client.addUrlFileWithPassword(testUrl, "pdf_password") # add PDF (that requires a password) from public url
print ("Starting pdf merge ...")
# merge pdfs to local file
client.saveToFile(localFile)
# merge pdfs to memory
# pdf = client.save()
print ("Finished! Number of pages: {0}.".format(client.getNumberOfPages()))
# get API usage
usageClient = selectpdf.UsageClient(apiKey)
usage = usageClient.getUsage()
print("Conversions remained this month: {0}.".format(usage["available"]))
except selectpdf.ApiException as ex:
print ("An error occurred: {0}.".format(ex.getMessage()))
The above Python samples can also be found in GitHub repository: Python Samples.
