SelectPdf Online REST API is a platform independent PDF manipulation API. SelectPdf REST API is cloud based and it can be used with any language: .NET (C# or VB.NET), Java, PHP, Python, Go, Ruby, Node.js, Perl and many more. We are presenting today the dedicated Python client library for SelectPdf API.
SelectPdf Python client library can be used to take advance of the features offered by SelectPdf Online REST API:
HTML to PDF REST API – Use SelectPdf HTML To PDF Online REST API to generate PDF documents from web page urls or raw HTML code.
PDF to TEXT REST API – SelectPdf Pdf To Text REST API is a cloud based solution that can be used to extract text from PDF documents or search PDF documents for specific words.
PDF Merge REST API – SelectPdf Pdf Merge REST API is an online solution that can be used to merge local or remote PDFs into a final PDF document.
All these APIs can be easily integrated with Python scripts and applications using the dedicated client library.
Installation
Download selectpdf-api-python-client-1.4.0.zip, unzip it and run:
python setup.py install
OR
Install SelectPdf Python Client for Online API via PyP: SelectPdf API on PyPI.
OR
Clone selectpdf-api-python-client from Github and install the library.
cd selectpdf-api-python-client
python setup.py install
Get a trial key for SelectPdf online REST API
Once the library is installed, you need a key to be able to access the API.
GET A DEMO LICENSE KEY NOW
The free trial key for the online API is valid for 7 days and it includes 200 conversions.
Sample Code
The Python client library makes accessing SelectPdf online REST API very easy. Here are a few samples that present the main features of the API. For details and full list of parameters access the individual pages of the APIs: HTML to PDF API or PDF to TEXT API or PDF Merge API.
Convert HTML to PDF in Python
The following sample shows the main features of the HTML To PDF API. Comment/uncomment code to convert an url to file or memory or also convert raw HTML to file or memory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
# -*- coding: utf-8 -*- import sys, json import selectpdf url = "https://selectpdf.com" localFile = "Test.pdf" apiKey = "Your API key here" pythonVersion = "Python 3" if selectpdf.IS_PYTHON3 else "Python 2" print ("This is SelectPdf-{0} using {1}.".format(selectpdf.CLIENT_VERSION, pythonVersion)) try: client = selectpdf.HtmlToPdfClient(apiKey) # set parameters - see full list at https://selectpdf.com/html-to-pdf-api/ # main properties client.setPageSize(selectpdf.PageSize.A4) # PDF page size client.setPageOrientation(selectpdf.PageOrientation.Portrait) # PDF page orientation client.setMargins(0) # PDF page margins client.setRenderingEngine(selectpdf.RenderingEngine.WebKit) # rendering engine client.setConversionDelay(1) # conversion delay client.setNavigationTimeout(30) # navigation timeout client.setShowPageNumbers(False) # page numbers client.setPageBreaksEnhancedAlgorithm(True) # enhanced page break algorithm # additional properties # client.setUseCssPrint(True) # enable CSS media print # client.setDisableJavascript(True) # disable javascript # client.setDisableInternalLinks(True) # disable internal links # client.setDisableExternalLinks(True) # disable external links # client.setKeepImagesTogether(True) # keep images together # client.setScaleImages(True) # scale images to create smaller pdfs # client.setSinglePagePdf(True) # generate a single page PDF # client.setUserPassword("password") # secure the PDF with a password # generate automatic bookmarks # client.setPdfBookmarksSelectors("H1, H2") # create outlines (bookmarks) for the specified elements # client.setViewerPageMode(selectpdf.PageMode.UseOutlines) # display outlines (bookmarks) in viewer print ("Starting conversion ...") # convert url to file client.convertUrlToFile(url, localFile) # convert url to memory # pdf = client.convertUrl(url) # convert html string to file # client.convertHtmlStringToFile("This is some <b>html</b>.", localFile) # convert html string to memory # pdf = client.convertHtmlString("This is some <b>html</b>.") print ("Finished! Number of pages: {0}.".format(client.getNumberOfPages())) # get API usage usageClient = selectpdf.UsageClient(apiKey) usage = usageClient.getUsage() print("Conversions remained this month: {0}.".format(usage["available"])) except selectpdf.ApiException as ex: print ("An error occurred: {0}.".format(ex.getMessage())) |
Convert HTML to PDF with custom header/footer in Python
The following sample shows how to convert a web page to PDF and also setting a custom header or footer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
# -*- coding: utf-8 -*- import sys, json import selectpdf url = "https://selectpdf.com" localFile = "Test.pdf" apiKey = "Your API key here" pythonVersion = "Python 3" if selectpdf.IS_PYTHON3 else "Python 2" print ("This is SelectPdf-{0} using {1}.".format(selectpdf.CLIENT_VERSION, pythonVersion)) try: client = selectpdf.HtmlToPdfClient(apiKey) # set parameters - see full list at https://selectpdf.com/html-to-pdf-api/ client.setMargins(0) # PDF page margins client.setPageBreaksEnhancedAlgorithm(True) # enhanced page break algorithm # header properties client.setShowHeader(True) # display header # client.setHeaderHeight(50) # header height # client.setHeaderUrl(url) # header url client.setHeaderHtml("This is the <b>HEADER</b>!!!!") # header html # footer properties client.setShowFooter(True) # display footer # client.setFooterHeight(60) # footer height # client.setFooterUrl(url) # footer url client.setFooterHtml("This is the <b>FOOTER</b>!!!!") # footer html # footer page numbers client.setShowPageNumbers(True) # show page numbers in footer client.setPageNumbersTemplate("{page_number} / {total_pages}") # page numbers template client.setPageNumbersFontName("Verdana") # page numbers font name client.setPageNumbersFontSize(12) # page numbers font size client.setPageNumbersAlignment(selectpdf.PageNumbersAlignment.Center) # page numbers alignment (2-Center) print ("Starting conversion ...") # convert url to file client.convertUrlToFile(url, localFile) # convert url to memory # pdf = client.convertUrl(url) # convert html string to file # client.convertHtmlStringToFile("This is some <b>html</b>.", localFile) # convert html string to memory # pdf = client.convertHtmlString("This is some <b>html</b>.") print ("Finished! Number of pages: {0}.".format(client.getNumberOfPages())) # get API usage usageClient = selectpdf.UsageClient(apiKey) usage = usageClient.getUsage() print("Conversions remained this month: {0}.".format(usage["available"])) except selectpdf.ApiException as ex: print ("An error occurred: {0}.".format(ex.getMessage())) |
Extract text from PDF in Python
The following sample shows how to extract the text from a PDF document using SelectPdf API. Comment/uncomment code to convert a local PDF or a PDF from a remote url to file or memory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
# -*- coding: utf-8 -*- import sys, json import selectpdf testUrl = "https://selectpdf.com/demo/files/selectpdf.pdf" testPdf = "Input.pdf" localFile = "Result.txt" apiKey = "Your API key here" pythonVersion = "Python 3" if selectpdf.IS_PYTHON3 else "Python 2" print ("This is SelectPdf-{0} using {1}.".format(selectpdf.CLIENT_VERSION, pythonVersion)) try: client = selectpdf.PdfToTextClient(apiKey) # set parameters - see full list at https://selectpdf.com/pdf-to-text-api/ client.setStartPage(1) # start page (processing starts from here) client.setEndPage(0) # end page (set 0 to process file til the end) client.setOutputFormat(selectpdf.OutputFormat.Text) # set output format (0-Text or 1-HTML) print ("Starting pdf to text ...") # convert local pdf to local text file client.getTextFromFileToFile(testPdf, localFile) # extract text from local pdf to memory # text = client.getTextFromFile(testPdf) # print text # print (text) # convert pdf from public url to local text file # client.getTextFromUrlToFile(testUrl, localFile) # extract text from pdf from public url to memory # text = client.getTextFromUrl(testUrl) # print text # print (text) print ("Finished! Number of pages processed: {0}.".format(client.getNumberOfPages())) # get API usage usageClient = selectpdf.UsageClient(apiKey) usage = usageClient.getUsage() print("Conversions remained this month: {0}.".format(usage["available"])) except selectpdf.ApiException as ex: print ("An error occurred: {0}.".format(ex.getMessage())) |
Search for text in PDF using Python
The following sample shows how to search a PDF document for a specific text.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
# -*- coding: utf-8 -*- import sys, json import selectpdf testUrl = "https://selectpdf.com/demo/files/selectpdf.pdf" testPdf = "Input.pdf" apiKey = "Your API key here" pythonVersion = "Python 3" if selectpdf.IS_PYTHON3 else "Python 2" print ("This is SelectPdf-{0} using {1}.".format(selectpdf.CLIENT_VERSION, pythonVersion)) try: client = selectpdf.PdfToTextClient(apiKey) # set parameters - see full list at https://selectpdf.com/pdf-to-text-api/ client.setStartPage(1) # start page (processing starts from here) client.setEndPage(0) # end page (set 0 to process file til the end) client.setOutputFormat(selectpdf.OutputFormat.Text) # set output format (0-Text or 1-HTML) print ("Starting search pdf ...") # search local pdf results = client.searchFile(testPdf, "pdf") # search pdf from public url # results = client.searchUrl(testUrl, "pdf") print ("Search results:\n{0}\nSearch results count: {1}.".format(json.dumps(results, indent=4), len(results))) print ("Finished! Number of pages processed: {0}.".format(client.getNumberOfPages())) # get API usage usageClient = selectpdf.UsageClient(apiKey) usage = usageClient.getUsage() print("Conversions remained this month: {0}.".format(usage["available"])) except selectpdf.ApiException as ex: print ("An error occurred: {0}.".format(ex.getMessage())) |
Merge PDFs using Python
The following sample shows how merge several PDF documents into a final file. The source PDFs can be local files or PDFs from remote urls. The final PDF can be retrieved in memory or saved to a local file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
# -*- coding: utf-8 -*- import sys, json import selectpdf testUrl = "https://selectpdf.com/demo/files/selectpdf.pdf" testPdf = "Input.pdf" localFile = "Result.pdf" apiKey = "Your API key here" pythonVersion = "Python 3" if selectpdf.IS_PYTHON3 else "Python 2" print ("This is SelectPdf-{0} using {1}.".format(selectpdf.CLIENT_VERSION, pythonVersion)) try: client = selectpdf.PdfMergeClient(apiKey) # set parameters - see full list at https://selectpdf.com/pdf-merge-api/ # specify the pdf files that will be merged (order will be preserved in the final pdf) client.addFile(testPdf) # add PDF from local file client.addUrlFile(testUrl) # add PDF From public url # client.addFileWithPassword(testPdf, "pdf_password") # add PDF (that requires a password) from local file # client.addUrlFileWithPassword(testUrl, "pdf_password") # add PDF (that requires a password) from public url print ("Starting pdf merge ...") # merge pdfs to local file client.saveToFile(localFile) # merge pdfs to memory # pdf = client.save() print ("Finished! Number of pages: {0}.".format(client.getNumberOfPages())) # get API usage usageClient = selectpdf.UsageClient(apiKey) usage = usageClient.getUsage() print("Conversions remained this month: {0}.".format(usage["available"])) except selectpdf.ApiException as ex: print ("An error occurred: {0}.".format(ex.getMessage())) |
The above Python samples can also be found in GitHub repository: Python Samples.