ChromePDF is a small Python 3 library that uses Selenium and Google Chrome to convert HTML into a PDF. This is accomplished by using Chrome's Page.printToPDF
DevTools command.
ChromePDF provides a function that accepts an html string, plus a dict of page parameters, and other settings, and returns the bytes containing a PDF:
pdf_bytes = generate_pdf(html_string, pdf_kwargs, **kwargs)
As of version 1.4, ChromePDF can also be run from the command line.
python -m chromepdf generate-pdf --chrome-path=/usr/bin/google-chrome path/to/input.html path/to/output.pdf
Because it renders via Chrome, it supports a wide range of CSS and HTML tags that should display the same as if you used Chrome to view the HTML itself.
ChromePDF can take advantage of Django settings for configuration, but Django is not a required dependency.
Official ChromePDF releases are available on PyPI.
1. Install ChromePDF via pip.
The latest version can be installed via PyPI. This will also install Selenium, the only direct dependency (Selenium versions 3 and 4 are supported; 3 will be used if already present, otherwise will install 4).
You may view the Changelog for a list of all ChromePDF version changes. ChromePDF uses semantic versioning for its release numbering. You are encouraged to pin your requirements to a Major.Minor version in a manner that will also receive Bugfix updates like so:
pip install django-chromepdf~=1.7.3
2. Set the location of your Chrome executable. This can be done in one of two ways:
- In your Django settings, set
CHROMEPDF['CHROME_PATH']
to the full path of the executable (E.G.,r'C:\Program Files (x86)\Google\...\chrome.exe'
) - OR, pass
chrome_path
as a keyword argument to thegenerate_pdf()
function.
A chromedriver executable is necessary to interface between Selenium and Chrome. ChromePDF will automatically check the version of your Chrome binary and download the required chromedriver from the Chrome website if it doesn't exist, into your site-packages/chromepdf/chromedrivers/
folder. If the Chrome binary ever gets upgraded, ChromePDF will realize this and download the required driver for it.
You may disable these automatic downloads in the following way:
- In your Django settings, set
CHROMEPDF['CHROMEDRIVER_DOWNLOADS']
to False - OR, pass a
chromedriver_downloads=False
argument togenerate_pdf()
You may also specify a chromedriver path manually. This is recommended if you disable downloads:
- In your Django settings, set
CHROMEPDF['CHROMEDRIVER_PATH']
to the full path of the executable (E.G.,r'C:\Users\myuser\...\chromedriver_win32\chromedriver.exe'
) - OR, pass a
chromedriver_path
argument togenerate_pdf()
containing the path. - OR, if both of the above are not set, and you've disabled downloads, and if your chromedriver is in your
PATH
environment variable, then Selenium should be able to find it automatically.
Note: generate_pdf()
cannot include external files including CSS. You must include all your CSS within <style>
tags or as inline styles.
# NOTE: This example assumes that you've set Django's settings.CHROMEPDF['CHROME_PATH'] = '(path to your Chrome instance)'
from chromepdf import generate_pdf
with open('myfile.html','r') as f:
html_string = f.read()
pdf_kwargs = {
'paperFormat': 'A4',
'marginTop': '2.5cm',
'marginLeft': '2cm',
'marginRight': '2cm',
'marginBottom': '3.5cm',
'displayHeaderFooter': True,
'headerTemplate': '',
'footerTemplate': '''
<div style='font-size: 12px; width: 100%; padding: 0; padding-left: 2cm; padding-bottom: 1cm; margin: 0; '>
Page <span class='pageNumber'></span> of <span class='totalPages'></span>
</div>
''',
}
pdf_bytes = generate_pdf(html_string, pdf_kwargs)
with open('myfile.pdf', 'wb') as file:
file.write(pdf_bytes)
ChromePDF can generate PDFs from the command-line. This method will not rely on Django settings. Example syntax:
# Convert file.html into file.pdf and place in the same directory
python -m chromepdf generate-pdf path/to/file.html [kwargs]
# Convert file.html and place the output PDF file at a specific path.
python -m chromepdf generate-pdf path/to/file.html path/to/output.pdf [kwargs]
Keyword arguments for command-line usage are almost identical to generate_pdf()
keyword arguments:
--chrome-path=path/to/google-chrome
--chromedriver-path=path/to/chromedriver
--chromedriver-downloads=0 # 0 or 1
--chrome-args="--arg1 --arg2" # Always use quotes to avoid misinterpreting as commands.
--pdf-kwargs-json=path/to/file.json # JSON file containing a dict of values to pass to pdf_kwargs
The recommended usage is to pass a --chrome-path
, and let ChromePDF handle the chromedriver downloads automatically.
python -m chromepdf generate-pdf --chrome-path=/usr/bin/google-chrome path/to/file.html path/to/output.pdf
The command will have a return code of zero on success, and nonzero on failure.
You can specify default settings in your Django settings file, if desired, via a CHROMEPDF
settings. Anything passed via the pdf_kwargs
argument will override the PDF_KWARGS
settings.
# settings.__init__.py
CHROMEPDF = {
'CHROME_PATH': r'C:\Program Files (x86)\Google\...\chrome.exe',
'CHROME_ARGS': [], # Optional list of command-line argument strings to pass to Chrome when rendering a PDF.
'CHROMEDRIVER_PATH': None, # will rely on downloads instead
'CHROMEDRIVER_DOWNLOADS': True, # automatically download the correct chromedriver for the chrome path
'PDF_KWARGS': {
'paperFormat': 'A4',
'marginTop': '2.5cm',
'marginLeft': '2cm',
'marginRight': '2cm',
'marginBottom': '3.5cm',
}
}
The pdf_kwargs
argument to generate_pdf()
lets you specify all the arguments for Chrome's Page.printToPDF
API. Its API can be viewed here:
Page.printToPDF API (url is funky. If you get a 404, try reloading/refreshing)
Some shortcut parameters are provided by generate_pdf()
for convenience. Here is a list of all the options:
Layout:
- scale: Scale of the PDF. Default
1
. - landscape:
True
to use landscape mode. DefaultFalse
.
Page Dimensions:
- paperWidth: Width of the paper, in inches. Can also use some CSS string values, like
'30cm'
. Default:8.5
- paperHeight: Height of the paper, in inches. Can also use some CSS string values, like
'30cm'
. Default:11
- paperFormat: A string indicating a paper size format, such as
'letter'
or'A4'
. Case-insensitive. This will overridepaperWidth
andpaperHeight
. Not part ofPage.printToPDF
API. Provided for convenience.
Content:
- displayHeaderFooter:
True
to display header and footer. DefaultFalse
. - headerTemplate: HTML containing the header for all pages. Default is an empty string. You may pass html tags with specific classes in order to insert values. For example,
<span class='title'></span>
would insert the the title.- date: formatted print date
- title: document title
- url: document location
- pageNumber: current page number
- totalPages: total pages in the document
- footerTemplate: HTML containing the footer for all pages. Default is an empty string. You may pass html tags with specific classes in order to insert values (same as above)
- printBackground:
True
to print background graphics. DefaultFalse
.
Margins:
- margin: Shortcut used to set all four margin values at once. Not part of
Page.printToPDF
API. Provided for convenience. - marginTop: Top margin. Default:
'1cm'
- marginBottom: Bottom margin. Default:
'1cm'
- marginLeft: Left margin. Default:
'1cm'
- marginRight: Right margin. Default:
'1cm'
Page Ranges:
- pageRanges: String indicating page ranges to use. Example:
'1-5, 8, 11-13'
- ignoreInvalidPageRanges: If
True
, will silently ignore invalid 'pageRanges' values. DefaultFalse
.