Author: Michael Kim [email protected]
Update: I have made a project template for Dockerized development here.
The idea of pip
roots back to the import
keyword in Python,
and that the keyword works for both standard library and user-defined modules.
While user-defined modules are often single-use and not very complicated, it can be helpful that they can be reused across different projects without copy-pasting, or even shared with other developers.
Before moving on to pip
, there are several other possible approaches.
-
Add modules to the standard Python library.
This is not a good approach because every developer needs different libraries, so increasing the size of the Python distribution is not beneficial. Also, code in the standard library should have a higher standard and have less flexibility when changes are needed.
-
Modify
PYTHONPATH
environment variable.While this can work locally on one machine, modifying the system setup can be problematic when it comes to distribution/deployment, and it has a high chance of messing things up on other parts of the system.
From the homepage:
pip is the package installer for Python. You can use pip to install packages from the Python Package Index and other indexes.
pip
is the package installer,
while Python Package Index, or pypi
,
is the package distribution platform that pip
references by default.
Because running pip install {package}
will find the package on pypi
,
download, and then install it,
it is easy to confuse them as one integral service.
However, a package for pip
does not have to live on pypi
,
as we'll demonstrate in this tutorial,
and apparently you can download packages from pypi
without using pip
.
It is recommended to create a virtual environment and do everything in it for the purpose of this tutorial, so that you won't mess up your python installation.
For Python 3.6+, you may use the venv
module in the standard library.
HOWTO
For previous versions of Python, you may use virtualenv
.
After creating the virtual environment, it might be a good idea to update the base packages we are going to use:
$ pip install -U pip setuptools
Since pip
is going to install modules that we can import
,
we need to have one ready first.
Let's create my_pip_package.py
:
def hello_world():
print("This is my first pip package!")
Confirm that it can be imported properly:
$ python -c "import my_pip_package; my_pip_package.hello_world()"
This is my first pip package!
Checkout the repo at this stage using the 01-create-module
tag.
setup.py
is used to tell pip
how to install the package.
You can find the full documentation here.
For this tutorial we will have the most basic setup ready, and expand upon it.
from setuptools import setup
from my_pip_package import __version__
setup(
name='my_pip_package',
version=__version__,
url='https://github.com/MichaelKim0407/tutorial-pip-package',
author='Michael Kim',
author_email='[email protected]',
py_modules=['my_pip_package'],
)
Change url and author info for yourself.
Add this to my_pip_package.py
:
__version__ = 'dev'
To confirm that setup.py
works properly:
$ pip install -e .
It should install the package
and create a folder called my_pip_package.egg-info
.
If you are using version control systems like git
,
make sure to ignore that folder.
Now, you should be able to import the package outside of the folder:
$ cd ..
$ python -c "import my_pip_package; my_pip_package.hello_world()"
This is my first pip package!
If you have pushed your code to a git hosting service, you should be able to install it anywhere right now:
$ pip install git+git://github.com/MichaelKim0407/tutorial-pip-package.git#egg=my_pip_package
(replace with your own repo url)
Note for pipenv
:
you should use -e
flag so that pipenv
will pick up dependencies in the lock file.
Checkout the repo at this stage using the 02-setup-py
tag.
This step is optional, if you want to keep everything in one file. However, the setup is slightly different so we'll keep this as a separate step.
First, turn the Python module into a package:
$ mkdir my_pip_package
$ mv my_pip_package.py my_pip_package/__init__.py
Add another Python file in the package, e.g. math.py
:
def add(x, y):
return x + y
Change the following lines in setup.py
:
from setuptools import setup
->
from setuptools import setup, find_packages
py_modules=['my_pip_package']
->
packages=find_packages()
Test that everything works:
$ python -c "import my_pip_package; my_pip_package.hello_world()"
This is my first pip package!
$ python -c "from my_pip_package.math import add; print(add(1, 3))"
4
Checkout the repo at this stage using the 03-convert-package
tag.
If you want to use another pip
library as dependency,
you can specify it in setup.py
.
First, let's add the following code to math.py
:
from returns import returns
@returns(int)
def div_int(x, y):
return x / y
The returns
decorator comes from the returns-decorator
package
(DISCLAIMER: created by the author of this tutorial),
which is available on pypi
.
When writing production code you should totally use //
,
but for the sake of demonstration let's use the decorator for now.
To specify returns-decorator
as a dependency,
add the following entry to setup(...)
in setup.py
:
install_requires=[
'returns-decorator',
],
Run pip install -e .
again to pick up the new dependency.
Now verify that it works:
$ python -c "from my_pip_package.math import div_int; print(div_int(3, 2))"
1
You may also specify versions of your dependency,
e.g. returns-decorator>=1.1
.
For the full spec, see PEP 508.
Checkout the repo at this stage using the 04-dependency
tag.
Sometimes certain parts of your code require a specific dependency, but it's not necessarily useful for all use cases.
One example would be the sqlalchemy
library,
which supports a variety of SQL dialects,
but in most cases anyone using it would only be interested in one dialect.
Installing all dependencies is both inefficient and messy, so it's better to let the user decide what exactly is needed. However, it would be cumbersome for the user to install the specific dependencies. This is where extra dependencies some in.
For this tutorial, after the last step,
let's pretend that we don't want to always install returns-decorator
unless math
is used.
We can replace the install_requires
with the following:
extras_require={
'math': [
'returns-decorator',
],
},
Note the s
: install_requires
is singular but extras_require
is plural.
Now, we can install the extra dependency by appending [math]
in the installation:
$ pip install -e .[math]
or
$ pip install git+git://github.com/MichaelKim0407/tutorial-pip-package.git#egg=my_pip_package[math]
However, we are not finished just yet - since we want to add more extra dependencies in the future, it's better to keep them organized.
One good habit is to make a [dev]
extra dependency,
which includes all dependencies needed for local development.
In setup.py
:
extra_math = [
'returns-decorator',
]
extra_dev = [
*extra_math,
]
and in setup(...)
:
extras_require={
'math': extra_math,
'dev': extra_dev,
},
Now we can just run pip install -e .[dev]
whenever we want to setup a dev environment.
Checkout the repo at this stage using the 05-extra-dependency
tag.
pip
allows packages to create command line entries in the bin/
folder.
First, let's make a function that accepts command line arguments in math.py
,
and make the module callable:
def cmd_add(args=None):
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('x', type=float)
parser.add_argument('y', type=float)
parsed_args = parser.parse_args(args)
print(add(parsed_args.x, parsed_args.y))
if __name__ == '__main__':
cmd_add()
Test it out:
$ python my_pip_package/math.py 1.5 3
4.5
Now, add the following entry to setup(...)
:
entry_points={
'console_scripts': [
'add=my_pip_package.math:cmd_add',
],
},
The syntax is {cmd entry name}={module path}:{function name}
.
Run pip install -e .[dev]
again to create the command line entry.
$ add 1.6 4
5.6
The __name__ == '__main__'
part is not really needed,
so let's remove it.
Also, since the add
command requires the [math]
dependency,
let's make it explicit for anyone wishing to use the command:
extra_bin = [
*extra_math,
]
and
extra_requires = {
...,
'bin': extra_bin,
}
Checkout the repo at this stage using the 06-command
tag.
If you are developing a package you should probably include tests from the beginning, but since it's a different step in the setup we'll do it now.
For this tutorial, we'll be using pytest
for testing and pytest-cov
for coverage.
Lets include the packages in the extras:
extra_test = [
*extra_math,
'pytest>=4',
'pytest-cov>=2',
]
and update the [dev]
extra dependency to include testing:
extra_dev = [
*extra_test,
]
Run pip install -e .[dev]
again to pick up the new dependencies.
For the sake of length, we'll add to the repo without writing them down here.
Run pytest
to test the package.
Once everything's passed, we can move on for coverage test.
Create .coveragerc
:
[run]
source = my_pip_package
And run pytest --cov
to see coverage.
--cov-report
can also be specified to provide formatting for coverage report.
My favorite is pytest --cov --cov-report term-missing:skip-covered
,
which lists all the line numbers that are not covered by tests,
while hiding all files that have been completely covered.
Lastly, don't forget to ignore the test output in .gitignore
:
.pytest_cache/
.coverage
Checkout the repo at this stage using the 07-tests
tag.
UPDATE: You should also instruct find_packages
to ignore the tests
folder by specifying find_packages(exclude=['tests', 'tests.*'])
.
I am not going to go back and rewrite Git history, but this is included in latest setup.py
.
While testing locally can catch a lot of problems already, running tests automatically is a further step on quality control, especially multiple developers are involved, and it also shows the world that your library is indeed working as intended.
For GitHub repos, we'll be using Travis CI to run the CI tests.
We'll be using Coveralls for coverage reporting. (There is an alternative called Codecov, however it has a pretty significant issue for Python.)
First, coveralls
requires an extra dependency,
so let's create an extra called ci
:
extra_ci = [
*extra_test,
'python-coveralls',
]
Next, add the CI configuration, which should be called .travis.yml
.
Details on how to write it can be found here.
See code in repo for how we are doing it.
Let's also add the badges to the top of our README file so everyone can see them immediately. The code to embed badges can be found on travis and coveralls. After the CI runs successfully, the badges will be updated.
Checkout the repo at this stage using the 08-ci
tag.
At this point, your library can already be shared with the world, however it is not on pypi yet.
To release on pypi, there are a few things we need to take care of.
First, add some classifiers for your package in setup()
.
A full list of classifiers can be found here.
Next, change __version__
to a standard version string, such as 1.0
.
Next, change the name of your package, if you followed the tutorial thus far,
since my_pip_package
would be taken by me.
Be creative!
The name
argument in setup()
does not need to match the name of the python package,
but it's better to keep them the same so that anyone that installs your library won't be confused.
You may also want to add a description
in setup()
.
Once everything is good, we can package the library:
$ python setup.py sdist
If should create a .tar.gz
file under dist/
.
You can unzip the file to inspect its contents.
Also, don't forget to add dist/
to .gitignore
.
The file is now ready to be uploaded to pypi
.
Create an account on pypi
, and store the credentials in ~/.pypirc
:
[pypi]
username =
password =
Finally, to upload the file:
$ twine upload dist/{packaged file}.tar.gz
Your package should now show up on pypi
and installable using pip install
.
It would also be a good idea to create a release on GitHub, and drop the packaged file as an attachment.
Checkout the repo at this stage using the 09-release
tag.