Skip to content

Latest commit

 

History

History

perf-compare

Performance Comparison

All the below code/scripts are run on an i7-8750H locked to 0.8GHz. Except for burgled-batteries3, all the others use python3.10. Julia version is 1.10. SBCL version is 2.4.3.

Burgled-batteries3 uses SBCL 1.5.4 and python3.6m.

run-py

python run-py.py
Evaluating performance of pystr_i through 1000000 calls...
Calls per second:  1576057.6666891782 

Evaluating performance of pycall_str through 100000 calls...
Calls per second:  1074461.2959969486

run-bb

Install burgled-batteries3 by following the instructions here. The burgled-batteries3 repository must be somewhere quicklisp can find. Usually, this is /path/to/quicklisp/local-projects/.

Finally, activate the burgled-batteries3 environment and run

sbcl --no-userinit --load /path/to/quicklisp/setup.lisp --script run-bb.lisp
This is SBCL 1.5.4, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses.  See the CREDITS and COPYING files in the
distribution for more information.


Evaluating (DEFPYFUN "str" (OBJECT))

Evaluating performance of
  (LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (STR X))
on the basis of 100000 runs...
Calls per second: 9282.466

Evaluating performance of
  (LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (STR* X))
on the basis of 100000 runs...
Calls per second: 16339.869

run-pycall

Install julia from here.

Install PyCall.jl using its REPL:

using Pkg
Pkg.add("PyCall")

Finally, run the run-pycall.jl script

/path/to/julia run-pycall.jl
Evaluating performance of pystr_i through 100000 calls...
Calls per second: 239483.74837415223

Evaluating performance of pycall_str through 10000 calls...
Calls per second: 20160.842444870534

run-py4cl2-cffi

Make sure python3-config is accessible in the environment.

The py4cl2-cffi repository must be somewhere quicklisp can find. Usually, this is /path/to/quicklisp/local-projects/.

sbcl --no-userinit --load /path/to/quicklisp/setup.lisp --eval '(ql:quickload "py4cl2-cffi")'

Finally, run the run-py4cl2-cffi.lisp script

sbcl --no-userinit --load /path/to/quicklisp/setup.lisp --script run-py4cl2-cffi.lisp
This is SBCL 2.4.3, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses.  See the CREDITS and COPYING files in the
distribution for more information.
gcc -I/media/common-storage/micromamba/envs/python310/include/python3.10 -I/media/common-storage/micromamba/envs/python310/include/python3.10 -c -Wall -Werror -fpic py4cl-utils.c && gcc -shared -o libpy4cl-utils.so py4cl-utils.o

Evaluating performance of
  (LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (PYSTR X))
on the basis of 1000000 runs...
Calls per second: 369280.6

Evaluating performance of
  (LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (PYCALL "str" X))
on the basis of 1000000 runs...
Calls per second: 164367.52

Or on CCL:

ccl --no-init --load ~/quicklisp/setup.lisp --load run-py4cl2-cffi.lisp --eval '(quit)'
Evaluating performance of
  (LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (PYSTR X))
on the basis of 1000000 runs...
Calls per second: 76780.38

Evaluating performance of
  (LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (PYCALL "str" X))
on the basis of 1000000 runs...
Calls per second: 45005.156

run-py4cl2-cffi-no-gil

A variant of run-py4cl2-cffi.lisp script comprises of holding the GIL before the task starts and leaving it only after the task is complete. This is in contrast to the usual approach of releasing the GIL as soon as possible.

sbcl --no-userinit --load /path/to/quicklisp/setup.lisp --script run-py4cl2-cffi-no-gil.lisp

This gets us a fair boost in performance:

This is SBCL 2.4.3, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses.  See the CREDITS and COPYING files in the
distribution for more information.
gcc -I/media/common-storage/micromamba/envs/python310/include/python3.10 -I/media/common-storage/micromamba/envs/python310/include/python3.10 -c -Wall -Werror -fpic py4cl-utils.c && gcc -shared -o libpy4cl-utils.so py4cl-utils.o

Evaluating performance of
  (LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (PYSTR X))
on the basis of 1000000 runs...
Calls per second: 409838.4

Evaluating performance of
  (LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (PYCALL "str" X))
on the basis of 1000000 runs...
Calls per second: 238323.47

run-py4cl

Make sure python3-config is accessible in the environment.

The py4cl repository must be somewhere quicklisp can find. Usually, this is /path/to/quicklisp/local-projects/.

sbcl --no-userinit --load /path/to/quicklisp/setup.lisp --eval '(ql:quickload "py4cl")'

Finally, run the run-py4cl2-cffi.lisp script

sbcl --no-userinit --load /path/to/quicklisp/setup.lisp --script run-py4cl.lisp
This is SBCL 2.3.11, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses.  See the CREDITS and COPYING files in the
distribution for more information.

Evaluating performance of
  (LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (PYTHON-CALL "str" X))
on the basis of 10000 runs...
Calls per second: 3576.2117

Evaluating performance of
  (LAMBDA (X)
    (DECLARE (OPTIMIZE SPEED))
    (REMOTE-OBJECTS
      (PYTHON-CALL "str" X)))
on the basis of 10000 runs...
Calls per second: 3857.6765

Or on CCL:

ccl --no-init --load ~/quicklisp/setup.lisp --load run-py4cl.lisp --eval '(quit)'
Evaluating performance of
  (LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (PYTHON-CALL "str" X))
on the basis of 10000 runs...
Calls per second: 1917.8184

Evaluating performance of
  (LAMBDA (X) (DECLARE (OPTIMIZE SPEED)) (REMOTE-OBJECTS (PYTHON-CALL "str" X)))
on the basis of 10000 runs...
Calls per second: 1626.3612

Summary

Table summarizing number of calls per second that the particular library can reach by either using PyObject_Call or PyObject_Str. Blank column indicates either that no such facility is available, or I could not find how to use it.

Library \ HowPyObject_CallPyObject_Str
<l><r><r>
Python(x1) 1000000(x1) 1600000
burgled-batteries3(x61) 16500-
PyCall.jl(x3) 320000(x3) 500000
py4cl2-cffi-no-gil (SBCL 2.4.3)(x4) 240000(x4) 410000
py4cl2-cffi (SBCL 2.4.3)(x6) 164000(x4) 370000
py4cl2-cffi (SBCL 1.5.4)(x7) 148000(x4) 370000
py4cl2-cffi (CCL)(x22) 45000(x20) 77000
py4cl (SBCL)(x250) 4000-
py4cl (CCL)(x500) 2000-