Comparing implementations of the Monkey Language XII: Speeding Up Python

In a previous episode, I wrote three implementations of the Monkey Language in different interpreted languages: Ruby, Python and Lua. Surprisingly, the fastest implementation went to Ruby, and the slowest one was Python.

In this episode, we'll look at different alternatives to speed up our Monkey Interpreter in Python.

mypyc

As described in the documentation:

Mypyc compiles Python modules to C extensions. It uses standard Python type hints to generate fast code.

Basically, you take any Python code and compile it, but to get the best performance, some changes are necessary, but the ones that gave me the best performance were:

  • Type hint everything

  • Avoid nested functions

If you want to see the changes that I made, you can have a look at this PR.

Results

Mypyc ran 12% faster than normal Python, but it isn't free. There are changes that you must make to get the best performance.

Pyston

From the website:

Pyston is an open-source faster implementation of the Python programming language, designed for the performance and compatibility challenges of large real-world applications.

Pyston comes in two flavours:

  • Full: A full implementation of Python 3.8

  • Lite: An extension module that includes a JIT and is available for Python 3.7 to 3.10

My interpreter depends on Python 3.10 features, so I cannot use Pyston Full, but Lite should be good enough.

Results

Pyston runs my interpreter 18% faster and, best of all, with zero changes in my code.

But why will a compiled code (mypyc) run slower than an interpreted one (Pyston)? I think that is my specific use case. Mypyc (and other similar tools such as Nuitka) tend to optimise on math operations, which my interpreter doesn't do often. On the other hand, Pyston seems to speed things up in general, using these techniques (from the README.md):

  • A very-low-overhead JIT using DynASM

  • Quickening

  • Aggressive attribute caching

Pypy

From the website:

A fast, compliant alternative implementation of Python

The pypy installation process is more complicated than mypyc and pyston (which are installed using pip). But once finished it should run as any other Python distribution.

Results

Mindblowing, Pypy is almost 20 times faster than Python. Of course, my interpreter is the perfect use case for Pypy

So the case where PyPy works best is when executing long-running programs where a significant fraction of the time is spent executing Python code.

Failed experiments

Cython

The latest version of Cython supports Python 3 but not the new features of Python 3.10.

GraalPython

GraalPython supports all the Python 3.10 features, but when I tried to execute my interpreter, I got a recursion error:

RecursionError: maximum recursion depth exceeded

No trickery was enough the get rid of this error, and there is nothing in the documentation about it.

Conclusion

I just touched the tip of the iceberg when we talk about speeding up Python. There are more libraries and runtime to explore, but for now, I must say that I'm incredibly impressed by Pypy for its astounding performance and by Pyston for its low-effort-high-reward approach.