Before we get to PyPy, we need to understand what the implementation of a language means.
A program that interprets the programs you write is called an implementation (of that programming language). For example, gcc and icc are different implementations of C (and clang), while CPython and PyPy are implementations of Python.
Thus, PyPy is a Python program that is compiled to C ( or other things) which is then compiled to machine code which then takes your Python programs and compiles them directly to machine code (or not!) without the “C” step in between.
So, when you say
python myScript.py, you are using CPython.
You could use PyPy instead, and run
pypy myScript.py. In most cases, your script would work just the same. (And if it doesn’t, it’s probably a bug!)
Ultimately the goal of PyPy is to do the exact same work as CPython, but a lot faster.
Just in Time?
PyPy comes with a JIT compiler, so that when PyPy is translated into an executable such as
pypy-c, the executable contains a full virtual machine that can optionally include a Just-In-Time compiler.
What that basically means is, when you compile a C program, the machine code gets put into a file so you can run it; whereas, when you use PyPy’s JIT to “compile” your program, the machine code goes into memory and is run right away, because it only compiles little bits of your program at a time. So it never saves a copy (saving a copy would not be useful due to a million technical details, I am told).
PyPy has a whole ton of different modes of operation. JITing is one of them. Usually it’s the one with the most performance impact, but your code is not necessarily JITed immediately. PyPy has an interpreter as well, and it decides which to use depending on a number of factors.
Turning back time
Now, if you dig into how PyPy came into existence, you will discover that PyPy is, in fact, a followup to the Psyco project. Wikipedia says, “PyPy’s aim is to have a just-in-time specializing compiler with scope, which was not available for Psyco.”
So, I inevitably wondered what happened to Psyco? Why PyPy?
Why couldn’t Psyco be improved in itself?
The answer being: The improvement is called PyPy!
But let’s dig into this a bit more, there is an important software design concept here that one must understand.
In this particular case, it so happens that starting a new project called for less effort than improving the slightly flawed one. Actually, “slightly flawed” is not the best way to put it. Psyco was basically a hook for improving the performance of integer math. In order to do more than that, it needed vastly more information. Speeding up object operations proved to be impossible.
Now let’s step back and look at the bigger picture. At what point does one decide give up improving an existing project and decide to make something new?
Turns out, never. It’s (almost) always the wrong decision.
Hmm so, why PyPy and not Psyco?
In the case of Psyco/PyPy it is kind of a different thing. It’s hard to explain exactly why one doesn’t become the other, but it’s like if you wrote a command-line program to make an image greyscale, and then later, you wanted to make a competitor to Photoshop. Both things manipulate images, but the photoshop clone is so much bigger, does so much more stuff, that it doesn’t necessarily help to build the whole thing around the infrastructure you built for a little command-line tool that just decreases saturation. So it’s not like you are throwing out one perfectly good greyscaler so you can build a new one, you’re just starting a new project with a different purpose. (Which may, eventually, contain code to make an image greyscale, but that’s almost beside the point).
So, we are saying Psyco didn’t have the same goals as PyPy, PyPy is more ambitious and has broader goals.
Now, once again, what exactly is PyPy?
It’s just another Python implementation, that’s pretty bleeding edge when it comes to compiler design.