Tuesday, February 14, 2012

PyPy and its future challenges

Obviously I'm biased, but I think PyPy is progressing fairly well. However,
I would like to mention some areas where I think pypy is lagging ---
not living up to its promises or the design decisions simply didn't
turn out as good as we hoped for them. In a fairly arbitrary order:

  • Whole program type inference. This decision has been haunting
    separate compilation effort for a while. It's also one of the reasons
    why RPython errors are confusing and why the compilation time is so long.
    This is less of a concern for users, but more of a concern for developers
    and potential developers.

  • Memory impact. We never scientifically measured
    memory impact of PyPy on examples. There are reports of outrageous pypy
    memory usage, but they're usually very cryptic "my app uses 300M" and not
    really reported in a way that's reproducible for us. We simply have to start
    measuring memory impact on benchmarks. You can definitely help by providing
    us with reproducible examples (they don't have to be small, but they have
    to be open source).

The next group all are connected. The fundamental question is: What to do
in the situation where the JIT does not help? There are many causes, but,
in general, PyPy often is inferior to CPython for all of the examples.
A representative, difficult exammple is running tests. Ideally, for
perfect unit tests, each piece of code should be executed only once. There
are other examples, like short running scripts. It all can
be addressed by one or more of the following:

  • Slow runtime. Our runtime is slow. This is caused by a combination
    of using a higher
    level language than C and a relative immaturity compared to CPython. The
    former is at least partly a GCC problem. We emit code that does not look
    like hand-written C and GCC is doing worse job at optimizing it. A good
    example is operations on longs, which are about 2x slower than CPython's,
    partly because GCC is unable to effectively optimize code generated
    by PyPy's translator.

  • Too large JIT warmup time. This is again a combination of issues.
    Partly this is one of the design decisions of tracing on the metalevel,
    which takes more time, but partly this is an issue with our current
    implementation that can be addressed. It's also true that in some edge
    cases, like running large and complex programs with lots and lots
    of megamorphic call sites, we don't do a very good job tracing. Because
    a good example of this case is running PyPy's own test suite, I expect
    we will invest some work into this eventually.

  • Slow interpreter. This one is very similar to the slow runtime - it's
    a combination of using RPython and the fact that we did not spend much
    time optimizing it. Unlike the runtime, we might solve it by having an
    unoptimizing JIT or some other medium-level solution that would work good
    enough. There were some efforts invested, but, as usual, we lack enough
    manpower to proceed as rapidly as we would like.

Thanks for bearing with me this far. This blog post was partly influenced
by accusations that we're doing dishonest PR that PyPy is always fast. I don't
think this is the case and I hope I clarified some of the weak spots, both here
and on the performance page.

EDIT:For what is worth I don't mention interfacing with C here and that's not because I think it's not relevant, it's simply because it did not quite fit with other stuff in this blog post. Consider the list non-exhaustive