SlideShare a Scribd company logo
pypy-logo
PyPy - How to not write Virtual Machines for
Dynamic Languages
Armin Rigo
Institut für Informatik
Heinrich-Heine-Universität Düsseldorf
ESUG 2007
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Scope
This talk is about:
implementing dynamic languages
(with a focus on complicated ones)
in a context of limited resources
(academic, open source, or domain-specific)
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Scope
This talk is about:
implementing dynamic languages
(with a focus on complicated ones)
in a context of limited resources
(academic, open source, or domain-specific)
Complicated = requiring a large VM
Smalltalk (etc...): typically small core VM
Python (etc...): the VM contains quite a lot
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Scope
This talk is about:
implementing dynamic languages
(with a focus on complicated ones)
in a context of limited resources
(academic, open source, or domain-specific)
Complicated = requiring a large VM
Smalltalk (etc...): typically small core VM
Python (etc...): the VM contains quite a lot
Limited resources
Only near-complete implementations are really useful
Minimize implementer’s duplication of efforts
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Our point
Our point:
Do not write virtual machines “by hand”
Instead, write interpreters in high-level languages
Meta-programming is your friend
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Common Approaches to VM construction
Using C directly (or C disguised as another language)
CPython
Ruby
Spidermonkey (Mozilla’s JavaScript VM)
but also: Squeak, Scheme48
Building on top of a general-purpose OO VM
Jython, IronPython
JRuby, IronRuby
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Implementing VMs in C
When writing a VM in C it is hard to reconcile:
flexibility, maintainability
simplicity of the VM
performance (needs dynamic compilation techniques)
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Implementing VMs in C
When writing a VM in C it is hard to reconcile:
flexibility, maintainability
simplicity of the VM
performance (needs dynamic compilation techniques)
Python Case
CPython is a very simple bytecode VM, performance not
great
Psyco is a just-in-time-specializer, very complex, hard to
maintain, but good performance
Stackless is a fork of CPython adding microthreads. It was
never incorporated into CPython for complexity reasons
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Compilers are a bad encoding of Semantics
to reach good performance levels, dynamic compilation is
often needed
a dynamic compiler needs to encode language semantics
this encoding is often obscure and hard to change
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Compilers are a bad encoding of Semantics
to reach good performance levels, dynamic compilation is
often needed
a dynamic compiler needs to encode language semantics
this encoding is often obscure and hard to change
Python Case
Psyco is a dynamic compiler for Python
synchronizing with CPython’s rapid development is a lot of
effort
many of CPython’s new features not supported well
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Fixing of Early Design Decisions
when starting a VM in C, many design decisions need to
be made upfront
examples: memory management technique, threading
model
the decision is manifested throughout the VM source
very hard to change later
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Fixing of Early Design Decisions
when starting a VM in C, many design decisions need to
be made upfront
examples: memory management technique, threading
model
the decision is manifested throughout the VM source
very hard to change later
Python Case
CPython uses reference counting, increfs and decrefs
everywhere
CPython uses OS threads with one global lock, hard to
change to lightweight threads or finer locking
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Implementation Proliferation
restrictions of the original implementation lead to
re-implementations, forks
all implementations need to be synchronized with
language evolution
lots of duplicate effort
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Implementation Proliferation
restrictions of the original implementation lead to
re-implementations, forks
all implementations need to be synchronized with
language evolution
lots of duplicate effort
Python Case
several serious implementations: CPython, Stackless,
Psyco, Jython, IronPython, PyPy
the implementations have various grades of compliance
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Implementing Languages on Top of General-Purpose
OO VMs
users wish to have easy interoperation with the
general-purpose OO VMs used by the industry (JVM, CLR)
therefore re-implementations of the language on the OO
VMs are started
even more implementation proliferation
implementing on top of an OO VM has its own set of
problems
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Implementing Languages on Top of General-Purpose
OO VMs
users wish to have easy interoperation with the
general-purpose OO VMs used by the industry (JVM, CLR)
therefore re-implementations of the language on the OO
VMs are started
even more implementation proliferation
implementing on top of an OO VM has its own set of
problems
Python Case
Jython is a Python-to-Java-bytecode compiler
IronPython is a Python-to-CLR-bytecode compiler
both are slightly incompatible with the newest CPython
version (especially Jython)
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Benefits of implementing on top of OO VMs
higher level of implementation
the VM supplies a GC and mostly a JIT
better interoperability than what the C level provides
some proponents believe that eventually one single VM
should be enough
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
The problems of OO VMs
some of the benefits of OO VMs don’t work out in practice
most immediate problem: it can be hard to map concepts
of the dynamic lang to the host OO VM
performance is often not improved, and can be very bad,
because of the semantic mismatch between the dynamic
language and the host VM
poor interoperability with everything outside the OO VM
in practice, one OO VM is not enough
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
The problems of OO VMs
some of the benefits of OO VMs don’t work out in practice
most immediate problem: it can be hard to map concepts
of the dynamic lang to the host OO VM
performance is often not improved, and can be very bad,
because of the semantic mismatch between the dynamic
language and the host VM
poor interoperability with everything outside the OO VM
in practice, one OO VM is not enough
Python Case
Jython about 5 times slower than CPython
IronPython is about as fast as CPython (but some
introspection features missing)
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
PyPy’s Approach to VM Construction
Goal: achieve flexibility, simplicity and performance together
Approach: auto-generate VMs from high-level descriptions
of the language
... using meta-programming techniques and aspects
high-level description: an interpreter written in a high-level
language
... which we translate (i.e. compile) to VMs running on top
of various targets, like C/Posix, CLR, JVM
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
PyPy
PyPy = Python interpreter written in RPython + translation
toolchain for RPython
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
PyPy
PyPy = Python interpreter written in RPython + translation
toolchain for RPython
What is RPython
RPython is a subset of Python
subset chosen in such a way that type-inference can be
performed
still a high-level language (unlike SLang or Prescheme)
...really a subset, can’t give a small example of code that
doesn’t just look like Python :-)
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Auto-generating VMs
high-level source: early design decisions not necessary
we need a custom translation toolchain to compile the
interpreter to a full VM
many aspects of the final VM are orthogonal to the
interpreter source: they are inserted during translation
translation aspect ∼= monads, with more ad-hoc control
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Auto-generating VMs
high-level source: early design decisions not necessary
we need a custom translation toolchain to compile the
interpreter to a full VM
many aspects of the final VM are orthogonal to the
interpreter source: they are inserted during translation
translation aspect ∼= monads, with more ad-hoc control
Examples
Garbage Collection strategy
Threading models (e.g. coroutines with CPS...)
non-trivial translation aspect: auto-generating a dynamic
compiler from the interpreter
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Good Points of the Approach
Simplicity:
dynamic languages can be implemented in a high level
language
separation of concerns from low-level details
a potential single-source-fits-all interpreter – less
duplication of efforts
runs everywhere with the same semantics – no outdated
implementations, no ties to any standard platform
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Good Points of the Approach
Simplicity:
dynamic languages can be implemented in a high level
language
separation of concerns from low-level details
a potential single-source-fits-all interpreter – less
duplication of efforts
runs everywhere with the same semantics – no outdated
implementations, no ties to any standard platform
PyPy
arguably the most readable Python implementation so far
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Good Points of the Approach
Flexibility at all levels:
when writing the interpreter (high-level languages rule!)
when adapting the translation toolchain as necessary
to break abstraction barriers when necessary
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Good Points of the Approach
Flexibility at all levels:
when writing the interpreter (high-level languages rule!)
when adapting the translation toolchain as necessary
to break abstraction barriers when necessary
Example
boxed integer objects, represented as tagged pointers
manual system-level RPython code
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Good Points of the Approach
Performance:
“reasonable” performance
can generate a dynamic compiler from the interpreter
(work in progress, 60x faster on very simple Python code)
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Good Points of the Approach
Performance:
“reasonable” performance
can generate a dynamic compiler from the interpreter
(work in progress, 60x faster on very simple Python code)
JIT compiler generator
almost orthogonal from the interpreter source - applicable
to many languages, follows language evolution “for free”
based on Partial Evaluation
benefits from a high-level interpreter and a tweakable
translation toolchain
generating a dynamic compiler is easier than generating a
static one!
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Open Issues / Drawbacks / Further Work
writing the translation toolchain in the first place takes lots
of effort (but it can be reused)
writing a good GC is still necessary. But: maybe we can
reuse existing good GCs (e.g. from the Jikes RVM)?
conceptually simple approach but many abstraction layers
dynamic compiler generation seems to work, but needs
more efforts. Also: can we layer it on top of the JIT of a
general purpose OO VM?
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
Conclusion / Meta-Points
high-level languages are suitable to implement dynamic
languages
doing so has many benefits
VMs shouldn’t be written by hand
PyPy’s concrete approach is not so important
diversity is good
let’s write more meta-programming toolchains!
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
pypy-logo
For more information
PyPy
http://codespeak.net/pypy/
“Sprints”
Main way we develop PyPy
They are programming camps, a few days to one week
long
We may have one in Bern soon (PyPy+Squeak) and/or in
Germany (JIT and other topics)
See also
Google for the full paper corresponding to these slides that was
submitted at Dyla’2007
Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages

More Related Content

ZIP
An Introduction to PyPy
PPT
a quick Introduction to PyPy
PPTX
Jfokus 2016 - A JVMs Journey into Polyglot Runtimes
PPTX
Python Programming
PPTX
FOSDEM2016 - Ruby and OMR
PDF
Pythonic doesn't mean slow!
PPTX
Introduction to Python Programing
PDF
Eclipse OMR: a modern toolkit for building language runtimes
An Introduction to PyPy
a quick Introduction to PyPy
Jfokus 2016 - A JVMs Journey into Polyglot Runtimes
Python Programming
FOSDEM2016 - Ruby and OMR
Pythonic doesn't mean slow!
Introduction to Python Programing
Eclipse OMR: a modern toolkit for building language runtimes

What's hot (20)

PPT
Os Worthington
PPT
Porting To Symbian
PDF
FunScript: Why bother?
PDF
Jit builder status and directions 2018 03-28
PDF
Automating boring and repetitive UbuCon Asia video and subtitle stuffs
PPTX
How to integrate python into a scala stack
PPT
Mixing Python and Java
PDF
Introduction to python
PDF
Python overview
PDF
PDF
Python Flavors
PDF
A Better Python for the JVM
PDF
Jython: Integrating Python and Java
PDF
Raspberry using Python Session 3
PDF
Ruby formatters
PDF
IL2CPP: Debugging and Profiling
PPSX
Community Tech Days C# 4.0
PPTX
Why Python?
PDF
PyPy 1.2: snakes never crawled so fast
PDF
Introduction to Python GUI development with Delphi for Python - Part 1: Del...
Os Worthington
Porting To Symbian
FunScript: Why bother?
Jit builder status and directions 2018 03-28
Automating boring and repetitive UbuCon Asia video and subtitle stuffs
How to integrate python into a scala stack
Mixing Python and Java
Introduction to python
Python overview
Python Flavors
A Better Python for the JVM
Jython: Integrating Python and Java
Raspberry using Python Session 3
Ruby formatters
IL2CPP: Debugging and Profiling
Community Tech Days C# 4.0
Why Python?
PyPy 1.2: snakes never crawled so fast
Introduction to Python GUI development with Delphi for Python - Part 1: Del...
Ad

Similar to PyPy (20)

PDF
Fast Python? Don't Bother
PPTX
Rusty Python
PDF
session5-Getting stated with Python.pdf
KEY
Mypy pycon-fi-2012
PDF
Python Developer's Daily Routine
PPT
Python Programming1.ppt
PPTX
IPT 2.pptx
PPTX
IHTM Python PCEP Introduction to Python
KEY
Four Python Pains
ODP
5 minute intro to virtualenv
PDF
Comparisons And Contrasts Of Windows Ce, Windows Xp, And...
PPT
Python Introduction.ppt
PDF
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
ODP
Learn python
PDF
Concurrency and Python - PyCon MY 2015
PPTX
Python PPT by Sushil Sir.pptx
PPTX
A deep dive into python and it's position in the programming landscape.pptx
PDF
The main Python implementation, named CPython, is written in C meeti.pdf
PPTX
The Medusa Project
Fast Python? Don't Bother
Rusty Python
session5-Getting stated with Python.pdf
Mypy pycon-fi-2012
Python Developer's Daily Routine
Python Programming1.ppt
IPT 2.pptx
IHTM Python PCEP Introduction to Python
Four Python Pains
5 minute intro to virtualenv
Comparisons And Contrasts Of Windows Ce, Windows Xp, And...
Python Introduction.ppt
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
Learn python
Concurrency and Python - PyCon MY 2015
Python PPT by Sushil Sir.pptx
A deep dive into python and it's position in the programming landscape.pptx
The main Python implementation, named CPython, is written in C meeti.pdf
The Medusa Project
Ad

More from ESUG (20)

PDF
ShowUs: Pharo Stream Deck (ESUG 2025, Gdansk)
PDF
Micromaid: A simple Mermaid-like chart generator for Pharo
PDF
Directing Generative AI for Pharo Documentation
PDF
Even Lighter Than Lightweiht: Augmenting Type Inference with Primitive Heuris...
PDF
Composing and Performing Electronic Music on-the-Fly with Pharo and Coypu
PDF
Gamifying Agent-Based Models in Cormas: Towards the Playable Architecture for...
PDF
Analysing Python Machine Learning Notebooks with Moose
PDF
FASTTypeScript metamodel generation using FAST traits and TreeSitter project
PDF
Migrating Katalon Studio Tests to Playwright with Model Driven Engineering
PDF
Package-Aware Approach for Repository-Level Code Completion in Pharo
PDF
Evaluating Benchmark Quality: a Mutation-Testing- Based Methodology
PDF
An Analysis of Inline Method Refactoring
PDF
Identification of unnecessary object allocations using static escape analysis
PDF
Control flow-sensitive optimizations In the Druid Meta-Compiler
PDF
Clean Blocks (IWST 2025, Gdansk, Poland)
PDF
Encoding for Objects Matters (IWST 2025)
PDF
Challenges of Transpiling Smalltalk to JavaScript
PDF
Immersive experiences: what Pharo users do!
PDF
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
PDF
Cavrois - an Organic Window Management (ESUG 2025)
ShowUs: Pharo Stream Deck (ESUG 2025, Gdansk)
Micromaid: A simple Mermaid-like chart generator for Pharo
Directing Generative AI for Pharo Documentation
Even Lighter Than Lightweiht: Augmenting Type Inference with Primitive Heuris...
Composing and Performing Electronic Music on-the-Fly with Pharo and Coypu
Gamifying Agent-Based Models in Cormas: Towards the Playable Architecture for...
Analysing Python Machine Learning Notebooks with Moose
FASTTypeScript metamodel generation using FAST traits and TreeSitter project
Migrating Katalon Studio Tests to Playwright with Model Driven Engineering
Package-Aware Approach for Repository-Level Code Completion in Pharo
Evaluating Benchmark Quality: a Mutation-Testing- Based Methodology
An Analysis of Inline Method Refactoring
Identification of unnecessary object allocations using static escape analysis
Control flow-sensitive optimizations In the Druid Meta-Compiler
Clean Blocks (IWST 2025, Gdansk, Poland)
Encoding for Objects Matters (IWST 2025)
Challenges of Transpiling Smalltalk to JavaScript
Immersive experiences: what Pharo users do!
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
Cavrois - an Organic Window Management (ESUG 2025)

Recently uploaded (20)

PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Electronic commerce courselecture one. Pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Encapsulation theory and applications.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Approach and Philosophy of On baking technology
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPT
Teaching material agriculture food technology
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Dropbox Q2 2025 Financial Results & Investor Presentation
Chapter 3 Spatial Domain Image Processing.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Understanding_Digital_Forensics_Presentation.pptx
MIND Revenue Release Quarter 2 2025 Press Release
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Encapsulation_ Review paper, used for researhc scholars
Electronic commerce courselecture one. Pdf
MYSQL Presentation for SQL database connectivity
Encapsulation theory and applications.pdf
Spectral efficient network and resource selection model in 5G networks
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Per capita expenditure prediction using model stacking based on satellite ima...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Approach and Philosophy of On baking technology
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Advanced methodologies resolving dimensionality complications for autism neur...
Teaching material agriculture food technology

PyPy

  • 1. pypy-logo PyPy - How to not write Virtual Machines for Dynamic Languages Armin Rigo Institut für Informatik Heinrich-Heine-Universität Düsseldorf ESUG 2007 Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 2. pypy-logo Scope This talk is about: implementing dynamic languages (with a focus on complicated ones) in a context of limited resources (academic, open source, or domain-specific) Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 3. pypy-logo Scope This talk is about: implementing dynamic languages (with a focus on complicated ones) in a context of limited resources (academic, open source, or domain-specific) Complicated = requiring a large VM Smalltalk (etc...): typically small core VM Python (etc...): the VM contains quite a lot Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 4. pypy-logo Scope This talk is about: implementing dynamic languages (with a focus on complicated ones) in a context of limited resources (academic, open source, or domain-specific) Complicated = requiring a large VM Smalltalk (etc...): typically small core VM Python (etc...): the VM contains quite a lot Limited resources Only near-complete implementations are really useful Minimize implementer’s duplication of efforts Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 5. pypy-logo Our point Our point: Do not write virtual machines “by hand” Instead, write interpreters in high-level languages Meta-programming is your friend Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 6. pypy-logo Common Approaches to VM construction Using C directly (or C disguised as another language) CPython Ruby Spidermonkey (Mozilla’s JavaScript VM) but also: Squeak, Scheme48 Building on top of a general-purpose OO VM Jython, IronPython JRuby, IronRuby Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 7. pypy-logo Implementing VMs in C When writing a VM in C it is hard to reconcile: flexibility, maintainability simplicity of the VM performance (needs dynamic compilation techniques) Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 8. pypy-logo Implementing VMs in C When writing a VM in C it is hard to reconcile: flexibility, maintainability simplicity of the VM performance (needs dynamic compilation techniques) Python Case CPython is a very simple bytecode VM, performance not great Psyco is a just-in-time-specializer, very complex, hard to maintain, but good performance Stackless is a fork of CPython adding microthreads. It was never incorporated into CPython for complexity reasons Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 9. pypy-logo Compilers are a bad encoding of Semantics to reach good performance levels, dynamic compilation is often needed a dynamic compiler needs to encode language semantics this encoding is often obscure and hard to change Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 10. pypy-logo Compilers are a bad encoding of Semantics to reach good performance levels, dynamic compilation is often needed a dynamic compiler needs to encode language semantics this encoding is often obscure and hard to change Python Case Psyco is a dynamic compiler for Python synchronizing with CPython’s rapid development is a lot of effort many of CPython’s new features not supported well Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 11. pypy-logo Fixing of Early Design Decisions when starting a VM in C, many design decisions need to be made upfront examples: memory management technique, threading model the decision is manifested throughout the VM source very hard to change later Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 12. pypy-logo Fixing of Early Design Decisions when starting a VM in C, many design decisions need to be made upfront examples: memory management technique, threading model the decision is manifested throughout the VM source very hard to change later Python Case CPython uses reference counting, increfs and decrefs everywhere CPython uses OS threads with one global lock, hard to change to lightweight threads or finer locking Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 13. pypy-logo Implementation Proliferation restrictions of the original implementation lead to re-implementations, forks all implementations need to be synchronized with language evolution lots of duplicate effort Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 14. pypy-logo Implementation Proliferation restrictions of the original implementation lead to re-implementations, forks all implementations need to be synchronized with language evolution lots of duplicate effort Python Case several serious implementations: CPython, Stackless, Psyco, Jython, IronPython, PyPy the implementations have various grades of compliance Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 15. pypy-logo Implementing Languages on Top of General-Purpose OO VMs users wish to have easy interoperation with the general-purpose OO VMs used by the industry (JVM, CLR) therefore re-implementations of the language on the OO VMs are started even more implementation proliferation implementing on top of an OO VM has its own set of problems Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 16. pypy-logo Implementing Languages on Top of General-Purpose OO VMs users wish to have easy interoperation with the general-purpose OO VMs used by the industry (JVM, CLR) therefore re-implementations of the language on the OO VMs are started even more implementation proliferation implementing on top of an OO VM has its own set of problems Python Case Jython is a Python-to-Java-bytecode compiler IronPython is a Python-to-CLR-bytecode compiler both are slightly incompatible with the newest CPython version (especially Jython) Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 17. pypy-logo Benefits of implementing on top of OO VMs higher level of implementation the VM supplies a GC and mostly a JIT better interoperability than what the C level provides some proponents believe that eventually one single VM should be enough Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 18. pypy-logo The problems of OO VMs some of the benefits of OO VMs don’t work out in practice most immediate problem: it can be hard to map concepts of the dynamic lang to the host OO VM performance is often not improved, and can be very bad, because of the semantic mismatch between the dynamic language and the host VM poor interoperability with everything outside the OO VM in practice, one OO VM is not enough Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 19. pypy-logo The problems of OO VMs some of the benefits of OO VMs don’t work out in practice most immediate problem: it can be hard to map concepts of the dynamic lang to the host OO VM performance is often not improved, and can be very bad, because of the semantic mismatch between the dynamic language and the host VM poor interoperability with everything outside the OO VM in practice, one OO VM is not enough Python Case Jython about 5 times slower than CPython IronPython is about as fast as CPython (but some introspection features missing) Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 20. pypy-logo PyPy’s Approach to VM Construction Goal: achieve flexibility, simplicity and performance together Approach: auto-generate VMs from high-level descriptions of the language ... using meta-programming techniques and aspects high-level description: an interpreter written in a high-level language ... which we translate (i.e. compile) to VMs running on top of various targets, like C/Posix, CLR, JVM Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 21. pypy-logo PyPy PyPy = Python interpreter written in RPython + translation toolchain for RPython Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 22. pypy-logo PyPy PyPy = Python interpreter written in RPython + translation toolchain for RPython What is RPython RPython is a subset of Python subset chosen in such a way that type-inference can be performed still a high-level language (unlike SLang or Prescheme) ...really a subset, can’t give a small example of code that doesn’t just look like Python :-) Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 23. pypy-logo Auto-generating VMs high-level source: early design decisions not necessary we need a custom translation toolchain to compile the interpreter to a full VM many aspects of the final VM are orthogonal to the interpreter source: they are inserted during translation translation aspect ∼= monads, with more ad-hoc control Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 24. pypy-logo Auto-generating VMs high-level source: early design decisions not necessary we need a custom translation toolchain to compile the interpreter to a full VM many aspects of the final VM are orthogonal to the interpreter source: they are inserted during translation translation aspect ∼= monads, with more ad-hoc control Examples Garbage Collection strategy Threading models (e.g. coroutines with CPS...) non-trivial translation aspect: auto-generating a dynamic compiler from the interpreter Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 25. pypy-logo Good Points of the Approach Simplicity: dynamic languages can be implemented in a high level language separation of concerns from low-level details a potential single-source-fits-all interpreter – less duplication of efforts runs everywhere with the same semantics – no outdated implementations, no ties to any standard platform Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 26. pypy-logo Good Points of the Approach Simplicity: dynamic languages can be implemented in a high level language separation of concerns from low-level details a potential single-source-fits-all interpreter – less duplication of efforts runs everywhere with the same semantics – no outdated implementations, no ties to any standard platform PyPy arguably the most readable Python implementation so far Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 27. pypy-logo Good Points of the Approach Flexibility at all levels: when writing the interpreter (high-level languages rule!) when adapting the translation toolchain as necessary to break abstraction barriers when necessary Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 28. pypy-logo Good Points of the Approach Flexibility at all levels: when writing the interpreter (high-level languages rule!) when adapting the translation toolchain as necessary to break abstraction barriers when necessary Example boxed integer objects, represented as tagged pointers manual system-level RPython code Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 29. pypy-logo Good Points of the Approach Performance: “reasonable” performance can generate a dynamic compiler from the interpreter (work in progress, 60x faster on very simple Python code) Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 30. pypy-logo Good Points of the Approach Performance: “reasonable” performance can generate a dynamic compiler from the interpreter (work in progress, 60x faster on very simple Python code) JIT compiler generator almost orthogonal from the interpreter source - applicable to many languages, follows language evolution “for free” based on Partial Evaluation benefits from a high-level interpreter and a tweakable translation toolchain generating a dynamic compiler is easier than generating a static one! Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 31. pypy-logo Open Issues / Drawbacks / Further Work writing the translation toolchain in the first place takes lots of effort (but it can be reused) writing a good GC is still necessary. But: maybe we can reuse existing good GCs (e.g. from the Jikes RVM)? conceptually simple approach but many abstraction layers dynamic compiler generation seems to work, but needs more efforts. Also: can we layer it on top of the JIT of a general purpose OO VM? Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 32. pypy-logo Conclusion / Meta-Points high-level languages are suitable to implement dynamic languages doing so has many benefits VMs shouldn’t be written by hand PyPy’s concrete approach is not so important diversity is good let’s write more meta-programming toolchains! Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages
  • 33. pypy-logo For more information PyPy http://codespeak.net/pypy/ “Sprints” Main way we develop PyPy They are programming camps, a few days to one week long We may have one in Bern soon (PyPy+Squeak) and/or in Germany (JIT and other topics) See also Google for the full paper corresponding to these slides that was submitted at Dyla’2007 Armin Rigo PyPy - How to not write Virtual Machines for Dynamic Languages