cpython/Tools/cases_generator/README.md
Guido van Rossum 51fc725117
gh-104584: Baby steps towards generating and executing traces (#105924)
Added a new, experimental, tracing optimizer and interpreter (a.k.a. "tier 2"). This currently pessimizes, so don't use yet -- this is infrastructure so we can experiment with optimizing passes. To enable it, pass ``-Xuops`` or set ``PYTHONUOPS=1``. To get debug output, set ``PYTHONUOPSDEBUG=N`` where ``N`` is a debug level (0-4, where 0 is no debug output and 4 is excessively verbose).

All of this code is likely to change dramatically before the 3.13 feature freeze. But this is a first step.
2023-06-26 19:02:57 -07:00

37 lines
1.6 KiB
Markdown

# Tooling to generate interpreters
Documentation for the instruction definitions in `Python/bytecodes.c`
("the DSL") is [here](interpreter_definition.md).
What's currently here:
- `lexer.py`: lexer for C, originally written by Mark Shannon
- `plexer.py`: OO interface on top of lexer.py; main class: `PLexer`
- `parser.py`: Parser for instruction definition DSL; main class `Parser`
- `generate_cases.py`: driver script to read `Python/bytecodes.c` and
write `Python/generated_cases.c.h` (and several other files)
- `test_generator.py`: tests, require manual running using `pytest`
Note that there is some dummy C code at the top and bottom of
`Python/bytecodes.c`
to fool text editors like VS Code into believing this is valid C code.
## A bit about the parser
The parser class uses a pretty standard recursive descent scheme,
but with unlimited backtracking.
The `PLexer` class tokenizes the entire input before parsing starts.
We do not run the C preprocessor.
Each parsing method returns either an AST node (a `Node` instance)
or `None`, or raises `SyntaxError` (showing the error in the C source).
Most parsing methods are decorated with `@contextual`, which automatically
resets the tokenizer input position when `None` is returned.
Parsing methods may also raise `SyntaxError`, which is irrecoverable.
When a parsing method returns `None`, it is possible that after backtracking
a different parsing method returns a valid AST.
Neither the lexer nor the parsers are complete or fully correct.
Most known issues are tersely indicated by `# TODO:` comments.
We plan to fix issues as they become relevant.