-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Project Proposal: Lox Programming Language Compiler and Bytecode Virtual Machine #409
Comments
Sounds good overall! Can you please expand on what you have written under "empirically measure success"? You have mentioned some test cases, which is great! What exactly will you do, and what outcomes do you expect? Please make a list. For example, it could include "I will write a test harness that uses these test files as input, and I expect 100% of the tests in this directory to pass." Or you could pick a subset of the tests. And even if you don't implement any optimizations, I think you should measure performance and compare it against some sort of baseline (which baseline? you pick). Smaller comments:
Do you have a specific justification for this (or perhaps a specific baseline that Pratt is "more adoptable" than)? If I were to guess, I would have supposed that a simple recursive descent parser would be the most common in industry. :)
Do you mean "during parsing" (i.e., the parser will directly output bytecode)? Or is there some other intermediate data structure (which might be referred to as the "parse tree")?
I strongly suspect that there will be plenty of low-hanging fruit for optimization that does not require |
Thanks for your detailed response. I added the performance test in the How will you empirically measure success?.
Sorry I don't have justification for this. I just read from this book that talks about it's more likely to implement front-end with Pratt Parser than generative tools like Bison or Flex. |
Sounds good; thank you! |
What will you do?
Inspired by Crafting Interpreters
I will be building a full-functioned Lox programming language compiler and a bytecode interpreter/ virtual machine in Rust language from scratch. Lox is a language that combines the syntax of Python and C with dynamic typing. And it has some extensions of object-oriented features and first-class functions.
As a new Rust user, one of the main difficulties of this project would be to correctly program Rust in the canonical way and pay best efforts to pass Rust's borrow checker.
How will you do it?
Lox Compiler
This part will be built as fast and as easy as possible. Instead of using Bison- or Flex-like tools, the scanner and parser will be implemented from scratch. Pratt Parser Algorithm will be adopted in the front-end. Moreover, I won’t build the AST (abstract syntax tree). Instead, the bytecode will be generated immediately during parsing.
Bytecode Interpreter
The language (ISA) for the interpreter is a stack machine that does data movements by mainly push and pop operations. This is not adopted in modern, high-performance architectures, but it can be very friendly for the front-end compiler development with quite acceptable performance. Furthermore, a Mark-Sweep Garbage Collection will be implemented as the memory management mechanism.
How will you empirically measure success?
There are some official test cases. The most important metric of language implementation is to make it correct. I will write a test harness that uses these tests as input. I expect it will pass the tests that cover features of this language I implemented.
If time permits, I could find the hotspot of bytecode simulation and try to optimize it by either algorithm improvement or hacking Rust by using unsafe block.
I will do the performance comparison with both jlox and clox from official implementation. The former is AST tree-walking interpreter implemented by Java, and the latter is bytecode interpreter implemented by C. It could be especially interesting to compare the performance between C's and Rust's implementation (mine).
Team members:
Possible Future Works
Build a compiler that compiles bytecode into native machine language (Arm64 for my computer) that can either be stack-machine-like style that doesn't need to take care of the register or register machine style that we should perform register allocation. The later one would be a huge project.
The above compiler can be used for the JIT compilation and we can also do the performance measurement on bytecode interpretation, stack style asm, or register style asm.
Or, build a profiler to measure performance between different backend implementation.
The text was updated successfully, but these errors were encountered: