Skip to content

πŸš€ A modern, optimizing bytecode compiler built from scratch in C++ with lexer, parser, optimizer, and stack-based VM. Features 17 opcodes, recursion, arrays, and a live web REPL.

Notifications You must be signed in to change notification settings

IsVohi/ByteCode-Compiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Optimizing Bytecode Compiler

A high-performance C++17 bytecode compiler with optimization passes for a stack-based virtual machine.

Project Overview

This project implements a complete compilation pipeline from source to optimized bytecode execution. The compiler supports a custom high-level language with functions, arrays, loops, and more.

Key Features

  • Stack-Based Bytecode VM: 17 opcodes for arithmetic, control, arrays, functions, and I/O
  • Modular Design: Separate lexer, parser, codegen, optimizer, VM
  • Full Language Support: Functions, recursion, arrays, loops, conditionals
  • Interactive REPL: Persistent state across commands
  • Robust Error Handling: Custom exceptions per compilation stage
  • Testing: GoogleTest suite with 150+ tests
  • Benchmark Suite: Performance profiling and HTML dashboard

Example Syntax

// Variables and arithmetic
let x = 10;
let y = 20;
print(x + y);           // 30

// Functions and recursion
fn factorial(n) {
    if (n <= 1) {
        return 1;
    }
    return n * factorial(n - 1);
}
print(factorial(5));    // 120

// Arrays
let arr = [1, 2, 3, 4, 5];
arr[0] = 99;
print(arr);             // [99, 2, 3, 4, 5]

// Nested arrays
let matrix = [[1, 2], [3, 4]];
print(matrix[1][0]);    // 3

// Control flow
for (let i = 0; i < 5; i = i + 1) {
    if (i == 2) { continue; }
    print(i);
}

// While loops with break
while (x > 0) {
    if (x == 5) { break; }
    x = x - 1;
}

Build Instructions

Prerequisites

  • CMake 3.16+
  • C++17 compiler (GCC 7+, Clang 5+, MSVC 2017+)
  • Git

Building

mkdir build && cd build
cmake ..
make

Running Compiler

# Run a source file
./build/compiler script.src

# Run with optimizations disabled
./build/compiler script.src --no-opt

# Run with profiling enabled
./build/compiler script.src --profile

# Run with verbose output
./build/compiler script.src --verbose

Open in New Terminal Window (macOS)

# Open REPL in a new Terminal window
./run.sh

# Run a file in a new Terminal window
./run.sh examples/comprehensive_demo.txt

Interactive REPL

./compiler
# ByteCode Compiler REPL v1.0.0
# > let x = 5;
# > print(x * 2);
# 10
# > exit

Running Tests

cd build
./tests
ctest --output-on-failure

Architecture

Source β†’ Lexer β†’ Parser β†’ AST β†’ Optimizer β†’ Codegen β†’ Bytecode β†’ VM

Core components status:

Component Status
Lexer βœ… Implemented
Parser βœ… Implemented
Codegen βœ… Implemented
Optimizer βœ… Implemented
VM βœ… Implemented
Arrays βœ… Implemented
REPL βœ… Implemented
CLI βœ… Implemented
Tests βœ… 150+ tests

Project Structure

ByteCode-Compiler/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ main.cpp        # CLI and REPL
β”‚   β”œβ”€β”€ lexer.cpp       # Tokenizer
β”‚   β”œβ”€β”€ parser.cpp      # Recursive descent parser
β”‚   β”œβ”€β”€ ast.cpp         # AST node implementations
β”‚   β”œβ”€β”€ codegen.cpp     # Bytecode generator
β”‚   β”œβ”€β”€ optimizer.cpp   # Optimization passes
β”‚   └── vm.cpp          # Virtual machine
β”œβ”€β”€ include/
β”‚   β”œβ”€β”€ common.h        # Types, opcodes, exceptions
β”‚   β”œβ”€β”€ lexer.h
β”‚   β”œβ”€β”€ parser.h
β”‚   β”œβ”€β”€ ast.h
β”‚   β”œβ”€β”€ codegen.h
β”‚   β”œβ”€β”€ optimizer.h
β”‚   β”œβ”€β”€ vm.h
β”‚   └── profiler.h
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ test_lexer.cpp
β”‚   β”œβ”€β”€ test_parser.cpp
β”‚   β”œβ”€β”€ test_codegen.cpp
β”‚   β”œβ”€β”€ test_vm.cpp
β”‚   β”œβ”€β”€ test_optimizer.cpp
β”‚   β”œβ”€β”€ test_arrays.cpp
β”‚   β”œβ”€β”€ test_control_flow.cpp
β”‚   β”œβ”€β”€ test_bubblesort.cpp
β”‚   └── test_end_to_end.cpp
β”œβ”€β”€ examples/
β”‚   └── comprehensive_demo.txt
β”œβ”€β”€ benchmarks/
β”‚   └── benchmark.py
β”œβ”€β”€ results/
β”‚   └── dashboard.html
β”œβ”€β”€ CMakeLists.txt
└── README.md

Bytecode Specification

Opcode Hex Description
CONST 0x00 Push constant
LOAD 0x01 Load variable
STORE 0x02 Store variable
ADD 0x03 Pop two and push sum
SUB 0x04 Pop two and push difference
MUL 0x05 Pop two and push product
DIV 0x06 Pop two and push quotient
MOD 0x07 Pop two and push modulo
JUMP 0x08 Unconditional jump
JUMP_IF_ZERO 0x09 Jump if top is zero
CALL 0x0A Call function
RETURN 0x0B Return from function
PRINT 0x0C Pop and print top
EQ/NEQ/LT/... 0x0D-0x12 Comparison operators
BUILD_ARRAY 0x13 Build array from stack
ARRAY_LOAD 0x14 Load from array index
ARRAY_STORE 0x15 Store to array index
POP 0x16 Pop and discard top

Limits

  • Stack Size: 256
  • Variables: 1024 per scope
  • Instructions: 65535 per program
  • Functions: 256 per program
  • Bytecode Version: 1

Compiler Flags

Flag Description
--no-opt Disable optimizations
--profile Enable execution profiling
--verbose Print detailed compilation info
--dump Dump generated bytecode

Optimizations

The optimizer performs several passes:

  • Constant Folding: Evaluates constant expressions at compile time
  • Dead Code Elimination: Removes unreachable code after return statements
  • Unused Variable Detection: Identifies and warns about unused variables
  • Function Inlining: Inlines small, non-recursive functions

Error Handling

Custom exception classes for each stage:

Exception Stage
CompilerError Base class
LexerError Lexical errors
ParserError Syntax errors
CodegenError Code generation
OptimizerError Optimization
VMError Runtime errors

Example errors:

> print(x / 0);
Error: VM error: Division by zero

> let arr = [1, 2]; print(arr[10]);
Error: VM error: Array index out of bounds

Benchmarks

Run benchmarks and view results:

cd benchmarks
python benchmark.py
# Results written to ../results/benchmarks.csv
# Open ../results/dashboard.html for charts

Contributing Guidelines

  • Maintain C++17 compliance
  • Add tests for new features
  • Keep warnings as errors (-Werror)
  • Document public APIs
  • Follow existing code style

Status: Fully implemented and tested. All 150+ tests passing.

About

πŸš€ A modern, optimizing bytecode compiler built from scratch in C++ with lexer, parser, optimizer, and stack-based VM. Features 17 opcodes, recursion, arrays, and a live web REPL.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published