- Tutorial
- ๐ Architecture
- ๐ VM Assembly Syntax and Commands
- ๐ป Command-Line Interface (CLI)
- ๐ ๏ธ Developer TODO / Roadmap
Here is link to MyVM tutorial: Tutorial
This project is a stack-based virtual machine (VM) with a separate call stack for managing function calls.
A stack-based VM executes instructions primarily by manipulating a stack data structure.
Instead of operating directly on registers, most instructions push values onto the stack and pop them when needed.
For example, consider the following program:
PUSH 10
PUSH 20
ADDPUSH 10pushes the value10onto the stack.PUSH 20pushes the value20.ADDpops the top two values (20and10), adds them, and pushes the result (30) back onto the stack.
This simple model makes it easier to implement compilers and interpreters since there is no need to manage complex register allocations.
The VM uses a Reduced Instruction Set Computing (RISC) design.
RISC focuses on having a small, well-defined set of simple instructions, each performing a single operation efficiently.
Each instruction in the VM is represented as an Opcode.
An Opcode (Operation Code) is the numeric or symbolic representation of an instruction that tells the VM what operation to perform.
Opcodes may also take parameters (operands).
For example:
PUSH 42 ; Opcode: PUSH, Operand: 42
PUSH 32 ; Opcode: PUSH, Operand: 32
ADD ; Opcode: ADD (no operands)Some Opcodes have variants.
For example, the Jump instruction has multiple variants that depend on conditions:
jmpโ Unconditional jumpjnzโ Jump if not zerojzโ Jump if zerojgโ Jump if greaterjlโ Jump if lessjgeโ Jump if greater or equaljleโ Jump if less or equal
The VM core is composed of three primary components:
- The main storage that contains:
- Uploaded program code
- Variables and data
- Execution stack
- The stack is located at the end of memory and grows backwards to avoid collisions with program data.
- If the stack grows beyond its capacity, a stack overflow occurs.
- Each memory cell is 32-bit wide.
- Registers are small, fast storage locations inside the VM that hold data during execution.
- This VM has:
- 8 general-purpose registers:
r0...r7 - Program Counter (PC) register: stores the address of the next instruction to execute
- 8 general-purpose registers:
Registers allow the VM to perform operations more quickly than relying solely on memory.
- Flags are single-bit indicators that store the outcome of operations.
- They are used by conditional instructions (like jumps) to determine execution flow.
- This VM defines the following flags:
- Zero (Z) โ Set when the result of an operation is
0 - Overflow (O) โ Set when an arithmetic operation produces a value outside the representable range
- Negative (N) โ Set when the result of an operation is negative
- Carry (C) โ Set when an arithmetic operation generates a carry/borrow bit beyond the register size
- Zero (Z) โ Set when the result of an operation is
The VM uses a custom assembly language for programming, supporting constants, memory addresses, registers, labels, macros, and interrupts.
Every program must have exactly one .start label that serves as the entry point. This is where execution begins.
- No
.startlabel: Results in compilation error - Multiple
.startlabels: Results in compilation error
Example:
[text]
.start
PUSH 10
INT 0 0
TERMPrograms must be organized into sections using [section_name] tags. There are two mandatory sections:
[text]- Contains executable code[data]- Contains data definitions and constants
Writing code in the [data] section or defining data in the [text] section will result in an error.
Example:
[data]
$name dw "Pedram" 0
$scores w 0x2301 0x1212 19 20
[text]
.start
PUSH $name
INT 0 3
TERMData can be defined in the [data] section using the following syntax:
$[identifier] [data type] [... data separated with space ...]
Supported data types:
b- Byte: stores data in 1-byte arraysw- Word: stores data in 2-byte arraysdw- Double Word: stores data in 4-byte arrays
Examples:
[data]
$name b "Pedram"
$scores w 0x2301 0x1212 19 20
$data dw 0xaaaaaaaa 0xbbbbbbbbNote on memory storage: Each memory cell is 32-bit (4 bytes). When using b or w types, data is packed into memory cells at the bit level. For example:
b 0xaa 0xbb 0xccstores as:0xaabbcc00w 0xaabb 0xccstores as:0xaabbcc00
The following syntax is supported for accessing data:
; Push operations
push $name ; pushes $name address to stack
push [$name] ; pushes $name value to stack
push [$name + 4] ; pushes $name value with offset to stack
push [$name + r0] ; pushes $name value with offset stored in register to stack
; Move operations
move r0 $name ; move address of $name into register r0
move r0 [$name] ; move value of $name into register r0
move r0 [$name + 2] ; move value of $name with offset into register r0
move r0 [$name + r1] ; move value of $name with offset stored in r1 into register r0
move r0 &r1 ; move value of data that its address stored in register r1 to r0- Decimal:
42 - Hexadecimal:
0x2A - Binary:
0b101010
- Use
&[number]to reference memory addresses:&0x1010,&321,&0b101010
- They point to the value stored at the specified memory address.
Metas are special commands that control how the VM executes or prepares code.
They start with @ and may have parameters:
@ORG xโ Sets the origin address in memory for the following code.@INCLUDE "./file.asm"โ Includes another assembly file into the current file.
- Start with
;
; This is a commentThe VM supports registers:
-
General-purpose: r0, r1, r2, r3, r4, r5, r6, r7
-
Special-purpose: pc (Program Counter)
| Command | Parameters | Description |
|---|---|---|
PUSH 10 |
number | Pushes a constant number to the stack |
PUSH r0 |
register | Pushes value of a register onto the stack |
PUSH &10 |
address | Pushes value from memory address onto the stack |
PUSH $name |
data label | Pushes address of data label to stack |
PUSH [$name] |
data label | Pushes value of data label to stack |
PUSH [$name + 1] |
data label + offset | Pushes value of data label with offset to stack |
PUSH [$name + r0] |
data label + offset | Pushes value of data label with offset in register to stack |
POP r1 |
register | Pops value from stack into a register |
POP &32 |
address | Pops value from stack into a memory address |
ADD |
- | Pops two values, adds them, pushes result |
SUB |
- | Pops two values, subtracts, pushes result |
MUL |
- | Pops two values, multiplies, pushes result |
DIV |
- | Pops two values, divides, pushes result, puts reminder in R3. |
DROP |
- | Drops the top item of the stack |
SWAP |
- | Swaps top two items on stack |
INC r0 |
register | increase register by 1 |
DEC r0 |
register | decrease register by 1 |
MOVE r0 10 |
register, value | Moves constant into register |
MOVE r0 r1 |
register, register | Moves value from one register to another |
MOVE r0 &12 |
register, address | Moves value from memory address to register |
MOVE r0 $name |
register, data | Moves address of data label to register |
MOVE r0 [$name] |
register, data | Moves value of data label to register |
MOVE r0 [$name + 2] |
register, data | Moves value of data label with offset to register |
MOVE r0 [$name + r1] |
register, data | Moves value of data label with register offset to register |
MOVE r0 &r1 |
register, register | Moves value from address in register to register |
STORE 1010 32 |
address, value | Stores constant into memory |
STORE 1010 r3 |
address, register | Stores register value into memory |
JMP .label |
label | Unconditional jump |
JNZ .label |
label | Jump if not zero |
JZ .label |
label | Jump if zero |
JG .label |
label | Jump if greater |
JGE .label |
label | Jump if greater or equal |
JL .label |
label | Jump if less |
JLE .label |
label | Jump if less or equal |
AND |
- | Pops two values, bitwise AND, pushes result |
OR |
- | Pops two values, bitwise OR, pushes result |
XOR |
- | Pops two values, bitwise XOR, pushes result |
NOT |
- | Pops one value, bitwise NOT, pushes result |
SHR 10 |
value | Pops value, shifts right by constant, pushes result |
SHR r3 |
value | Pops value, shifts right by register value, pushes result |
SHL 10 |
value | Pops value, shifts left by constant, pushes result |
SHL r3 |
value | Pops value, shifts left by register value, pushes result |
CALL 32 |
address | Calls procedure at address |
CALL .label |
label | Calls procedure by label |
CALL r0 |
register | Calls procedure at address in register |
CALL &323 |
address | Calls procedure at memory address |
SAFECALL 32 |
address | Calls procedure at address and preserve machine state (registers and flags) |
SAFECALL .label |
label | Calls procedure by label and preserve machine state (registers and flags) |
SAFECALL r0 |
register | Calls procedure at address in register and preserve machine state (registers and flags) |
SAFECALL &323 |
address | Calls procedure at memory address and preserve machine state (registers and flags) |
RET |
- | Returns from procedure |
DUP |
- | Duplicates top stack item |
DUP 10 |
number | Duplicates top stack item n times |
DUP r3 |
register | Duplicates top stack item r3 times |
INT 0 2 |
module, function | Calls an interrupt (see below) |
TERM |
- | Terminates code execution |
-
Labels mark code positions and procedures for jumps and calls.
-
Start with . and contain only alphanumeric characters (no spaces):
.sayhello
; code here
RETAn interrupt is a pre-defined function in a VM module that performs operations outside normal instructions. Interrupts allow modular, system-level functionality like I/O.
INT module_number function_number| Function | Description |
|---|---|
| 0 | Pops top of stack and prints it |
| 1 | Pops number n from stack, then pops n items and prints them |
| 2 | Pops a stop value, then continuously pops and prints until reaching stop value |
| 3 | Pops address of string in [data] and prints it until reaching 0 |
| 4 | Pops number from stack and prints it as string |
Interrupts provide a bridge between VM code and system-level functions without complicating the instruction set.
@ORG 0x100
.start
CALL .f
TERM
.f
PUSH 13
INT 0 0
RET[data]
$hello dw "Hello World!" 10 13 0
[text]
.start
push $hello
int 0 3
term
@org 0
.start
push 123
call .split
TERM
.split
push 10
swap
div
push r3
call .printdigit
dup
push 10
swap
sub
drop
jge .split
call .printdigit
call .newline
ret
; push digit as parameter
.printdigit
push 48
add
int 0 0
ret
.newline
push 10
push 13
push 10
int 0 2
retThis project includes a CLI tool built with Rust Clap to compile assembly code into binary and execute binary files on the VM.
The CLI provides two main commands: compile and exec.
After cloning the repository, build the CLI with Cargo:
cargo build --releaseThe compiled executable will be in target/release/. You can run it directly:
./myvm [COMMAND] [OPTIONS]Compiles an assembly source file into a binary that can be executed on the VM.
Usage:
./myvm compile -p source.asm -o output.bin| Option | Description |
|---|---|
-p, --path |
Path to the source assembly file |
-o, --output |
Path to the output binary file |
How it works:
- Reads the assembly file at the given path
- Compiles it using the VM compiler
- Writes a binary file with:
- Header: origin address (u32)
- Body: compiled bytecode (u32 per instruction)
Executes a compiled binary file on the VM.
Usage:
./myvm exec -p output.bin --cells 2048 --stack 256Options:
| Option | Description | Default |
|---|---|---|
-p, --path |
Path to the binary file | โ |
-c, --cells |
Number of memory cells in the VM | 2048 |
-s, --stack |
Number of cells allocated for the stack | 256 |
-d, --dump |
Dumps the VM's memory layout to stdout | false |
How it works:
- Reads the binary file
- Parses the origin address from the header
- Loads the bytecode into VM memory
- Configures the VM memory and stack size
- Sets the program counter to the origin
- Executes instructions sequentially until
TERMor an error occurs
This project is a hobby but fully open for contributions. Here are some key areas to work on:
-
Error Handling
- Detect and handle stack overflows, invalid memory access, and illegal opcodes
- Provide descriptive runtime error messages
-
Heap and Memory Management
- Implement dynamic memory allocation
- Add garbage collection or memory reuse strategies
-
IO Interrupt Module
- Expand module 0 functionality
- Support reading input, printing formatted output, and file operations
-
Network Interrupt Module
- Add network communication interrupts for sending/receiving data
- Enable TCP/UDP support for simple network programs
-
More unit tests
- Write unit test for all modules and functions
-
Code docs
- Write better code docs
-
Create a REPL
- REPL in CLI