ScarDB: A High-Performance RDBMS Engine from Scratch in Rust

ScarDB is a lightweight Relational Database Management System (RDBMS) implemented in Rust. It features a custom storage engine, a schema-driven binary serialization format, B-Tree indexing for optimized lookups, and a relational execution engine capable of performing Nested Loop Joins.

This project demonstrates the implementation of low-level database internals, moving from raw byte manipulation on disk to a high-level SQL interface.

Live web SQL interface

rdbms.oscardev.site (SQL Interface)

Demonstration

screenrecording-2026-01-14_19-58-37.mp4

Key Features

Custom Pager-Based Storage: Manages data in 4KB pages to optimize disk I/O.
Fixed-Length Binary Format: Uses a schema-driven binary format for storage, ensuring $O(1)$ row access via offsets.
B-Tree Indexing: Implements primary key constraints and optimized point-lookups using a memory-resident B-Tree index.
Query Optimizer: A built-in planner that automatically switches from a "Full Table Scan" to an "Index Lookup" when filtering by Primary Key.
Relational Joins: Supports INNER JOIN operations using a Nested Loop Join algorithm.
Full CRUD Support: Supports CREATE, INSERT, SELECT, UPDATE, DELETE, and DROP TABLE.
Advanced SQL Features: Includes AUTOINCREMENT for primary keys and WHERE clause filtering.
Interactive REPL: A full-featured CLI with command history.
Web Integration: A REST API demo showing ScarDB used as a library to power a web application.

Architecture

1. The Storage Layer (The Pager & Bitmask)

Data is stored in a dedicated .db file for each table. To manage space efficiently:

Pages: The file is divided into 4096-byte blocks.
Bitmask Management: Each page contains a 64-byte header with a bitmask. This allows ScarDB to track occupied vs. empty slots, ensuring that when a row is deleted, the space is immediately reclaimed for the next INSERT.

2. The Catalog (Metadata Persistence)

The catalog.json file acts as the database's "brain." It persists table schemas (column names, types, primary key flags) and sequences for AUTOINCREMENT counters.

3. The Execution Engine

The engine transforms SQL AST (Abstract Syntax Tree) into logical commands:

Index Optimization: If a query filters on a Primary Key (e.g., WHERE id = 5), the engine bypasses the file scan and probes the B-Tree for the exact page and slot.
Joins: Joins are handled by a Nested Loop Join. The engine iterates through the "Outer" table and matches records in the "Inner" table based on the join predicate.

Technical Decisions & Trade-offs

Why Rust?: I chose rust for its zero-cost abstractions and memory safety. It allowed for safe raw byte manipulation when serializing data for disk storage.
Why Fixed-Length Records?: By requiring a max length for strings (VARCHAR), we ensure that every row in a table is the same size. This allows for extremely fast "In-Place Updates" and predictable offset math.
B-Tree Indexing: I chose a memory-resident B-Tree (reconstructed on startup) to ensure $O(\log N)$ lookup performance while maintaining code simplicity for this challenge.

Supported SQL Syntax

-- Table Creation
CREATE TABLE users (id INT PRIMARY KEY AUTOINCREMENT, name VARCHAR(20), active BOOLEAN);

-- Data Manipulation
INSERT INTO users (name, active) VALUES ('Oscar', true);
UPDATE users SET active = false WHERE name = 'Oscar';
DELETE FROM users WHERE id = 1;

-- Querying & Joining
SELECT * FROM users WHERE active = true;
SELECT * FROM users JOIN orders ON users.id = orders.user_id;

-- Cleanup
DROP TABLE users;

How to Run

Prerequisites

Rust (latest stable)

1. Interactive CLI (REPL)

cargo run --package cli

2. Web App Demo

cargo run --package web

Then, access the API at http://localhost:3000.

3. Running Tests

cargo test

Credits & Acknowledgments

sqlparser-rs: Used for parsing SQL strings into an AST.
rustyline: Used for the interactive REPL interface.
axum/tokio: Used for the web demonstration.
serde: Used for catalog serialization.

Word of Reflection

This challenge was a deep dive into the "magic" of databases. Implementing the pager and the bitmask was particularly enlightening, as it forced me to think about how data actually lives on a physical platter rather than just in memory. Even if not every SQL edge case is handled, the core engine demonstrates a robust understanding of relational theory and systems programming.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
.github/workflows		.github/workflows
cli		cli
engine		engine
web		web
.dockerignore		.dockerignore
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ScarDB: A High-Performance RDBMS Engine from Scratch in Rust

Live web SQL interface

Demonstration

Key Features

Architecture

1. The Storage Layer (The Pager & Bitmask)

2. The Catalog (Metadata Persistence)

3. The Execution Engine

Technical Decisions & Trade-offs

Supported SQL Syntax

How to Run

Prerequisites

1. Interactive CLI (REPL)

2. Web App Demo

3. Running Tests

Credits & Acknowledgments

Word of Reflection

About

Uh oh!

Releases 3

Packages

Uh oh!

Languages

License

oscarmuya/rust-rdbms

Folders and files

Latest commit

History

Repository files navigation

ScarDB: A High-Performance RDBMS Engine from Scratch in Rust

Live web SQL interface

Demonstration

Key Features

Architecture

1. The Storage Layer (The Pager & Bitmask)

2. The Catalog (Metadata Persistence)

3. The Execution Engine

Technical Decisions & Trade-offs

Supported SQL Syntax

How to Run

Prerequisites

1. Interactive CLI (REPL)

2. Web App Demo

3. Running Tests

Credits & Acknowledgments

Word of Reflection

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Languages

Packages