Skip to content

talagrand/parpl

Repository files navigation

parpl

parpl: Parse Programming Languages

CI Crates.io Documentation

NOT RELEASED YET

Note: parpl is not released to crates.io yet, but will be soon!

A foundation for building interpreters, compilers, and code analysis tools

Parpl provides standards-conformant parsers for programming languages, designed to integrate into your project via trait abstractions. You bring your own AST representation; parpl handles the parsing.

Design

Parpl is a library that provides:

  • Standards-conformant parsing that strictly follows language specifications
  • Trait-based AST construction so you define your own node types
  • Zero opinions about your memory management (arena, Rc, Box—your choice)

Starter implementations are included and designed to be forked. Use them as-is for quick integration, or adapt the code to your project's needs.

Supported Parsers

Parser Feature Flag Specification
CEL cel Common Expression Language
Scheme scheme R7RS Small external representations

Installation

[dependencies]
parpl = { version = "0.1", features = ["cel", "scheme"] }

# Include starter implementations (use as-is or fork)
parpl = { version = "0.1", features = ["cel", "scheme", "reference"] }

Core Concepts

Bring Your Own AST

The key abstraction is the writer trait. You implement it to construct your AST nodes as the parser encounters each construct:

// For CEL: implement CelWriter
impl CelWriter for MyInterpreter {
    type Expr = MyExpr;
    type StringId = MySymbolId;
    // ...
    
    fn binary(&mut self, op: BinaryOp, left: Self::Expr, right: Self::Expr, span: Span) 
        -> Result<Self::Expr, Self::Error> 
    {
        // Build your own binary expression node
        Ok(MyExpr::Binary { op, left, right })
    }
}

// For Scheme: implement DatumWriter
impl DatumWriter for MySchemeReader {
    type Output = MyDatum;
    type StringId = MySymbolId;
    // ...
    
    fn list<I>(&mut self, items: I, span: Span) -> Result<Self::Output, Self::Error> {
        // Build your own list representation
        Ok(MyDatum::List(items.collect()))
    }
}

Using the Parsers

// CEL
let parser = parpl::cel::CelParser::default();
let ast = parser.parse("user.age >= 18", &mut my_writer)?;

// Scheme
let parser = parpl::scheme::SchemeParser::default();
let datum = parser.parse("(+ 1 2 3)", &mut my_writer)?;

Starter Implementations

The reference feature includes working implementations designed to be forked. Use them as-is for quick integration, or adapt the code to your project's needs.

// Example using reference implementation (requires "reference" feature)
use parpl::StringPool;
use parpl::scheme::reference::arena::{ArenaDatumWriter, Datum};

let arena = Bump::new();
let mut interner = StringPool::new();
let mut writer = ArenaDatumWriter::new(&arena, &mut interner);
let parser = parpl::scheme::SchemeParser::default();
let datum = parser.parse("(lambda (x) x)", &mut writer)?;

Features

  • Specification Conformant: Strictly follows CEL spec and R7RS Scheme standard
  • Zero-Copy Parsing: Lexers borrow from input; your AST controls allocation
  • Memory Safe: Configurable depth limits prevent stack overflow attacks
  • Rich Error Reporting: Source spans on all errors, REPL-friendly incomplete detection
  • String Interning: Shared infrastructure for efficient symbol handling

Safety Limits

Both parsers enforce configurable depth limits to prevent stack overflow:

// CEL: Two-phase depth protection
let parser = parpl::cel::Builder::default()
    .max_parse_depth(128)  // Heuristic pre-validation
    .max_ast_depth(24)     // Precise AST recursion limit
    .max_call_limit(10_000_000)  // DoS protection
    .build();

// Scheme: Single depth limit
let parser = parpl::scheme::Builder::default()
    .max_depth(64)
    .build();
let datum = parser.parse(source, &mut writer)?;

Error Handling

All errors include source spans for precise error reporting:

pub enum Error {
    Incomplete,           // Input needs more data (REPL-friendly)
    IncompleteToken,      // Input ends mid-token
    Syntax { span, ... }, // Syntax error with location
    Unsupported { ... },  // Feature not supported by writer
    LimitExceeded { ... }, // Safety limit exceeded
    WriterError { ... },  // Error from your writer implementation
}

Feature Flags

Feature Description Default
cel CEL parser No
scheme Scheme parser No
reference Starter implementations (use as-is or fork) No

Parpl ships with no default features. Enable the parser(s) you need.

Current Status

Implemented

  • Complete CEL lexer and parser (pest-based)
  • Complete R7RS Scheme lexer (winnow-based) and reader
  • Reference implementations demonstrating trait usage
  • Configurable safety limits
  • Comprehensive error types with spans
  • String interning infrastructure

Future Plans

  • R7RS define-syntax macro expander
  • Additional language parsers

Related Projects

  • rulesxp - Expression evaluator for Scheme and JsonLogic

License

MIT License - see LICENSE file for details.

About

Programming Languages Parsers & Manipulation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •