This is the term project for the CS F363 Compiler Construction course at BITS Pilani. The project implements a compiler front-end for a custom language. It consists of a lexical analyzer (lexer) and a syntax analyzer (parser) using an LL(1) parsing approach.
- Purpose: Converts source code into tokens.
- Key Features:
- DFA Implementation: Uses a deterministic finite automaton with twin buffers for efficient file I/O.
- Tokenization: Recognizes tokens such as identifiers, numbers (integer and real), keywords, and operators.
- Error Handling: Reports invalid characters or patterns with line numbers.
- Symbol Table: Manages reserved keywords and identifiers.
- Main Files:
lexer.c,lexerDef.hhelper_function.c,helper_function.hsymbol_table.c,symbol_table.h,symbol_tableDef.h
- Purpose: Checks syntactic correctness and builds a parse tree.
- Key Features:
- Grammar Reading: Reads grammar rules from files (e.g.,
final_grammar_index_clean.txt). - FIRST and FOLLOW Sets: Computes these sets for all non-terminals.
- Parse Table Construction: Builds an LL(1) parse table used to guide the parser.
- Predictive Parsing: Uses a stack to match tokens against grammar rules and constructs a parse tree.
- Error Reporting: Reports syntax errors with detailed messages.
- Grammar Reading: Reads grammar rules from files (e.g.,
- Main Files:
parser.c,parser.h,parserDef.hstack.c,stack.h,stackDef.hparseTree.c,parseTree.h,parseTreeDef.h
- Purpose: Provides a menu-based interface to run the compiler.
- Features:
- Remove comments from the source code.
- Print the list of tokens generated by the lexer.
- Parse the source code and print the parse tree.
- Measure execution time for parsing.
- Main File:
driver.c
- Compile the Main Compiler:
In the project root directory, run:
This will compile and run compiler
make run f1="file_to_be_compiled.txt" f2="output_file_for_parse_tree.txt"
The main executable requires two arguments:
./compiler <source_file> <parse_tree_output_file>For example:
./compiler testcases/lexer_test_cases/t2.txt output_parse_tree.txtA menu will appear with options:
- 0: Exit.
- 1: Remove comments (shows the cleaned file).
- 2: Print token list.
- 3: Parse the source code and generate a parse tree (saved to the output file).
- 4: Display execution time for parsing.
Test cases are provided
This project implements a complete compiler front-end for a custom language, integrating a DFA-based lexer and an LL(1) predictive parser. The design is modular with clear separation of concerns, and it is accompanied by test cases and a dedicated test harness (test.c) for verifying grammar analysis. With further testing and refinement, this project will serve as a solid foundation for subsequent phases such as semantic analysis and code generation.
Error recovery in parser is not semantically correct (to be fixed)
-
Code Integration:
The lexer reads the source file using twin buffers, tokenizes it using a DFA inlexer.c, and stores tokens in a symbol table.
The parser reads the grammar, computes FIRST and FOLLOW sets (usingComputeFirstAndFollowSets()), builds an LL(1) parse table, and parses the tokens into a parse tree using a stack.
The driver (driver.c) ties these components together and offers a menu for testing each phase. -
Building and Running:
The Makefile builds the main executable (compiler)