Skip to content

kaulcodes/CompilerConstructionBITSPILANI

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Compiler Construction BITS PILANI - CS F363

This is the term project for the CS F363 Compiler Construction course at BITS Pilani. The project implements a compiler front-end for a custom language. It consists of a lexical analyzer (lexer) and a syntax analyzer (parser) using an LL(1) parsing approach.

Project Overview

Lexical Analyzer (Lexer)

  • Purpose: Converts source code into tokens.
  • Key Features:
    • DFA Implementation: Uses a deterministic finite automaton with twin buffers for efficient file I/O.
    • Tokenization: Recognizes tokens such as identifiers, numbers (integer and real), keywords, and operators.
    • Error Handling: Reports invalid characters or patterns with line numbers.
    • Symbol Table: Manages reserved keywords and identifiers.
  • Main Files:
    • lexer.c, lexerDef.h
    • helper_function.c, helper_function.h
    • symbol_table.c, symbol_table.h, symbol_tableDef.h

Syntax Analyzer (Parser)

  • Purpose: Checks syntactic correctness and builds a parse tree.
  • Key Features:
    • Grammar Reading: Reads grammar rules from files (e.g., final_grammar_index_clean.txt).
    • FIRST and FOLLOW Sets: Computes these sets for all non-terminals.
    • Parse Table Construction: Builds an LL(1) parse table used to guide the parser.
    • Predictive Parsing: Uses a stack to match tokens against grammar rules and constructs a parse tree.
    • Error Reporting: Reports syntax errors with detailed messages.
  • Main Files:
    • parser.c, parser.h, parserDef.h
    • stack.c, stack.h, stackDef.h
    • parseTree.c, parseTree.h, parseTreeDef.h

Driver

  • Purpose: Provides a menu-based interface to run the compiler.
  • Features:
    • Remove comments from the source code.
    • Print the list of tokens generated by the lexer.
    • Parse the source code and print the parse tree.
    • Measure execution time for parsing.
  • Main File: driver.c

How to Build

  1. Compile the Main Compiler: In the project root directory, run:
    make run f1="file_to_be_compiled.txt" f2="output_file_for_parse_tree.txt"
    This will compile and run compiler

How to Run

Running the Compiler

The main executable requires two arguments:

./compiler <source_file> <parse_tree_output_file>

For example:

./compiler testcases/lexer_test_cases/t2.txt output_parse_tree.txt

A menu will appear with options:

  • 0: Exit.
  • 1: Remove comments (shows the cleaned file).
  • 2: Print token list.
  • 3: Parse the source code and generate a parse tree (saved to the output file).
  • 4: Display execution time for parsing.

Testing

Test cases are provided

Conclusion

This project implements a complete compiler front-end for a custom language, integrating a DFA-based lexer and an LL(1) predictive parser. The design is modular with clear separation of concerns, and it is accompanied by test cases and a dedicated test harness (test.c) for verifying grammar analysis. With further testing and refinement, this project will serve as a solid foundation for subsequent phases such as semantic analysis and code generation.

NOTE

Error recovery in parser is not semantically correct (to be fixed)


Final Summary

  • Code Integration:
    The lexer reads the source file using twin buffers, tokenizes it using a DFA in lexer.c, and stores tokens in a symbol table.
    The parser reads the grammar, computes FIRST and FOLLOW sets (using ComputeFirstAndFollowSets()), builds an LL(1) parse table, and parses the tokens into a parse tree using a stack.
    The driver (driver.c) ties these components together and offers a menu for testing each phase.

  • Building and Running:
    The Makefile builds the main executable (compiler)

About

CS F363 Compiler Construction Term Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 98.6%
  • Makefile 1.4%