Skip to content
weyrick edited this page Jan 17, 2012 · 1 revision

Here you can find documentation on the internals of the Roadsend PHP compiler and runtime. This page is only of interest to those developing the actual compiler, debugger, runtime and extensions.

Roadsend PHP is written in [www-sop.inria.fr/mimosa/fp/Bigloo/ bigloo scheme] and C. To hack Roadsend PHP, you will have to understand both. If you’ve never programmed [en.wikipedia.org/wiki/Scheme_%28programming_language%29 scheme], [www.schemers.org/ here] is a good place to start.

The major parts of Roadsend PHP include:

”‘Frontend”’

Command line interface, argument parsing, main driver.

”‘Compiler (generator)”’

Generates scheme code from an AST tree

”‘Interpretor (evaluator)”’

Evaluates nodes of an AST tree

”‘Debugger”’

Step debugger and evaluator

”‘Runtime”’

Basic runtime functionality: PHP hashes, objects, resources, basic type handling, output buffering, constants, INI files, runtime state initialization and maintenance.

”‘Extensions”’

MySQL, XML, SQLite, PCRE, etc

”‘Web Backends”’

FastCGI and MicroServer interfaces

The source tree is organized thusly:

{{{ /pcc

/benchmarks       [benchmark test suite]
/bugs             [regression test suite]
/compiler         [core compiler: parser/lexer/interpreter/generator]
/doc              [documentation: manual, tutorials]
/libs             [compiled runtime libraries]
/packages         [packaging: rpm, deb, self installer]
/runtime          [core runtime: hashes, objects, resources, etc]
  /php-ext        [extension written in PHP]
    /pdo          [PDO database abstraction]
  /ext            [extensions]
    /standard     [standard library: array_*, strlen, fopen, etc]
    /xml          [xml extension]
    /sqlite       [sqlite extension]
    /curl         [curl extension]
    /gtk          [PHP-GTK extension]
    /gtk2         [PHP-GTK2 extension]
    /mysql        [mysql extension]
    /odbc         [ODBC extension]
    /pcre         [PCRE extension]
    /pcc-win      [pcc windows extension]
    /sockets      [sockets extension]
/testoutput       [output directory for main test suite]
/tests            [main test suite]
/sa-tests         [stand alone tests which don't compare against zend but test other compiler functionality]
/tidbits          [misc]
/tools            [tools]
  /profiler       [profiler]
  /readline       [readline command line editing library]
  /libwebserver   [microserver C library]
  /shortpath      [windows shortpath]
/webconnect       [web connect backends]
  /fastcgi        [fastcgi backend]
  /micro          [microserver backend]
  /tests          [web test suite]
  /apache1        [apache1 module (deprecated)]
  /apache2        [apache2 module (deprecated)]
  /cgi            [cgi module (deprecated)]
/zend-tests       [test suite from main zend php distro]

}}}

The general work flow for a typical pcc run is:

1. scan the given source file(s) and build an [http://en.wikipedia.org/wiki/Abstract_syntax_tree Abstract Syntax Tree] (AST)
1. if interpreting:
   1. walk the AST and evaluate each node immediately
   1. end
1. if compiling:
   1. walk the AST and generate appropriate scheme code for each node
   1. create an appropriate "stub" file, depending on target (commandline, microserver, fastcgi, library)
   1. run bigloo on generated code
      1. bigloo generates C code
      1. bigloo calls GCC
         1. GCC compiles and links to machine code
   1. end

You can dump the tokens created by the lexer pass by using the “–dump-toks” command line switch. This can help debug parser problems:

{{{ $ pcc –dump-toks hello.php (echokey (string . “hello worldn”) semi semi) }}}

You can dump the nodes from the AST tree by using the “–dump-ast” command line switch:

{{{ $ pcc –dump-ast hello.php (PHP-AST

(|original-filename:| "hello.php")
(|real-filename:|
  "/home/user/tmp/php/hello.php")
(|project-relative-filename:| "unknown")
(|program-name:| "unknown")
(|import-asts:| ())
(|nodes:|
  ((ECHO-STMT
     (|location:|
       (3 . "/home/user/tmp/php/hello.php"))
     (|stuff:|
       ((LITERAL-STRING
          (|location:|
            (3 . "/home/user/tmp/php/hello.php"))
          (|value:| "hello world\n")))))
   (NOP (|location:|
          (5 . "/home/user/tmp/php/hello.php"))))))

}}}

You can dump the AST after annotation with “–dump-types”

You can create a flow graph of the program for use with [graphviz.org graphviz] with: {{{ pcc –dump-flow foo.php |dot -Tgif -o foo.gif }}}

The core runtime is composed of the following modules:

php-runtime

The main runtime module which exports the scheme PHP runtime API

php-objects

Implements the functionality of and provides an API for working with PHP objects.

php-hash

Implements the functionality of and provides an API for working with PHP hashes.

opaque-math

An implementation of PHP numbers that handles basic arithmetic with automatic type conversion and overflow detection.

php-errors

Error handling, stack traces, exceptions

php-ini

Generic INI file reading as well as getting/setting php.ini compatible values

constants

PHP constants implementation (i.e., define()) as well as 'special' constants like __FILE__

extended-streams

Base code for file/network stream operations

finalizers

Implementation of resource finalizers

grasstable

Custom hash table code used by the runtime

resources

Implementation of PHP resources

signatures

Function signatures

utils

Utility functions used through the runtime

builtin-classes

Builtin PHP class and interface definitions, like Exception

containers

Routines and structures for working with variable containers (for e.g. reference variables)

dynarray

Implementation of dynamic arrays

environments

Implementation of variable environments, for global, functions, methods

php-functions

Implementation of PHP function calling semantics

php-types

Routines for working with the various PHP types, including type conversion

url-rewriter

A parser that handles URL rewriting for output buffering

web-var-cache

A method for caching PHP variables that persist through web requests

output-buffering

Runtime implementation of output buffering

Each extension has a similar layout.

extension makefile

A makefile which inherits a common set of extension make functionality from extensions.mk

library bindings

Most extensions involve binding functions from a C library. This is done using the [http://www-sop.inria.fr/mimosa/fp/Bigloo/doc/bigloo-23.html#The-C-interface FFI provided by bigloo].

extension code

The actual scheme code that provides the functionality of the extension. It makes use of the PHP scheme runtime API and library bindings.

tests

A test suite

The core compiler is composed of the following modules:

ast

Abstract Syntax Tree structure definitions and helper routines

basic-blocks

a basic block is a collection of AST nodes with linear control-flow, like a scheme begin statement

cfa

Control Flow Analysis (AST annotation)

config

pcc.conf routines

commandline

command line front end

containers

container analysis (AST annotation)

debugger

step debugger backend

declare

runs over AST(s) and make sure that everything is declared and widened (AST annotation)

driver

entry point for (compile, interpret, run-url) routines

evaluate

evaluate an AST directly

generate

generate scheme code from an AST

include

include file handling

lexers

PHP lexer

parser

PHP parser

pdb

step debugger front end

synhighlight

code highlighting

tags

ctags compatible source code "tagger"

target

base "target" code. Targets are the desired end result: command line application, FastCGI application, library, etc.

FastCGI

The FastCGI interface, for compiling PHP source code to FastCGI binaries. They can also be used as normal CGI binaries.

Micro

The MicroServer interface, for compiling PHP source code to a binary file with a built in web server.

Clone this wiki locally