-
Notifications
You must be signed in to change notification settings - Fork 8
Parsing PHP Files
To parse PHP files, you first need to import the modules which provide the AST datatypes and the parsing functionality:
import lang::php::ast::AbstractSyntax;
import lang::php::util::Utils;
To parse a single PHP file, use function loadPHPFile
:
ast = loadPHPFile(|file:///Users/mhills/Projects/phpsa/corpus/MediaWiki/mediawiki-1.19.1/profileinfo.php|);
This will parse the given file and return an AST-representation, tagged with source locations, of the contents.
To parse an entire system, use function loadPHPFiles
. This function comes in two varieties. The first takes a location, which should be a directory, and parses all files with either a .php
or a .inc
extension:
sys = loadPHPFiles(|file:///Users/mhills/Projects/phpsa/corpus/MediaWiki/mediawiki-1.19.1|);
The second also takes a set containing the extensions that should be included to determine which files to parse:
sys = loadPHPFiles(|file:///Users/mhills/Projects/phpsa/corpus/MediaWiki/mediawiki-1.19.1|, {"php","inc"});
Both return a System
, imported as follows:
import lang::php::util::System;
A System
is an alias to a map from locations (the location of each file) to ASTs (the AST representing the file at that location).
All of the functions shown above will throw the runtime exception AssertionFailed
in cases where the location provided is not correct. In all cases the location must be a file, given with the file scheme, and must exist. For loadPHPFile
the location must also be a file, while for loadPHPFiles
the location must be a directory.
The parser was updated on 4 and 5 June, 2013, to match the output of the current version of the external parser and to fix a couple of bugs. To summarize:
- Support for
yield
in PHP 5.5 has been added to the AST - List assignment now works correctly for multi-level lists
- List assignment now works correctly for empty positions, e.g.,
list($a,,$b)
- List assignment now uses the standard assignment expression constructor with a list expression target versus using the list assignment constructor
- Namespaces without blocks are differentiated from namespaces with empty blocks by using
namespaceHeader
for the first
It's easy to translate back to the original AST for everything but yield
(which will not be in existing ASTs, most likely) and nested lists. The functions that can do this are in NormalizeAST
and are named oldListAssignments
and oldNamespaces
. Each takes a script and returns a script that uses the original features. This does not currently propagate annotations.