An implementation of PEG.js grammar for Erlang
This is a rather straightforward port/implementation of the grammar defined for PEG.js.
-
As far as I can tell, implements everything from the PEG.js grammar
-
Generates complete useable parsers
-
The project is bootstrapped (see
priv/pegjs_parse.pegjs
). Original grammar for Neotoma is also available inpriv/pegjs_parse.peg
-
It's based on an earlier definition of the grammar (probably this) than the one that currently exists for PEG.js.
Current-ish version of the grammar has been ported to
priv/parser.pegjs
, but causes the VM to quit with an out-of-memory exception on sufficiently large garmmars (including its own). See How to contribute section for more info -
Implements support for
@append
extension (see, e.g. core-pegjs in the for-GET project)
- Dialyze, create dialyzer-friendly parsers
> pegjs:file("extra/csv_pegjs.peg").
ok
> c("extra/csv_pegjs.erl").
{ok, csv_pegjs}
> csv_pegjs:parse("a,b,c").
[{<<"head">>,
[{<<"head">>,[[[[],[],<<"a">>]]]},
{<<"tail">>,
[[[<<",">>],[[[[],[],<<"b">>]]]],
[[<<",">>],[[[[],[],<<"c">>]]]]]}]},
{<<"tail">>,[]}]
There are several options you can pass along to pegjs:file(File, Options::options())
:
-type options() :: [option()].
%% options for pegjs
-type option() :: {output, Dir::string() | binary()} %% where to put the generated file
%% Default: directory of the input file
| {module, string() | binary()} %% to change the module name
%% Default: name of the input file
| pegjs_analyze:option().
%% options for pegjs_analyze
-type option() :: {ignore_unused, boolean()} %% ignore unused rules. Default: true
| {ignore_duplicates, boolean()} %% ignore duplicate rules. Default: false
| {ignore_unparsed, boolean()} %% ignore incomplete parses. Default: false
| {ignore_missing_rules, boolean()} %% Default: false
| {ignore_invalid_code, boolean()} %% Default: false
| {parser, atom()} %% use a different module to parse grammars.
%% Default: pegjs_parse
| {root, Dir::string() | binary()}. %% root directory for @append instructions.
%% Default: undefined
Suggestions and improvements are more than welcome!
Current grammar in priv/pegjs_parse.peg
is created for Neotoma,
so you need that to tweak pegjs.
pegjs_analyze
module is inspired by neotoma_analyze
from the 2.0-refactor
branch of neotoma.
Non-generated parser combinators can be found in priv/pegjs.template
.
Safe working parser is always available at src/pegjs_parse.erl.safe
.
The current grammar from which the project is now bootstrapped lives in
priv/pegjs_parser.pegjs
. When you've tweaked it and you want to try your changes,
generate a different module and tell pegjs
to use your new module instead:
> pegjs:file("priv/pegjs_parse.pegjs", [{output, "src"}, {module, modified_parser}]).
ok
> c(modified_parser).
{ok, modified_parser)
> pegjs:file("extra/json.pegjs", [{parser, modified_parser}]).
ok
... etc. ...
Once you're satisfied with your changes, overwrite pegjs_parser (which is used by default):
> pegjs:file("priv/pegjs_parse.pegjs", [{output, "src"}]).
ok
> c(pegjs_parse).
{ok, pegjs_parse)
> pegjs:file("extra/json.pegjs").
ok
... etc. ...
A port of a current-ish version
of the PEG.js grammar can be found in priv/parser.pegjs
. src/pegjs.erl
,
src/pegjs.hrl
and src/pegjs_analyze.erl
have all been updated to work with
this grammar (and will generate a parser for you. Note, however, that priv/pegjs.template
doesn't contain code for the action
combinator).
To generate a parser from this grammar:
> pegjs:file("priv/parser.pegjs", [{output, "src/"}]).
ok
> c("src/parser.erl").
{ok, parser}
> pegjs:file("extra/csv.pegjs", [{parser, parser}]).
ok
... etc ...
However, the parser causes the VM to fail with an out-of-memory exception for
sufficiently large grammars (including parser.pegjs
). YMMV. The culprit is
the escape/1,2
function (see initializer section). I haven't figured out what
to do about this yet.
The original parser for pegjs was derived from a grammar defined for Neotoma. You can also start your work from there:
> neotoma:file("priv/pegjs_parse.peg", [{output, "src/"}]).
ok
> pegjs:file(.... etc ... )
However, the original grammar will get increasingly outdated as time goes on, so it's there for reference only.