An ultra lightweight PEG Parser in ANSI C.
____ ____ _____ ____
| _ \ ___ _ __ _ __ __ _ | _ \| ____/ ___|
| |_) / _ \ '_ \| '_ \ / _` | | |_) | _|| | _
| __/ __/ |_) | |_) | (_| | | __/| |__| |_| |
|_| \___| .__/| .__/ \__,_| |_| |_____\____|
|_| |_|
Peppa PEG is an ultra lightweight PEG (parsing expression grammar) parser in ANSI C.
References: GitHub | Project Home Page | Project Documentation Pages.
Currently, this repo hosted the grammar specification written in Peppa PEG for the following languages:
ABNF (RFC 5234) | Golang v1.17 | HCL 2 | JSON (ECMA-404) | Lua v5.3 | TOML v1.0 | .
Assume your system has cmake
installed, run
$ cd PeppaPEG/
$ mkdir build
$ cd build
$ cmake ..
$ make
$ make install
Once installed, add include macro and start using the library!
#include <peppa.h>
You can use pkg-config to link the library:
$ gcc `pkg-config --cflags --libs libpeppa` example.c
Peppa PEG has a header file and a C file, so alternatively you can add it to your project by copying files "peppa.h" and "peppa.c".
Once copied, add include macro and start using the library!
#include "peppa.h"
You can manually load the library source:
$ gcc example.c peppa.c
Peppa PEG ships with a tiny utility: peppa
to help develop a PEG grammar.
Example: given files: json.peg
and data.json
, run with peppa
utility:
$ cat json.peg
@lifted entry = &. value !.;
@lifted value = object / array / string / number / true / false / null;
object = "{" (item ("," item)*)? "}";
item = string ":" value;
array = "[" (value ("," value)*)? "]";
@tight string = "\"" ([\u0020-\u0021] / [\u0023-\u005b] / [\u005d-\U0010ffff] / escape )* "\"";
true = "true";
false = "false";
null = "null";
@tight @squashed number = minus? integral fractional? exponent?;
@tight @squashed @lifted escape = "\\" ("\"" / "/" / "\\" / "b" / "f" / "n" / "r" / "t" / unicode);
@tight @squashed unicode = "u" ([0-9] / [a-f] / [A-F]){4};
minus = "-";
plus = "+";
@squashed @tight integral = "0" / [1-9] [0-9]*;
@squashed @tight fractional = "." [0-9]+;
@tight exponent = i"e" (plus / minus)? [0-9]+;
@spaced @lifted whitespace = " " / "\r" / "\n" / "\t";
$ cat data.json
[{"numbers": [1,2.0,3e1]},[true,false,null],"xyz"]
$ peppa parse -G json.peg -e entry data.json | python3 ./scripts/gendot.py | dot -Tsvg -o/tmp/data.svg
In Peppa PEG, grammar syntax can be loaded from a string. Below is an example of JSON grammar syntax.
P4_Grammar* grammar = P4_LoadGrammar(
"@lifted\n"
"entry = &. value !.;\n"
"@lifted\n"
"value = object / array / string / number / true / false / null;\n"
"object = \"{\" (item (\",\" item)*)? \"}\";\n"
"item = string \":\" value;\n"
"array = \"[\" (value (\",\" value)*)? \"]\";\n"
"@tight\n"
"string = \"\\\"\" ([\\u0020-\\u0021] / [\\u0023-\\u005b] / [\\u005d-\\U0010ffff] / escape )* \"\\\"\";\n"
"true = \"true\";\n"
"false = \"false\";\n"
"null = \"null\";\n"
"@tight @squashed\n"
"number = minus? integral fractional? exponent?;\n"
"@tight @squashed @lifted\n"
"escape = \"\\\\\" (\"\\\"\" / \"/\" / \"\\\\\" / \"b\" / \"f\" / \"n\" / \"r\" / \"t\" / unicode);\n"
"@tight @squashed"
"unicode = \"u\" ([0-9] / [a-f] / [A-F]){4};\n"
"minus = \"-\";\n"
"plus = \"+\";\n"
"@squashed @tight\n"
"integral = \"0\" / [1-9] [0-9]*;\n"
"@squashed @tight\n"
"fractional = \".\" [0-9]+;\n"
"@tight"
"exponent = i\"e\" (plus / minus)? [0-9]+;\n"
"@spaced @lifted\n"
"whitespace = \" \" / \"\\r\" / \"\\n\" / \"\\t\";\n"
);
The input can be parsed via P4_Parse
:
P4_Source* source = P4_CreateSource("[{\"numbers\": [1,2.0,3e1]},[true,false,null],\"xyz\"]", "entry");
P4_Parse(grammar, source);
You can traverse the parse tree. For example, the below function outputs the parse tree into JSON format:
P4_Node* root = P4_GetSourceAST(source);
P4_JsonifySourceAst(stdout, root, NULL);
[{"slice":[0,50],"type":"array","children":[
{"slice":[1,25],"type":"object","children":[
{"slice":[2,24],"type":"item","children":[
{"slice":[2,11],"type":"string"},
{"slice":[13,24],"type":"array","children":[
{"slice":[14,15],"type":"number"},
{"slice":[16,19],"type":"number"},
{"slice":[20,23],"type":"number"}]}]}]},
{"slice":[26,43],"type":"array","children":[
{"slice":[27,31],"type":"true"},
{"slice":[32,37],"type":"false"},
{"slice":[38,42],"type":"null"}]},
{"slice":[44,49],"type":"string"}]}]
Read the documentation here: https://soasme.com/PeppaPEG/.
Assume you have cmake
and gcc
installed.
(root) $ mkdir -p build
(root) $ cd build
(build) $ cmake -DENABLE_CHECK=On ..
(build) $ make check
...
100% tests passed, 0 tests failed
If valgrind is installed, you can also run the test along with memory leak check.
(root) $ mkdir -p build
(root) $ cd build
(build) $ cmake -DENABLE_VALGRIND=ON ..
(build) $ make check
If you feel having a testing environment is hard, try docker:
$ docker run --rm -v `pwd`:/app -it ubuntu:latest bash
# apt-get install gcc gdb valgrind make cmake python3 python3-venv python3-pip doxygen
# mkdir -p build && cd build && cmake .. && make check
Peppa PEG docs can be built via doxygen:
(root) $ cd build
(build) $ cmake -DENABLE_DOCS=On ..
(build) $ rm -rf docs && make docs
The outputs are stored on build/docs
.
- Write an INI Parser using Peppa PEG: ini.h, ini.c.
- Write a Mustache Parser using Peppa PEG: mustache.h.
- Write a JSON Parser using Peppa PEG: json.h.
- Write a Calculator Parser using Peppa PEG: calc.h, calc.c.
- Write a Dot parser using Peppa PEG: dot.h.
Made with ❤️ by Ju.