From 837255cfc4c62e1abf8b19c6f6068cd9246f125f Mon Sep 17 00:00:00 2001 From: Joao Paulo Magalhaes Date: Tue, 11 Jun 2024 18:57:25 +0100 Subject: [PATCH] v0.7.0 --- CMakeLists.txt | 2 +- changelog/0.7.0.md | 191 ++++++++++++++++++++++++++ changelog/current.md | 191 -------------------------- doc/Doxyfile | 2 +- doc/conf.py | 2 +- doc/doxy_main.md | 2 +- doc/sphinx_is_it_rapid.rst | 18 +-- doc/sphinx_quicklinks.rst | 6 +- doc/sphinx_try_quickstart.rst | 2 +- doc/sphinx_using.rst | 18 +-- ext/c4core | 2 +- samples/quickstart.cpp | 2 +- tbump.toml | 2 +- test/test_install/CMakeLists.txt | 2 +- test/test_singleheader/CMakeLists.txt | 2 +- 15 files changed, 222 insertions(+), 222 deletions(-) create mode 100644 changelog/0.7.0.md diff --git a/CMakeLists.txt b/CMakeLists.txt index 1a9e7ff04..762bb9710 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -6,7 +6,7 @@ project(ryml LANGUAGES CXX) include(./compat.cmake) -c4_project(VERSION 0.6.0 STANDALONE +c4_project(VERSION 0.7.0 STANDALONE AUTHOR "Joao Paulo Magalhaes ") diff --git a/changelog/0.7.0.md b/changelog/0.7.0.md new file mode 100644 index 000000000..0c7c60639 --- /dev/null +++ b/changelog/0.7.0.md @@ -0,0 +1,191 @@ +Most of the changes are from the giant Parser refactor described below. Before getting to that, some other minor changes first. + + +### Fixes + +- [#PR431](https://github.com/biojppm/rapidyaml/pull/431) - Emitter: prevent stack overflows when emitting malicious trees by providing a max tree depth for the emit visitor. This was done by adding an `EmitOptions` structure as an argument both to the emitter and to the emit functions, which is then forwarded to the emitter. This `EmitOptions` structure has a max tree depth setting with a default value of 64. +- [#PR431](https://github.com/biojppm/rapidyaml/pull/431) - Fix `_RYML_CB_ALLOC()` using `(T)` in parenthesis, making the macro unusable. +- [#434](https://github.com/biojppm/rapidyaml/issues/434) - Ensure empty vals are not deserialized ([#PR436](https://github.com/biojppm/rapidyaml/pull/436)). +- [#PR433](https://github.com/biojppm/rapidyaml/pull/433): + - Fix some corner cases causing read-after-free in the tree's arena when it is relocated while filtering scalars. + - Improve YAML error conformance - detect YAML-mandated parse errors when: + - directives are misplaced (eg [9MMA](https://matrix.yaml.info/details/9MMA.html), [9HCY](https://matrix.yaml.info/details/9HCY.html), [B63P](https://matrix.yaml.info/details/B63P.html), [EB22](https://matrix.yaml.info/details/EB22.html), [SF5V](https://matrix.yaml.info/details/SF5V.html)). + - comments are misplaced (eg [MUS6/00](https://matrix.yaml.info/details/MUS6:00.html), [9JBA](https://matrix.yaml.info/details/9JBA.html), [SU5Z](https://matrix.yaml.info/details/SU5Z.html)) + - a node has both an anchor and an alias (eg [SR86](https://matrix.yaml.info/details/SR86.html), [SU74](https://matrix.yaml.info/details/SU74.html)). + - tags contain [invalid characters](https://yaml.org/spec/1.2.2/#tag-shorthands) `,{}[]` (eg [LHL4](https://matrix.yaml.info/details/LHL4.html), [U99R](https://matrix.yaml.info/details/U99R.html), [WZ62](https://matrix.yaml.info/details/WZ62.html)). + + +### New features + +- [#PR431](https://github.com/biojppm/rapidyaml/pull/431) - append-emitting to existing containers in the `emitrs_` functions, suggested in [#345](https://github.com/biojppm/rapidyaml/issues/345). This was achieved by adding a `bool append=false` as the last parameter of these functions. +- [#PR431](https://github.com/biojppm/rapidyaml/pull/431) - add depth query methods: + ```cpp + Tree::depth_asc(id_type) const; // O(log(num_tree_nodes)) get the depth of a node ascending (ie, from root to node) + Tree::depth_desc(id_type) const; // O(num_tree_nodes) get the depth of a node descending (ie, from node to deep-most leaf node) + ConstNodeRef::depth_asc() const; // likewise + ConstNodeRef::depth_desc() const; + NodeRef::depth_asc() const; + NodeRef::depth_desc() const; + ``` +- [#PR432](https://github.com/biojppm/rapidyaml/pull/432) - Added a function to estimate the required tree capacity, based on yaml markup: + ```cpp + size_t estimate_tree_capacity(csubstr); // estimate number of nodes resulting from yaml + ``` + + +------ +All other changes come from [#PR414](https://github.com/biojppm/rapidyaml/pull/414). + +### Parser refactor + +The parser was completely refactored ([#PR414](https://github.com/biojppm/rapidyaml/pull/414)). This was a large and hard job carried out over several months, but it brings important improvements. + +- The new parser is an event-based parser, based on an event dispatcher engine. This engine is templated on event handler, where each event is a function call, which spares branches on the event handler. The parsing code was fully rewritten, and is now much more simple (albeit longer), and much easier to work with and fix. +- YAML standard-conformance was improved significantly. Along with many smaller fixes and additions, (too many to list here), the main changes are the following: + - The parser engine can now successfully parse container keys, emitting all the events in correctly, **but** as before, the ryml tree cannot accomodate these (and this constraint is no longer enforced by the parser, but instead by `EventHandlerTree`). For an example of a handler which can accomodate key containers, see the one which is used for the test suite at `test/test_suite/test_suite_event_handler.hpp` + - Anchor keys can now be terminated with colon (eg, `&anchor: key: val`), as dictated by the standard. +- The parser engine can now be used to create native trees in other programming languages, or in cases where the user *must* have container keys. +- Performance of both parsing and emitting improved significantly; see some figures below. + + +### Strict JSON parser + +- A strict JSON parser was added. Use the `parse_json_...()` family of functions to parse json in stricter mode (and faster) than flow-style YAML. + + +### YAML style preserved while parsing + +- The YAML style information is now fully preserved through parsing/emitting round trips. This was made possible because the event model of the new parsing engine now incorporates style varieties. So, for example: + - a scalar parsed from a plain/single-quoted/double-quoted/block-literal/block-folded scalar will be emitted always using its original style in the YAML source + - a container parsed in block-style will always be emitted in block-style + - a container parsed in flow-style will always be emitted in flow-style + Because of this, the style of YAML emitted by ryml changes from previous releases. +- Scalar filtering was improved and is now done directly in the source being parsed (which may be in place or in the arena), except in the cases where the scalar expands and does not fit its initial range, in which case the scalar is filtered out of place to the tree's arena. + - Filtering can now be disabled while parsing, to ensure a fully-readonly parse (but this feature is still experimental and somewhat untested, given the scope of the rewrite work). + - The parser now offers methods to filter scalars in place or out of place. +- Style flags were added to `NodeType_e`: + ```cpp + FLOW_SL ///< mark container with single-line flow style (seqs as '[val1,val2], maps as '{key: val,key2: val2}') + FLOW_ML ///< mark container with multi-line flow style (seqs as '[\n val1,\n val2\n], maps as '{\n key: val,\n key2: val2\n}') + BLOCK ///< mark container with block style (seqs as '- val\n', maps as 'key: val') + KEY_LITERAL ///< mark key scalar as multiline, block literal | + VAL_LITERAL ///< mark val scalar as multiline, block literal | + KEY_FOLDED ///< mark key scalar as multiline, block folded > + VAL_FOLDED ///< mark val scalar as multiline, block folded > + KEY_SQUO ///< mark key scalar as single quoted ' + VAL_SQUO ///< mark val scalar as single quoted ' + KEY_DQUO ///< mark key scalar as double quoted " + VAL_DQUO ///< mark val scalar as double quoted " + KEY_PLAIN ///< mark key scalar as plain scalar (unquoted, even when multiline) + VAL_PLAIN ///< mark val scalar as plain scalar (unquoted, even when multiline) + ``` +- Style predicates were added to `NodeType`, `Tree`, `ConstNodeRef` and `NodeRef`: + ```cpp + bool is_container_styled() const; + bool is_block() const + bool is_flow_sl() const; + bool is_flow_ml() const; + bool is_flow() const; + + bool is_key_styled() const; + bool is_val_styled() const; + bool is_key_literal() const; + bool is_val_literal() const; + bool is_key_folded() const; + bool is_val_folded() const; + bool is_key_squo() const; + bool is_val_squo() const; + bool is_key_dquo() const; + bool is_val_dquo() const; + bool is_key_plain() const; + bool is_val_plain() const; + ``` +- Style modifiers were also added: + ```cpp + void set_container_style(NodeType_e style); + void set_key_style(NodeType_e style); + void set_val_style(NodeType_e style); + ``` +- Emit helper predicates were added, and are used when an emitted node was built programatically without style flags: + ```cpp + /** choose a YAML emitting style based on the scalar's contents */ + NodeType_e scalar_style_choose(csubstr scalar) noexcept; + /** query whether a scalar can be encoded using single quotes. + * It may not be possible, notably when there is leading + * whitespace after a newline. */ + bool scalar_style_query_squo(csubstr s) noexcept; + /** query whether a scalar can be encoded using plain style (no + * quotes, not a literal/folded block scalar). */ + bool scalar_style_query_plain(csubstr s) noexcept; + ``` + +### Breaking changes + +As a result of the refactor, there are some limited changes with impact in client code. Even though this was a large refactor, effort was directed at keeping maximal backwards compatibility, and the changes are not wide. But they still exist: + +- The existing `parse_...()` methods in the `Parser` class were all removed. Use the corresponding `parse_...(Parser*, ...)` function from the header [`c4/yml/parse.hpp`](https://github.com/biojppm/rapidyaml/blob/master/src/c4/yml/parse.hpp). +- When instantiated by the user, the parser now needs to receive a `EventHandlerTree` object, which is responsible for building the tree. Although fully functional and tested, the structure of this class is still somewhat experimental and is still likely to change. There is an alternative event handler implementation responsible for producing the events for the YAML test suite in `test/test_suite/test_suite_event_handler.hpp`. +- The declaration and definition of `NodeType` was moved to a separate header file `c4/yml/node_type.hpp` (previously it was in `c4/yml/tree.hpp`). +- Some of the node type flags were removed, and several flags (and combination flags) were added. + - Most of the existing flags are kept, as well as their meaning. + - `KEYQUO` and `VALQUO` are now masks of the several style flags for quoted scalars. In general, however, client code using these flags and `.is_val_quoted()` or `.is_key_quoted()` is not likely to require any changes. + + +### New type for node IDs + +A type `id_type` was added to signify the integer type for the node id, defaulting to the backwards-compatible `size_t` which was previously used in the tree. In the future, this type is likely to change, *and probably to a signed type*, so client code is encouraged to always use `id_type` instead of the `size_t`, and specifically not to rely on the signedness of this type. + + +### Reference resolver is now exposed + +The reference (ie, alias) resolver object is now exposed in +[`c4/yml/reference_resolver.hpp`](https://github.com/biojppm/rapidyaml/blob/master/src/c4/yml/reference_resolver.hpp). Previously this object was temporarily instantiated in `Tree::resolve()`. Exposing it now enables the user to reuse this object through different calls, saving a potential allocation on every call. + + +### Tag utilities + +Tag utilities were moved to the new header [`c4/yml/tag.hpp`](https://github.com/biojppm/rapidyaml/blob/master/src/c4/yml/tag.hpp). The types `Tree::tag_directive_const_iterator` and `Tree::TagDirectiveProxy` were deprecated. Fixed also an unitialization problem with `Tree::m_tag_directives`. + + +### Performance improvements + +To compare performance before and after this changeset, the benchmark runs were run (in the same PC), and the results were collected into these two files: + - [results before newparser](https://github.com/biojppm/rapidyaml/blob/master/bm/results/results_before_newparser.md) + - [results after newparser](https://github.com/biojppm/rapidyaml/blob/master/bm/results/results_after_newparser.md) + - (suggestion: compare these files in a diff viewer) + +There are a lot of results in these files, and many insights can be obtained by browsing them; too many to list here. Below we show only some selected results. + + +#### Parsing +Here are some figures for parsing performance, for `bm_ryml_inplace_reuse` (name before) / `bm_ryml_yaml_inplace_reuse` (name after): + +|------|------------|-----------|--------| +| case | B/s before newparser | B/s after newparser | improv % | +|------|------------|-----------|--------| +| [PARSE/appveyor.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/appveyor.yml) | 168.628Mi/s | 232.017Mi/s | ~+40% | +| [PARSE/compile_commands.json](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/compile_commands.yml) | 630.17Mi/s | 609.877Mi/s | ~-3% | +| [PARSE/travis.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/travis.yml) | 193.674Mi/s | 271.598Mi/s | ~+50% | +| [PARSE/scalar_dquot_multiline.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/scalar_dquot_multiline.yml) | 224.796Mi/s | 187.335Mi/s | ~-10% | +| [PARSE/scalar_dquot_singleline.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/scalar_dquot_singleline.yml) | 339.889Mi/s | 388.924Mi/s | ~-16% | + +Some conclusions: +- parse performance improved by ~30%-50% for YAML without filtering-heavy parsing. +- parse performance *decreased* by ~10%-15% for YAML with filtering-heavy parsing. There is still some scope for improvement in the parsing code, so this cost may hopefully be minimized in the future. + + +#### Emitting + +Here are some figures emitting performance improvements retrieved from these files, for `bm_ryml_str_reserve` (name before) / `bm_ryml_yaml_str_reserve` (name after): + +|------|------------|-----------| +| case | B/s before newparser | B/s after newparser | +|------|------------|-----------| +| [EMIT/appveyor.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/appveyor.yml) | 311.718Mi/s | 1018.44Mi/s | +| [EMIT/compile_commands.json](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/compile_commands.yml) | 434.206Mi/s | 771.682Mi/s | +| [EMIT/travis.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/travis.yml) | 333.322Mi/s | 1.41597Gi/s | +| [EMIT/scalar_dquot_multiline.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/scalar_dquot_multiline.yml) | 868.6Mi/s | 692.564Mi/s | +| [EMIT/scalar_dquot_singleline.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/scalar_dquot_singleline.yml) | 336.98Mi/s | 638.368Mi/s | +| [EMIT/style_seqs_flow_outer1000_inner100.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/style_seqs_flow_outer1000_inner100.yml) | 136.826Mi/s | 279.487Mi/s | + +Emit performance improved everywhere by over 1.5x and as much as 3x-4x for YAML without filtering-heavy parsing. diff --git a/changelog/current.md b/changelog/current.md index 69a39c9df..e69de29bb 100644 --- a/changelog/current.md +++ b/changelog/current.md @@ -1,191 +0,0 @@ -Most of the changes are from the giant Parser refactor described below. Before getting to that, some other minor changes first. - - -### Fixes - -- [#PR431](https://github.com/biojppm/rapidyaml/pull/431) - Emitter: prevent stack overflows when emitting malicious trees by providing a max tree depth for the emit visitor. This was done by adding an `EmitOptions` structure as an argument both to the emitter and to the emit functions, which is then forwarded to the emitter. This `EmitOptions` structure has a max tree depth setting with a default value of 64. -- [#PR431](https://github.com/biojppm/rapidyaml/pull/431) - Fix `_RYML_CB_ALLOC()` using `(T)` in parenthesis, making the macro unusable. -- [#434](https://github.com/biojppm/rapidyaml/issues/434) - Ensure empty vals are not deserialized ([#PR436](https://github.com/biojppm/rapidyaml/pull/436)). -- [#PR433](https://github.com/biojppm/rapidyaml/pull/433): - - Fix some corner cases causing read-after-free in the tree's arena when it is relocated while filtering scalar. - - Improve YAML error conformance - detect YAML-mandated parse errors when: - - directives are misplaced (eg [9MMA](https://matrix.yaml.info/details/9MMA.html), [9HCY](https://matrix.yaml.info/details/9HCY.html), [B63P](https://matrix.yaml.info/details/B63P.html), [EB22](https://matrix.yaml.info/details/EB22.html), [SF5V](https://matrix.yaml.info/details/SF5V.html)). - - comments are misplaced (eg [MUS6/00](https://matrix.yaml.info/details/MUS6:00.html), [9JBA](https://matrix.yaml.info/details/9JBA.html), [SU5Z](https://matrix.yaml.info/details/SU5Z.html)) - - a node has both an anchor and an alias (eg [SR86](https://matrix.yaml.info/details/SR86.html), [SU74](https://matrix.yaml.info/details/SU74.html)). - - tags contain [invalid characters](https://yaml.org/spec/1.2.2/#tag-shorthands) `,{}[]` (eg [LHL4](https://matrix.yaml.info/details/LHL4.html), [U99R](https://matrix.yaml.info/details/U99R.html), [WZ62](https://matrix.yaml.info/details/WZ62.html)). - - -### New features - -- [#PR431](https://github.com/biojppm/rapidyaml/pull/431) - append-emitting to existing containers in the `emitrs_` functions, suggested in [#345](https://github.com/biojppm/rapidyaml/issues/345). This was achieved by adding a `bool append=false` as the last parameter of these functions. -- [#PR431](https://github.com/biojppm/rapidyaml/pull/431) - add depth query methods: - ```cpp - Tree::depth_asc(id_type) const; // O(log(num_tree_nodes)) get the depth of a node ascending (ie, from root to node) - Tree::depth_desc(id_type) const; // O(num_tree_nodes) get the depth of a node descending (ie, from node to deep-most leaf node) - ConstNodeRef::depth_asc() const; // likewise - ConstNodeRef::depth_desc() const; - NodeRef::depth_asc() const; - NodeRef::depth_desc() const; - ``` -- [#PR432](https://github.com/biojppm/rapidyaml/pull/432) - Added a function to estimate the required tree capacity, based on yaml markup: - ```cpp - size_t estimate_tree_capacity(csubstr); // estimate number of nodes resulting from yaml - ``` - - ------- -All other changes come from [#PR414](https://github.com/biojppm/rapidyaml/pull/414). - -### Parser refactor - -The parser was completely refactored ([#PR414](https://github.com/biojppm/rapidyaml/pull/414)). This was a large and hard job carried out over several months, but it brings important improvements. - -- The new parser is an event-based parser, based on an event dispatcher engine. This engine is templated on event handler, where each event is a function call, which spares branches on the event handler. The parsing code was fully rewritten, and is now much more simple (albeit longer), and much easier to work with and fix. -- YAML standard-conformance was improved significantly. Along with many smaller fixes and additions, (too many to list here), the main changes are the following: - - The parser engine can now successfully parse container keys, emitting all the events in correctly, **but** as before, the ryml tree cannot accomodate these (and this constraint is no longer enforced by the parser, but instead by `EventHandlerTree`). For an example of a handler which can accomodate key containers, see the one which is used for the test suite at `test/test_suite/test_suite_event_handler.hpp` - - Anchor keys can now be terminated with colon (eg, `&anchor: key: val`), as dictated by the standard. -- The parser engine can now be used to create native trees in other programming languages, or in cases where the user *must* have container keys. -- Performance of both parsing and emitting improved significantly; see some figures below. - - -### Strict JSON parser - -- A strict JSON parser was added. Use the `parse_json_...()` family of functions to parse json in stricter mode (and faster) than flow-style YAML. - - -### YAML style preserved while parsing - -- The YAML style information is now fully preserved through parsing/emitting round trips. This was made possible because the event model of the new parsing engine now incorporates style varieties. So, for example: - - a scalar parsed from a plain/single-quoted/double-quoted/block-literal/block-folded scalar will be emitted always using its original style in the YAML source - - a container parsed in block-style will always be emitted in block-style - - a container parsed in flow-style will always be emitted in flow-style - Because of this, the style of YAML emitted by ryml changes from previous releases. -- Scalar filtering was improved and is now done directly in the source being parsed (which may be in place or in the arena), except in the cases where the scalar expands and does not fit its initial range, in which case the scalar is filtered out of place to the tree's arena. - - Filtering can now be disabled while parsing, to ensure a fully-readonly parse (but this feature is still experimental and somewhat untested, given the scope of the rewrite work). - - The parser now offers methods to filter scalars in place or out of place. -- Style flags were added to `NodeType_e`: - ```cpp - FLOW_SL ///< mark container with single-line flow style (seqs as '[val1,val2], maps as '{key: val,key2: val2}') - FLOW_ML ///< mark container with multi-line flow style (seqs as '[\n val1,\n val2\n], maps as '{\n key: val,\n key2: val2\n}') - BLOCK ///< mark container with block style (seqs as '- val\n', maps as 'key: val') - KEY_LITERAL ///< mark key scalar as multiline, block literal | - VAL_LITERAL ///< mark val scalar as multiline, block literal | - KEY_FOLDED ///< mark key scalar as multiline, block folded > - VAL_FOLDED ///< mark val scalar as multiline, block folded > - KEY_SQUO ///< mark key scalar as single quoted ' - VAL_SQUO ///< mark val scalar as single quoted ' - KEY_DQUO ///< mark key scalar as double quoted " - VAL_DQUO ///< mark val scalar as double quoted " - KEY_PLAIN ///< mark key scalar as plain scalar (unquoted, even when multiline) - VAL_PLAIN ///< mark val scalar as plain scalar (unquoted, even when multiline) - ``` -- Style predicates were added to `NodeType`, `Tree`, `ConstNodeRef` and `NodeRef`: - ```cpp - bool is_container_styled() const; - bool is_block() const - bool is_flow_sl() const; - bool is_flow_ml() const; - bool is_flow() const; - - bool is_key_styled() const; - bool is_val_styled() const; - bool is_key_literal() const; - bool is_val_literal() const; - bool is_key_folded() const; - bool is_val_folded() const; - bool is_key_squo() const; - bool is_val_squo() const; - bool is_key_dquo() const; - bool is_val_dquo() const; - bool is_key_plain() const; - bool is_val_plain() const; - ``` -- Style modifiers were also added: - ```cpp - void set_container_style(NodeType_e style); - void set_key_style(NodeType_e style); - void set_val_style(NodeType_e style); - ``` -- Emit helper predicates were added, and are used when an emitted node was built programatically without style flags: - ```cpp - /** choose a YAML emitting style based on the scalar's contents */ - NodeType_e scalar_style_choose(csubstr scalar) noexcept; - /** query whether a scalar can be encoded using single quotes. - * It may not be possible, notably when there is leading - * whitespace after a newline. */ - bool scalar_style_query_squo(csubstr s) noexcept; - /** query whether a scalar can be encoded using plain style (no - * quotes, not a literal/folded block scalar). */ - bool scalar_style_query_plain(csubstr s) noexcept; - ``` - -### Breaking changes - -As a result of the refactor, there are some limited changes with impact in client code. Even though this was a large refactor, effort was directed at keeping maximal backwards compatibility, and the changes are not wide. But they still exist: - -- The existing `parse_...()` methods in the `Parser` class were all removed. Use the corresponding `parse_...(Parser*, ...)` function from the header [`c4/yml/parse.hpp`](https://github.com/biojppm/rapidyaml/blob/master/src/c4/yml/parse.hpp). -- When instantiated by the user, the parser now needs to receive a `EventHandlerTree` object, which is responsible for building the tree. Although fully functional and tested, the structure of this class is still somewhat experimental and is still likely to change. There is an alternative event handler implementation responsible for producing the events for the YAML test suite in `test/test_suite/test_suite_event_handler.hpp`. -- The declaration and definition of `NodeType` was moved to a separate header file `c4/yml/node_type.hpp` (previously it was in `c4/yml/tree.hpp`). -- Some of the node type flags were removed, and several flags (and combination flags) were added. - - Most of the existing flags are kept, as well as their meaning. - - `KEYQUO` and `VALQUO` are now masks of the several style flags for quoted scalars. In general, however, client code using these flags and `.is_val_quoted()` or `.is_key_quoted()` is not likely to require any changes. - - -### New type for node IDs - -A type `id_type` was added to signify the integer type for the node id, defaulting to the backwards-compatible `size_t` which was previously used in the tree. In the future, this type is likely to change, *and probably to a signed type*, so client code is encouraged to always use `id_type` instead of the `size_t`, and specifically not to rely on the signedness of this type. - - -### Reference resolver is now exposed - -The reference (ie, alias) resolver object is now exposed in -[`c4/yml/reference_resolver.hpp`](https://github.com/biojppm/rapidyaml/blob/master/src/c4/yml/reference_resolver.hpp). Previously this object was temporarily instantiated in `Tree::resolve()`. Exposing it now enables the user to reuse this object through different calls, saving a potential allocation on every call. - - -### Tag utilities - -Tag utilities were moved to the new header [`c4/yml/tag.hpp`](https://github.com/biojppm/rapidyaml/blob/master/src/c4/yml/tag.hpp). The types `Tree::tag_directive_const_iterator` and `Tree::TagDirectiveProxy` were deprecated. Fixed also an unitialization problem with `Tree::m_tag_directives`. - - -### Performance improvements - -To compare performance before and after this changeset, the benchmark runs were run (in the same PC), and the results were collected into these two files: - - [results before newparser](https://github.com/biojppm/rapidyaml/blob/master/bm/results/results_before_newparser.md) - - [results after newparser](https://github.com/biojppm/rapidyaml/blob/master/bm/results/results_after_newparser.md) - - (suggestion: compare these files in a diff viewer) - -There are a lot of results in these files, and many insights can be obtained by browsing them; too many to list here. Below we show only some selected results. - - -#### Parsing -Here are some figures for parsing performance, for `bm_ryml_inplace_reuse` (name before) / `bm_ryml_yaml_inplace_reuse` (name after): - -|------|------------|-----------|--------| -| case | B/s before newparser | B/s after newparser | improv % | -|------|------------|-----------|--------| -| [PARSE/appveyor.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/appveyor.yml) | 168.628Mi/s | 232.017Mi/s | ~+40% | -| [PARSE/compile_commands.json](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/compile_commands.yml) | 630.17Mi/s | 609.877Mi/s | ~-3% | -| [PARSE/travis.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/travis.yml) | 193.674Mi/s | 271.598Mi/s | ~+50% | -| [PARSE/scalar_dquot_multiline.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/scalar_dquot_multiline.yml) | 224.796Mi/s | 187.335Mi/s | ~-10% | -| [PARSE/scalar_dquot_singleline.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/scalar_dquot_singleline.yml) | 339.889Mi/s | 388.924Mi/s | ~-16% | - -Some conclusions: -- parse performance improved by ~30%-50% for YAML without filtering-heavy parsing. -- parse performance *decreased* by ~10%-15% for YAML with filtering-heavy parsing. There is still some scope for improvement in the parsing code, so this cost may hopefully be minimized in the future. - - -#### Emitting - -Here are some figures emitting performance improvements retrieved from these files, for `bm_ryml_str_reserve` (name before) / `bm_ryml_yaml_str_reserve` (name after): - -|------|------------|-----------| -| case | B/s before newparser | B/s after newparser | -|------|------------|-----------| -| [EMIT/appveyor.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/appveyor.yml) | 311.718Mi/s | 1018.44Mi/s | -| [EMIT/compile_commands.json](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/compile_commands.yml) | 434.206Mi/s | 771.682Mi/s | -| [EMIT/travis.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/travis.yml) | 333.322Mi/s | 1.41597Gi/s | -| [EMIT/scalar_dquot_multiline.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/scalar_dquot_multiline.yml) | 868.6Mi/s | 692.564Mi/s | -| [EMIT/scalar_dquot_singleline.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/scalar_dquot_singleline.yml) | 336.98Mi/s | 638.368Mi/s | -| [EMIT/style_seqs_flow_outer1000_inner100.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/style_seqs_flow_outer1000_inner100.yml) | 136.826Mi/s | 279.487Mi/s | - -Emit performance improved everywhere by over 1.5x and as much as 3x-4x for YAML without filtering-heavy parsing. diff --git a/doc/Doxyfile b/doc/Doxyfile index 6358b1f13..1e72e59f4 100644 --- a/doc/Doxyfile +++ b/doc/Doxyfile @@ -48,7 +48,7 @@ PROJECT_NAME = rapidyaml # could be handy for archiving the generated documentation or if some version # control system is used. -PROJECT_NUMBER = 0.6.0 +PROJECT_NUMBER = 0.7.0 # Using the PROJECT_BRIEF tag one can provide an optional one line description # for a project that appears at the top of each page and should give viewer a diff --git a/doc/conf.py b/doc/conf.py index 0067c34f5..0d4840691 100644 --- a/doc/conf.py +++ b/doc/conf.py @@ -10,7 +10,7 @@ project = 'rapidyaml' copyright = '2018-2024 Joao Paulo Magalhaes ' author = 'Joao Paulo Magalhaes ' -release = '0.6.0' +release = '0.7.0' # -- General configuration --------------------------------------------------- # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration diff --git a/doc/doxy_main.md b/doc/doxy_main.md index 5a64e9ae2..35fb948be 100644 --- a/doc/doxy_main.md +++ b/doc/doxy_main.md @@ -1,6 +1,6 @@ # rapidyaml -* Begin by looking at the [project's README](https://github.com/biojppm/rapidyaml/blob/v0.6.0/README.md) +* Begin by looking at the [project's README](https://github.com/biojppm/rapidyaml/blob/v0.7.0/README.md) * [Documentation page](https://rapidyaml.readthedocs.org) * Next, skim the docs for the @ref doc_quickstart sample. * Good! Now the main ryml topics: diff --git a/doc/sphinx_is_it_rapid.rst b/doc/sphinx_is_it_rapid.rst index 7a0c6d1d3..81010b9c6 100644 --- a/doc/sphinx_is_it_rapid.rst +++ b/doc/sphinx_is_it_rapid.rst @@ -25,11 +25,11 @@ faster. nicely as claimed here, we would definitely like to see it! Please open an issue, or submit a pull request adding the file to `bm/cases - `__, or + `__, or just send us the files. `Here’s a parsing benchmark -`__. Using +`__. Using different approaches within ryml (in-situ/read-only vs. with/without reuse), a YAML / JSON buffer is repeatedly parsed, and compared against other libraries. @@ -40,7 +40,7 @@ Comparison with yaml-cpp The first result set is for Windows, and is using a `appveyor.yml config file -`__. A +`__. A comparison of these results is summarized on the table below: =========================== ===== ======= ========== @@ -52,11 +52,11 @@ appveyor / vs2017 / Debug 6.4 0.0844 76x / 1.3% The next set of results is taken in Linux, comparing g++ 8.2 and clang++ 7.0.1 in parsing a YAML buffer from a `travis.yml config file -`__ +`__ or a JSON buffer from a `compile_commands.json file -`__. You +`__. You can `see the full results here -`__. Summarizing: +`__. Summarizing: ========================== ===== ======= ======== Read rates (MB/s) ryml yamlcpp compared @@ -89,9 +89,9 @@ So how does ryml compare against other JSON readers? Well, it may not be the fastest, but it's definitely ahead of the pack! The benchmark is the `same as above -`__, +`__, and it is reading the `compile_commands.json -`__, +`__, The ``_arena`` suffix notes parsing a read-only buffer (so buffer copies are performed), while the ``_inplace`` suffix means that the source buffer can be parsed in place. The ``_reuse`` means the data @@ -131,7 +131,7 @@ Performance emitting -------------------- `Emitting benchmarks -`__ +`__ also show similar speedups from the existing libraries, also anecdotally reported by some users `(eg, here’s a user reporting 25x speedup from yaml-cpp) diff --git a/doc/sphinx_quicklinks.rst b/doc/sphinx_quicklinks.rst index 133d4f80d..ab5b62ed7 100644 --- a/doc/sphinx_quicklinks.rst +++ b/doc/sphinx_quicklinks.rst @@ -15,11 +15,11 @@ Quick links * `Kanban board `_ -* Latest release: `0.6.0 `_ +* Latest release: `0.7.0 `_ - * `Release page [0.6.0] `_ + * `Release page [0.7.0] `_ - * `README [0.6.0] `_ + * `README [0.7.0] `_ * Since latest release (master branch): diff --git a/doc/sphinx_try_quickstart.rst b/doc/sphinx_try_quickstart.rst index 97391fb5c..e36349009 100644 --- a/doc/sphinx_try_quickstart.rst +++ b/doc/sphinx_try_quickstart.rst @@ -11,7 +11,7 @@ include(FetchContent) FetchContent_Declare(ryml GIT_REPOSITORY https://github.com/biojppm/rapidyaml.git - GIT_TAG v0.6.0 + GIT_TAG v0.7.0 GIT_SHALLOW FALSE # ensure submodules are checked out ) FetchContent_MakeAvailable(ryml) diff --git a/doc/sphinx_using.rst b/doc/sphinx_using.rst index d498282d8..a2f61bbb6 100644 --- a/doc/sphinx_using.rst +++ b/doc/sphinx_using.rst @@ -7,7 +7,7 @@ Quickstart build samples These samples show different ways of getting ryml into your application. All the samples use `the same quickstart executable -source `__, but are built in different ways, +source `__, but are built in different ways, showing several alternatives to integrate ryml into your project. We also encourage you to refer to the `quickstart docs `__, which extensively cover @@ -29,19 +29,19 @@ more about each sample: +-------------------------------------------------------------------------------------------------+----------------------------------+--------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------+ | Sample name | ryml is part of build? | cmake file | commands | +=================================================================================================+==================================+==============================================================================================================+=============================================================================================================+ -| `singleheader `_ | | **yes** | `CMakeLists.txt `_ | `run.sh `_ | +| `singleheader `_ | | **yes** | `CMakeLists.txt `_ | `run.sh `_ | | | | ryml brought as a single | | | | | | header, not as a library | | | +-------------------------------------------------------------------------------------------------+----------------------------------+--------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------+ -| `singleheaderlib `_ | | **yes** | `CMakeLists.txt `_ | | `run_shared.sh `_ | -| | | ryml brought as library | | | `run_static.sh `_ | +| `singleheaderlib `_ | | **yes** | `CMakeLists.txt `_ | | `run_shared.sh `_ | +| | | ryml brought as library | | | `run_static.sh `_ | | | | but from the single header | | | +-------------------------------------------------------------------------------------------------+----------------------------------+--------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------+ -| `add_subdirectory `_ | **yes** | `CMakeLists.txt `_ | `run.sh `_ | +| `add_subdirectory `_ | **yes** | `CMakeLists.txt `_ | `run.sh `_ | +-------------------------------------------------------------------------------------------------+----------------------------------+--------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------+ -| `fetch_content `_ | **yes** | `CMakeLists.txt `_ | `run.sh `_ | +| `fetch_content `_ | **yes** | `CMakeLists.txt `_ | `run.sh `_ | +-------------------------------------------------------------------------------------------------+----------------------------------+--------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------+ -| `find_package `_ | | **no** | `CMakeLists.txt `_ | `run.sh `_ | +| `find_package `_ | | **no** | `CMakeLists.txt `_ | `run.sh `_ | | | | needs prior install or package | | | +-------------------------------------------------------------------------------------------------+----------------------------------+--------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------+ @@ -51,7 +51,7 @@ As a single-header ryml is provided chiefly as a cmake library project, but it can also be used as a single header file, and there is a `tool to amalgamate -`__ +`__ the code into a single header file. The amalgamated header file is provided with each release, but you can also generate a customized file suiting your particular needs (or commit): @@ -185,7 +185,7 @@ that c4core is a submodule of the current repo. However, it is still possible to use a c4core version different from the one in the repo (of course, only if there are no incompatibilities between the versions). You can find out how to achieve this by looking at the -`custom_c4core sample `__. +`custom_c4core sample `__. CMake build settings for ryml diff --git a/ext/c4core b/ext/c4core index 6da883076..1cf2a755a 160000 --- a/ext/c4core +++ b/ext/c4core @@ -1 +1 @@ -Subproject commit 6da883076e27ae36855cc512589a549797d5f6c0 +Subproject commit 1cf2a755a5853651e42aec99c5ef49bb2ec56cf5 diff --git a/samples/quickstart.cpp b/samples/quickstart.cpp index d27b738bb..eed235d22 100644 --- a/samples/quickstart.cpp +++ b/samples/quickstart.cpp @@ -166,7 +166,7 @@ namespace sample { * include(FetchContent) * FetchContent_Declare(ryml * GIT_REPOSITORY https://github.com/biojppm/rapidyaml.git - * GIT_TAG v0.6.0 + * GIT_TAG v0.7.0 * GIT_SHALLOW FALSE # ensure submodules are checked out * ) * FetchContent_MakeAvailable(ryml) diff --git a/tbump.toml b/tbump.toml index 800b467be..a7751f398 100644 --- a/tbump.toml +++ b/tbump.toml @@ -5,7 +5,7 @@ github_url = "https://github.com/biojppm/rapidyaml/" [version] -current = "0.6.0" +current = "0.7.0" # Example of a semver regexp. # Make sure this matches current_version before diff --git a/test/test_install/CMakeLists.txt b/test/test_install/CMakeLists.txt index 67853bae0..1d4479d08 100644 --- a/test/test_install/CMakeLists.txt +++ b/test/test_install/CMakeLists.txt @@ -4,7 +4,7 @@ project(ryml HOMEPAGE_URL "https://github.com/biojppm/rapidyaml" LANGUAGES CXX) include(../../ext/c4core/cmake/c4Project.cmake) -c4_project(VERSION 0.6.0 +c4_project(VERSION 0.7.0 AUTHOR "Joao Paulo Magalhaes ") diff --git a/test/test_singleheader/CMakeLists.txt b/test/test_singleheader/CMakeLists.txt index 21695561c..4fd464831 100644 --- a/test/test_singleheader/CMakeLists.txt +++ b/test/test_singleheader/CMakeLists.txt @@ -4,7 +4,7 @@ project(ryml HOMEPAGE_URL "https://github.com/biojppm/rapidyaml" LANGUAGES CXX) include(../../ext/c4core/cmake/c4Project.cmake) -c4_project(VERSION 0.6.0 +c4_project(VERSION 0.7.0 AUTHOR "Joao Paulo Magalhaes ") # amalgamate ryml to get the single header