[doc] update changelog and README

[ci skip]
This commit is contained in:
Joao Paulo Magalhaes
2022-08-14 13:59:43 +01:00
parent 06c955e147
commit 9097d26e6e
3 changed files with 173 additions and 76 deletions

215
README.md
View File

@@ -250,7 +250,7 @@ eg the [`add_subdirectory()` sample](samples/add_subdirectory/).
// - reuse an existing tree (advised)
// - reuse an existing parser (advised)
char yml_buf[] = "{foo: 1, bar: [2, 3], john: doe}";
ryml::Tree tree = ryml::parse_in_place(ryml::substr(yml_buf));
ryml::Tree tree = ryml::parse_in_place(yml_buf);
// Note: it will always be significantly faster to use mutable
// buffers and reuse tree+parser.
@@ -276,14 +276,14 @@ CHECK(tree.is_seq(bar_id)); // ... and receive the subject index
// The node API is a lightweight abstraction sitting on top of the
// index API, but offering a much more convenient interaction:
ryml::NodeRef root = tree.rootref();
ryml::NodeRef bar = tree["bar"];
ryml::ConstNodeRef root = tree.rootref();
ryml::ConstNodeRef bar = tree["bar"];
CHECK(root.is_map());
CHECK(bar.is_seq());
// NodeRef is a lightweight handle to the tree and associated id:
CHECK(root.tree() == &tree); // NodeRef points at its tree, WITHOUT refcount
CHECK(root.id() == root_id); // NodeRef's id is the index of the node
CHECK(bar.id() == bar_id); // NodeRef's id is the index of the node
// A node ref is a lightweight handle to the tree and associated id:
CHECK(root.tree() == &tree); // a node ref points at its tree, WITHOUT refcount
CHECK(root.id() == root_id); // a node ref's id is the index of the node
CHECK(bar.id() == bar_id); // a node ref's id is the index of the node
// The node API translates very cleanly to the index API, so most
// of the code examples below are using the node API.
@@ -297,14 +297,14 @@ CHECK(bar.id() == bar_id); // NodeRef's id is the index of the node
//------------------------------------------------------------------
// To read the parsed tree
// Node::operator[] does a lookup, is O(num_children[node]).
// maps use string keys, seqs use integral keys.
// ConstNodeRef::operator[] does a lookup, is O(num_children[node]).
CHECK(tree["foo"].is_keyval());
CHECK(tree["foo"].key() == "foo");
CHECK(tree["foo"].val() == "1");
CHECK(tree["bar"].is_seq());
CHECK(tree["bar"].has_key());
CHECK(tree["bar"].key() == "bar");
// maps use string keys, seqs use integral keys:
CHECK(tree["bar"][0].val() == "2");
CHECK(tree["bar"][1].val() == "3");
CHECK(tree["john"].val() == "doe");
@@ -314,12 +314,11 @@ CHECK(tree["john"].val() == "doe");
CHECK(tree[0].id() == tree["foo"].id());
CHECK(tree[1].id() == tree["bar"].id());
CHECK(tree[2].id() == tree["john"].id());
// Tree::operator[](int) searches a root child by its position.
// Tree::operator[](int) searches a ***root*** child by its position.
CHECK(tree[0].id() == tree["foo"].id()); // 0: first child of root
CHECK(tree[1].id() == tree["bar"].id()); // 1: first child of root
CHECK(tree[2].id() == tree["john"].id()); // 2: first child of root
// NodeRef::operator[](int) searches a node child by its position
// on __the node__'s children list:
// NodeRef::operator[](int) searches a ***node*** child by its position:
CHECK(bar[0].val() == "2"); // 0 means first child of bar
CHECK(bar[1].val() == "3"); // 1 means second child of bar
// NodeRef::operator[](string):
@@ -337,29 +336,36 @@ CHECK(root["foo"].id() == root[0].id());
CHECK(root["bar"].id() == root[1].id());
CHECK(root["john"].id() == root[2].id());
// Please note that since a ryml tree uses indexed linked lists for storing
// children, the complexity of `Tree::operator[csubstr]` and
// `Tree::operator[size_t]` is linear on the number of root children. If you use
// it with a large tree where the root has many children, you may get a
// performance hit. To avoid this hit, you can create your own accelerator
// structure. For example, before doing a lookup, do a single traverse at the
// root level to fill an `std::map<csubstr,size_t>` mapping key names to node
// indices; with a node index, a lookup (via `Tree::get()`) is O(1), so this way
// you can get O(log n) lookup from a key.
// IMPORTANT. The ryml tree uses indexed linked lists for storing
// children, so the complexity of `Tree::operator[csubstr]` and
// `Tree::operator[size_t]` is linear on the number of root
// children. If you use `Tree::operator[]` with a large tree where
// the root has many children, you will see a performance hit.
//
// As for `NodeRef`, the difference from `NodeRef::operator[]`
// to `Tree::operator[]` is that the latter refers to the root node, whereas
// the former can be invoked on any node. But the lookup process is the same for
// both and their algorithmic complexity is the same: they are both linear in
// the number of direct children; but depending on the data, that number may
// be very different from one to another.
// To avoid this hit, you can create your own accelerator
// structure. For example, before doing a lookup, do a single
// traverse at the root level to fill an `map<csubstr,size_t>`
// mapping key names to node indices; with a node index, a lookup
// (via `Tree::get()`) is O(1), so this way you can get O(log n)
// lookup from a key. (But please do not use `std::map` if you
// care about performance; use something else like a flat map or
// sorted vector).
//
// As for node refs, the difference from `NodeRef::operator[]` and
// `ConstNodeRef::operator[]` to `Tree::operator[]` is that the
// latter refers to the root node, whereas the former are invoked
// on their target node. But the lookup process works the same for
// both and their algorithmic complexity is the same: they are
// both linear in the number of direct children. But of course,
// depending on the data, that number may be very different from
// one to another.
//------------------------------------------------------------------
// Hierarchy:
{
ryml::NodeRef foo = root.first_child();
ryml::NodeRef john = root.last_child();
ryml::ConstNodeRef foo = root.first_child();
ryml::ConstNodeRef john = root.last_child();
CHECK(tree.size() == 6); // O(1) number of nodes in the tree
CHECK(root.num_children() == 3); // O(num_children[root])
CHECK(foo.num_siblings() == 3); // O(num_children[parent(foo)])
@@ -384,13 +390,13 @@ CHECK(root["john"].id() == root[2].id());
// iterate children using the high-level node API:
{
size_t count = 0;
for(ryml::NodeRef const& child : root.children())
for(ryml::ConstNodeRef const& child : root.children())
CHECK(child.key() == expected_keys[count++]);
}
// iterate siblings using the high-level node API:
{
size_t count = 0;
for(ryml::NodeRef const& child : root["foo"].siblings())
for(ryml::ConstNodeRef const& child : root["foo"].siblings())
CHECK(child.key() == expected_keys[count++]);
}
// iterate children using the lower-level tree index API:
@@ -440,13 +446,19 @@ CHECK(!tree["bar"][1].has_key()); // belongs to a seq, so no key
//------------------------------------------------------------------
// Modifying existing nodes: operator<< vs operator=
// As implied by its name, ConstNodeRef is a reference to a const
// node. It can be used to read from the node, but not write to it
// or modify the hierarchy of the node. If any modification is
// desired then a NodeRef must be used instead:
ryml::NodeRef wroot = tree.rootref();
// operator= assigns an existing string to the receiving node.
// This pointer will be in effect until the tree goes out of scope
// so beware to only assign from strings outliving the tree.
root["foo"] = "says you";
root["bar"][0] = "-2";
root["bar"][1] = "-3";
root["john"] = "ron";
wroot["foo"] = "says you";
wroot["bar"][0] = "-2";
wroot["bar"][1] = "-3";
wroot["john"] = "ron";
// Now the tree is _pointing_ at the memory of the strings above.
// That is OK because those are static strings and will outlive
// the tree.
@@ -465,10 +477,10 @@ CHECK(root["john"].val() == "ron");
// assigns the serialized string to the receiving node. This avoids
// constraints with the lifetime, since the arena lives with the tree.
CHECK(tree.arena().empty());
root["foo"] << "says who"; // requires to_chars(). see serialization samples below.
root["bar"][0] << 20;
root["bar"][1] << 30;
root["john"] << "deere";
wroot["foo"] << "says who"; // requires to_chars(). see serialization samples below.
wroot["bar"][0] << 20;
wroot["bar"][1] << 30;
wroot["john"] << "deere";
CHECK(root["foo"].val() == "says who");
CHECK(root["bar"][0].val() == "20");
CHECK(root["bar"][1].val() == "30");
@@ -478,7 +490,7 @@ CHECK(tree.arena() == "says who2030deere"); // the result of serializations to t
{
std::string ok("in_scope");
// root["john"] = ryml::to_csubstr(ok); // don't, will dangle
root["john"] << ryml::to_csubstr(ok); // OK, copy to the tree's arena
wroot["john"] << ryml::to_csubstr(ok); // OK, copy to the tree's arena
}
CHECK(root["john"] == "in_scope"); // OK!
CHECK(tree.arena() == "says who2030deerein_scope"); // the result of serializations to the tree arena
@@ -489,40 +501,41 @@ CHECK(tree.arena() == "says who2030deerein_scope"); // the result of serializati
// adding a keyval node to a map:
CHECK(root.num_children() == 3);
root["newkeyval"] = "shiny and new"; // using these strings
root.append_child() << ryml::key("newkeyval (serialized)") << "shiny and new (serialized)"; // serializes and assigns the serialization
wroot["newkeyval"] = "shiny and new"; // using these strings
wroot.append_child() << ryml::key("newkeyval (serialized)") << "shiny and new (serialized)"; // serializes and assigns the serialization
CHECK(root.num_children() == 5);
CHECK(root["newkeyval"].key() == "newkeyval");
CHECK(root["newkeyval"].val() == "shiny and new");
CHECK(root["newkeyval (serialized)"].key() == "newkeyval (serialized)");
CHECK(root["newkeyval (serialized)"].val() == "shiny and new (serialized)");
CHECK( ! root["newkeyval"].key().is_sub(tree.arena())); // it's using directly the static string above
CHECK( ! root["newkeyval"].val().is_sub(tree.arena())); // it's using directly the static string above
CHECK( root["newkeyval (serialized)"].key().is_sub(tree.arena())); // it's using a serialization of the string above
CHECK( root["newkeyval (serialized)"].val().is_sub(tree.arena())); // it's using a serialization of the string above
CHECK( ! tree.in_arena(root["newkeyval"].key())); // it's using directly the static string above
CHECK( ! tree.in_arena(root["newkeyval"].val())); // it's using directly the static string above
CHECK( tree.in_arena(root["newkeyval (serialized)"].key())); // it's using a serialization of the string above
CHECK( tree.in_arena(root["newkeyval (serialized)"].val())); // it's using a serialization of the string above
// adding a val node to a seq:
CHECK(root["bar"].num_children() == 2);
root["bar"][2] = "oh so nice";
root["bar"][3] << "oh so nice (serialized)";
wroot["bar"][2] = "oh so nice";
wroot["bar"][3] << "oh so nice (serialized)";
CHECK(root["bar"].num_children() == 4);
CHECK(root["bar"][2].val() == "oh so nice");
CHECK(root["bar"][3].val() == "oh so nice (serialized)");
// adding a seq node:
CHECK(root.num_children() == 5);
root["newseq"] |= ryml::SEQ;
root.append_child() << ryml::key("newseq (serialized)") |= ryml::SEQ;
wroot["newseq"] |= ryml::SEQ;
wroot.append_child() << ryml::key("newseq (serialized)") |= ryml::SEQ;
CHECK(root.num_children() == 7);
CHECK(root["newseq"].num_children() == 0);
CHECK(root["newseq (serialized)"].num_children() == 0);
// adding a map node:
CHECK(root.num_children() == 7);
root["newmap"] |= ryml::MAP;
root.append_child() << ryml::key("newmap (serialized)") |= ryml::SEQ;
wroot["newmap"] |= ryml::MAP;
wroot.append_child() << ryml::key("newmap (serialized)") |= ryml::SEQ;
CHECK(root.num_children() == 9);
CHECK(root["newmap"].num_children() == 0);
CHECK(root["newmap (serialized)"].num_children() == 0);
// operator[] does not mutate the tree until the returned node is
// written to.
//
// When the tree is mutable, operator[] does not mutate the tree
// until the returned node is written to.
//
// Until such time, the NodeRef object keeps in itself the required
// information to write to the proper place in the tree. This is
@@ -530,25 +543,44 @@ CHECK(root["newmap (serialized)"].num_children() == 0);
//
// This means that passing a key/index which does not exist will
// not mutate the tree, but will instead store (in the node) the
// proper place of the tree to do so if and when it is required.
// proper place of the tree to be able to do so, if and when it is
// required.
//
// This is a significant difference from eg, the behavior of
// std::map, which mutates the map immediately within the call to
// operator[].
CHECK(!root.has_child("I am nobody"));
ryml::NodeRef nobody = root["I am nobody"];
CHECK(nobody.valid()); // points at the tree, and a specific place in the tree
CHECK(nobody.is_seed()); // ... but nothing is there yet.
CHECK(!root.has_child("I am nobody")); // same as above
ryml::NodeRef somebody = root["I am somebody"];
CHECK(!root.has_child("I am somebody")); // same as above
CHECK(somebody.valid());
CHECK(somebody.is_seed()); // same as above
somebody = "indeed"; // this will commit to the tree, mutating at the proper place
CHECK(somebody.valid());
CHECK(!somebody.is_seed()); // now the tree has this node, and it is no longer a seed
CHECK(root.has_child("I am somebody"));
CHECK(root["I am somebody"].val() == "indeed");
//
// All of the points above apply only if the tree is mutable. If
// the tree is const, then a NodeRef cannot be obtained from it;
// only a ConstNodeRef, which can never be used to mutate the
// tree.
CHECK(!root.has_child("I am not nothing"));
ryml::NodeRef nothing = wroot["I am nothing"];
CHECK(nothing.valid()); // points at the tree, and a specific place in the tree
CHECK(nothing.is_seed()); // ... but nothing is there yet.
CHECK(!root.has_child("I am nothing")); // same as above
ryml::NodeRef something = wroot["I am something"];
ryml::ConstNodeRef constsomething = wroot["I am something"];
CHECK(!root.has_child("I am something")); // same as above
CHECK(something.valid());
CHECK(something.is_seed()); // same as above
CHECK(!constsomething.valid()); // NOTE: because a ConstNodeRef
// cannot be used to mutate a
// tree, it is only valid() if it
// is pointing at an existing
// node.
something = "indeed"; // this will commit to the tree, mutating at the proper place
CHECK(root.has_child("I am something"));
CHECK(root["I am something"].val() == "indeed");
CHECK(something.valid());
CHECK(!something.is_seed()); // now the tree has this node, so the
// ref is no longer a seed
// now the constref is also valid (but it needs to be reassigned):
ryml::ConstNodeRef constsomethingnew = wroot["I am something"];
CHECK(constsomethingnew.valid());
// note that the old constref is now stale, because it only keeps
// the state at creation:
CHECK(!constsomething.valid());
//------------------------------------------------------------------
@@ -579,7 +611,7 @@ newseq: []
newseq (serialized): []
newmap: {}
newmap (serialized): []
I am somebody: indeed
I am something: indeed
)";
CHECK(buf_result == expected_result);
CHECK(str_result == expected_result);
@@ -587,6 +619,38 @@ CHECK(stream_result == expected_result);
// There are many possibilities to emit to buffer;
// please look at the emit sample functions below.
//------------------------------------------------------------------
// ConstNodeRef vs NodeRef
ryml::NodeRef noderef = tree["bar"][0];
ryml::ConstNodeRef constnoderef = tree["bar"][0];
// ConstNodeRef cannot be used to mutate the tree, but a NodeRef can:
//constnoderef = "21"; // compile error
//constnoderef << "22"; // compile error
noderef = "21"; // ok, can assign because it's not const
CHECK(tree["bar"][0].val() == "21");
noderef << "22"; // ok, can serialize and assign because it's not const
CHECK(tree["bar"][0].val() == "22");
// it is not possible to obtain a NodeRef from a ConstNodeRef:
// noderef = constnoderef; // compile error
// it is always possible to obtain a ConstNodeRef from a NodeRef:
constnoderef = noderef; // ok can assign const <- nonconst
// If a tree is const, then only ConstNodeRef's can be
// obtained from that tree:
ryml::Tree const& consttree = tree;
//noderef = consttree["bar"][0]; // compile error
noderef = tree["bar"][0]; // ok
constnoderef = consttree["bar"][0]; // ok
// ConstNodeRef and NodeRef can be compared for equality.
// Equality means they point at the same node.
CHECK(constnoderef == noderef);
CHECK(!(constnoderef != noderef));
//------------------------------------------------------------------
// Dealing with UTF8
ryml::Tree langs = ryml::parse_in_arena(R"(
@@ -595,6 +659,8 @@ fr: Planète (Gazeuse)
ru: Планета (Газ)
ja: 惑星(ガス)
zh: 行星(气体)
# UTF8 decoding only happens in double-quoted strings,\
# as per the YAML standard
decode this: "\u263A \xE2\x98\xBA"
and this as well: "\u2705 \U0001D11E"
)");
@@ -604,8 +670,9 @@ CHECK(langs["fr"].val() == "Planète (Gazeuse)");
CHECK(langs["ru"].val() == "Планета (Газ)");
CHECK(langs["ja"].val() == "惑星(ガス)");
CHECK(langs["zh"].val() == "行星(气体)");
// and \x \u \U codepoints are decoded (but only when
// they appear inside double-quoted strings):
// and \x \u \U codepoints are decoded (but only when they appear
// inside double-quoted strings, as dictated by the YAML
// standard):
CHECK(langs["decode this"].val() == "☺ ☺");
CHECK(langs["and this as well"].val() == "✅ 𝄞");
@@ -617,7 +684,6 @@ ryml::Location loc = parser.location(tree2["bar"][1]);
CHECK(parser.location_contents(loc).begins_with("30"));
CHECK(loc.line == 3u);
CHECK(loc.col == 4u);
// For further details in location tracking, refer to the sample function.
```
The [quickstart.cpp sample](./samples/quickstart.cpp) (from which the
@@ -647,13 +713,14 @@ sample_emit_to_container(); ///< emit to memory, eg a string or vector-like c
sample_emit_to_stream(); ///< emit to a stream, eg std::ostream
sample_emit_to_file(); ///< emit to a FILE*
sample_emit_nested_node(); ///< pick a nested node as the root when emitting
sample_json(); ///< JSON parsing and emitting: notes and constraints
sample_json(); ///< JSON parsing and emitting
sample_anchors_and_aliases(); ///< deal with YAML anchors and aliases
sample_tags(); ///< deal with YAML type tags
sample_docs(); ///< deal with YAML docs
sample_error_handler(); ///< set a custom error handler
sample_global_allocator(); ///< set a global allocator for ryml
sample_per_tree_allocator(); ///< set per-tree allocators
sample_static_trees(); ///< how to use static trees in ryml
sample_location_tracking(); ///< track node locations in the parsed source tree
```

View File

@@ -1,3 +1,33 @@
### Modifications
- **BREAKING CHANGE**: make the node API const-correct ([PR#267](https://github.com/biojppm/rapidyaml/pull/267)): added `ConstNodeRef` to hold a constant reference to a node. As the name implies, a `ConstNodeRef` object cannot be used in any tree-mutating operation. It is also smaller than the existing `NodeRef` (and faster because it does not need to check its own validity on every access). As a result of this change, there are now some constraints when obtaining a ref from a tree, and existing code is likely to break in this type of situation:
```c++
const Tree const_tree = ...;
NodeRef nr = const_tree.rootref(); // ERROR (was ok): cannot obtain a mutating NodeRef from a const Tree
ConstNodeRef cnr = const_tree.rootref(); // ok
Tree tree = ...;
NodeRef nr = tree.rootref(); // ok
ConstNodeRef cnr = tree.rootref(); // ok (implicit conversion from NodeRef to ConstNodeRef)
// to obtain a ConstNodeRef from a mutable Tree
// while avoiding implicit conversion, use the `c`
// prefix:
ConstNodeRef cnr = tree.crootref();
// likewise for tree.ref() and tree.cref().
nr = cnr; // ERROR: cannot obtain NodeRef from ConstNodeRef
cnr = nr; // ok
```
Naturally, use of `ConstNodeRef` needs to be propagated through client code. One such place is when deserializing types:
```c++
// needs to be changed from:
template<class T> bool read(ryml::NodeRef const& n, T *var);
// ... to:
template<class T> bool read(ryml::ConstNodeRef const& n, T *var);
```
### Fixes
- Fix [#233](https://github.com/biojppm/rapidyaml/issues/233) - accept leading colon in the first key of a flow map (`UNK` node) [PR#234](https://github.com/biojppm/rapidyaml/pull/234):

View File

@@ -358,8 +358,7 @@ void sample_quick_overview()
// As implied by its name, ConstNodeRef is a reference to a const
// node. It can be used to read from the node, but not write to it
// or modify the hierarchy of the node. If any modification is
// desired then a NodeRef must be used instead, which allows
// mutation of the target node:
// desired then a NodeRef must be used instead:
ryml::NodeRef wroot = tree.rootref();
// operator= assigns an existing string to the receiving node.
@@ -443,6 +442,7 @@ void sample_quick_overview()
CHECK(root.num_children() == 9);
CHECK(root["newmap"].num_children() == 0);
CHECK(root["newmap (serialized)"].num_children() == 0);
//
// When the tree is mutable, operator[] does not mutate the tree
// until the returned node is written to.
//