Update README

This commit is contained in:
Joao Paulo Magalhaes
2025-04-02 23:26:30 +01:00
parent 3b3ad1e093
commit 8c82a2eb1f
2 changed files with 37 additions and 171 deletions

116
README.md
View File

@@ -4,7 +4,6 @@
[![Documentation Status](https://readthedocs.org/projects/rapidyaml/badge/?version=latest)](https://rapidyaml.readthedocs.io/latest/?badge=latest)
[![PyPI](https://img.shields.io/pypi/v/rapidyaml?color=g)](https://pypi.org/project/rapidyaml/)
[![Gitter](https://badges.gitter.im/rapidyaml/community.svg)](https://gitter.im/rapidyaml/community)
[![Coveralls](https://coveralls.io/repos/github/biojppm/rapidyaml/badge.svg?branch=master)](https://coveralls.io/github/biojppm/rapidyaml)
[![Codecov](https://codecov.io/gh/biojppm/rapidyaml/branch/master/graph/badge.svg?branch=master)](https://codecov.io/gh/biojppm/rapidyaml)
@@ -83,6 +82,7 @@ and [the roadmap](https://github.com/biojppm/rapidyaml/tree/master/ROADMAP.md).
------
## Table of contents
* [License](#license)
* [Is it rapid?](#is-it-rapid)
* [Comparison with yaml-cpp](#comparison-with-yaml-cpp)
* [Performance reading JSON](#performance-reading-json)
@@ -102,7 +102,11 @@ and [the roadmap](https://github.com/biojppm/rapidyaml/tree/master/ROADMAP.md).
* [Test suite status](#test-suite-status)
* [Known limitations](#known-limitations)
* [Alternative libraries](#alternative-libraries)
* [License](#license)
------
## License
ryml is permissively licensed under the [MIT license](LICENSE.txt).
------
@@ -617,97 +621,22 @@ A JavaScript+WebAssembly port is available, compiled through [emscripten](https:
### Python
(Note that this is a work in progress. Additions will be made and things will
be changed.) With that said, here's an example of the Python API (from a [test file](api/python/tests/)):
(Note that this is a work in progress. Additions will be made and
things will be changed.). The python port is using only the
index-based low-level API, which works with node indices and string
views. This API is fast, but you may find it hard to use: it does not
build a python structure of dicts/seqs/scalars, and all the scalars
are strings, and not typed. With that said, it is really fast, and
once you have the tree you can still walk over the tree to create the
native python structure. Have a look at this [test
file](api/python/tests/test_readme.py) to see how the python API
works, and to judge whether it may be useful to your case.
```python
import ryml
# ryml cannot accept strings because it does not take ownership of the
# source buffer; only bytes or bytearrays are accepted.
src = b"{HELLO: a, foo: b, bar: c, baz: d, seq: [0, 1, 2, 3]}"
# verify that the given tree is as expected from the source above
def check(tree: ryml.Tree):
# For now, only the index-based low-level API is implemented.
# Here's the node structure for a tree parsed from the source
# above:
#
# [node 0] root, map
# ` [node 1] "HELLO": "a"
# ` [node 2] "foo": "b"
# ` [node 3] "bar": "c"
# ` [node 4] "baz": "d"
# ` [node 5] "seq":
# ` [node 6] "0"
# ` [node 7] "1"
# ` [node 8] "2"
# ` [node 9] "3"
#
# let's now do some assertions:
assert tree.size() == 10
assert tree.root_id() == 0
assert tree.is_root(0)
assert tree.is_map(0)
assert tree.is_seq(5)
# use bytes objects for queries
assert tree.find_child(0, b"HELLO") == 1
assert tree.find_child(0, b"foo") == 2
assert tree.find_child(0, b"seq") == 5
assert tree.key(1) == b"HELLO"
assert tree.val(1) == b"a"
assert tree.key(2) == b"foo"
assert tree.val(2) == b"b"
assert tree.find_child(0, b"seq") == 5
# hierarchy:
assert tree.first_child(0) == 1
assert tree.last_child(0) == 5
assert tree.next_sibling(1) == 2
assert tree.first_sibling(5) == 1
assert tree.last_sibling(1) == 5
assert tree.first_child(5) == 6
assert tree.last_child(5) == 9
# to loop over children:
for i, ch in enumerate(ryml.children(tree, 5)):
assert tree.val(ch) == [b"0", b"1", b"2", b"3"][i]
# to loop over siblings:
for i, sib in enumerate(ryml.siblings(tree, 5)):
assert tree.key(sib) == [b"HELLO", b"foo", b"bar", b"baz", b"seq"][i]
# to walk over all elements
visited = [False] * tree.size()
for node_id, indentation_level in ryml.walk(tree):
visited[node_id] = True
assert False not in visited
# NOTE about encoding!
k = tree.key(5)
print(k) # '<memory at 0x7f80d5b93f48>'
assert k == b"seq" # ok, as expected
assert k != "seq" # not ok - NOTE THIS!
assert str(k) != "seq" # not ok
assert str(k, "utf8") == "seq" # ok again
def test_immutable_buffer():
# copy the source buffer to the tree arena and parse the copy
tree = ryml.parse_in_arena(src)
check(tree) # OK
def test_mutable_buffer():
# parse a mutable buffer in place
# requires bytearrays or objects offering writeable memory
mutable_buffer = bytearray(src)
# WATCHOUT: the tree is pointing into mutable_buffer!
tree = ryml.parse_in_place(mutable_buffer)
check(tree) # OK
```
As expected, the performance results so far are encouraging. In
a [timeit benchmark](api/python/parse_bm.py) compared
against [PyYaml](https://pyyaml.org/)
and [ruamel.yaml](https://yaml.readthedocs.io/en/latest/), ryml parses
As for performance, in a [timeit benchmark](api/python/bm/parse_bm.py)
compared against [PyYaml](https://pyyaml.org/) and
[ruamel.yaml](https://yaml.readthedocs.io/en/latest/), ryml parses
quicker by generally 100x and up to 400x:
```
+----------------------------------------+-------+----------+----------+-----------+
| style_seqs_blck_outer1000_inner100.yml | count | time(ms) | avg(ms) | avg(MB/s) |
@@ -903,9 +832,4 @@ compromise, bridges the gap from efficiency to usability. This library
takes inspiration from
[RapidJSON](https://github.com/Tencent/rapidjson) and
[RapidXML](http://rapidxml.sourceforge.net/).
------
## License
ryml is permissively licensed under the [MIT license](LICENSE.txt).

View File

@@ -13,22 +13,14 @@ JavaScript
----------
A JavaScript+WebAssembly port is available, compiled through
`emscripten <https://emscripten.org/>`__. Here's a quick example of
how to compile ryml with emscripten using ``emcmake``:
`emscripten <https://emscripten.org/>`__.
Here's a quick example on how to configure, compile and run the tests
using `emcmake`:
.. code:: bash
git clone --recursive https://github.com/biojppm/rapidyaml
cd rapidyaml
emcmake cmake -S . -B build \
-DCMAKE_CXX_FLAGS="-s DISABLE_EXCEPTION_CATCHING=0"
Here's a quick example on how to configure, compile and run the tests
using `emscripten`:
.. code:: bash
cd rapidyaml
emcmake cmake -S . -B build/emscripten \
-D RYML_DEV=ON \
@@ -45,70 +37,20 @@ using `emscripten`:
Python
------
(Note that this is a work in progress. Additions will be made and things
will be changed.) With that said, heres an example of the Python API:
(Note that this is a work in progress. Additions will be made and
things will be changed.). The python port is using only the
index-based low-level API, which works with node indices and string
views. This API is very fast, but you may find it hard to use: it does not
build a python structure of dicts/seqs/scalars, and all the scalars
are string views into the source buffer, and not typed. With that
said, it is really fast, and once you have the tree, you can still walk
over the tree to create the native python structure. Have a look at
this `test file
<https://github.com/biojppm/rapidyaml/tree/v0.8.0/api/python/tests/test_readme.py>`__
to see how the python API works, and to judge whether it may be useful to your
case.
.. code:: python
import ryml
# ryml cannot accept strings because it does not take ownership of the
# source buffer; only bytes or bytearrays are accepted.
src = b"{HELLO: a, foo: b, bar: c, baz: d, seq: [0, 1, 2, 3]}"
def check(tree):
# for now, only the index-based low-level API is implemented
assert tree.size() == 10
assert tree.root_id() == 0
assert tree.first_child(0) == 1
assert tree.next_sibling(1) == 2
assert tree.first_sibling(5) == 2
assert tree.last_sibling(1) == 5
# use bytes objects for queries
assert tree.find_child(0, b"foo") == 1
assert tree.key(1) == b"foo")
assert tree.val(1) == b"b")
assert tree.find_child(0, b"seq") == 5
assert tree.is_seq(5)
# to loop over children:
for i, ch in enumerate(ryml.children(tree, 5)):
assert tree.val(ch) == [b"0", b"1", b"2", b"3"][i]
# to loop over siblings:
for i, sib in enumerate(ryml.siblings(tree, 5)):
assert tree.key(sib) == [b"HELLO", b"foo", b"bar", b"baz", b"seq"][i]
# to walk over all elements
visited = [False] * tree.size()
for n, indentation_level in ryml.walk(tree):
# just a dumb emitter
left = " " * indentation_level
if tree.is_keyval(n):
print("{}{}: {}".format(left, tree.key(n), tree.val(n))
elif tree.is_val(n):
print("- {}".format(left, tree.val(n))
elif tree.is_keyseq(n):
print("{}{}:".format(left, tree.key(n))
visited[inode] = True
assert False not in visited
# NOTE about encoding!
k = tree.get_key(5)
print(k) # '<memory at 0x7f80d5b93f48>'
assert k == b"seq" # ok, as expected
assert k != "seq" # not ok - NOTE THIS!
assert str(k) != "seq" # not ok
assert str(k, "utf8") == "seq" # ok again
# parse immutable buffer
tree = ryml.parse_in_arena(src)
check(tree) # OK
# parse mutable buffer.
# requires bytearrays or objects offering writeable memory
mutable = bytearray(src)
tree = ryml.parse_in_place(mutable)
check(tree) # OK
As expected, the performance results so far are encouraging. In a
`timeit benchmark <api/python/parse_bm.py>`__ compared against
As for performance, in a `timeit benchmark <https://github.com/biojppm/rapidyaml/tree/v0.8.0/api/python/bm/parse_bm.py>`__ comparing against
`PyYaml <https://pyyaml.org/>`__ and
`ruamel.yaml <https://yaml.readthedocs.io/en/latest/>`__, ryml parses
quicker by generally 100x and up to 400x: