Files
rapidyaml/doc/sphinx_other_languages.rst
2024-04-14 18:36:05 +01:00

119 lines
5.0 KiB
ReStructuredText
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
Other languages
===============
One of the aims of ryml is to provide an efficient YAML API for other
languages. JavaScript is fully available, and there is already a cursory
implementation for Python using only the low-level API. After ironing
out the general approach, other languages are likely to follow (all of
this is possible because were using `SWIG <http://www.swig.org/>`__,
which makes it easy to do so).
JavaScript
----------
A JavaScript+WebAssembly port is available, compiled through
`emscripten <https://emscripten.org/>`__.
Python
------
(Note that this is a work in progress. Additions will be made and things
will be changed.) With that said, heres an example of the Python API:
.. code:: python
import ryml
# ryml cannot accept strings because it does not take ownership of the
# source buffer; only bytes or bytearrays are accepted.
src = b"{HELLO: a, foo: b, bar: c, baz: d, seq: [0, 1, 2, 3]}"
def check(tree):
# for now, only the index-based low-level API is implemented
assert tree.size() == 10
assert tree.root_id() == 0
assert tree.first_child(0) == 1
assert tree.next_sibling(1) == 2
assert tree.first_sibling(5) == 2
assert tree.last_sibling(1) == 5
# use bytes objects for queries
assert tree.find_child(0, b"foo") == 1
assert tree.key(1) == b"foo")
assert tree.val(1) == b"b")
assert tree.find_child(0, b"seq") == 5
assert tree.is_seq(5)
# to loop over children:
for i, ch in enumerate(ryml.children(tree, 5)):
assert tree.val(ch) == [b"0", b"1", b"2", b"3"][i]
# to loop over siblings:
for i, sib in enumerate(ryml.siblings(tree, 5)):
assert tree.key(sib) == [b"HELLO", b"foo", b"bar", b"baz", b"seq"][i]
# to walk over all elements
visited = [False] * tree.size()
for n, indentation_level in ryml.walk(tree):
# just a dumb emitter
left = " " * indentation_level
if tree.is_keyval(n):
print("{}{}: {}".format(left, tree.key(n), tree.val(n))
elif tree.is_val(n):
print("- {}".format(left, tree.val(n))
elif tree.is_keyseq(n):
print("{}{}:".format(left, tree.key(n))
visited[inode] = True
assert False not in visited
# NOTE about encoding!
k = tree.get_key(5)
print(k) # '<memory at 0x7f80d5b93f48>'
assert k == b"seq" # ok, as expected
assert k != "seq" # not ok - NOTE THIS!
assert str(k) != "seq" # not ok
assert str(k, "utf8") == "seq" # ok again
# parse immutable buffer
tree = ryml.parse_in_arena(src)
check(tree) # OK
# parse mutable buffer.
# requires bytearrays or objects offering writeable memory
mutable = bytearray(src)
tree = ryml.parse_in_place(mutable)
check(tree) # OK
As expected, the performance results so far are encouraging. In a
`timeit benchmark <api/python/parse_bm.py>`__ compared against
`PyYaml <https://pyyaml.org/>`__ and
`ruamel.yaml <https://yaml.readthedocs.io/en/latest/>`__, ryml parses
quicker by generally 100x and up to 400x:
::
+----------------------------------------+-------+----------+----------+-----------+
| style_seqs_blck_outer1000_inner100.yml | count | time(ms) | avg(ms) | avg(MB/s) |
+----------------------------------------+-------+----------+----------+-----------+
| parse:RuamelYamlParse | 1 | 4564.812 | 4564.812 | 0.173 |
| parse:PyYamlParse | 1 | 2815.426 | 2815.426 | 0.280 |
| parse:RymlParseInArena | 38 | 588.024 | 15.474 | 50.988 |
| parse:RymlParseInArenaReuse | 38 | 466.997 | 12.289 | 64.202 |
| parse:RymlParseInPlace | 38 | 579.770 | 15.257 | 51.714 |
| parse:RymlParseInPlaceReuse | 38 | 462.932 | 12.182 | 64.765 |
+----------------------------------------+-------+----------+----------+-----------+
(Note that the parse timings above are somewhat biased towards ryml,
because it does not perform any type conversions in Python-land: return
types are merely ``memoryviews`` to the source buffer, possibly copied
to the trees arena).
As for emitting, the improvement can be as high as 3000x:
::
+----------------------------------------+-------+-----------+-----------+-----------+
| style_maps_blck_outer1000_inner100.yml | count | time(ms) | avg(ms) | avg(MB/s) |
+----------------------------------------+-------+-----------+-----------+-----------+
| emit_yaml:RuamelYamlEmit | 1 | 18149.288 | 18149.288 | 0.054 |
| emit_yaml:PyYamlEmit | 1 | 2683.380 | 2683.380 | 0.365 |
| emit_yaml:RymlEmitToNewBuffer | 88 | 861.726 | 9.792 | 99.976 |
| emit_yaml:RymlEmitReuse | 88 | 437.931 | 4.976 | 196.725 |
+----------------------------------------+-------+-----------+-----------+-----------+