refactor: event-based parser policy

rewrite parser based on events, rewrite filtering
separating the filter code to a different class
wip
wip
wip
filter single quoted is working
refactor to filter processor wip
double quoted wip
double quoted wip
double quoted wip
double quoted wip
double quoted wip
double quoted wip
double quoted wip
double quoted seems to be working
double quoted wip
double quoted wip
double quoted wip
double quoted wip
double quoted wip
double quoted wip
double quoted working!
filter plain scalar wip
wip
filter plain scalar wip
wip
test filter processors
fix write in inplace::translate_esc
block literal wip
block literal wip
block literal wip
block literal wip
block literal wip
block literal wip
block literal working!
filter block folded wip
filter block folded wip
cleanup filter
filter locations are needed only for double quoted scalars
add FilterResult to encapsulate validity
prepare filter for using in parser
in-parser filtering wip
filter empty block literals
filter block folded ok
all filters working
moving filters to parse wip
fix block_folded
fixing block folded WIP
new filter: all tests passing!
fix sanitizer issues
refactor: harmonize parser filtering function names
wip ci fixes
coverage wip
filter arena no longer needed
double quoted filter wip
fix wip
fix wip
fix wip
wip: inplace mid-extending vs end-extending
all tests ok
wip
wip
wip2
wip
wip
wip doc
wip doc
wip anchor
fix newlines in emit of docs
wip ref
wip new parser
wip new parser
wip new parser
fix
wip new parser
wip new parser
wip new parser
wip new parser
wip new parser: tag directives
wip new parser: tag resolving
wip new parser: more sink edge cases
wip new parser: key containers working in the sink
prepare event sink stack
tree parse wip
cleanup event sink
tree parse wip
tree parse wip
tree parse wip
tree parse wip: now parsing simple flow seqs!
new parser wip: flow seqs: added anchor/ref parsing
new parser wip: seq flow goes on while there is a seq flow
new parser wip: seqimap events
new parser wip: seqimap parsing
new parser wip: now parsing flow maps!
wip
wip
new parser wip: block seqs wip
new parser wip: block maps wip
wip
wip
wip
map anchors ok
tags wip
anchors and tags now working
add tests for container keys
structure wip
key containers: working in events from yaml!
wip
wip
docs wip
qmrk wip
qmrk seq blck
qmrk wip
fix seqimap again
qmrk with tags
doc wip
doc wip
doc wip
doc wip
doc wip
doc wip
remove old parsing functions
fix
wip buffered events for container keys
ditto
ditto
ditto
ditto
container keys seem to be working
report error for container keys
flow key containers inside qmrk
remove unused functions
remove more unused functions
comments
wip
comments wip
wip
wip
wip
wip
most tests working
fix more tests
wip: refactor parser to not depend on tree
ditto
remove include dependencies
parser: do not use tree directly
fixes
fix annotations when starting child maps
more fixes
more fixes
more fixes
more fixes
block scalars
block scalars
fixes to scalars
wip
wip
wip
wip
add error location checks
wip
wip
sudden docs
sudden docs wip
sudden docs in block map/seq
first test cases for simple seq are working!
fixing test cases WIP
mark doc only on explicit docs or stream children
more progress
wip
wip
fixing indentless seqs wip
simple seqs are working!
nested_seqx2 working!
disable all un-refactored tests
fix empty_seq
fix empty map/file
empty scalar wip
fix empty scalars
fix test number
fix null vals and empty scalars
fix nested seq
map wip
map wip
fix maps!
fix nested maps!
fix map of seq
fix seq of map
fix sets
explicit key WIP
explicit key WIP
explicit key WIP
explicit key WIP
explicit keys working!
fix regressions
fix generic map seq tests
docs WIP
docs + indentation wip
remove unused functions
fix regressions
rename test_new_parser to test_parser_engine
docs working!
fix json
fix scalar names
anchors wip
anchors wip
anchors wip
anchors mostly working
anchors WIP
anchors/refs working!
move test lib files to a separate folder
tags wip
simple seq
simple seq
tag wip
tags working!
rename TestCase->TestCaseNode, into separate files
remove empty var
fix indentation
fix github_issues
fix github issues
single quoted wip
single quoted wip
single quoted is working!
double quoted wip
double quoted wip
fix plain scalar emit
literal scalar wip
literal scalar wip
literal scalar wip
literal scalar wip
literal scalar wip
move tags to separate source files
minor cleanup
block literal wip
block literal wip
add json parser
update benchmarks
improve json
fix compilation in clang
fix bm_emit
block literal wip
block literal wip
block literal wip
reference resolver
block literal wip
block literal working!
fix regressions
block folded wip
block folded wip
block folded wip
block folded wip
block folded wip
block folded wip
block folded wip
block folded wip: indented blocks
block folded wip
block folded wip
block folded wip
block folded working!
plain scalar wip
plain scalar wip
plain scalar working!
style wip
style wip
style wip
style wip
style WIP
scalar style wip
scalar style ok
fix regression of scalar plain
fix regression of double quoted wip
block literal wip (old)
double quoted wip
fix regression in double quoted
fix merge
add tests for merge
fix merge wip
fix vs compilation wip
parse overloads wip
parse overloads wip
parse overloads
fix merge for styles
fixes to quickstart wip
enable serialize test
improve test merge
fix test serialize
test tree wip
fix locations
test tree wip
test parser wip
fix test for yaml events (from tree)
refactor yaml event tests to use parameterized tests
event tests: use the scalar style information from the tree
event tests: use the container style information from the tree
event tests: working both from parser and tree
improve tag errors
fix tags wip
fix tags
fix bm
fix bm
fix test parser
fix tree wip
fix quickstart wip
fix test tree wip
fix some valgrind warnings
fix quickstart wip
fix tree & quickstart wip
fix docmaps with keyref as the first child
fix parsing into existing nodes
fix quickstart!
more fixes (~regressions from quickstart)
fix tool tests
fix test suite wip
fix test suite wip @215/1633
fix test suite wip @152/1633 91%
disable tests with container keys: 96/1633  94%
test suite wip
test suite parse: update missing errors
fix parsing of scalars starting with ?
fix skipping of whitespace in flow mode 47/1633 97%
fix missing anchor 45/1633 97%
fix neutral tag resolve 43/1633 97%
fix parse of yaml events 39/1633 98%
fix tags normalization 50/1633 97%
fix tags normalization 38/1633 98%
fix scalar with trailing colon : 36/1633 98%
exempt more missing errors. 32/1633 98%
30/1633 98%
22/1633 99%
18/1633 99%
backspace in dquo. 16/1633 99%
8/1633 99%
7/1633 99%
6/1633 99%
3/1633 99%
100% pass!
adding events parser to test suite and events tool
sneaky block container keys WIP
cleanup yaml-events
fix warning
wip
fix block key containers
test suite: fix event emitting WIP
100% tests pass!
fix missing doc UKK6
test suite: add tests comparing reference events and emitted events WIP
test suite: fix comparison of emitted events
100% test pass
enable tests for key containers. 100% pass!
enable error tests for event emitter. 100% pass!
update test suite exclusions
[refac] split event handlers
[fix] compilation in windows
windows exports
fix wip
wip
wip
wip
tab tokens working!
fix NodeType::operator== ambiguity in C++20
clean up test names
cover json as much as possible in the tests
fix the difficult failure in vs-x86-release builds
ensure json is tested in the test groups
fix some problems with the declaration/definition of test groups
minor cleanup in json emit
parser cleanup wip
cleanup and improve coverage
cleanup and improve coverage
cleanup and improve coverage
cleanup and improve coverage
wip cleanup and coverage
wip cleanup and coverage
style is no longer tagged WIP
tidy style API
ensure tree assertions go through the tree's callbacks
style API
bm wip
bm wip
changelog
tidy type+style predicates
add id_type to take place as the new type for node ids
update benchmarks
WIP fix warnings when the id_type is signed 32 bit
wip
wip [ci skip]
woops
wip [ci skip]
add test to ensure #422
fix rebase problem
fix noderef tests which were optimized
github workflows: update checkout version
add some more plain scalar tests
add yamlscript like test
quickstart: call sample_tags/directives on the proper place
add test for 379
update docs post rebase
fix rebase problems and update docs
test parse engine: fix gcc4.8 not accepting C++11 raw strings as macro args
investigating gcc x86 release failures
fix gcc x86 release failures (?)
gcc x86 release failures: cleanup print
update c4core
update swig interface
fix benchmark workflow
improve coverage
improve error logging functions
annotate unreachable to prevent error in visual studio
improve coverage
split event stack wip
split event stack wip
split event stack wip
tidy up some defines, and improve the dump function
emit: disable uncovered statements
This commit is contained in:
Joao Paulo Magalhaes
2024-05-05 18:47:07 +02:00
parent 38f2326c1c
commit 735ba65bba
32 changed files with 13808 additions and 8339 deletions

View File

@@ -44,17 +44,30 @@ c4_add_library(ryml
c4/yml/common.cpp
c4/yml/emit.def.hpp
c4/yml/emit.hpp
c4/yml/event_handler_stack.hpp
c4/yml/event_handler_tree.hpp
c4/yml/filter_processor.hpp
c4/yml/fwd.hpp
c4/yml/export.hpp
c4/yml/node.hpp
c4/yml/node.cpp
c4/yml/node_type.hpp
c4/yml/node_type.cpp
c4/yml/parser_state.hpp
c4/yml/parse.hpp
c4/yml/parse.cpp
c4/yml/parse_engine.hpp
c4/yml/parse_engine.def.hpp
c4/yml/preprocess.hpp
c4/yml/preprocess.cpp
c4/yml/reference_resolver.hpp
c4/yml/reference_resolver.cpp
c4/yml/std/map.hpp
c4/yml/std/std.hpp
c4/yml/std/string.hpp
c4/yml/std/vector.hpp
c4/yml/tag.hpp
c4/yml/tag.cpp
c4/yml/tree.hpp
c4/yml/tree.cpp
c4/yml/writer.hpp
@@ -88,6 +101,13 @@ if(RYML_USE_ASSERT)
target_compile_definitions(ryml PUBLIC RYML_USE_ASSERT=1)
endif()
if(CMAKE_COMPILER_IS_GNUCXX)
option(RYML_FANALYZER "Compile with -fanalyzer https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Static-Analyzer-Options.html" OFF)
if(RYML_FANALYZER)
target_compile_options(ryml PUBLIC -fanalyzer)
endif()
endif()
#-------------------------------------------------------

View File

@@ -28,6 +28,8 @@ void report_error_impl(const char* msg, size_t length, Location loc, FILE *f)
{
if(!loc.name.empty())
{
// this is more portable than using fprintf("%.*s:") which
// is not available in some embedded platforms
fwrite(loc.name.str, 1, loc.name.len, f);
fputc(':', f);
}
@@ -36,13 +38,17 @@ void report_error_impl(const char* msg, size_t length, Location loc, FILE *f)
fprintf(f, "%zu:", loc.col);
if(loc.offset)
fprintf(f, " (%zuB):", loc.offset);
fputc(' ', f);
}
fprintf(f, "%.*s\n", (int)length, msg);
RYML_ASSERT(!csubstr(msg, length).ends_with('\0'));
fwrite(msg, 1, length, f);
fputc('\n', f);
fflush(f);
}
[[noreturn]] void error_impl(const char* msg, size_t length, Location loc, void * /*user_data*/)
{
RYML_ASSERT(!csubstr(msg, length).ends_with('\0'));
report_error_impl(msg, length, loc, nullptr);
#ifdef RYML_DEFAULT_CALLBACK_USES_EXCEPTIONS
throw std::runtime_error(std::string(msg, length));
@@ -98,9 +104,9 @@ Callbacks::Callbacks(void *user_data, pfn_allocate alloc_, pfn_free free_, pfn_e
m_error(error_)
#endif
{
C4_CHECK(m_allocate);
C4_CHECK(m_free);
C4_CHECK(m_error);
RYML_CHECK(m_allocate);
RYML_CHECK(m_free);
RYML_CHECK(m_error);
}

View File

@@ -5,8 +5,49 @@
#include <cstddef>
#include <c4/substr.hpp>
#include <c4/dump.hpp>
#include <c4/yml/export.hpp>
#ifdef C4_MSVC
#include <malloc.h>
#else
#include <alloca.h>
#endif
//-----------------------------------------------------------------------------
#ifndef RYML_ERRMSG_SIZE
/// size for the error message buffer
#define RYML_ERRMSG_SIZE (1024)
#endif
#ifndef RYML_LOGBUF_SIZE
/// size for the buffer used to format individual values to string
/// while preparing an error message. This is only used for formatting
/// individual values in the message; final messages will be larger
/// than this value (see @ref RYML_ERRMSG_SIZE). This is also used for
/// the detailed debug log messages when RYML_DBG is defined.
#define RYML_LOGBUF_SIZE (256)
#endif
#ifndef RYML_LOGBUF_SIZE_MAX
/// size for the fallback larger log buffer. When @ref
/// RYML_LOGBUG_SIZE is not large enough to convert a value to string,
/// then temporary stack memory is allocated up to
/// RYML_LOGBUF_SIZE_MAX. This limit is in place to prevent a stack
/// overflow. If the printed value requires more than
/// RYML_LOGBUF_SIZE_MAX, the value is silently skipped.
#define RYML_LOGBUF_SIZE_MAX (1024)
#endif
#ifndef RYML_LOCATIONS_SMALL_THRESHOLD
/// threshold at which a location search will revert from linear to
/// binary search.
#define RYML_LOCATIONS_SMALL_THRESHOLD (30)
#endif
//-----------------------------------------------------------------------------
// Specify groups to have a predefined topic order in doxygen:
@@ -83,6 +124,11 @@
*
*/
/** @defgroup doc_ref_utils Anchor/Reference utilities
*
* @see sample::sample_anchors_and_aliases
* */
/** @defgroup doc_tag_utils Tag utilities
* @see sample::sample_tags
*/
@@ -134,11 +180,13 @@
# define RYML_ASSERT(cond) RYML_CHECK(cond)
# define RYML_ASSERT_MSG(cond, msg) RYML_CHECK_MSG(cond, msg)
# define _RYML_CB_ASSERT(cb, cond) _RYML_CB_CHECK((cb), (cond))
# define _RYML_CB_ASSERT_(cb, cond, loc) _RYML_CB_CHECK((cb), (cond), (loc))
# define RYML_NOEXCEPT
#else
# define RYML_ASSERT(cond)
# define RYML_ASSERT_MSG(cond, msg)
# define _RYML_CB_ASSERT(cb, cond)
# define _RYML_CB_ASSERT_(cb, cond, loc)
# define RYML_NOEXCEPT noexcept
#endif
@@ -148,7 +196,7 @@
do { \
if(C4_UNLIKELY(!(cond))) \
{ \
RYML_DEBUG_BREAK() \
RYML_DEBUG_BREAK(); \
c4::yml::error("check failed: " #cond, c4::yml::Location(__FILE__, __LINE__, 0)); \
C4_UNREACHABLE_AFTER_ERR(); \
} \
@@ -159,7 +207,7 @@
{ \
if(C4_UNLIKELY(!(cond))) \
{ \
RYML_DEBUG_BREAK() \
RYML_DEBUG_BREAK(); \
c4::yml::error(msg ": check failed: " #cond, c4::yml::Location(__FILE__, __LINE__, 0)); \
C4_UNREACHABLE_AFTER_ERR(); \
} \
@@ -167,17 +215,16 @@
#if defined(RYML_DBG) && !defined(NDEBUG) && !defined(C4_NO_DEBUG_BREAK)
# define RYML_DEBUG_BREAK() \
{ \
do { \
if(c4::get_error_flags() & c4::ON_ERROR_DEBUGBREAK) \
{ \
C4_DEBUG_BREAK(); \
} \
}
} while(0)
#else
# define RYML_DEBUG_BREAK()
#endif
/** @endcond */
@@ -190,11 +237,33 @@ namespace yml {
C4_SUPPRESS_WARNING_GCC_CLANG_WITH_PUSH("-Wold-style-cast")
enum : size_t {
/** a null position */
npos = size_t(-1),
#ifndef RYML_ID_TYPE
/** The type of a node id in the YAML tree. In the future, the default
* will likely change to int32_t, which was observed to be faster.
* @see id_type */
#define RYML_ID_TYPE size_t
#endif
/** The type of a node id in the YAML tree; to override the default
* type, define the macro @ref RYML_ID_TYPE to a suitable integer
* type. */
using id_type = RYML_ID_TYPE;
static_assert(std::is_integral<id_type>::value, "id_type must be an integer type");
C4_SUPPRESS_WARNING_GCC_WITH_PUSH("-Wuseless-cast")
enum : id_type {
/** an index to none */
NONE = size_t(-1)
NONE = id_type(-1),
};
C4_SUPPRESS_WARNING_GCC_CLANG_POP
enum : size_t {
/** a null string position */
npos = size_t(-1)
};
@@ -225,10 +294,11 @@ struct RYML_EXPORT Location : public LineCol
{
csubstr name;
operator bool () const { return !name.empty() || line != 0 || offset != 0; }
operator bool () const { return !name.empty() || line != 0 || offset != 0 || col != 0; }
Location() : LineCol(), name() {}
Location( size_t l, size_t c) : LineCol{ l, c}, name( ) {}
Location( size_t b, size_t l, size_t c) : LineCol{b, l, c}, name( ) {}
Location( csubstr n, size_t l, size_t c) : LineCol{ l, c}, name(n) {}
Location( csubstr n, size_t b, size_t l, size_t c) : LineCol{b, l, c}, name(n) {}
Location(const char *n, size_t l, size_t c) : LineCol{ l, c}, name(to_csubstr(n)) {}
@@ -364,25 +434,25 @@ template<size_t N>
}
#define _RYML_CB_ERR(cb, msg_literal) \
_RYML_CB_ERR_(cb, msg_literal, c4::yml::Location(__FILE__, 0, __LINE__, 0))
#define _RYML_CB_CHECK(cb, cond) \
_RYML_CB_CHECK_(cb, cond, c4::yml::Location(__FILE__, 0, __LINE__, 0))
#define _RYML_CB_ERR_(cb, msg_literal, loc) \
do \
{ \
const char msg[] = msg_literal; \
RYML_DEBUG_BREAK() \
c4::yml::error((cb), \
msg, sizeof(msg), \
c4::yml::Location(__FILE__, 0, __LINE__, 0)); \
RYML_DEBUG_BREAK(); \
c4::yml::error((cb), msg, sizeof(msg)-1, loc); \
C4_UNREACHABLE_AFTER_ERR(); \
} while(0)
#define _RYML_CB_CHECK(cb, cond) \
#define _RYML_CB_CHECK_(cb, cond, loc) \
do \
{ \
if(!(cond)) \
if(C4_UNLIKELY(!(cond))) \
{ \
const char msg[] = "check failed: " #cond; \
RYML_DEBUG_BREAK() \
c4::yml::error((cb), \
msg, sizeof(msg), \
c4::yml::Location(__FILE__, 0, __LINE__, 0)); \
RYML_DEBUG_BREAK(); \
c4::yml::error((cb), msg, sizeof(msg)-1, loc); \
C4_UNREACHABLE_AFTER_ERR(); \
} \
} while(0)
@@ -395,7 +465,51 @@ do \
} while(0)
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
typedef enum {
BLOCK_LITERAL, //!< keep newlines (|)
BLOCK_FOLD //!< replace newline with single space (>)
} BlockStyle_e;
typedef enum {
CHOMP_CLIP, //!< single newline at end (default)
CHOMP_STRIP, //!< no newline at end (-)
CHOMP_KEEP //!< all newlines from end (+)
} BlockChomp_e;
/** Abstracts the fact that a scalar filter result may not fit in the
* intended memory. */
struct FilterResult
{
C4_ALWAYS_INLINE bool valid() const noexcept { return str.str != nullptr; }
C4_ALWAYS_INLINE size_t required_len() const noexcept { return str.len; }
C4_ALWAYS_INLINE csubstr get() { RYML_ASSERT(valid()); return str; }
csubstr str;
};
/** Abstracts the fact that a scalar filter result may not fit in the
* intended memory. */
struct FilterResultExtending
{
C4_ALWAYS_INLINE bool valid() const noexcept { return str.str != nullptr; }
C4_ALWAYS_INLINE size_t required_len() const noexcept { return reqlen; }
C4_ALWAYS_INLINE csubstr get() { RYML_ASSERT(valid()); return str; }
csubstr str;
size_t reqlen;
};
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
namespace detail {
// is there a better way to do this?
template<int8_t signedval, uint8_t unsignedval>
struct _charconstant_t
: public std::conditional<std::is_signed<char>::value,
@@ -411,10 +525,11 @@ struct _SubstrWriter
{
substr buf;
size_t pos;
_SubstrWriter(substr buf_, size_t pos_=0) : buf(buf_), pos(pos_) {}
_SubstrWriter(substr buf_, size_t pos_=0) : buf(buf_), pos(pos_) { C4_ASSERT(buf.str); }
void append(csubstr s)
{
C4_ASSERT(!s.overlaps(buf));
C4_ASSERT(s.str || !s.len);
if(s.len && pos + s.len <= buf.len)
{
C4_ASSERT(s.str);
@@ -424,12 +539,14 @@ struct _SubstrWriter
}
void append(char c)
{
C4_ASSERT(buf.str);
if(pos < buf.len)
buf.str[pos] = c;
++pos;
}
void append_n(char c, size_t numtimes)
{
C4_ASSERT(buf.str);
if(numtimes && pos + numtimes < buf.len)
memset(buf.str + pos, c, numtimes);
pos += numtimes;
@@ -445,9 +562,71 @@ struct _SubstrWriter
};
} // namespace detail
namespace detail {
// dumpfn is a function abstracting prints to terminal (or to string).
template<class DumpFn, class ...Args>
C4_NO_INLINE void _dump(DumpFn &&dumpfn, csubstr fmt, Args&& ...args)
{
DumpResults results;
// try writing everything:
{
// buffer for converting individual arguments. it is defined
// in a child scope to free it in case the buffer is too small
// for any of the arguments.
char writebuf[RYML_LOGBUF_SIZE];
results = format_dump_resume(std::forward<DumpFn>(dumpfn), writebuf, fmt, std::forward<Args>(args)...);
}
// if any of the arguments failed to fit the buffer, allocate a
// larger buffer (up to a limit) and resume writing.
//
// results.bufsize is set to the size of the largest element
// serialized. Eg int(1) will require 1 byte.
if(C4_UNLIKELY(results.bufsize > RYML_LOGBUF_SIZE))
{
const size_t bufsize = results.bufsize <= RYML_LOGBUF_SIZE_MAX ? results.bufsize : RYML_LOGBUF_SIZE_MAX;
#ifdef C4_MSVC
substr largerbuf = {static_cast<char*>(_alloca(bufsize)), bufsize};
#else
substr largerbuf = {static_cast<char*>(alloca(bufsize)), bufsize};
#endif
results = format_dump_resume(std::forward<DumpFn>(dumpfn), results, largerbuf, fmt, std::forward<Args>(args)...);
}
}
template<class ...Args>
C4_NORETURN C4_NO_INLINE void _report_err(Callbacks const& C4_RESTRICT callbacks, csubstr fmt, Args const& C4_RESTRICT ...args)
{
char errmsg[RYML_ERRMSG_SIZE] = {0};
detail::_SubstrWriter writer(errmsg);
auto dumpfn = [&writer](csubstr s){ writer.append(s); };
_dump(dumpfn, fmt, args...);
writer.append('\n');
const size_t len = writer.pos < RYML_ERRMSG_SIZE ? writer.pos : RYML_ERRMSG_SIZE;
callbacks.m_error(errmsg, len, {}, callbacks.m_user_data);
C4_UNREACHABLE_AFTER_ERR();
}
} // namespace detail
inline csubstr _c4prc(const char &C4_RESTRICT c) // pass by reference!
{
switch(c)
{
case '\n': return csubstr("\\n");
case '\t': return csubstr("\\t");
case '\0': return csubstr("\\0");
case '\r': return csubstr("\\r");
case '\f': return csubstr("\\f");
case '\b': return csubstr("\\b");
case '\v': return csubstr("\\v");
case '\a': return csubstr("\\a");
default: return csubstr(&c, 1);
}
}
/// @endcond
C4_SUPPRESS_WARNING_GCC_CLANG_POP
C4_SUPPRESS_WARNING_GCC_POP
} // namespace yml
} // namespace c4

View File

@@ -17,7 +17,7 @@ namespace c4 {
namespace yml {
void check_invariants(Tree const& t, size_t node=NONE);
void check_invariants(Tree const& t, id_type node=NONE);
void check_free_list(Tree const& t);
void check_arena(Tree const& t);
@@ -26,7 +26,7 @@ void check_arena(Tree const& t);
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
inline void check_invariants(Tree const& t, size_t node)
inline void check_invariants(Tree const& t, id_type node)
{
if(node == NONE)
{
@@ -34,8 +34,8 @@ inline void check_invariants(Tree const& t, size_t node)
node = t.root_id();
}
auto const& n = *t._p(node);
#ifdef RYML_DBG
NodeData const& n = *t._p(node);
#if defined(RYML_DBG) && 0
if(n.m_first_child != NONE || n.m_last_child != NONE)
{
printf("check(%zu): fc=%zu lc=%zu\n", node, n.m_first_child, n.m_last_child);
@@ -100,10 +100,10 @@ inline void check_invariants(Tree const& t, size_t node)
C4_CHECK(t._p(n.m_next_sibling)->m_next_sibling != node);
}
size_t count = 0;
for(size_t i = n.m_first_child; i != NONE; i = t.next_sibling(i))
id_type count = 0;
for(id_type i = n.m_first_child; i != NONE; i = t.next_sibling(i))
{
#ifdef RYML_DBG
#if defined(RYML_DBG) && 0
printf("check(%zu): descend to child[%zu]=%zu\n", node, count, i);
#endif
auto const& ch = *t._p(i);
@@ -131,7 +131,7 @@ inline void check_invariants(Tree const& t, size_t node)
check_arena(t);
}
for(size_t i = t.first_child(node); i != NONE; i = t.next_sibling(i))
for(id_type i = t.first_child(node); i != NONE; i = t.next_sibling(i))
{
check_invariants(t, i);
}
@@ -159,8 +159,8 @@ inline void check_free_list(Tree const& t)
//C4_CHECK(head.m_prev_sibling == NONE);
//C4_CHECK(tail.m_next_sibling == NONE);
size_t count = 0;
for(size_t i = t.m_free_head, prev = NONE; i != NONE; i = t._p(i)->m_next_sibling)
id_type count = 0;
for(id_type i = t.m_free_head, prev = NONE; i != NONE; i = t._p(i)->m_next_sibling)
{
auto const& elm = *t._p(i);
if(&elm != &head)

View File

@@ -4,7 +4,11 @@
#ifndef _C4_YML_COMMON_HPP_
#include "../common.hpp"
#endif
#ifdef RYML_DBG
#include <cstdio>
#endif
//-----------------------------------------------------------------------------
// some debugging scaffolds
@@ -23,109 +27,123 @@
#pragma clang diagnostic ignored "-Werror"
#pragma clang diagnostic ignored "-Wgnu-zero-variadic-macro-arguments"
// some debugging scaffolds
#ifdef RYML_DBG
#include <c4/dump.hpp>
namespace c4 {
inline void _dbg_dumper(csubstr s) { fwrite(s.str, 1, s.len, stdout); };
template<class ...Args>
void _dbg_printf(c4::csubstr fmt, Args&& ...args)
{
static char writebuf[256];
auto results = c4::format_dump_resume<&_dbg_dumper>(writebuf, fmt, std::forward<Args>(args)...);
// resume writing if the results failed to fit the buffer
if(C4_UNLIKELY(results.bufsize > sizeof(writebuf))) // bufsize will be that of the largest element serialized. Eg int(1), will require 1 byte.
{
results = format_dump_resume<&_dbg_dumper>(results, writebuf, fmt, std::forward<Args>(args)...);
if(C4_UNLIKELY(results.bufsize > sizeof(writebuf)))
{
results = format_dump_resume<&_dbg_dumper>(results, writebuf, fmt, std::forward<Args>(args)...);
}
}
}
} // namespace c4
# define _c4dbgt(fmt, ...) this->_dbg ("{}:{}: " fmt , __FILE__, __LINE__, ## __VA_ARGS__)
# define _c4dbgpf(fmt, ...) _dbg_printf("{}:{}: " fmt "\n", __FILE__, __LINE__, ## __VA_ARGS__)
# define _c4dbgp(msg) _dbg_printf("{}:{}: " msg "\n", __FILE__, __LINE__ )
# define _c4dbgq(msg) _dbg_printf(msg "\n")
#ifndef RYML_DBG
# define _c4err(fmt, ...) \
do { if(c4::is_debugger_attached()) { C4_DEBUG_BREAK(); } \
this->_err("ERROR:\n" "{}:{}: " fmt, __FILE__, __LINE__, ## __VA_ARGS__); } while(0)
#else
this->_err("ERROR: " fmt, ## __VA_ARGS__)
# define _c4dbgt(fmt, ...)
# define _c4dbgpf(fmt, ...)
# define _c4dbgpf_(fmt, ...)
# define _c4dbgp(msg)
# define _c4dbgp_(msg)
# define _c4dbgq(msg)
# define _c4presc(...)
# define _c4prscalar(msg, scalar, keep_newlines)
#else
# define _c4err(fmt, ...) \
do { if(c4::is_debugger_attached()) { C4_DEBUG_BREAK(); } \
this->_err("ERROR: " fmt, ## __VA_ARGS__); } while(0)
#endif
do { RYML_DEBUG_BREAK(); this->_err("ERROR:\n" "{}:{}: " fmt, __FILE__, __LINE__, ## __VA_ARGS__); } while(0)
# define _c4dbgt(fmt, ...) do { if(_dbg_enabled()) { \
this->_dbg ("{}:{}: " fmt , __FILE__, __LINE__, ## __VA_ARGS__); } } while(0)
# define _c4dbgpf(fmt, ...) _dbg_printf("{}:{}: " fmt "\n", __FILE__, __LINE__, ## __VA_ARGS__)
# define _c4dbgpf_(fmt, ...) _dbg_printf("{}:{}: " fmt , __FILE__, __LINE__, ## __VA_ARGS__)
# define _c4dbgp(msg) _dbg_printf("{}:{}: " msg "\n", __FILE__, __LINE__ )
# define _c4dbgp_(msg) _dbg_printf("{}:{}: " msg , __FILE__, __LINE__ )
# define _c4dbgq(msg) _dbg_printf(msg "\n")
# define _c4presc(...) do { if(_dbg_enabled()) __c4presc(__VA_ARGS__); } while(0)
# define _c4prscalar(msg, scalar, keep_newlines) \
do { \
_c4dbgpf_("{}: [{}]~~~", msg, scalar.len); \
if(_dbg_enabled()) { \
__c4presc((scalar).str, (scalar).len, (keep_newlines)); \
} \
_c4dbgq("~~~"); \
} while(0)
#endif // RYML_DBG
#define _c4prsp(sp) sp
#define _c4presc(s) __c4presc(s.str, s.len)
inline c4::csubstr _c4prc(const char &C4_RESTRICT c)
//-----------------------------------------------------------------------------
#ifdef RYML_DBG
#include <c4/dump.hpp>
namespace c4 {
inline bool& _dbg_enabled() { static bool enabled = true; return enabled; }
inline void _dbg_set_enabled(bool yes) { _dbg_enabled() = yes; }
inline void _dbg_dumper(csubstr s)
{
switch(c)
if(s.str)
fwrite(s.str, 1, s.len, stdout);
}
inline substr _dbg_buf() noexcept
{
static char writebuf[2048];
return writebuf;
}
template<class ...Args>
C4_NO_INLINE void _dbg_printf(c4::csubstr fmt, Args const& ...args)
{
if(_dbg_enabled())
{
case '\n': return c4::csubstr("\\n");
case '\t': return c4::csubstr("\\t");
case '\0': return c4::csubstr("\\0");
case '\r': return c4::csubstr("\\r");
case '\f': return c4::csubstr("\\f");
case '\b': return c4::csubstr("\\b");
case '\v': return c4::csubstr("\\v");
case '\a': return c4::csubstr("\\a");
default: return c4::csubstr(&c, 1);
substr buf = _dbg_buf();
const size_t needed_size = c4::format_dump(&_dbg_dumper, buf, fmt, args...);
C4_CHECK(needed_size <= buf.len);
}
}
inline void __c4presc(const char *s, size_t len)
inline void __c4presc(const char *s, size_t len, bool keep_newlines=false)
{
RYML_ASSERT(s || !len);
size_t prev = 0;
for(size_t i = 0; i < len; ++i)
{
switch(s[i])
{
case '\n' : if(i > prev) { fwrite(s+prev, 1, i-prev, stdout); } putchar('\\'); putchar('n'); putchar('\n'); prev = i+1; break;
case '\t' : if(i > prev) { fwrite(s+prev, 1, i-prev, stdout); } putchar('\\'); putchar('t'); prev = i+1; break;
case '\0' : if(i > prev) { fwrite(s+prev, 1, i-prev, stdout); } putchar('\\'); putchar('0'); prev = i+1; break;
case '\r' : if(i > prev) { fwrite(s+prev, 1, i-prev, stdout); } putchar('\\'); putchar('r'); prev = i+1; break;
case '\f' : if(i > prev) { fwrite(s+prev, 1, i-prev, stdout); } putchar('\\'); putchar('f'); prev = i+1; break;
case '\b' : if(i > prev) { fwrite(s+prev, 1, i-prev, stdout); } putchar('\\'); putchar('b'); prev = i+1; break;
case '\v' : if(i > prev) { fwrite(s+prev, 1, i-prev, stdout); } putchar('\\'); putchar('v'); prev = i+1; break;
case '\a' : if(i > prev) { fwrite(s+prev, 1, i-prev, stdout); } putchar('\\'); putchar('a'); prev = i+1; break;
case '\x1b': if(i > prev) { fwrite(s+prev, 1, i-prev, stdout); } putchar('\\'); putchar('e'); prev = i+1; break;
case '\n' : _dbg_printf("{}{}{}", csubstr(s+prev, i-prev), csubstr("\\n"), csubstr(keep_newlines ? "\n":"")); prev = i+1; break;
case '\t' : _dbg_printf("{}{}", csubstr(s+prev, i-prev), csubstr("\\t")); prev = i+1; break;
case '\0' : _dbg_printf("{}{}", csubstr(s+prev, i-prev), csubstr("\\0")); prev = i+1; break;
case '\r' : _dbg_printf("{}{}", csubstr(s+prev, i-prev), csubstr("\\r")); prev = i+1; break;
case '\f' : _dbg_printf("{}{}", csubstr(s+prev, i-prev), csubstr("\\f")); prev = i+1; break;
case '\b' : _dbg_printf("{}{}", csubstr(s+prev, i-prev), csubstr("\\b")); prev = i+1; break;
case '\v' : _dbg_printf("{}{}", csubstr(s+prev, i-prev), csubstr("\\v")); prev = i+1; break;
case '\a' : _dbg_printf("{}{}", csubstr(s+prev, i-prev), csubstr("\\a")); prev = i+1; break;
case '\x1b': _dbg_printf("{}{}", csubstr(s+prev, i-prev), csubstr("\\x1b")); prev = i+1; break;
case -0x3e/*0xc2u*/:
if(i+1 < len)
{
if(s[i+1] == -0x60/*0xa0u*/)
{
if(i > prev) { fwrite(s+prev, 1, i-prev, stdout); } putchar('\\'); putchar('_'); prev = i+2; ++i;
_dbg_printf("{}{}", csubstr(s+prev, i-prev), csubstr("\\_")); prev = i+1;
}
else if(s[i+1] == -0x7b/*0x85u*/)
{
if(i > prev) { fwrite(s+prev, 1, i-prev, stdout); } putchar('\\'); putchar('N'); prev = i+2; ++i;
_dbg_printf("{}{}", csubstr(s+prev, i-prev), csubstr("\\N")); prev = i+1;
}
break;
}
break;
case -0x1e/*0xe2u*/:
if(i+2 < len && s[i+1] == -0x80/*0x80u*/)
{
if(s[i+2] == -0x58/*0xa8u*/)
{
if(i > prev) { fwrite(s+prev, 1, i-prev, stdout); } putchar('\\'); putchar('L'); prev = i+3; i += 2;
_dbg_printf("{}{}", csubstr(s+prev, i-prev), csubstr("\\L")); prev = i+1;
}
else if(s[i+2] == -0x57/*0xa9u*/)
{
if(i > prev) { fwrite(s+prev, 1, i-prev, stdout); } putchar('\\'); putchar('P'); prev = i+3; i += 2;
_dbg_printf("{}{}", csubstr(s+prev, i-prev), csubstr("\\P")); prev = i+1;
}
break;
}
break;
}
}
if(len > prev)
fwrite(s + prev, 1, len - prev, stdout);
_dbg_printf("{}", csubstr(s+prev, len-prev));
}
inline void __c4presc(csubstr s, bool keep_newlines=false)
{
__c4presc(s.str, s.len, keep_newlines);
}
} // namespace c4
#endif // RYML_DBG
#pragma clang diagnostic pop
#pragma GCC diagnostic pop
@@ -134,5 +152,4 @@ inline void __c4presc(const char *s, size_t len)
# pragma warning(pop)
#endif
#endif /* _C4_YML_DETAIL_PARSER_DBG_HPP_ */

View File

@@ -4,20 +4,76 @@
#include "c4/yml/tree.hpp"
#include "c4/yml/node.hpp"
#ifdef RYML_DBG
#define _c4dbg_tree(...) print_tree(__VA_ARGS__)
#define _c4dbg_node(...) print_tree(__VA_ARGS__)
#else
#define _c4dbg_tree(...)
#define _c4dbg_node(...)
#endif
namespace c4 {
namespace yml {
C4_SUPPRESS_WARNING_GCC_CLANG_WITH_PUSH("-Wold-style-cast")
C4_SUPPRESS_WARNING_GCC("-Wuseless-cast")
inline size_t print_node(Tree const& p, size_t node, int level, size_t count, bool print_children)
inline const char* _container_style_code(Tree const& p, id_type node)
{
printf("[%zd]%*s[%zd] %p", count, (2*level), "", node, (void const*)p.get(node));
if(p.is_container(node))
{
if(p._p(node)->m_type & (FLOW_SL|FLOW_ML))
{
return "[FLOW]";
}
if(p._p(node)->m_type & (BLOCK))
{
return "[BLCK]";
}
}
return "";
}
inline char _scalar_code(NodeType masked)
{
if(masked & (KEY_LITERAL|VAL_LITERAL))
return '|';
if(masked & (KEY_FOLDED|VAL_FOLDED))
return '>';
if(masked & (KEY_SQUO|VAL_SQUO))
return '\'';
if(masked & (KEY_DQUO|VAL_DQUO))
return '"';
if(masked & (KEY_PLAIN|VAL_PLAIN))
return '~';
return '@';
}
inline char _scalar_code_key(NodeType t)
{
return _scalar_code(t & KEY_STYLE);
}
inline char _scalar_code_val(NodeType t)
{
return _scalar_code(t & VAL_STYLE);
}
inline char _scalar_code_key(Tree const& p, id_type node)
{
return _scalar_code_key(p._p(node)->m_type);
}
inline char _scalar_code_val(Tree const& p, id_type node)
{
return _scalar_code_key(p._p(node)->m_type);
}
inline id_type print_node(Tree const& p, id_type node, int level, id_type count, bool print_children)
{
printf("[%zu]%*s[%zu] %p", (size_t)count, (2*level), "", (size_t)node, (void const*)p.get(node));
if(p.is_root(node))
{
printf(" [ROOT]");
}
printf(" %s:", p.type_str(node));
char typebuf[128];
csubstr typestr = p.type(node).type_str(typebuf);
RYML_CHECK(typestr.str);
printf(" %.*s", (int)typestr.len, typestr.str);
if(p.has_key(node))
{
if(p.has_key_anchor(node))
@@ -28,65 +84,47 @@ inline size_t print_node(Tree const& p, size_t node, int level, size_t count, bo
if(p.has_key_tag(node))
{
csubstr kt = p.key_tag(node);
csubstr k = p.key(node);
printf(" %.*s '%.*s'", (int)kt.len, kt.str, (int)k.len, k.str);
}
else
{
csubstr k = p.key(node);
printf(" '%.*s'", (int)k.len, k.str);
}
}
else
{
RYML_ASSERT( ! p.has_key_tag(node));
}
if(p.has_val(node))
{
if(p.has_val_tag(node))
{
csubstr vt = p.val_tag(node);
csubstr v = p.val(node);
printf(" %.*s '%.*s'", (int)vt.len, vt.str, (int)v.len, v.str);
}
else
{
csubstr v = p.val(node);
printf(" '%.*s'", (int)v.len, v.str);
}
}
else
{
if(p.has_val_tag(node))
{
csubstr vt = p.val_tag(node);
printf(" %.*s", (int)vt.len, vt.str);
printf(" <%.*s>", (int)kt.len, kt.str);
}
const char code = _scalar_code_key(p, node);
csubstr k = p.key(node);
printf(" %c%.*s%c :", code, (int)k.len, k.str, code);
}
if(p.has_val_anchor(node))
{
auto &a = p.val_anchor(node);
printf(" valanchor='&%.*s'", (int)a.len, a.str);
csubstr a = p.val_anchor(node);
printf(" &%.*s'", (int)a.len, a.str);
}
printf(" (%zd sibs)", p.num_siblings(node));
if(p.has_val_tag(node))
{
csubstr vt = p.val_tag(node);
printf(" <%.*s>", (int)vt.len, vt.str);
}
if(p.has_val(node))
{
const char code = _scalar_code_val(p, node);
csubstr v = p.val(node);
printf(" %c%.*s%c", code, (int)v.len, v.str, code);
}
printf(" (%zu sibs)", (size_t)p.num_siblings(node));
++count;
if(p.is_container(node))
if(!p.is_container(node))
{
printf(" %zd children:\n", p.num_children(node));
printf("\n");
}
else
{
printf(" (%zu children)\n", (size_t)p.num_children(node));
if(print_children)
{
for(size_t i = p.first_child(node); i != NONE; i = p.next_sibling(i))
for(id_type i = p.first_child(node); i != NONE; i = p.next_sibling(i))
{
count = print_node(p, i, level+1, count, print_children);
}
}
}
else
{
printf("\n");
}
return count;
}
@@ -106,21 +144,37 @@ inline void print_node(ConstNodeRef const& p, int level=0)
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
inline size_t print_tree(Tree const& p, size_t node=NONE)
inline id_type print_tree(const char *message, Tree const& p, id_type node=NONE)
{
printf("--------------------------------------\n");
size_t ret = 0;
if(message != nullptr)
printf("%s:\n", message);
id_type ret = 0;
if(!p.empty())
{
if(node == NONE)
node = p.root_id();
ret = print_node(p, node, 0, 0, true);
}
printf("#nodes=%zd vs #printed=%zd\n", p.size(), ret);
printf("#nodes=%zu vs #printed=%zu\n", (size_t)p.size(), (size_t)ret);
printf("--------------------------------------\n");
return ret;
}
inline id_type print_tree(Tree const& p, id_type node=NONE)
{
return print_tree(nullptr, p, node);
}
inline void print_tree(ConstNodeRef const& p, int level)
{
print_node(p, level);
for(ConstNodeRef ch : p.children())
{
print_tree(ch, level+1);
}
}
C4_SUPPRESS_WARNING_GCC_CLANG_POP
} /* namespace yml */

View File

@@ -18,21 +18,26 @@ C4_SUPPRESS_WARNING_GCC_CLANG_WITH_PUSH("-Wold-style-cast")
namespace detail {
/** A lightweight contiguous stack with SSO. This avoids a dependency on std. */
template<class T, size_t N=16>
/** A lightweight contiguous stack with Small Storage
* Optimization. This is required because std::vector can throw
* exceptions, and we don't want to enforce any particular error
* mechanism. */
template<class T, id_type N=16>
class stack
{
static_assert(std::is_trivially_copyable<T>::value, "T must be trivially copyable");
static_assert(std::is_trivially_destructible<T>::value, "T must be trivially destructible");
enum : size_t { sso_size = N };
public:
enum : id_type { sso_size = N };
public:
T m_buf[N];
T m_buf[size_t(N)];
T * m_stack;
size_t m_size;
size_t m_capacity;
id_type m_size;
id_type m_capacity;
Callbacks m_callbacks;
public:
@@ -79,29 +84,29 @@ public:
public:
size_t size() const { return m_size; }
size_t empty() const { return m_size == 0; }
size_t capacity() const { return m_capacity; }
id_type size() const { return m_size; }
id_type empty() const { return m_size == 0; }
id_type capacity() const { return m_capacity; }
void clear()
{
m_size = 0;
}
void resize(size_t sz)
void resize(id_type sz)
{
reserve(sz);
m_size = sz;
}
void reserve(size_t sz);
void reserve(id_type sz);
void push(T const& C4_RESTRICT n)
{
RYML_ASSERT((const char*)&n + sizeof(T) < (const char*)m_stack || &n > m_stack + m_capacity);
_RYML_CB_ASSERT(m_callbacks, (const char*)&n + sizeof(T) < (const char*)m_stack || &n > m_stack + m_capacity);
if(m_size == m_capacity)
{
size_t cap = m_capacity == 0 ? N : 2 * m_capacity;
id_type cap = m_capacity == 0 ? N : 2 * m_capacity;
reserve(cap);
}
m_stack[m_size] = n;
@@ -110,10 +115,10 @@ public:
void push_top()
{
RYML_ASSERT(m_size > 0);
_RYML_CB_ASSERT(m_callbacks, m_size > 0);
if(m_size == m_capacity)
{
size_t cap = m_capacity == 0 ? N : 2 * m_capacity;
id_type cap = m_capacity == 0 ? N : 2 * m_capacity;
reserve(cap);
}
m_stack[m_size] = m_stack[m_size - 1];
@@ -122,25 +127,25 @@ public:
T const& C4_RESTRICT pop()
{
RYML_ASSERT(m_size > 0);
_RYML_CB_ASSERT(m_callbacks, m_size > 0);
--m_size;
return m_stack[m_size];
}
C4_ALWAYS_INLINE T const& C4_RESTRICT top() const { RYML_ASSERT(m_size > 0); return m_stack[m_size - 1]; }
C4_ALWAYS_INLINE T & C4_RESTRICT top() { RYML_ASSERT(m_size > 0); return m_stack[m_size - 1]; }
C4_ALWAYS_INLINE T const& C4_RESTRICT top() const { _RYML_CB_ASSERT(m_callbacks, m_size > 0); return m_stack[m_size - 1]; }
C4_ALWAYS_INLINE T & C4_RESTRICT top() { _RYML_CB_ASSERT(m_callbacks, m_size > 0); return m_stack[m_size - 1]; }
C4_ALWAYS_INLINE T const& C4_RESTRICT bottom() const { RYML_ASSERT(m_size > 0); return m_stack[0]; }
C4_ALWAYS_INLINE T & C4_RESTRICT bottom() { RYML_ASSERT(m_size > 0); return m_stack[0]; }
C4_ALWAYS_INLINE T const& C4_RESTRICT bottom() const { _RYML_CB_ASSERT(m_callbacks, m_size > 0); return m_stack[0]; }
C4_ALWAYS_INLINE T & C4_RESTRICT bottom() { _RYML_CB_ASSERT(m_callbacks, m_size > 0); return m_stack[0]; }
C4_ALWAYS_INLINE T const& C4_RESTRICT top(size_t i) const { RYML_ASSERT(i < m_size); return m_stack[m_size - 1 - i]; }
C4_ALWAYS_INLINE T & C4_RESTRICT top(size_t i) { RYML_ASSERT(i < m_size); return m_stack[m_size - 1 - i]; }
C4_ALWAYS_INLINE T const& C4_RESTRICT top(id_type i) const { _RYML_CB_ASSERT(m_callbacks, i < m_size); return m_stack[m_size - 1 - i]; }
C4_ALWAYS_INLINE T & C4_RESTRICT top(id_type i) { _RYML_CB_ASSERT(m_callbacks, i < m_size); return m_stack[m_size - 1 - i]; }
C4_ALWAYS_INLINE T const& C4_RESTRICT bottom(size_t i) const { RYML_ASSERT(i < m_size); return m_stack[i]; }
C4_ALWAYS_INLINE T & C4_RESTRICT bottom(size_t i) { RYML_ASSERT(i < m_size); return m_stack[i]; }
C4_ALWAYS_INLINE T const& C4_RESTRICT bottom(id_type i) const { _RYML_CB_ASSERT(m_callbacks, i < m_size); return m_stack[i]; }
C4_ALWAYS_INLINE T & C4_RESTRICT bottom(id_type i) { _RYML_CB_ASSERT(m_callbacks, i < m_size); return m_stack[i]; }
C4_ALWAYS_INLINE T const& C4_RESTRICT operator[](size_t i) const { RYML_ASSERT(i < m_size); return m_stack[i]; }
C4_ALWAYS_INLINE T & C4_RESTRICT operator[](size_t i) { RYML_ASSERT(i < m_size); return m_stack[i]; }
C4_ALWAYS_INLINE T const& C4_RESTRICT operator[](id_type i) const { _RYML_CB_ASSERT(m_callbacks, i < m_size); return m_stack[i]; }
C4_ALWAYS_INLINE T & C4_RESTRICT operator[](id_type i) { _RYML_CB_ASSERT(m_callbacks, i < m_size); return m_stack[i]; }
public:
@@ -154,10 +159,12 @@ public:
const_iterator end () const { return (const_iterator)m_stack + m_size; }
public:
void _free();
void _cp(stack const* C4_RESTRICT that);
void _mv(stack * that);
void _cb(Callbacks const& cb);
};
@@ -165,8 +172,8 @@ public:
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
template<class T, size_t N>
void stack<T, N>::reserve(size_t sz)
template<class T, id_type N>
void stack<T, N>::reserve(id_type sz)
{
if(sz <= m_size)
return;
@@ -176,11 +183,12 @@ void stack<T, N>::reserve(size_t sz)
m_capacity = N;
return;
}
T *buf = (T*) m_callbacks.m_allocate(sz * sizeof(T), m_stack, m_callbacks.m_user_data);
memcpy(buf, m_stack, m_size * sizeof(T));
T *buf = (T*) m_callbacks.m_allocate((size_t)sz * sizeof(T), m_stack, m_callbacks.m_user_data);
_RYML_CB_ASSERT(m_callbacks, ((uintptr_t)buf % alignof(T)) == 0u);
memcpy(buf, m_stack, (size_t)m_size * sizeof(T));
if(m_stack != m_buf)
{
m_callbacks.m_free(m_stack, m_capacity * sizeof(T), m_callbacks.m_user_data);
m_callbacks.m_free(m_stack, (size_t)m_capacity * sizeof(T), m_callbacks.m_user_data);
}
m_stack = buf;
m_capacity = sz;
@@ -189,38 +197,38 @@ void stack<T, N>::reserve(size_t sz)
//-----------------------------------------------------------------------------
template<class T, size_t N>
template<class T, id_type N>
void stack<T, N>::_free()
{
RYML_ASSERT(m_stack != nullptr); // this structure cannot be memset() to zero
_RYML_CB_ASSERT(m_callbacks, m_stack != nullptr); // this structure cannot be memset() to zero
if(m_stack != m_buf)
{
m_callbacks.m_free(m_stack, m_capacity * sizeof(T), m_callbacks.m_user_data);
m_callbacks.m_free(m_stack, (size_t)m_capacity * sizeof(T), m_callbacks.m_user_data);
m_stack = m_buf;
m_size = N;
m_capacity = N;
}
else
{
RYML_ASSERT(m_capacity == N);
_RYML_CB_ASSERT(m_callbacks, m_capacity == N);
}
}
//-----------------------------------------------------------------------------
template<class T, size_t N>
template<class T, id_type N>
void stack<T, N>::_cp(stack const* C4_RESTRICT that)
{
if(that->m_stack != that->m_buf)
{
RYML_ASSERT(that->m_capacity > N);
RYML_ASSERT(that->m_size <= that->m_capacity);
_RYML_CB_ASSERT(m_callbacks, that->m_capacity > N);
_RYML_CB_ASSERT(m_callbacks, that->m_size <= that->m_capacity);
}
else
{
RYML_ASSERT(that->m_capacity <= N);
RYML_ASSERT(that->m_size <= that->m_capacity);
_RYML_CB_ASSERT(m_callbacks, that->m_capacity <= N);
_RYML_CB_ASSERT(m_callbacks, that->m_size <= that->m_capacity);
}
memcpy(m_stack, that->m_stack, that->m_size * sizeof(T));
m_size = that->m_size;
@@ -231,19 +239,19 @@ void stack<T, N>::_cp(stack const* C4_RESTRICT that)
//-----------------------------------------------------------------------------
template<class T, size_t N>
template<class T, id_type N>
void stack<T, N>::_mv(stack * that)
{
if(that->m_stack != that->m_buf)
{
RYML_ASSERT(that->m_capacity > N);
RYML_ASSERT(that->m_size <= that->m_capacity);
_RYML_CB_ASSERT(m_callbacks, that->m_capacity > N);
_RYML_CB_ASSERT(m_callbacks, that->m_size <= that->m_capacity);
m_stack = that->m_stack;
}
else
{
RYML_ASSERT(that->m_capacity <= N);
RYML_ASSERT(that->m_size <= that->m_capacity);
_RYML_CB_ASSERT(m_callbacks, that->m_capacity <= N);
_RYML_CB_ASSERT(m_callbacks, that->m_size <= that->m_capacity);
memcpy(m_buf, that->m_buf, that->m_size * sizeof(T));
m_stack = m_buf;
}
@@ -251,7 +259,7 @@ void stack<T, N>::_mv(stack * that)
m_capacity = that->m_capacity;
m_callbacks = that->m_callbacks;
// make sure no deallocation happens on destruction
RYML_ASSERT(that->m_stack != m_buf);
_RYML_CB_ASSERT(m_callbacks, that->m_stack != m_buf);
that->m_stack = that->m_buf;
that->m_capacity = N;
that->m_size = 0;
@@ -260,7 +268,7 @@ void stack<T, N>::_mv(stack * that)
//-----------------------------------------------------------------------------
template<class T, size_t N>
template<class T, id_type N>
void stack<T, N>::_cb(Callbacks const& cb)
{
if(cb != m_callbacks)

File diff suppressed because it is too large Load Diff

View File

@@ -15,13 +15,21 @@
#include "./node.hpp"
#endif
#ifdef emit
#error "emit is defined, likely from a Qt include. This will cause a compilation error. See https://github.com/biojppm/rapidyaml/issues/120"
#endif
#define RYML_DEPRECATE_EMIT \
RYML_DEPRECATED("use emit_yaml() instead. See https://github.com/biojppm/rapidyaml/issues/120")
RYML_DEPRECATED("use emit_yaml() instead. " \
"See https://github.com/biojppm/rapidyaml/issues/120")
#define RYML_DEPRECATE_EMITRS \
RYML_DEPRECATED("use emitrs_yaml() instead. See https://github.com/biojppm/rapidyaml/issues/120")
RYML_DEPRECATED("use emitrs_yaml() instead. " \
"See https://github.com/biojppm/rapidyaml/issues/120")
#ifdef emit
#error "emit is defined, likely from a Qt include. " \
"This will cause a compilation error. " \
"See https://github.com/biojppm/rapidyaml/issues/120"
#endif
C4_SUPPRESS_WARNING_GCC_CLANG_WITH_PUSH("-Wold-style-cast")
//-----------------------------------------------------------------------------
@@ -83,7 +91,7 @@ public:
* @param error_on_excess when true, an error is raised when the
* output buffer is too small for the emitted YAML/JSON
* */
substr emit_as(EmitType_e type, Tree const& t, size_t id, bool error_on_excess);
substr emit_as(EmitType_e type, Tree const& t, id_type id, bool error_on_excess);
/** emit starting at the root node */
substr emit_as(EmitType_e type, Tree const& t, bool error_on_excess=true);
/** emit the given node */
@@ -92,27 +100,30 @@ public:
private:
Tree const* C4_RESTRICT m_tree;
bool m_flow;
void _emit_yaml(size_t id);
void _do_visit_flow_sl(size_t id, size_t ilevel=0);
void _do_visit_flow_ml(size_t id, size_t ilevel=0, size_t do_indent=1);
void _do_visit_block(size_t id, size_t ilevel=0, size_t do_indent=1);
void _do_visit_block_container(size_t id, size_t next_level, size_t do_indent);
void _do_visit_json(size_t id);
void _emit_yaml(id_type id);
void _do_visit_flow_sl(id_type id, id_type ilevel=0);
void _do_visit_flow_ml(id_type id, id_type ilevel=0, id_type do_indent=1);
void _do_visit_block(id_type id, id_type ilevel=0, id_type do_indent=1);
void _do_visit_block_container(id_type id, id_type next_level, bool do_indent);
void _do_visit_json(id_type id);
private:
void _write(NodeScalar const& C4_RESTRICT sc, NodeType flags, size_t level);
void _write(NodeScalar const& C4_RESTRICT sc, NodeType flags, id_type level);
void _write_json(NodeScalar const& C4_RESTRICT sc, NodeType flags);
void _write_doc(size_t id);
void _write_scalar(csubstr s, bool was_quoted);
void _write_scalar_json(csubstr s, bool as_key, bool was_quoted);
void _write_scalar_literal(csubstr s, size_t level, bool as_key, bool explicit_indentation=false);
void _write_scalar_folded(csubstr s, size_t level, bool as_key);
void _write_scalar_squo(csubstr s, size_t level);
void _write_scalar_dquo(csubstr s, size_t level);
void _write_scalar_plain(csubstr s, size_t level);
void _write_doc(id_type id);
void _write_scalar_json_dquo(csubstr s);
void _write_scalar_literal(csubstr s, id_type level, bool as_key);
void _write_scalar_folded(csubstr s, id_type level, bool as_key);
void _write_scalar_squo(csubstr s, id_type level);
void _write_scalar_dquo(csubstr s, id_type level);
void _write_scalar_plain(csubstr s, id_type level);
size_t _write_escaped_newlines(csubstr s, size_t i);
size_t _write_indented_block(csubstr s, size_t i, id_type level);
void _write_tag(csubstr tag)
{
@@ -122,18 +133,28 @@ private:
}
enum : type_bits {
_keysc = (KEY|KEYREF|KEYANCH|KEYQUO|_WIP_KEY_STYLE) | ~(VAL|VALREF|VALANCH|VALQUO|_WIP_VAL_STYLE),
_valsc = ~(KEY|KEYREF|KEYANCH|KEYQUO|_WIP_KEY_STYLE) | (VAL|VALREF|VALANCH|VALQUO|_WIP_VAL_STYLE),
_keysc = (KEY|KEYREF|KEYANCH|KEYQUO|KEY_STYLE) | ~(VAL|VALREF|VALANCH|VALQUO|VAL_STYLE) | CONTAINER_STYLE,
_valsc = ~(KEY|KEYREF|KEYANCH|KEYQUO|KEY_STYLE) | (VAL|VALREF|VALANCH|VALQUO|VAL_STYLE) | CONTAINER_STYLE,
_keysc_json = (KEY) | ~(VAL),
_valsc_json = ~(KEY) | (VAL),
};
C4_ALWAYS_INLINE void _writek(size_t id, size_t level) { _write(m_tree->keysc(id), m_tree->_p(id)->m_type.type & ~_valsc, level); }
C4_ALWAYS_INLINE void _writev(size_t id, size_t level) { _write(m_tree->valsc(id), m_tree->_p(id)->m_type.type & ~_keysc, level); }
C4_ALWAYS_INLINE void _writek(id_type id, id_type level) { _write(m_tree->keysc(id), (m_tree->_p(id)->m_type.type & ~_valsc), level); }
C4_ALWAYS_INLINE void _writev(id_type id, id_type level) { _write(m_tree->valsc(id), (m_tree->_p(id)->m_type.type & ~_keysc), level); }
C4_ALWAYS_INLINE void _writek_json(size_t id) { _write_json(m_tree->keysc(id), m_tree->_p(id)->m_type.type & ~(VAL)); }
C4_ALWAYS_INLINE void _writev_json(size_t id) { _write_json(m_tree->valsc(id), m_tree->_p(id)->m_type.type & ~(KEY)); }
C4_ALWAYS_INLINE void _writek_json(id_type id) { _write_json(m_tree->keysc(id), m_tree->_p(id)->m_type.type & ~(VAL)); }
C4_ALWAYS_INLINE void _writev_json(id_type id) { _write_json(m_tree->valsc(id), m_tree->_p(id)->m_type.type & ~(KEY)); }
void _indent(id_type level, bool enabled)
{
if(enabled)
this->Writer::_do_write(' ', 2u * (size_t)level);
}
void _indent(id_type level)
{
if(!m_flow)
this->Writer::_do_write(' ', 2u * (size_t)level);
}
};
@@ -149,14 +170,14 @@ private:
/** emit YAML to the given file. A null file defaults to stdout.
* Return the number of bytes written. */
inline size_t emit_yaml(Tree const& t, size_t id, FILE *f)
inline size_t emit_yaml(Tree const& t, id_type id, FILE *f)
{
EmitterFile em(f);
return em.emit_as(EMIT_YAML, t, id, /*error_on_excess*/true).len;
}
/** emit JSON to the given file. A null file defaults to stdout.
* Return the number of bytes written. */
inline size_t emit_json(Tree const& t, size_t id, FILE *f)
inline size_t emit_json(Tree const& t, id_type id, FILE *f)
{
EmitterFile em(f);
return em.emit_as(EMIT_JSON, t, id, /*error_on_excess*/true).len;
@@ -265,17 +286,29 @@ inline OStream& operator<< (OStream& s, as_json const& j)
*/
/** emit YAML to the given buffer. Return a substr trimmed to the emitted YAML.
* @param t the tree to emit.
* @param id the node where to start emitting.
* @param buf the output buffer.
* @param error_on_excess Raise an error if the space in the buffer is insufficient.
* @return a substr trimmed to the result. If the buffer is
* insufficient (and error_on_excess is false), the pointer of the
* result will be set to null.
* @overload */
inline substr emit_yaml(Tree const& t, size_t id, substr buf, bool error_on_excess=true)
inline substr emit_yaml(Tree const& t, id_type id, substr buf, bool error_on_excess=true)
{
EmitterBuf em(buf);
return em.emit_as(EMIT_YAML, t, id, error_on_excess);
}
/** emit JSON to the given buffer. Return a substr trimmed to the emitted JSON.
* @param t the tree to emit.
* @param id the node where to start emitting.
* @param buf the output buffer.
* @param error_on_excess Raise an error if the space in the buffer is insufficient.
* @return a substr trimmed to the result. If the buffer is
* insufficient (and error_on_excess is false), the pointer of the
* result will be set to null.
* @overload */
inline substr emit_json(Tree const& t, size_t id, substr buf, bool error_on_excess=true)
inline substr emit_json(Tree const& t, id_type id, substr buf, bool error_on_excess=true)
{
EmitterBuf em(buf);
return em.emit_as(EMIT_JSON, t, id, error_on_excess);
@@ -283,7 +316,12 @@ inline substr emit_json(Tree const& t, size_t id, substr buf, bool error_on_exce
/** emit YAML to the given buffer. Return a substr trimmed to the emitted YAML.
* @param t the tree; will be emitted from the root node.
* @param error_on_excess Raise an error if the space in the buffer is insufficient.
* @param buf the output buffer.
* @return a substr trimmed to the result. If the buffer is
* insufficient (and error_on_excess is false), the pointer of the
* result will be set to null.
* @overload */
inline substr emit_yaml(Tree const& t, substr buf, bool error_on_excess=true)
{
@@ -291,7 +329,12 @@ inline substr emit_yaml(Tree const& t, substr buf, bool error_on_excess=true)
return em.emit_as(EMIT_YAML, t, error_on_excess);
}
/** emit JSON to the given buffer. Return a substr trimmed to the emitted JSON.
* @param t the tree; will be emitted from the root node.
* @param buf the output buffer.
* @param error_on_excess Raise an error if the space in the buffer is insufficient.
* @return a substr trimmed to the result. If the buffer is
* insufficient (and error_on_excess is false), the pointer of the
* result will be set to null.
* @overload */
inline substr emit_json(Tree const& t, substr buf, bool error_on_excess=true)
{
@@ -301,7 +344,12 @@ inline substr emit_json(Tree const& t, substr buf, bool error_on_excess=true)
/** emit YAML to the given buffer. Return a substr trimmed to the emitted YAML.
* @param r the starting node.
* @param buf the output buffer.
* @param error_on_excess Raise an error if the space in the buffer is insufficient.
* @return a substr trimmed to the result. If the buffer is
* insufficient (and error_on_excess is false), the pointer of the
* result will be set to null.
* @overload
*/
inline substr emit_yaml(ConstNodeRef const& r, substr buf, bool error_on_excess=true)
@@ -310,7 +358,12 @@ inline substr emit_yaml(ConstNodeRef const& r, substr buf, bool error_on_excess=
return em.emit_as(EMIT_YAML, r, error_on_excess);
}
/** emit JSON to the given buffer. Return a substr trimmed to the emitted JSON.
* @param r the starting node.
* @param buf the output buffer.
* @param error_on_excess Raise an error if the space in the buffer is insufficient.
* @return a substr trimmed to the result. If the buffer is
* insufficient (and error_on_excess is false), the pointer of the
* result will be set to null.
* @overload
*/
inline substr emit_json(ConstNodeRef const& r, substr buf, bool error_on_excess=true)
@@ -325,7 +378,7 @@ inline substr emit_json(ConstNodeRef const& r, substr buf, bool error_on_excess=
/** emit+resize: emit YAML to the given `std::string`/`std::vector`-like
* container, resizing it as needed to fit the emitted YAML. */
template<class CharOwningContainer>
substr emitrs_yaml(Tree const& t, size_t id, CharOwningContainer * cont)
substr emitrs_yaml(Tree const& t, id_type id, CharOwningContainer * cont)
{
substr buf = to_substr(*cont);
substr ret = emit_yaml(t, id, buf, /*error_on_excess*/false);
@@ -340,7 +393,7 @@ substr emitrs_yaml(Tree const& t, size_t id, CharOwningContainer * cont)
/** emit+resize: emit JSON to the given `std::string`/`std::vector`-like
* container, resizing it as needed to fit the emitted JSON. */
template<class CharOwningContainer>
substr emitrs_json(Tree const& t, size_t id, CharOwningContainer * cont)
substr emitrs_json(Tree const& t, id_type id, CharOwningContainer * cont)
{
substr buf = to_substr(*cont);
substr ret = emit_json(t, id, buf, /*error_on_excess*/false);
@@ -357,7 +410,7 @@ substr emitrs_json(Tree const& t, size_t id, CharOwningContainer * cont)
/** emit+resize: emit YAML to the given `std::string`/`std::vector`-like
* container, resizing it as needed to fit the emitted YAML. */
template<class CharOwningContainer>
CharOwningContainer emitrs_yaml(Tree const& t, size_t id)
CharOwningContainer emitrs_yaml(Tree const& t, id_type id)
{
CharOwningContainer c;
emitrs_yaml(t, id, &c);
@@ -366,7 +419,7 @@ CharOwningContainer emitrs_yaml(Tree const& t, size_t id)
/** emit+resize: emit JSON to the given `std::string`/`std::vector`-like
* container, resizing it as needed to fit the emitted JSON. */
template<class CharOwningContainer>
CharOwningContainer emitrs_json(Tree const& t, size_t id)
CharOwningContainer emitrs_json(Tree const& t, id_type id)
{
CharOwningContainer c;
emitrs_json(t, id, &c);
@@ -464,7 +517,7 @@ CharOwningContainer emitrs_json(ConstNodeRef const& n)
/** @cond dev */
RYML_DEPRECATE_EMIT inline size_t emit(Tree const& t, size_t id, FILE *f)
RYML_DEPRECATE_EMIT inline size_t emit(Tree const& t, id_type id, FILE *f)
{
return emit_yaml(t, id, f);
}
@@ -477,7 +530,7 @@ RYML_DEPRECATE_EMIT inline size_t emit(ConstNodeRef const& r, FILE *f=nullptr)
return emit_yaml(r, f);
}
RYML_DEPRECATE_EMIT inline substr emit(Tree const& t, size_t id, substr buf, bool error_on_excess=true)
RYML_DEPRECATE_EMIT inline substr emit(Tree const& t, id_type id, substr buf, bool error_on_excess=true)
{
return emit_yaml(t, id, buf, error_on_excess);
}
@@ -491,12 +544,12 @@ RYML_DEPRECATE_EMIT inline substr emit(ConstNodeRef const& r, substr buf, bool e
}
template<class CharOwningContainer>
RYML_DEPRECATE_EMITRS substr emitrs(Tree const& t, size_t id, CharOwningContainer * cont)
RYML_DEPRECATE_EMITRS substr emitrs(Tree const& t, id_type id, CharOwningContainer * cont)
{
return emitrs_yaml(t, id, cont);
}
template<class CharOwningContainer>
RYML_DEPRECATE_EMITRS CharOwningContainer emitrs(Tree const& t, size_t id)
RYML_DEPRECATE_EMITRS CharOwningContainer emitrs(Tree const& t, id_type id)
{
return emitrs_yaml<CharOwningContainer>(t, id);
}
@@ -526,6 +579,8 @@ RYML_DEPRECATE_EMITRS CharOwningContainer emitrs(ConstNodeRef const& n)
} // namespace yml
} // namespace c4
C4_SUPPRESS_WARNING_GCC_CLANG_POP
#undef RYML_DEPRECATE_EMIT
#undef RYML_DEPRECATE_EMITRS

View File

@@ -0,0 +1,136 @@
#ifndef _C4_YML_EVENT_HANDLER_STACK_HPP_
#define _C4_YML_EVENT_HANDLER_STACK_HPP_
#ifndef _C4_YML_DETAIL_STACK_HPP_
#include "c4/yml/detail/stack.hpp"
#endif
#ifndef _C4_YML_DETAIL_PARSER_DBG_HPP_
#include "c4/yml/detail/parser_dbg.hpp"
#endif
#ifndef _C4_YML_PARSER_STATE_HPP_
#include "c4/yml/parser_state.hpp"
#endif
#ifdef RYML_DBG
#ifndef _C4_YML_DETAIL_PRINT_HPP_
#include "c4/yml/detail/print.hpp"
#endif
#endif
namespace c4 {
namespace yml {
/** @addtogroup doc_event_handlers
* @{ */
/** Use this class a base of implementations of event handler to
* simplify the stack logic. */
template<class HandlerImpl, class HandlerState>
struct EventHandlerStack
{
static_assert(std::is_base_of<ParserState, HandlerState>::value,
"ParserState must be a base of HandlerState");
using state = HandlerState;
public:
detail::stack<state> m_stack;
state *C4_RESTRICT m_curr; ///< current stack level: top of the stack. cached here for easier access.
state *C4_RESTRICT m_parent; ///< parent of the current stack level.
protected:
EventHandlerStack() : m_stack(), m_curr(), m_parent() {}
EventHandlerStack(Callbacks const& cb) : m_stack(cb), m_curr(), m_parent() {}
protected:
void _stack_reset_root()
{
m_stack.clear();
m_stack.push({});
m_parent = nullptr;
m_curr = &m_stack.top();
}
void _stack_reset_non_root()
{
m_stack.clear();
m_stack.push({}); // parent
m_stack.push({}); // node
m_parent = &m_stack.top(1);
m_curr = &m_stack.top();
}
void _stack_push()
{
m_stack.push_top();
m_parent = &m_stack.top(1); // don't use m_curr. watch out for relocations inside the prev push
m_curr = &m_stack.top();
m_curr->reset_after_push();
}
void _stack_pop()
{
_RYML_CB_ASSERT(m_stack.m_callbacks, m_parent);
_RYML_CB_ASSERT(m_stack.m_callbacks, m_stack.size() > 1);
m_parent->reset_before_pop(*m_curr);
m_stack.pop();
m_parent = m_stack.size() > 1 ? &m_stack.top(1) : nullptr;
m_curr = &m_stack.top();
#ifdef RYML_DBG
if(m_parent)
_c4dbgpf("popped! top is now node={} (parent={})", m_curr->node_id, m_parent->node_id);
else
_c4dbgpf("popped! top is now node={} @ ROOT", m_curr->node_id);
#endif
}
protected:
// undefined at the end
#define _has_any_(bits) (static_cast<HandlerImpl const* C4_RESTRICT>(this)->template _has_any__<bits>())
bool _stack_should_push_on_begin_doc() const
{
const bool is_root = (m_stack.size() == 1u);
return is_root && (_has_any_(DOC|VAL|MAP|SEQ) || m_curr->has_children);
}
bool _stack_should_pop_on_end_doc() const
{
const bool is_root = (m_stack.size() == 1u);
return !is_root && _has_any_(DOC);
}
public:
/** Check whether the current parse tokens are trailing on the
* previous doc, and raise an error if they are. This function is
* called by the parse engine (not the event handler) before a doc
* is started. */
void check_trailing_doc_token() const
{
const bool is_root = (m_stack.size() == 1u);
const bool isndoc = (m_curr->flags & NDOC) != 0;
const bool suspicious = _has_any_(MAP|SEQ|VAL);
_c4dbgpf("target={} isroot={} suspicious={} ndoc={}", m_curr->node_id, is_root, suspicious, isndoc);
if((is_root || _has_any_(DOC)) && suspicious && !isndoc)
_RYML_CB_ERR_(m_stack.m_callbacks, "parse error", m_curr->pos);
}
protected:
#undef _has_any_
};
/** @} */
} // namespace yml
} // namespace c4
#endif /* _C4_YML_EVENT_HANDLER_STACK_HPP_ */

View File

@@ -0,0 +1,700 @@
#ifndef _C4_YML_EVENT_HANDLER_TREE_HPP_
#define _C4_YML_EVENT_HANDLER_TREE_HPP_
#ifndef _C4_YML_TREE_HPP_
#include "c4/yml/tree.hpp"
#endif
#ifndef _C4_YML_EVENT_HANDLER_STACK_HPP_
#include "c4/yml/event_handler_stack.hpp"
#endif
C4_SUPPRESS_WARNING_MSVC_WITH_PUSH(4702) // unreachable code
namespace c4 {
namespace yml {
/** @addtogroup doc_event_handlers
* @{ */
struct EventHandlerTreeState : public ParserState
{
NodeData *tr_data;
};
/** The event handler to create a ryml @ref Tree. See the
* documentation for @ref doc_event_handlers, which has important
* notes about the event model used by rapidyaml. */
struct EventHandlerTree : public EventHandlerStack<EventHandlerTree, EventHandlerTreeState>
{
/** @name types
* @{ */
// our internal state must inherit from parser state
using state = EventHandlerTreeState;
/** @} */
public:
/** @cond dev */
static constexpr const bool is_events = false; // remove
static constexpr const bool is_wtree = true;
Tree *C4_RESTRICT m_tree;
id_type m_id;
#if RYML_DBG
#define _enable_(bits) _enable__<bits>(); _c4dbgpf("node[{}]: enable {}", m_curr->node_id, #bits)
#define _disable_(bits) _disable__<bits>(); _c4dbgpf("node[{}]: disable {}", m_curr->node_id, #bits)
#else
#define _enable_(bits) _enable__<bits>()
#define _disable_(bits) _disable__<bits>()
#endif
#define _has_any_(bits) _has_any__<bits>()
/** @endcond */
public:
/** @name construction and resetting
* @{ */
EventHandlerTree() : EventHandlerStack(), m_tree(), m_id(NONE) {}
EventHandlerTree(Callbacks const& cb) : EventHandlerStack(cb), m_tree(), m_id(NONE) {}
EventHandlerTree(Tree *tree, id_type id) : EventHandlerStack(tree->callbacks()), m_tree(tree), m_id(id)
{
reset(tree, id);
}
void reset(Tree *tree, id_type id)
{
RYML_CHECK(tree);
RYML_CHECK(id < tree->capacity());
if(!tree->is_root(id))
if(tree->is_map(tree->parent(id)))
if(!tree->has_key(id))
c4::yml::error("destination node belongs to a map and has no key");
m_tree = tree;
m_id = id;
if(m_tree->is_root(id))
{
_stack_reset_root();
_reset_parser_state(m_curr, id, m_tree->root_id());
}
else
{
_stack_reset_non_root();
_reset_parser_state(m_parent, id, m_tree->parent(id));
_reset_parser_state(m_curr, id, id);
}
}
/** @} */
public:
/** @name parse events
* @{ */
void start_parse(const char* filename)
{
m_curr->start_parse(filename, m_curr->node_id);
}
void finish_parse()
{
/* This pointer is temporary. Remember that:
*
* - this handler object may be held by the user
* - it may be used with a temporary tree inside the parse function
* - when the parse function returns the temporary tree, its address
* will change
*
* As a result, the user could try to read the tree from m_tree, and
* end up reading the stale temporary object.
*
* So it is better to clear it here; then the user will get an obvious
* segfault to read from m_tree. */
m_tree = nullptr;
}
void cancel_parse()
{
m_tree = nullptr;
}
/** @} */
public:
/** @name YAML stream events */
/** @{ */
C4_ALWAYS_INLINE void begin_stream() const noexcept { /*nothing to do*/ }
C4_ALWAYS_INLINE void end_stream() const noexcept { /*nothing to do*/ }
/** @} */
public:
/** @name YAML document events */
/** @{ */
/** implicit doc start (without ---) */
void begin_doc()
{
_c4dbgp("begin_doc");
if(_stack_should_push_on_begin_doc())
{
_c4dbgp("push!");
_tr_set_root_as_stream();
_push();
_enable_(DOC);
}
}
/** implicit doc end (without ...) */
void end_doc()
{
_c4dbgp("end_doc");
if(_stack_should_pop_on_end_doc())
{
_tr_remove_speculative();
_c4dbgp("pop!");
_pop();
}
}
/** explicit doc start, with --- */
void begin_doc_expl()
{
_c4dbgp("begin_doc_expl");
_RYML_CB_ASSERT(m_stack.m_callbacks, m_tree->root_id() == m_curr->node_id);
if(!m_tree->is_stream(m_tree->root_id())) //if(_should_push_on_begin_doc())
{
_c4dbgp("ensure stream");
_tr_set_root_as_stream();
id_type first = m_tree->first_child(m_tree->root_id());
_RYML_CB_ASSERT(m_stack.m_callbacks, m_tree->is_stream(m_tree->root_id()));
_RYML_CB_ASSERT(m_stack.m_callbacks, m_tree->num_children(m_tree->root_id()) == 1u);
if(m_tree->has_children(first) || m_tree->is_val(first))
{
_c4dbgp("push!");
_push();
}
else
{
_c4dbgp("tweak");
_push();
_tr_remove_speculative();
m_curr->node_id = m_tree->last_child(m_tree->root_id());
m_curr->tr_data = m_tree->_p(m_curr->node_id);
}
}
else
{
_c4dbgp("push!");
_push();
}
_enable_(DOC);
}
/** explicit doc end, with ... */
void end_doc_expl()
{
_c4dbgp("end_doc_expl");
{
_tr_remove_speculative();
}
if(_stack_should_pop_on_end_doc())
{
_c4dbgp("pop!");
_pop();
}
}
/** @} */
public:
/** @name YAML map events */
/** @{ */
void begin_map_key_flow()
{
_RYML_CB_ERR_(m_stack.m_callbacks, "ryml trees cannot handle containers as keys", m_curr->pos);
}
void begin_map_key_block()
{
_RYML_CB_ERR_(m_stack.m_callbacks, "ryml trees cannot handle containers as keys", m_curr->pos);
}
void begin_map_val_flow()
{
_c4dbgpf("node[{}]: begin_map_val_flow", m_curr->node_id);
_enable_(MAP|FLOW_SL);
_tr_save_loc();
_push();
}
void begin_map_val_block()
{
_c4dbgpf("node[{}]: begin_map_val_block", m_curr->node_id);
_enable_(MAP|BLOCK);
_tr_save_loc();
_push();
}
void end_map()
{
_pop();
_c4dbgpf("node[{}]: end_map_val", m_curr->node_id);
}
/** @} */
public:
/** @name YAML seq events */
/** @{ */
void begin_seq_key_flow()
{
_RYML_CB_ERR_(m_stack.m_callbacks, "ryml trees cannot handle containers as keys", m_curr->pos);
}
void begin_seq_key_block()
{
_RYML_CB_ERR_(m_stack.m_callbacks, "ryml trees cannot handle containers as keys", m_curr->pos);
}
void begin_seq_val_flow()
{
_c4dbgpf("node[{}]: begin_seq_val_flow", m_curr->node_id);
_enable_(SEQ|FLOW_SL);
_tr_save_loc();
_push();
}
void begin_seq_val_block()
{
_c4dbgpf("node[{}]: begin_seq_val_block", m_curr->node_id);
_enable_(SEQ|BLOCK);
_tr_save_loc();
_push();
}
void end_seq()
{
_pop();
_c4dbgpf("node[{}]: end_seq_val", m_curr->node_id);
}
/** @} */
public:
/** @name YAML structure events */
/** @{ */
void add_sibling()
{
_RYML_CB_ASSERT(m_stack.m_callbacks, m_parent);
_RYML_CB_ASSERT(m_stack.m_callbacks, m_tree->has_children(m_parent->node_id));
NodeData const* prev = m_tree->m_buf; // watchout against relocation of the tree nodes
_tr_set_state_(m_curr, m_tree->_append_child__unprotected(m_parent->node_id));
if(prev != m_tree->m_buf)
_tr_refresh_after_relocation();
_c4dbgpf("node[{}]: added sibling={} prev={}", m_parent->node_id, m_curr->node_id, m_tree->prev_sibling(m_curr->node_id));
}
/** set the previous val as the first key of a new map, with flow style.
*
* See the documentation for @ref doc_event_handlers, which has
* important notes about this event.
*/
void actually_val_is_first_key_of_new_map_flow()
{
if(C4_UNLIKELY(m_tree->is_container(m_curr->node_id)))
_RYML_CB_ERR_(m_stack.m_callbacks, "ryml trees cannot handle containers as keys", m_curr->pos);
_RYML_CB_ASSERT(m_stack.m_callbacks, m_parent);
_RYML_CB_ASSERT(m_stack.m_callbacks, m_tree->is_seq(m_parent->node_id));
_RYML_CB_ASSERT(m_stack.m_callbacks, !m_tree->is_container(m_curr->node_id));
_RYML_CB_ASSERT(m_stack.m_callbacks, !m_tree->has_key(m_curr->node_id));
const NodeData tmp = _tr_val2key_(*m_curr->tr_data);
_disable_(_VALMASK|VAL_STYLE);
m_curr->tr_data->m_val = {};
begin_map_val_flow();
m_curr->tr_data->m_type = tmp.m_type;
m_curr->tr_data->m_key = tmp.m_key;
}
/** like its flow counterpart, but this function can only be
* called after the end of a flow-val at root or doc level.
*
* See the documentation for @ref doc_event_handlers, which has
* important notes about this event.
*/
void actually_val_is_first_key_of_new_map_block()
{
_RYML_CB_ERR_(m_stack.m_callbacks, "ryml trees cannot handle containers as keys", m_curr->pos);
}
/** @} */
public:
/** @name YAML scalar events */
/** @{ */
C4_ALWAYS_INLINE void set_key_scalar_plain(csubstr scalar)
{
_c4dbgpf("node[{}]: set key scalar plain: [{}]~~~{}~~~ ({})", m_curr->node_id, scalar.len, scalar, reinterpret_cast<void const*>(scalar.str));
m_curr->tr_data->m_key.scalar = scalar;
_enable_(KEY|KEY_PLAIN);
}
C4_ALWAYS_INLINE void set_val_scalar_plain(csubstr scalar)
{
_c4dbgpf("node[{}]: set val scalar plain: [{}]~~~{}~~~ ({})", m_curr->node_id, scalar.len, scalar, reinterpret_cast<void const*>(scalar.str));
m_curr->tr_data->m_val.scalar = scalar;
_enable_(VAL|VAL_PLAIN);
}
C4_ALWAYS_INLINE void set_key_scalar_dquoted(csubstr scalar)
{
_c4dbgpf("node[{}]: set key scalar dquot: [{}]~~~{}~~~ ({})", m_curr->node_id, scalar.len, scalar, reinterpret_cast<void const*>(scalar.str));
m_curr->tr_data->m_key.scalar = scalar;
_enable_(KEY|KEY_DQUO);
}
C4_ALWAYS_INLINE void set_val_scalar_dquoted(csubstr scalar)
{
_c4dbgpf("node[{}]: set val scalar dquot: [{}]~~~{}~~~ ({})", m_curr->node_id, scalar.len, scalar, reinterpret_cast<void const*>(scalar.str));
m_curr->tr_data->m_val.scalar = scalar;
_enable_(VAL|VAL_DQUO);
}
C4_ALWAYS_INLINE void set_key_scalar_squoted(csubstr scalar)
{
_c4dbgpf("node[{}]: set key scalar squot: [{}]~~~{}~~~ ({})", m_curr->node_id, scalar.len, scalar, reinterpret_cast<void const*>(scalar.str));
m_curr->tr_data->m_key.scalar = scalar;
_enable_(KEY|KEY_SQUO);
}
C4_ALWAYS_INLINE void set_val_scalar_squoted(csubstr scalar)
{
_c4dbgpf("node[{}]: set val scalar squot: [{}]~~~{}~~~ ({})", m_curr->node_id, scalar.len, scalar, reinterpret_cast<void const*>(scalar.str));
m_curr->tr_data->m_val.scalar = scalar;
_enable_(VAL|VAL_SQUO);
}
C4_ALWAYS_INLINE void set_key_scalar_literal(csubstr scalar)
{
_c4dbgpf("node[{}]: set key scalar literal: [{}]~~~{}~~~ ({})", m_curr->node_id, scalar.len, scalar, reinterpret_cast<void const*>(scalar.str));
m_curr->tr_data->m_key.scalar = scalar;
_enable_(KEY|KEY_LITERAL);
}
C4_ALWAYS_INLINE void set_val_scalar_literal(csubstr scalar)
{
_c4dbgpf("node[{}]: set val scalar literal: [{}]~~~{}~~~ ({})", m_curr->node_id, scalar.len, scalar, reinterpret_cast<void const*>(scalar.str));
m_curr->tr_data->m_val.scalar = scalar;
_enable_(VAL|VAL_LITERAL);
}
C4_ALWAYS_INLINE void set_key_scalar_folded(csubstr scalar)
{
_c4dbgpf("node[{}]: set key scalar folded: [{}]~~~{}~~~ ({})", m_curr->node_id, scalar.len, scalar, reinterpret_cast<void const*>(scalar.str));
m_curr->tr_data->m_key.scalar = scalar;
_enable_(KEY|KEY_FOLDED);
}
C4_ALWAYS_INLINE void set_val_scalar_folded(csubstr scalar)
{
_c4dbgpf("node[{}]: set val scalar folded: [{}]~~~{}~~~ ({})", m_curr->node_id, scalar.len, scalar, reinterpret_cast<void const*>(scalar.str));
m_curr->tr_data->m_val.scalar = scalar;
_enable_(VAL|VAL_FOLDED);
}
C4_ALWAYS_INLINE void mark_key_scalar_unfiltered()
{
_enable_(KEY_UNFILT);
}
C4_ALWAYS_INLINE void mark_val_scalar_unfiltered()
{
_enable_(VAL_UNFILT);
}
/** @} */
public:
/** @name YAML anchor/reference events */
/** @{ */
void set_key_anchor(csubstr anchor)
{
_c4dbgpf("node[{}]: set key anchor: [{}]~~~{}~~~", m_curr->node_id, anchor.len, anchor);
RYML_ASSERT(!anchor.begins_with('&'));
_enable_(KEYANCH);
m_curr->tr_data->m_key.anchor = anchor;
}
void set_val_anchor(csubstr anchor)
{
_c4dbgpf("node[{}]: set val anchor: [{}]~~~{}~~~", m_curr->node_id, anchor.len, anchor);
RYML_ASSERT(!anchor.begins_with('&'));
_enable_(VALANCH);
m_curr->tr_data->m_val.anchor = anchor;
}
void set_key_ref(csubstr ref)
{
_c4dbgpf("node[{}]: set key ref: [{}]~~~{}~~~", m_curr->node_id, ref.len, ref);
RYML_ASSERT(ref.begins_with('*'));
_enable_(KEY|KEYREF);
m_curr->tr_data->m_key.anchor = ref.sub(1);
m_curr->tr_data->m_key.scalar = ref;
}
void set_val_ref(csubstr ref)
{
_c4dbgpf("node[{}]: set val ref: [{}]~~~{}~~~", m_curr->node_id, ref.len, ref);
RYML_ASSERT(ref.begins_with('*'));
_enable_(VAL|VALREF);
m_curr->tr_data->m_val.anchor = ref.sub(1);
m_curr->tr_data->m_val.scalar = ref;
}
/** @} */
public:
/** @name YAML tag events */
/** @{ */
void set_key_tag(csubstr tag)
{
_c4dbgpf("node[{}]: set key tag: [{}]~~~{}~~~", m_curr->node_id, tag.len, tag);
_enable_(KEYTAG);
m_curr->tr_data->m_key.tag = tag;
}
void set_val_tag(csubstr tag)
{
_c4dbgpf("node[{}]: set val tag: [{}]~~~{}~~~", m_curr->node_id, tag.len, tag);
_enable_(VALTAG);
m_curr->tr_data->m_val.tag = tag;
}
/** @} */
public:
/** @name YAML directive events */
/** @{ */
void add_directive(csubstr directive)
{
_c4dbgpf("% directive! {}", directive);
_RYML_CB_ASSERT(m_stack.m_callbacks, directive.begins_with('%'));
if(directive.begins_with("%TAG"))
{
// TODO do not use directives in the tree
_RYML_CB_CHECK(m_stack.m_callbacks, m_tree->add_tag_directive(directive));
}
else if(directive.begins_with("%YAML"))
{
_c4dbgpf("%YAML directive! ignoring...: {}", directive);
}
else
{
_c4dbgpf("% directive unknown! ignoring...: {}", directive);
}
}
/** @} */
public:
/** @name arena functions */
/** @{ */
substr alloc_arena(size_t len)
{
return m_tree->alloc_arena(len);
}
/** @} */
public:
/** @cond dev */
void _reset_parser_state(state* st, id_type parse_root, id_type node)
{
_tr_set_state_(st, node);
const NodeType type = m_tree->type(node);
#ifdef RYML_DBG
char flagbuf[80];
#endif
_c4dbgpf("resetting state: initial flags={}", detail::_parser_flags_to_str(flagbuf, st->flags));
if(type == NOTYPE)
{
_c4dbgpf("node[{}] is notype", node);
if(m_tree->is_root(parse_root))
{
_c4dbgpf("node[{}] is root", node);
st->flags |= RUNK|RTOP;
}
else
{
_c4dbgpf("node[{}] is not root. setting USTY", node);
st->flags |= USTY;
}
}
else if(type.is_map())
{
_c4dbgpf("node[{}] is map", node);
st->flags |= RMAP|USTY;
}
else if(type.is_seq())
{
_c4dbgpf("node[{}] is map", node);
st->flags |= RSEQ|USTY;
}
else if(type.has_key())
{
_c4dbgpf("node[{}] has key. setting USTY", node);
st->flags |= USTY;
}
else
{
_RYML_CB_ERR(m_stack.m_callbacks, "cannot append to node");
}
if(type.is_doc())
{
_c4dbgpf("node[{}] is doc", node);
st->flags |= RDOC;
}
_c4dbgpf("resetting state: final flags={}", detail::_parser_flags_to_str(flagbuf, st->flags));
}
/** push a new parent, add a child to the new parent, and set the
* child as the current node */
void _push()
{
_stack_push();
NodeData const* prev = m_tree->m_buf; // watch out against relocation of the tree nodes
m_curr->node_id = m_tree->_append_child__unprotected(m_parent->node_id);
m_curr->tr_data = m_tree->_p(m_curr->node_id);
if(prev != m_tree->m_buf)
_tr_refresh_after_relocation();
_c4dbgpf("pushed! level={}. top is now node={} (parent={})", m_curr->level, m_curr->node_id, m_parent ? m_parent->node_id : NONE);
}
/** end the current scope */
void _pop()
{
_tr_remove_speculative_with_parent();
_stack_pop();
}
public:
template<type_bits bits> C4_HOT C4_ALWAYS_INLINE void _enable__() noexcept
{
m_curr->tr_data->m_type.type = static_cast<NodeType_e>(m_curr->tr_data->m_type.type | bits);
}
template<type_bits bits> C4_HOT C4_ALWAYS_INLINE void _disable__() noexcept
{
m_curr->tr_data->m_type.type = static_cast<NodeType_e>(m_curr->tr_data->m_type.type & (~bits));
}
template<type_bits bits> C4_HOT C4_ALWAYS_INLINE bool _has_any__() const noexcept
{
return (m_curr->tr_data->m_type.type & bits) != 0;
}
public:
C4_ALWAYS_INLINE void _tr_set_state_(state *C4_RESTRICT s, id_type id) noexcept
{
s->node_id = id;
s->tr_data = m_tree->_p(id);
}
void _tr_refresh_after_relocation()
{
_c4dbgp("tree: refreshing stack data after tree data relocation");
for(auto &st : m_stack)
st.tr_data = m_tree->_p(st.node_id);
}
void _tr_set_root_as_stream()
{
_c4dbgp("set root as stream");
_RYML_CB_ASSERT(m_stack.m_callbacks, m_tree->root_id() == 0u);
_RYML_CB_ASSERT(m_stack.m_callbacks, m_curr->node_id == 0u);
const bool hack = !m_tree->has_children(m_curr->node_id) && !m_tree->is_val(m_curr->node_id);
if(hack)
m_tree->_p(m_tree->root_id())->m_type.add(VAL);
m_tree->set_root_as_stream();
_RYML_CB_ASSERT(m_stack.m_callbacks, m_tree->is_stream(m_tree->root_id()));
_RYML_CB_ASSERT(m_stack.m_callbacks, m_tree->has_children(m_tree->root_id()));
_RYML_CB_ASSERT(m_stack.m_callbacks, m_tree->is_doc(m_tree->first_child(m_tree->root_id())));
if(hack)
m_tree->_p(m_tree->first_child(m_tree->root_id()))->m_type.rem(VAL);
_tr_set_state_(m_curr, m_tree->root_id());
}
static NodeData _tr_val2key_(NodeData const& C4_RESTRICT d) noexcept
{
NodeData r = d;
r.m_key = d.m_val;
r.m_val = {};
r.m_type = d.m_type;
static_assert((_VALMASK >> 1u) == _KEYMASK, "required for this function to work");
static_assert((VAL_STYLE >> 1u) == KEY_STYLE, "required for this function to work");
r.m_type.type = ((d.m_type.type & (_VALMASK|VAL_STYLE)) >> 1u);
r.m_type.type = (r.m_type.type & ~(_VALMASK|VAL_STYLE));
r.m_type.type = (r.m_type.type | KEY);
return r;
}
void _tr_remove_speculative()
{
_c4dbgp("remove speculative node");
_RYML_CB_ASSERT(m_stack.m_callbacks, m_tree->size() > 0);
const id_type last_added = m_tree->size() - 1;
if(m_tree->has_parent(last_added))
if(m_tree->_p(last_added)->m_type == NOTYPE)
m_tree->remove(last_added);
}
void _tr_remove_speculative_with_parent()
{
_RYML_CB_ASSERT(m_stack.m_callbacks, m_tree->size() > 0);
const id_type last_added = m_tree->size() - 1;
_RYML_CB_ASSERT(m_stack.m_callbacks, m_tree->has_parent(last_added));
if(m_tree->_p(last_added)->m_type == NOTYPE)
{
_c4dbgpf("remove speculative node with parent. parent={} node={} parent(node)={}", m_parent->node_id, last_added, m_tree->parent(last_added));
m_tree->remove(last_added);
}
}
C4_ALWAYS_INLINE void _tr_save_loc()
{
m_tree->_p(m_curr->node_id)->m_val.scalar.str = m_curr->line_contents.rem.str;
}
#undef _enable_
#undef _disable_
#undef _has_any_
/** @endcond */
};
/** @} */
} // namespace yml
} // namespace c4
C4_SUPPRESS_WARNING_MSVC_POP
#endif /* _C4_YML_EVENT_HANDLER_TREE_HPP_ */

View File

@@ -0,0 +1,512 @@
#ifndef _C4_YML_FILTER_PROCESSOR_HPP_
#define _C4_YML_FILTER_PROCESSOR_HPP_
#include "c4/yml/common.hpp"
#ifdef RYML_DBG
#include "c4/charconv.hpp"
#include "c4/yml/detail/parser_dbg.hpp"
#endif
namespace c4 {
namespace yml {
/** @defgroup doc_filter_processors Scalar filter processors
*
* These are internal classes used by @ref ParseEngine to parse the
* scalars; normally there is no reason for a user to be manually
* using these classes.
*
* @ingroup doc_parse */
/** @{ */
//-----------------------------------------------------------------------------
/** Filters an input string into a different output string */
struct FilterProcessorSrcDst
{
csubstr src;
substr dst;
size_t rpos; ///< read position
size_t wpos; ///< write position
C4_ALWAYS_INLINE FilterProcessorSrcDst(csubstr src_, substr dst_) noexcept
: src(src_)
, dst(dst_)
, rpos(0)
, wpos(0)
{
RYML_ASSERT(!dst.overlaps(src));
}
C4_ALWAYS_INLINE void setwpos(size_t wpos_) noexcept { wpos = wpos_; }
C4_ALWAYS_INLINE void setpos(size_t rpos_, size_t wpos_) noexcept { rpos = rpos_; wpos = wpos_; }
C4_ALWAYS_INLINE void set_at_end() noexcept { skip(src.len - rpos); }
C4_ALWAYS_INLINE bool has_more_chars() const noexcept { return rpos < src.len; }
C4_ALWAYS_INLINE bool has_more_chars(size_t maxpos) const noexcept { RYML_ASSERT(maxpos <= src.len); return rpos < maxpos; }
C4_ALWAYS_INLINE csubstr rem() const noexcept { return src.sub(rpos); }
C4_ALWAYS_INLINE csubstr sofar() const noexcept { return csubstr(dst.str, wpos <= dst.len ? wpos : dst.len); }
C4_ALWAYS_INLINE FilterResult result() const noexcept
{
FilterResult ret;
ret.str.str = wpos <= dst.len ? dst.str : nullptr;
ret.str.len = wpos;
return ret;
}
C4_ALWAYS_INLINE char curr() const noexcept { RYML_ASSERT(rpos < src.len); return src[rpos]; }
C4_ALWAYS_INLINE char next() const noexcept { return rpos+1 < src.len ? src[rpos+1] : '\0'; }
C4_ALWAYS_INLINE bool skipped_chars() const noexcept { return wpos != rpos; }
C4_ALWAYS_INLINE void skip() noexcept { ++rpos; }
C4_ALWAYS_INLINE void skip(size_t num) noexcept { rpos += num; }
C4_ALWAYS_INLINE void set_at(size_t pos, char c) noexcept
{
RYML_ASSERT(pos < wpos);
dst.str[pos] = c;
}
C4_ALWAYS_INLINE void set(char c) noexcept
{
if(wpos < dst.len)
dst.str[wpos] = c;
++wpos;
}
C4_ALWAYS_INLINE void set(char c, size_t num) noexcept
{
RYML_ASSERT(num > 0);
if(wpos + num <= dst.len)
memset(dst.str + wpos, c, num);
wpos += num;
}
C4_ALWAYS_INLINE void copy() noexcept
{
RYML_ASSERT(rpos < src.len);
if(wpos < dst.len)
dst.str[wpos] = src.str[rpos];
++wpos;
++rpos;
}
C4_ALWAYS_INLINE void copy(size_t num) noexcept
{
RYML_ASSERT(num);
RYML_ASSERT(rpos+num <= src.len);
if(wpos + num <= dst.len)
memcpy(dst.str + wpos, src.str + rpos, num);
wpos += num;
rpos += num;
}
C4_ALWAYS_INLINE void translate_esc(char c) noexcept
{
if(wpos < dst.len)
dst.str[wpos] = c;
++wpos;
rpos += 2;
}
C4_ALWAYS_INLINE void translate_esc_bulk(const char *C4_RESTRICT s, size_t nw, size_t nr) noexcept
{
RYML_ASSERT(nw > 0);
RYML_ASSERT(nr > 0);
RYML_ASSERT(rpos+nr <= src.len);
if(wpos+nw <= dst.len)
memcpy(dst.str + wpos, s, nw);
wpos += nw;
rpos += 1 + nr;
}
C4_ALWAYS_INLINE void translate_esc_extending(const char *C4_RESTRICT s, size_t nw, size_t nr) noexcept
{
translate_esc_bulk(s, nw, nr);
}
};
//-----------------------------------------------------------------------------
// filter in place
// debugging scaffold
/** @cond dev */
#if defined(RYML_DBG) && 0
#define _c4dbgip(...) _c4dbgpf(__VA_ARGS__)
#else
#define _c4dbgip(...)
#endif
/** @endcond */
/** Filters in place. While the result may be larger than the source,
* any extending happens only at the end of the string. Consequently,
* it's impossible for characters to be left unfiltered.
*
* @see FilterProcessorInplaceMidExtending */
struct FilterProcessorInplaceEndExtending
{
substr src; ///< the subject string
size_t wcap; ///< write capacity - the capacity of the subject string's buffer
size_t rpos; ///< read position
size_t wpos; ///< write position
C4_ALWAYS_INLINE FilterProcessorInplaceEndExtending(substr src_, size_t wcap_) noexcept
: src(src_)
, wcap(wcap_)
, rpos(0)
, wpos(0)
{
RYML_ASSERT(wcap >= src.len);
}
C4_ALWAYS_INLINE void setwpos(size_t wpos_) noexcept { wpos = wpos_; }
C4_ALWAYS_INLINE void setpos(size_t rpos_, size_t wpos_) noexcept { rpos = rpos_; wpos = wpos_; }
C4_ALWAYS_INLINE void set_at_end() noexcept { skip(src.len - rpos); }
C4_ALWAYS_INLINE bool has_more_chars() const noexcept { return rpos < src.len; }
C4_ALWAYS_INLINE bool has_more_chars(size_t maxpos) const noexcept { RYML_ASSERT(maxpos <= src.len); return rpos < maxpos; }
C4_ALWAYS_INLINE FilterResult result() const noexcept
{
_c4dbgip("inplace: wpos={} wcap={} small={}", wpos, wcap, wpos > rpos);
FilterResult ret;
ret.str.str = (wpos <= wcap) ? src.str : nullptr;
ret.str.len = wpos;
return ret;
}
C4_ALWAYS_INLINE csubstr sofar() const noexcept { return csubstr(src.str, wpos <= wcap ? wpos : wcap); }
C4_ALWAYS_INLINE csubstr rem() const noexcept { return src.sub(rpos); }
C4_ALWAYS_INLINE char curr() const noexcept { RYML_ASSERT(rpos < src.len); return src[rpos]; }
C4_ALWAYS_INLINE char next() const noexcept { return rpos+1 < src.len ? src[rpos+1] : '\0'; }
C4_ALWAYS_INLINE void skip() noexcept { ++rpos; }
C4_ALWAYS_INLINE void skip(size_t num) noexcept { rpos += num; }
void set_at(size_t pos, char c) noexcept
{
RYML_ASSERT(pos < wpos);
const size_t save = wpos;
wpos = pos;
set(c);
wpos = save;
}
void set(char c) noexcept
{
if(wpos < wcap) // respect write-capacity
src.str[wpos] = c;
++wpos;
}
void set(char c, size_t num) noexcept
{
RYML_ASSERT(num);
if(wpos + num <= wcap) // respect write-capacity
memset(src.str + wpos, c, num);
wpos += num;
}
void copy() noexcept
{
RYML_ASSERT(wpos <= rpos);
RYML_ASSERT(rpos < src.len);
if(wpos < wcap) // respect write-capacity
src.str[wpos] = src.str[rpos];
++rpos;
++wpos;
}
void copy(size_t num) noexcept
{
RYML_ASSERT(num);
RYML_ASSERT(rpos+num <= src.len);
RYML_ASSERT(wpos <= rpos);
if(wpos + num <= wcap) // respect write-capacity
{
if(wpos + num <= rpos) // there is no overlap
memcpy(src.str + wpos, src.str + rpos, num);
else // there is overlap
memmove(src.str + wpos, src.str + rpos, num);
}
rpos += num;
wpos += num;
}
void translate_esc(char c) noexcept
{
RYML_ASSERT(rpos + 2 <= src.len);
RYML_ASSERT(wpos <= rpos);
if(wpos < wcap) // respect write-capacity
src.str[wpos] = c;
rpos += 2; // add 1u to account for the escape character
++wpos;
}
void translate_esc_bulk(const char *C4_RESTRICT s, size_t nw, size_t nr) noexcept
{
RYML_ASSERT(nw > 0);
RYML_ASSERT(nr > 0);
RYML_ASSERT(nw <= nr + 1u);
RYML_ASSERT(rpos+nr <= src.len);
RYML_ASSERT(wpos <= rpos);
const size_t wpos_next = wpos + nw;
const size_t rpos_next = rpos + nr + 1u; // add 1u to account for the escape character
RYML_ASSERT(wpos_next <= rpos_next);
if(wpos_next <= wcap)
memcpy(src.str + wpos, s, nw);
rpos = rpos_next;
wpos = wpos_next;
}
C4_ALWAYS_INLINE void translate_esc_extending(const char *C4_RESTRICT s, size_t nw, size_t nr) noexcept
{
translate_esc_bulk(s, nw, nr);
}
};
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
/** Filters in place. The result may be larger than the source, and
* extending may happen anywhere. As a result some characters may be
* left unfiltered when there is no slack in the buffer and the
* write-position would overlap the read-position. Consequently, it's
* possible for characters to be left unfiltered. In YAML, this
* happens only with double-quoted strings, and only with a small
* number of escape sequences such as `\L` which is substituted by three
* bytes. These escape sequences cause a call to translate_esc_extending()
* which is the only entry point to this unfiltered situation.
*
* @see FilterProcessorInplaceMidExtending */
struct FilterProcessorInplaceMidExtending
{
substr src; ///< the subject string
size_t wcap; ///< write capacity - the capacity of the subject string's buffer
size_t rpos; ///< read position
size_t wpos; ///< write position
size_t maxcap; ///< the max capacity needed for filtering the string. This may be larger than the final string size.
bool unfiltered_chars; ///< number of characters that were not added to wpos from lack of capacity
C4_ALWAYS_INLINE FilterProcessorInplaceMidExtending(substr src_, size_t wcap_) noexcept
: src(src_)
, wcap(wcap_)
, rpos(0)
, wpos(0)
, maxcap(src.len)
, unfiltered_chars(false)
{
RYML_ASSERT(wcap >= src.len);
}
C4_ALWAYS_INLINE void setwpos(size_t wpos_) noexcept { wpos = wpos_; }
C4_ALWAYS_INLINE void setpos(size_t rpos_, size_t wpos_) noexcept { rpos = rpos_; wpos = wpos_; }
C4_ALWAYS_INLINE void set_at_end() noexcept { skip(src.len - rpos); }
C4_ALWAYS_INLINE bool has_more_chars() const noexcept { return rpos < src.len; }
C4_ALWAYS_INLINE bool has_more_chars(size_t maxpos) const noexcept { RYML_ASSERT(maxpos <= src.len); return rpos < maxpos; }
C4_ALWAYS_INLINE FilterResultExtending result() const noexcept
{
_c4dbgip("inplace: wpos={} wcap={} unfiltered={} maxcap={}", this->wpos, this->wcap, this->unfiltered_chars, this->maxcap);
FilterResultExtending ret;
ret.str.str = (wpos <= wcap && !unfiltered_chars) ? src.str : nullptr;
ret.str.len = wpos;
ret.reqlen = maxcap;
return ret;
}
C4_ALWAYS_INLINE csubstr sofar() const noexcept { return csubstr(src.str, wpos <= wcap ? wpos : wcap); }
C4_ALWAYS_INLINE csubstr rem() const noexcept { return src.sub(rpos); }
C4_ALWAYS_INLINE char curr() const noexcept { RYML_ASSERT(rpos < src.len); return src[rpos]; }
C4_ALWAYS_INLINE char next() const noexcept { return rpos+1 < src.len ? src[rpos+1] : '\0'; }
C4_ALWAYS_INLINE void skip() noexcept { ++rpos; }
C4_ALWAYS_INLINE void skip(size_t num) noexcept { rpos += num; }
void set_at(size_t pos, char c) noexcept
{
RYML_ASSERT(pos < wpos);
const size_t save = wpos;
wpos = pos;
set(c);
wpos = save;
}
void set(char c) noexcept
{
if(wpos < wcap) // respect write-capacity
{
if((wpos <= rpos) && !unfiltered_chars)
src.str[wpos] = c;
}
else
{
_c4dbgip("inplace: add unwritten {}->{} maxcap={}->{}!", unfiltered_chars, true, maxcap, (wpos+1u > maxcap ? wpos+1u : maxcap));
unfiltered_chars = true;
}
++wpos;
maxcap = wpos > maxcap ? wpos : maxcap;
}
void set(char c, size_t num) noexcept
{
RYML_ASSERT(num);
if(wpos + num <= wcap) // respect write-capacity
{
if((wpos <= rpos) && !unfiltered_chars)
memset(src.str + wpos, c, num);
}
else
{
_c4dbgip("inplace: add unwritten {}->{} maxcap={}->{}!", unfiltered_chars, true, maxcap, (wpos+num > maxcap ? wpos+num : maxcap));
unfiltered_chars = true;
}
wpos += num;
maxcap = wpos > maxcap ? wpos : maxcap;
}
void copy() noexcept
{
RYML_ASSERT(rpos < src.len);
if(wpos < wcap) // respect write-capacity
{
if((wpos < rpos) && !unfiltered_chars) // write only if wpos is behind rpos
src.str[wpos] = src.str[rpos];
}
else
{
_c4dbgip("inplace: add unwritten {}->{} (wpos={}!=rpos={})={} (wpos={}<wcap={}) maxcap={}->{}!", unfiltered_chars, true, wpos, rpos, wpos!=rpos, wpos, wcap, wpos<wcap, maxcap, (wpos+1u > maxcap ? wpos+1u : maxcap));
unfiltered_chars = true;
}
++rpos;
++wpos;
maxcap = wpos > maxcap ? wpos : maxcap;
}
void copy(size_t num) noexcept
{
RYML_ASSERT(num);
RYML_ASSERT(rpos+num <= src.len);
if(wpos + num <= wcap) // respect write-capacity
{
if((wpos < rpos) && !unfiltered_chars) // write only if wpos is behind rpos
{
if(wpos + num <= rpos) // there is no overlap
memcpy(src.str + wpos, src.str + rpos, num);
else // there is overlap
memmove(src.str + wpos, src.str + rpos, num);
}
}
else
{
_c4dbgip("inplace: add unwritten {}->{} (wpos={}!=rpos={})={} (wpos={}<wcap={}) maxcap={}->{}!", unfiltered_chars, true, wpos, rpos, wpos!=rpos, wpos, wcap, wpos<wcap);
unfiltered_chars = true;
}
rpos += num;
wpos += num;
maxcap = wpos > maxcap ? wpos : maxcap;
}
void translate_esc(char c) noexcept
{
RYML_ASSERT(rpos + 2 <= src.len);
if(wpos < wcap) // respect write-capacity
{
if((wpos <= rpos) && !unfiltered_chars)
src.str[wpos] = c;
}
else
{
_c4dbgip("inplace: add unfiltered {}->{} maxcap={}->{}!", unfiltered_chars, true, maxcap, (wpos+1u > maxcap ? wpos+1u : maxcap));
unfiltered_chars = true;
}
rpos += 2;
++wpos;
maxcap = wpos > maxcap ? wpos : maxcap;
}
C4_NO_INLINE void translate_esc_bulk(const char *C4_RESTRICT s, size_t nw, size_t nr) noexcept
{
RYML_ASSERT(nw > 0);
RYML_ASSERT(nr > 0);
RYML_ASSERT(nr+1u >= nw);
const size_t wpos_next = wpos + nw;
const size_t rpos_next = rpos + nr + 1u; // add 1u to account for the escape character
if(wpos_next <= wcap) // respect write-capacity
{
if((wpos <= rpos) && !unfiltered_chars) // write only if wpos is behind rpos
memcpy(src.str + wpos, s, nw);
}
else
{
_c4dbgip("inplace: add unwritten {}->{} (wpos={}!=rpos={})={} (wpos={}<wcap={}) maxcap={}->{}!", unfiltered_chars, true, wpos, rpos, wpos!=rpos, wpos, wcap, wpos<wcap);
unfiltered_chars = true;
}
rpos = rpos_next;
wpos = wpos_next;
maxcap = wpos > maxcap ? wpos : maxcap;
}
C4_NO_INLINE void translate_esc_extending(const char *C4_RESTRICT s, size_t nw, size_t nr) noexcept
{
RYML_ASSERT(nw > 0);
RYML_ASSERT(nr > 0);
RYML_ASSERT(rpos+nr <= src.len);
const size_t wpos_next = wpos + nw;
const size_t rpos_next = rpos + nr + 1u; // add 1u to account for the escape character
if(wpos_next <= rpos_next) // read and write do not overlap. just do a vanilla copy.
{
if((wpos_next <= wcap) && !unfiltered_chars)
memcpy(src.str + wpos, s, nw);
rpos = rpos_next;
wpos = wpos_next;
maxcap = wpos > maxcap ? wpos : maxcap;
}
else // there is overlap. move the (to-be-read) string to the right.
{
const size_t excess = wpos_next - rpos_next;
RYML_ASSERT(wpos_next > rpos_next);
if(src.len + excess <= wcap) // ensure we do not go past the end
{
RYML_ASSERT(rpos+nr+excess <= src.len);
if(wpos_next <= wcap)
{
if(!unfiltered_chars)
{
memmove(src.str + wpos_next, src.str + rpos_next, src.len - rpos_next);
memcpy(src.str + wpos, s, nw);
}
rpos = wpos_next; // wpos, not rpos
}
else
{
rpos = rpos_next;
//const size_t unw = nw > (nr + 1u) ? nw - (nr + 1u) : 0;
_c4dbgip("inplace: add unfiltered {}->{} maxcap={}->{}!", unfiltered_chars, true);
unfiltered_chars = true;
}
wpos = wpos_next;
// extend the string up to capacity
src.len += excess;
maxcap = wpos > maxcap ? wpos : maxcap;
}
else
{
//const size_t unw = nw > (nr + 1u) ? nw - (nr + 1u) : 0;
RYML_ASSERT(rpos_next <= src.len);
const size_t required_size = wpos_next + (src.len - rpos_next);
_c4dbgip("inplace: add unfiltered {}->{} maxcap={}->{}!", unfiltered_chars, true, maxcap, required_size > maxcap ? required_size : maxcap);
RYML_ASSERT(required_size > wcap);
unfiltered_chars = true;
maxcap = required_size > maxcap ? required_size : maxcap;
wpos = wpos_next;
rpos = rpos_next;
}
}
}
};
#undef _c4dbgip
/** @} */
} // namespace yml
} // namespace c4
#endif /* _C4_YML_FILTER_PROCESSOR_HPP_ */

24
src/c4/yml/fwd.hpp Normal file
View File

@@ -0,0 +1,24 @@
#ifndef _C4_YML_FWD_HPP_
#define _C4_YML_FWD_HPP_
/** @file fwd.hpp forward declarations */
namespace c4 {
namespace yml {
struct NodeScalar;
struct NodeInit;
struct NodeData;
struct NodeType;
class NodeRef;
class ConstNodeRef;
class Tree;
struct ReferenceResolver;
template<class EventHandler> class ParseEngine;
struct EventHandlerTree;
using Parser = ParseEngine<EventHandlerTree>;
} // namespace c4
} // namespace yml
#endif /* _C4_YML_FWD_HPP_ */

View File

@@ -16,6 +16,7 @@
# pragma GCC diagnostic push
# pragma GCC diagnostic ignored "-Wtype-limits"
# pragma GCC diagnostic ignored "-Wold-style-cast"
# pragma GCC diagnostic ignored "-Wuseless-cast"
#elif defined(_MSC_VER)
# pragma warning(push)
# pragma warning(disable: 4251/*needs to have dll-interface to be used by clients of struct*/)
@@ -79,9 +80,9 @@ struct child_iterator
using tree_type = typename NodeRefType::tree_type;
tree_type * C4_RESTRICT m_tree;
size_t m_child_id;
id_type m_child_id;
child_iterator(tree_type * t, size_t id) : m_tree(t), m_child_id(id) {}
child_iterator(tree_type * t, id_type id) : m_tree(t), m_child_id(id) {}
child_iterator& operator++ () { RYML_ASSERT(m_child_id != NONE); m_child_id = m_tree->next_sibling(m_child_id); return *this; }
child_iterator& operator-- () { RYML_ASSERT(m_child_id != NONE); m_child_id = m_tree->prev_sibling(m_child_id); return *this; }
@@ -108,9 +109,9 @@ struct children_view_
};
template<class NodeRefType, class Visitor>
bool _visit(NodeRefType &node, Visitor fn, size_t indentation_level, bool skip_root=false)
bool _visit(NodeRefType &node, Visitor fn, id_type indentation_level, bool skip_root=false)
{
size_t increment = 0;
id_type increment = 0;
if( ! (node.is_root() && skip_root))
{
if(fn(node, indentation_level))
@@ -131,9 +132,9 @@ bool _visit(NodeRefType &node, Visitor fn, size_t indentation_level, bool skip_r
}
template<class NodeRefType, class Visitor>
bool _visit_stacked(NodeRefType &node, Visitor fn, size_t indentation_level, bool skip_root=false)
bool _visit_stacked(NodeRefType &node, Visitor fn, id_type indentation_level, bool skip_root=false)
{
size_t increment = 0;
id_type increment = 0;
if( ! (node.is_root() && skip_root))
{
if(fn(node, indentation_level))
@@ -169,8 +170,9 @@ struct RoNodeMethods;
/** a CRTP base providing read-only methods for @ref ConstNodeRef and @ref NodeRef */
namespace detail {
template<class Impl, class ConstImpl>
struct detail::RoNodeMethods
struct RoNodeMethods
{
C4_SUPPRESS_WARNING_GCC_CLANG_WITH_PUSH("-Wcast-align")
/** @cond dev */
@@ -218,11 +220,14 @@ public:
C4_ALWAYS_INLINE bool key_is_null() const RYML_NOEXCEPT { _C4RR(); return tree_->key_is_null(id_); }
C4_ALWAYS_INLINE bool val_is_null() const RYML_NOEXCEPT { _C4RR(); return tree_->val_is_null(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_key_unfiltered() const noexcept { _C4RR(); return tree_->is_key_unfiltered(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_val_unfiltered() const noexcept { _C4RR(); return tree_->is_val_unfiltered(id_); }
/** @} */
public:
/** @name node property predicates */
/** @name node type predicates */
/** @{ */
C4_ALWAYS_INLINE bool empty() const RYML_NOEXCEPT { _C4RR(); return tree_->empty(id_); } /**< Forward to Tree::empty(). Node must be readable. */
@@ -238,21 +243,52 @@ public:
C4_ALWAYS_INLINE bool has_key_tag() const RYML_NOEXCEPT { _C4RR(); return tree_->has_key_tag(id_); } /**< Forward to Tree::has_key_tag(). Node must be readable. */
C4_ALWAYS_INLINE bool has_val_tag() const RYML_NOEXCEPT { _C4RR(); return tree_->has_val_tag(id_); } /**< Forward to Tree::has_val_tag(). Node must be readable. */
C4_ALWAYS_INLINE bool has_key_anchor() const RYML_NOEXCEPT { _C4RR(); return tree_->has_key_anchor(id_); } /**< Forward to Tree::has_key_anchor(). Node must be readable. */
C4_ALWAYS_INLINE bool is_key_anchor() const RYML_NOEXCEPT { _C4RR(); return tree_->is_key_anchor(id_); } /**< Forward to Tree::is_key_anchor(). Node must be readable. */
C4_ALWAYS_INLINE bool has_val_anchor() const RYML_NOEXCEPT { _C4RR(); return tree_->has_val_anchor(id_); } /**< Forward to Tree::has_val_anchor(). Node must be readable. */
C4_ALWAYS_INLINE bool is_val_anchor() const RYML_NOEXCEPT { _C4RR(); return tree_->is_val_anchor(id_); } /**< Forward to Tree::is_val_anchor(). Node must be readable. */
C4_ALWAYS_INLINE bool has_anchor() const RYML_NOEXCEPT { _C4RR(); return tree_->has_anchor(id_); } /**< Forward to Tree::has_anchor(). Node must be readable. */
C4_ALWAYS_INLINE bool is_anchor() const RYML_NOEXCEPT { _C4RR(); return tree_->is_anchor(id_); } /**< Forward to Tree::is_anchor(). Node must be readable. */
C4_ALWAYS_INLINE bool is_key_ref() const RYML_NOEXCEPT { _C4RR(); return tree_->is_key_ref(id_); } /**< Forward to Tree::is_key_ref(). Node must be readable. */
C4_ALWAYS_INLINE bool is_val_ref() const RYML_NOEXCEPT { _C4RR(); return tree_->is_val_ref(id_); } /**< Forward to Tree::is_val_ref(). Node must be readable. */
C4_ALWAYS_INLINE bool is_ref() const RYML_NOEXCEPT { _C4RR(); return tree_->is_ref(id_); } /**< Forward to Tree::is_ref(). Node must be readable. */
C4_ALWAYS_INLINE bool is_anchor_or_ref() const RYML_NOEXCEPT { _C4RR(); return tree_->is_anchor_or_ref(id_); } /**< Forward to Tree::is_anchor_or_ref(. Node must be readable. */
C4_ALWAYS_INLINE bool is_key_quoted() const RYML_NOEXCEPT { _C4RR(); return tree_->is_key_quoted(id_); } /**< Forward to Tree::is_key_quoted(). Node must be readable. */
C4_ALWAYS_INLINE bool is_val_quoted() const RYML_NOEXCEPT { _C4RR(); return tree_->is_val_quoted(id_); } /**< Forward to Tree::is_val_quoted(). Node must be readable. */
C4_ALWAYS_INLINE bool is_quoted() const RYML_NOEXCEPT { _C4RR(); return tree_->is_quoted(id_); } /**< Forward to Tree::is_quoted(). Node must be readable. */
C4_ALWAYS_INLINE bool parent_is_seq() const RYML_NOEXCEPT { _C4RR(); return tree_->parent_is_seq(id_); } /**< Forward to Tree::parent_is_seq(). Node must be readable. */
C4_ALWAYS_INLINE bool parent_is_map() const RYML_NOEXCEPT { _C4RR(); return tree_->parent_is_map(id_); } /**< Forward to Tree::parent_is_map(). Node must be readable. */
RYML_DEPRECATED("use has_key_anchor()") bool is_key_anchor() const noexcept { _C4RR(); return tree_->has_key_anchor(id_); }
RYML_DEPRECATED("use has_val_anchor()") bool is_val_hanchor() const noexcept { _C4RR(); return tree_->has_val_anchor(id_); }
RYML_DEPRECATED("use has_anchor()") bool is_anchor() const noexcept { _C4RR(); return tree_->has_anchor(id_); }
RYML_DEPRECATED("use has_anchor() || is_ref()") bool is_anchor_or_ref() const noexcept { _C4RR(); return tree_->is_anchor_or_ref(id_); }
/** @} */
public:
/** @name node container+scalar style predicates */
/** @{ */
C4_ALWAYS_INLINE C4_PURE bool type_has_any(NodeType_e bits) const { _C4RR(); return tree_->type_has_any(id_, bits); }
C4_ALWAYS_INLINE C4_PURE bool type_has_all(NodeType_e bits) const { _C4RR(); return tree_->type_has_all(id_, bits); }
C4_ALWAYS_INLINE C4_PURE bool type_has_none(NodeType_e bits) const { _C4RR(); return tree_->type_has_none(id_, bits); }
C4_ALWAYS_INLINE C4_PURE bool is_container_styled() const { _C4RR(); return tree_->is_container_styled(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_block() const { _C4RR(); return tree_->is_block(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_flow_sl() const { _C4RR(); return tree_->is_flow_sl(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_flow_ml() const { _C4RR(); return tree_->is_flow_ml(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_flow() const { _C4RR(); return tree_->is_flow(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_key_styled() const { _C4RR(); return tree_->is_key_styled(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_val_styled() const { _C4RR(); return tree_->is_val_styled(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_key_literal() const { _C4RR(); return tree_->is_key_literal(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_val_literal() const { _C4RR(); return tree_->is_val_literal(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_key_folded() const { _C4RR(); return tree_->is_key_folded(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_val_folded() const { _C4RR(); return tree_->is_val_folded(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_key_squo() const { _C4RR(); return tree_->is_key_squo(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_val_squo() const { _C4RR(); return tree_->is_val_squo(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_key_dquo() const { _C4RR(); return tree_->is_key_dquo(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_val_dquo() const { _C4RR(); return tree_->is_val_dquo(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_key_plain() const { _C4RR(); return tree_->is_key_plain(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_val_plain() const { _C4RR(); return tree_->is_val_plain(id_); }
C4_ALWAYS_INLINE C4_PURE bool is_key_quoted() const { _C4RR(); return tree_->is_key_quoted(id_); } /**< Forward to Tree::is_key_quoted(). Node must be readable. */
C4_ALWAYS_INLINE C4_PURE bool is_val_quoted() const { _C4RR(); return tree_->is_val_quoted(id_); } /**< Forward to Tree::is_val_quoted(). Node must be readable. */
C4_ALWAYS_INLINE C4_PURE bool is_quoted() const { _C4RR(); return tree_->is_quoted(id_); } /**< Forward to Tree::is_quoted(). Node must be readable. */
/** @} */
public:
@@ -264,17 +300,16 @@ public:
C4_ALWAYS_INLINE bool has_parent() const RYML_NOEXCEPT { _C4RR(); return tree_->has_parent(id_); } /**< Forward to Tree::has_parent() Node must be readable. */
C4_ALWAYS_INLINE bool has_child(ConstImpl const& n) const RYML_NOEXCEPT { _C4RR(); return n.readable() ? tree_->has_child(id_, n.m_id) : false; } /**< Node must be readable. */
C4_ALWAYS_INLINE bool has_child(size_t node) const RYML_NOEXCEPT { _C4RR(); return tree_->has_child(id_, node); } /**< Node must be readable. */
C4_ALWAYS_INLINE bool has_child(id_type node) const RYML_NOEXCEPT { _C4RR(); return tree_->has_child(id_, node); } /**< Node must be readable. */
C4_ALWAYS_INLINE bool has_child(csubstr name) const RYML_NOEXCEPT { _C4RR(); return tree_->has_child(id_, name); } /**< Node must be readable. */
C4_ALWAYS_INLINE bool has_children() const RYML_NOEXCEPT { _C4RR(); return tree_->has_children(id_); } /**< Node must be readable. */
C4_ALWAYS_INLINE bool has_sibling(ConstImpl const& n) const RYML_NOEXCEPT { _C4RR(); return n.readable() ? tree_->has_sibling(id_, n.m_id) : false; } /**< Node must be readable. */
C4_ALWAYS_INLINE bool has_sibling(size_t node) const RYML_NOEXCEPT { _C4RR(); return tree_->has_sibling(id_, node); } /**< Node must be readable. */
C4_ALWAYS_INLINE bool has_sibling(id_type node) const RYML_NOEXCEPT { _C4RR(); return tree_->has_sibling(id_, node); } /**< Node must be readable. */
C4_ALWAYS_INLINE bool has_sibling(csubstr name) const RYML_NOEXCEPT { _C4RR(); return tree_->has_sibling(id_, name); } /**< Node must be readable. */
/** does not count with this */
C4_ALWAYS_INLINE bool has_other_siblings() const RYML_NOEXCEPT { _C4RR(); return tree_->has_other_siblings(id_); }
/** counts with this */
RYML_DEPRECATED("use has_other_siblings()") bool has_siblings() const RYML_NOEXCEPT { _C4RR(); return tree_->has_siblings(id_); }
/** @} */
@@ -285,8 +320,9 @@ public:
/** @{ */
template<class U=Impl>
C4_ALWAYS_INLINE auto doc(size_t i) RYML_NOEXCEPT -> _C4_IF_MUTABLE(Impl) { RYML_ASSERT(tree_); return {tree__, tree__->doc(i)}; } /**< Forward to Tree::doc(). Node must be readable. */
C4_ALWAYS_INLINE ConstImpl doc(size_t i) const RYML_NOEXCEPT { RYML_ASSERT(tree_); return {tree_, tree_->doc(i)}; } /**< Forward to Tree::doc(). Node must be readable. */
C4_ALWAYS_INLINE auto doc(id_type i) RYML_NOEXCEPT -> _C4_IF_MUTABLE(Impl) { RYML_ASSERT(tree_); return {tree__, tree__->doc(i)}; } /**< Forward to Tree::doc(). Node must be readable. */
/** succeeds even when the node may have invalid or seed id */
C4_ALWAYS_INLINE ConstImpl doc(id_type i) const RYML_NOEXCEPT { RYML_ASSERT(tree_); return {tree_, tree_->doc(i)}; } /**< Forward to Tree::doc(). Node must be readable. */
template<class U=Impl>
C4_ALWAYS_INLINE auto parent() RYML_NOEXCEPT -> _C4_IF_MUTABLE(Impl) { _C4RR(); return {tree__, tree__->parent(id__)}; } /**< Forward to Tree::parent(). Node must be readable. */
@@ -301,8 +337,8 @@ public:
C4_ALWAYS_INLINE ConstImpl last_child () const RYML_NOEXCEPT { _C4RR(); return {tree_, tree_->last_child (id_)}; } /**< Forward to Tree::last_child(). Node must be readable. */
template<class U=Impl>
C4_ALWAYS_INLINE auto child(size_t pos) RYML_NOEXCEPT -> _C4_IF_MUTABLE(Impl) { _C4RR(); return {tree__, tree__->child(id__, pos)}; } /**< Forward to Tree::child(). Node must be readable. */
C4_ALWAYS_INLINE ConstImpl child(size_t pos) const RYML_NOEXCEPT { _C4RR(); return {tree_, tree_->child(id_, pos)}; } /**< Forward to Tree::child(). Node must be readable. */
C4_ALWAYS_INLINE auto child(id_type pos) RYML_NOEXCEPT -> _C4_IF_MUTABLE(Impl) { _C4RR(); return {tree__, tree__->child(id__, pos)}; } /**< Forward to Tree::child(). Node must be readable. */
C4_ALWAYS_INLINE ConstImpl child(id_type pos) const RYML_NOEXCEPT { _C4RR(); return {tree_, tree_->child(id_, pos)}; } /**< Forward to Tree::child(). Node must be readable. */
template<class U=Impl>
C4_ALWAYS_INLINE auto find_child(csubstr name) RYML_NOEXCEPT -> _C4_IF_MUTABLE(Impl) { _C4RR(); return {tree__, tree__->find_child(id__, name)}; } /**< Forward to Tree::first_child(). Node must be readable. */
@@ -325,26 +361,27 @@ public:
C4_ALWAYS_INLINE ConstImpl last_sibling () const RYML_NOEXCEPT { _C4RR(); return {tree_, tree_->last_sibling(id_)}; } /**< Forward to Tree::last_sibling(). Node must be readable. */
template<class U=Impl>
C4_ALWAYS_INLINE auto sibling(size_t pos) RYML_NOEXCEPT -> _C4_IF_MUTABLE(Impl) { _C4RR(); return {tree__, tree__->sibling(id__, pos)}; } /**< Forward to Tree::sibling(). Node must be readable. */
C4_ALWAYS_INLINE ConstImpl sibling(size_t pos) const RYML_NOEXCEPT { _C4RR(); return {tree_, tree_->sibling(id_, pos)}; } /**< Forward to Tree::sibling(). Node must be readable. */
C4_ALWAYS_INLINE auto sibling(id_type pos) RYML_NOEXCEPT -> _C4_IF_MUTABLE(Impl) { _C4RR(); return {tree__, tree__->sibling(id__, pos)}; } /**< Forward to Tree::sibling(). Node must be readable. */
C4_ALWAYS_INLINE ConstImpl sibling(id_type pos) const RYML_NOEXCEPT { _C4RR(); return {tree_, tree_->sibling(id_, pos)}; } /**< Forward to Tree::sibling(). Node must be readable. */
template<class U=Impl>
C4_ALWAYS_INLINE auto find_sibling(csubstr name) RYML_NOEXCEPT -> _C4_IF_MUTABLE(Impl) { _C4RR(); return {tree__, tree__->find_sibling(id__, name)}; } /**< Forward to Tree::find_sibling(). Node must be readable. */
C4_ALWAYS_INLINE ConstImpl find_sibling(csubstr name) const RYML_NOEXCEPT { _C4RR(); return {tree_, tree_->find_sibling(id_, name)}; } /**< Forward to Tree::find_sibling(). Node must be readable. */
/** O(num_children). Forward to Tree::num_children(). */
C4_ALWAYS_INLINE size_t num_children() const RYML_NOEXCEPT { _C4RR(); return tree_->num_children(id_); }
C4_ALWAYS_INLINE id_type num_children() const RYML_NOEXCEPT { _C4RR(); return tree_->num_children(id_); }
C4_ALWAYS_INLINE size_t num_siblings() const RYML_NOEXCEPT { _C4RR(); return tree_->num_siblings(id_); }
/** O(num_children). Forward to Tree::num_siblings(). */
C4_ALWAYS_INLINE id_type num_siblings() const RYML_NOEXCEPT { _C4RR(); return tree_->num_siblings(id_); }
/** O(num_siblings). Return the number of siblings except this. */
C4_ALWAYS_INLINE size_t num_other_siblings() const RYML_NOEXCEPT { _C4RR(); return tree_->num_other_siblings(id_); }
/** O(num_siblings). Forward to Tree::num_other_siblings(). */
C4_ALWAYS_INLINE id_type num_other_siblings() const RYML_NOEXCEPT { _C4RR(); return tree_->num_other_siblings(id_); }
/** O(num_children). Return the position of a child within this node, using Tree::child_pos(). */
C4_ALWAYS_INLINE size_t child_pos(ConstImpl const& n) const RYML_NOEXCEPT { _C4RR(); _RYML_CB_ASSERT(tree_->m_callbacks, n.readable()); return tree_->child_pos(id_, n.m_id); }
/** O(num_children). Forward to Tree::child_pos(). */
C4_ALWAYS_INLINE id_type child_pos(ConstImpl const& n) const RYML_NOEXCEPT { _C4RR(); _RYML_CB_ASSERT(tree_->m_callbacks, n.readable()); return tree_->child_pos(id_, n.m_id); }
/** O(num_siblings) */
C4_ALWAYS_INLINE size_t sibling_pos(ConstImpl const& n) const RYML_NOEXCEPT { _C4RR(); _RYML_CB_ASSERT(tree_->callbacks(), n.readable()); return tree_->child_pos(tree_->parent(id_), n.m_id); }
/** O(num_siblings). Forward to Tree::sibling_pos(). */
C4_ALWAYS_INLINE id_type sibling_pos(ConstImpl const& n) const RYML_NOEXCEPT { _C4RR(); _RYML_CB_ASSERT(tree_->callbacks(), n.readable()); return tree_->child_pos(tree_->parent(id_), n.m_id); }
/** @} */
@@ -376,7 +413,7 @@ public:
C4_ALWAYS_INLINE auto operator[] (csubstr key) RYML_NOEXCEPT -> _C4_IF_MUTABLE(Impl)
{
_C4RR();
size_t ch = tree__->find_child(id__, key);
id_type ch = tree__->find_child(id__, key);
return ch != NONE ? Impl(tree__, ch) : Impl(tree__, id__, key);
}
@@ -399,10 +436,10 @@ public:
*
* @see https://github.com/biojppm/rapidyaml/issues/389 */
template<class U=Impl>
C4_ALWAYS_INLINE auto operator[] (size_t pos) RYML_NOEXCEPT -> _C4_IF_MUTABLE(Impl)
C4_ALWAYS_INLINE auto operator[] (id_type pos) RYML_NOEXCEPT -> _C4_IF_MUTABLE(Impl)
{
_C4RR();
size_t ch = tree__->child(id__, pos);
id_type ch = tree__->child(id__, pos);
return ch != NONE ? Impl(tree__, ch) : Impl(tree__, id__, pos);
}
@@ -418,7 +455,7 @@ public:
C4_ALWAYS_INLINE ConstImpl operator[] (csubstr key) const RYML_NOEXCEPT
{
_C4RR();
size_t ch = tree_->find_child(id_, key);
id_type ch = tree_->find_child(id_, key);
_RYML_CB_ASSERT(tree_->m_callbacks, ch != NONE);
return {tree_, ch};
}
@@ -432,10 +469,10 @@ public:
* it is UB to use the return value if it is not valid.
*
* @see https://github.com/biojppm/rapidyaml/issues/389 */
C4_ALWAYS_INLINE ConstImpl operator[] (size_t pos) const RYML_NOEXCEPT
C4_ALWAYS_INLINE ConstImpl operator[] (id_type pos) const RYML_NOEXCEPT
{
_C4RR();
size_t ch = tree_->child(id_, pos);
id_type ch = tree_->child(id_, pos);
_RYML_CB_ASSERT(tree_->m_callbacks, ch != NONE);
return {tree_, ch};
}
@@ -486,7 +523,7 @@ public:
_RYML_CB_CHECK(tree_->m_callbacks, (id_ >= 0 && id_ < tree_->capacity()));
_RYML_CB_CHECK(tree_->m_callbacks, ((Impl const*)this)->readable());
_RYML_CB_CHECK(tree_->m_callbacks, tree_->is_map(id_));
size_t ch = tree__->find_child(id__, key);
id_type ch = tree__->find_child(id__, key);
return ch != NONE ? Impl(tree__, ch) : Impl(tree__, id__, key);
}
@@ -508,7 +545,7 @@ public:
* valid (points at a tree and a node), b) the calling object must
* be readable (must not be in seed state), c) the calling object
* must be pointing at a MAP node. The preconditions are similar
* to the non-const operator[](size_t), but instead of using
* to the non-const operator[](id_type), but instead of using
* assertions, this function directly checks those conditions and
* calls the error callback if any of the checks fail.
*
@@ -516,15 +553,15 @@ public:
* seed state, the error callback is not invoked when this
* happens. */
template<class U=Impl>
C4_ALWAYS_INLINE auto at(size_t pos) -> _C4_IF_MUTABLE(Impl)
C4_ALWAYS_INLINE auto at(id_type pos) -> _C4_IF_MUTABLE(Impl)
{
RYML_CHECK(tree_ != nullptr);
const size_t cap = tree_->capacity();
const id_type cap = tree_->capacity();
_RYML_CB_CHECK(tree_->m_callbacks, (id_ >= 0 && id_ < cap));
_RYML_CB_CHECK(tree_->m_callbacks, (pos >= 0 && pos < cap));
_RYML_CB_CHECK(tree_->m_callbacks, ((Impl const*)this)->readable());
_RYML_CB_CHECK(tree_->m_callbacks, tree_->is_container(id_));
size_t ch = tree__->child(id__, pos);
id_type ch = tree__->child(id__, pos);
return ch != NONE ? Impl(tree__, ch) : Impl(tree__, id__, pos);
}
@@ -543,7 +580,7 @@ public:
_RYML_CB_CHECK(tree_->m_callbacks, (id_ >= 0 && id_ < tree_->capacity()));
_RYML_CB_CHECK(tree_->m_callbacks, ((Impl const*)this)->readable());
_RYML_CB_CHECK(tree_->m_callbacks, tree_->is_map(id_));
size_t ch = tree_->find_child(id_, key);
id_type ch = tree_->find_child(id_, key);
_RYML_CB_CHECK(tree_->m_callbacks, ch != NONE);
return {tree_, ch};
}
@@ -551,21 +588,21 @@ public:
/** Get a child by position, with error checking; complexity is
* O(pos).
*
* Behaves as operator[](size_t) const, but always raises an error
* Behaves as operator[](id_type) const, but always raises an error
* (even when RYML_USE_ASSERT is set to false) when the returned
* node does not exist, or when this node is not readable, or when
* it is not a container. This behaviour is similar to
* std::vector::at(), but the error consists in calling the error
* callback instead of directly raising an exception. */
ConstImpl at(size_t pos) const
ConstImpl at(id_type pos) const
{
RYML_CHECK(tree_ != nullptr);
const size_t cap = tree_->capacity();
const id_type cap = tree_->capacity();
_RYML_CB_CHECK(tree_->m_callbacks, (id_ >= 0 && id_ < cap));
_RYML_CB_CHECK(tree_->m_callbacks, (pos >= 0 && pos < cap));
_RYML_CB_CHECK(tree_->m_callbacks, ((Impl const*)this)->readable());
_RYML_CB_CHECK(tree_->m_callbacks, tree_->is_container(id_));
size_t ch = tree_->child(id_, pos);
id_type ch = tree_->child(id_, pos);
_RYML_CB_CHECK(tree_->m_callbacks, ch != NONE);
return {tree_, ch};
}
@@ -721,14 +758,14 @@ public:
/** visit every child node calling fn(node) */
template<class Visitor>
bool visit(Visitor fn, size_t indentation_level=0, bool skip_root=true) const RYML_NOEXCEPT
bool visit(Visitor fn, id_type indentation_level=0, bool skip_root=true) const RYML_NOEXCEPT
{
_C4RR();
return detail::_visit(*(ConstImpl const*)this, fn, indentation_level, skip_root);
}
/** visit every child node calling fn(node) */
template<class Visitor, class U=Impl>
auto visit(Visitor fn, size_t indentation_level=0, bool skip_root=true) RYML_NOEXCEPT
auto visit(Visitor fn, id_type indentation_level=0, bool skip_root=true) RYML_NOEXCEPT
-> _C4_IF_MUTABLE(bool)
{
_C4RR();
@@ -737,14 +774,14 @@ public:
/** visit every child node calling fn(node, level) */
template<class Visitor>
bool visit_stacked(Visitor fn, size_t indentation_level=0, bool skip_root=true) const RYML_NOEXCEPT
bool visit_stacked(Visitor fn, id_type indentation_level=0, bool skip_root=true) const RYML_NOEXCEPT
{
_C4RR();
return detail::_visit_stacked(*(ConstImpl const*)this, fn, indentation_level, skip_root);
}
/** visit every child node calling fn(node, level) */
template<class Visitor, class U=Impl>
auto visit_stacked(Visitor fn, size_t indentation_level=0, bool skip_root=true) RYML_NOEXCEPT
auto visit_stacked(Visitor fn, id_type indentation_level=0, bool skip_root=true) RYML_NOEXCEPT
-> _C4_IF_MUTABLE(bool)
{
_C4RR();
@@ -768,6 +805,7 @@ public:
C4_SUPPRESS_WARNING_GCC_CLANG_POP
};
} // detail
//-----------------------------------------------------------------------------
@@ -787,7 +825,7 @@ public:
public:
Tree const* C4_RESTRICT m_tree;
size_t m_id;
id_type m_id;
friend NodeRef;
friend struct detail::RoNodeMethods<ConstNodeRef, ConstNodeRef>;
@@ -800,7 +838,7 @@ public:
ConstNodeRef() : m_tree(nullptr), m_id(NONE) {}
ConstNodeRef(Tree const &t) : m_tree(&t), m_id(t .root_id()) {}
ConstNodeRef(Tree const *t) : m_tree(t ), m_id(t->root_id()) {}
ConstNodeRef(Tree const *t, size_t id) : m_tree(t), m_id(id) {}
ConstNodeRef(Tree const *t, id_type id) : m_tree(t), m_id(id) {}
ConstNodeRef(std::nullptr_t) : m_tree(nullptr), m_id(NONE) {}
ConstNodeRef(ConstNodeRef const&) = default;
@@ -852,7 +890,7 @@ public:
/** @{ */
C4_ALWAYS_INLINE Tree const* tree() const noexcept { return m_tree; }
C4_ALWAYS_INLINE size_t id() const noexcept { return m_id; }
C4_ALWAYS_INLINE id_type id() const noexcept { return m_id; }
/** @} */
@@ -927,7 +965,7 @@ public:
private:
Tree *C4_RESTRICT m_tree;
size_t m_id;
id_type m_id;
/** This member is used to enable lazy operator[] writing. When a child
* with a key or index is not found, m_id is set to the id of the parent
@@ -945,7 +983,7 @@ private:
friend struct detail::RoNodeMethods<NodeRef, ConstNodeRef>;
// require valid: a helper macro, undefined at the end
#define _C4RV() \
#define _C4RR() \
RYML_ASSERT(m_tree != nullptr); \
_RYML_CB_ASSERT(m_tree->m_callbacks, m_id != NONE && !is_seed())
// require id: a helper macro, undefined at the end
@@ -961,9 +999,9 @@ public:
NodeRef() : m_tree(nullptr), m_id(NONE), m_seed() { _clear_seed(); }
NodeRef(Tree &t) : m_tree(&t), m_id(t .root_id()), m_seed() { _clear_seed(); }
NodeRef(Tree *t) : m_tree(t ), m_id(t->root_id()), m_seed() { _clear_seed(); }
NodeRef(Tree *t, size_t id) : m_tree(t), m_id(id), m_seed() { _clear_seed(); }
NodeRef(Tree *t, size_t id, size_t seed_pos) : m_tree(t), m_id(id), m_seed() { m_seed.str = nullptr; m_seed.len = seed_pos; }
NodeRef(Tree *t, size_t id, csubstr seed_key) : m_tree(t), m_id(id), m_seed(seed_key) {}
NodeRef(Tree *t, id_type id) : m_tree(t), m_id(id), m_seed() { _clear_seed(); }
NodeRef(Tree *t, id_type id, id_type seed_pos) : m_tree(t), m_id(id), m_seed() { m_seed.str = nullptr; m_seed.len = (size_t)seed_pos; }
NodeRef(Tree *t, id_type id, csubstr seed_key) : m_tree(t), m_id(id), m_seed(seed_key) {}
NodeRef(std::nullptr_t) : m_tree(nullptr), m_id(NONE), m_seed() {}
inline void _clear_seed() { /*do the following manually or an assert is triggered: */ m_seed.str = nullptr; m_seed.len = NONE; }
@@ -1031,12 +1069,10 @@ public:
RYML_DEPRECATED("use !readable()") bool operator== (std::nullptr_t) const { return m_tree == nullptr || m_id == NONE || is_seed(); }
RYML_DEPRECATED("use readable()") bool operator!= (std::nullptr_t) const { return !(m_tree == nullptr || m_id == NONE || is_seed()); }
RYML_DEPRECATED("use `this->val() == s`") bool operator== (csubstr s) const { _C4RV(); _RYML_CB_ASSERT(m_tree->m_callbacks, has_val()); return m_tree->val(m_id) == s; }
RYML_DEPRECATED("use `this->val() != s`") bool operator!= (csubstr s) const { _C4RV(); _RYML_CB_ASSERT(m_tree->m_callbacks, has_val()); return m_tree->val(m_id) != s; }
RYML_DEPRECATED("use `this->val() == s`") bool operator== (csubstr s) const { _C4RR(); _RYML_CB_ASSERT(m_tree->m_callbacks, has_val()); return m_tree->val(m_id) == s; }
RYML_DEPRECATED("use `this->val() != s`") bool operator!= (csubstr s) const { _C4RR(); _RYML_CB_ASSERT(m_tree->m_callbacks, has_val()); return m_tree->val(m_id) != s; }
/** @endcond */
/** @} */
public:
/** @name node_property_getters
@@ -1045,7 +1081,7 @@ public:
C4_ALWAYS_INLINE C4_PURE Tree * tree() noexcept { return m_tree; }
C4_ALWAYS_INLINE C4_PURE Tree const* tree() const noexcept { return m_tree; }
C4_ALWAYS_INLINE C4_PURE size_t id() const noexcept { return m_id; }
C4_ALWAYS_INLINE C4_PURE id_type id() const noexcept { return m_id; }
/** @} */
@@ -1056,7 +1092,7 @@ public:
void create() { _apply_seed(); }
void change_type(NodeType t) { _C4RV(); m_tree->change_type(m_id, t); }
void change_type(NodeType t) { _C4RR(); m_tree->change_type(m_id, t); }
void set_type(NodeType t) { _apply_seed(); m_tree->_set_flags(m_id, t); }
void set_key(csubstr key) { _apply_seed(); m_tree->_set_key(m_id, key); }
@@ -1068,37 +1104,11 @@ public:
void set_key_ref(csubstr key_ref) { _apply_seed(); m_tree->set_key_ref(m_id, key_ref); }
void set_val_ref(csubstr val_ref) { _apply_seed(); m_tree->set_val_ref(m_id, val_ref); }
template<class T>
size_t set_key_serialized(T const& C4_RESTRICT k)
{
_apply_seed();
csubstr s = m_tree->to_arena(k);
m_tree->_set_key(m_id, s);
return s.len;
}
template<class T>
size_t set_val_serialized(T const& C4_RESTRICT v)
{
_apply_seed();
csubstr s = m_tree->to_arena(v);
m_tree->_set_val(m_id, s);
return s.len;
}
size_t set_val_serialized(std::nullptr_t)
{
_apply_seed();
m_tree->_set_val(m_id, csubstr{});
return 0;
}
void set_container_style(NodeType_e style) { _C4RR(); m_tree->set_container_style(m_id, style); }
void set_key_style(NodeType_e style) { _C4RR(); m_tree->set_key_style(m_id, style); }
void set_val_style(NodeType_e style) { _C4RR(); m_tree->set_val_style(m_id, style); }
/** encode a blob as base64 into the tree's arena, then assign the
* result to the node's key @return the size of base64-encoded
* blob */
size_t set_key_serialized(fmt::const_base64_wrapper w);
/** encode a blob as base64 into the tree's arena, then assign the
* result to the node's val @return the size of base64-encoded
* blob */
size_t set_val_serialized(fmt::const_base64_wrapper w);
public:
inline void clear()
{
@@ -1189,6 +1199,45 @@ public:
return m_tree->to_arena(s);
}
template<class T>
size_t set_key_serialized(T const& C4_RESTRICT k)
{
_apply_seed();
csubstr s = m_tree->to_arena(k);
m_tree->_set_key(m_id, s);
return s.len;
}
size_t set_key_serialized(std::nullptr_t)
{
_apply_seed();
m_tree->_set_key(m_id, csubstr{});
return 0;
}
template<class T>
size_t set_val_serialized(T const& C4_RESTRICT v)
{
_apply_seed();
csubstr s = m_tree->to_arena(v);
m_tree->_set_val(m_id, s);
return s.len;
}
size_t set_val_serialized(std::nullptr_t)
{
_apply_seed();
m_tree->_set_val(m_id, csubstr{});
return 0;
}
/** encode a blob as base64 into the tree's arena, then assign the
* result to the node's key
* @return the size of base64-encoded blob */
size_t set_key_serialized(fmt::const_base64_wrapper w);
/** encode a blob as base64 into the tree's arena, then assign the
* result to the node's val
* @return the size of base64-encoded blob */
size_t set_val_serialized(fmt::const_base64_wrapper w);
/** serialize a variable, then assign the result to the node's val */
inline NodeRef& operator<< (csubstr s)
{
@@ -1250,14 +1299,14 @@ private:
m_id = m_tree->append_child(m_id);
m_tree->_set_key(m_id, m_seed);
m_seed.str = nullptr;
m_seed.len = NONE;
m_seed.len = (size_t)NONE;
}
else if(m_seed.len != NONE) // we have a seed index: create a child at that position
else if(m_seed.len != (size_t)NONE) // we have a seed index: create a child at that position
{
_RYML_CB_ASSERT(m_tree->m_callbacks, m_tree->num_children(m_id) == m_seed.len);
_RYML_CB_ASSERT(m_tree->m_callbacks, (size_t)m_tree->num_children(m_id) == m_seed.len);
m_id = m_tree->append_child(m_id);
m_seed.str = nullptr;
m_seed.len = NONE;
m_seed.len = (size_t)NONE;
}
else
{
@@ -1287,7 +1336,7 @@ public:
inline NodeRef insert_child(NodeRef after)
{
_C4RV();
_C4RR();
_RYML_CB_ASSERT(m_tree->m_callbacks, after.m_tree == m_tree);
NodeRef r(m_tree, m_tree->insert_child(m_id, after.m_id));
return r;
@@ -1295,7 +1344,7 @@ public:
inline NodeRef insert_child(NodeInit const& i, NodeRef after)
{
_C4RV();
_C4RR();
_RYML_CB_ASSERT(m_tree->m_callbacks, after.m_tree == m_tree);
NodeRef r(m_tree, m_tree->insert_child(m_id, after.m_id));
r._apply(i);
@@ -1304,14 +1353,14 @@ public:
inline NodeRef prepend_child()
{
_C4RV();
_C4RR();
NodeRef r(m_tree, m_tree->insert_child(m_id, NONE));
return r;
}
inline NodeRef prepend_child(NodeInit const& i)
{
_C4RV();
_C4RR();
NodeRef r(m_tree, m_tree->insert_child(m_id, NONE));
r._apply(i);
return r;
@@ -1319,14 +1368,14 @@ public:
inline NodeRef append_child()
{
_C4RV();
_C4RR();
NodeRef r(m_tree, m_tree->append_child(m_id));
return r;
}
inline NodeRef append_child(NodeInit const& i)
{
_C4RV();
_C4RR();
NodeRef r(m_tree, m_tree->append_child(m_id));
r._apply(i);
return r;
@@ -1334,7 +1383,7 @@ public:
inline NodeRef insert_sibling(ConstNodeRef const& after)
{
_C4RV();
_C4RR();
_RYML_CB_ASSERT(m_tree->m_callbacks, after.m_tree == m_tree);
NodeRef r(m_tree, m_tree->insert_sibling(m_id, after.m_id));
return r;
@@ -1342,7 +1391,7 @@ public:
inline NodeRef insert_sibling(NodeInit const& i, ConstNodeRef const& after)
{
_C4RV();
_C4RR();
_RYML_CB_ASSERT(m_tree->m_callbacks, after.m_tree == m_tree);
NodeRef r(m_tree, m_tree->insert_sibling(m_id, after.m_id));
r._apply(i);
@@ -1351,14 +1400,14 @@ public:
inline NodeRef prepend_sibling()
{
_C4RV();
_C4RR();
NodeRef r(m_tree, m_tree->prepend_sibling(m_id));
return r;
}
inline NodeRef prepend_sibling(NodeInit const& i)
{
_C4RV();
_C4RR();
NodeRef r(m_tree, m_tree->prepend_sibling(m_id));
r._apply(i);
return r;
@@ -1366,14 +1415,14 @@ public:
inline NodeRef append_sibling()
{
_C4RV();
_C4RR();
NodeRef r(m_tree, m_tree->append_sibling(m_id));
return r;
}
inline NodeRef append_sibling(NodeInit const& i)
{
_C4RV();
_C4RR();
NodeRef r(m_tree, m_tree->append_sibling(m_id));
r._apply(i);
return r;
@@ -1383,7 +1432,7 @@ public:
inline void remove_child(NodeRef & child)
{
_C4RV();
_C4RR();
_RYML_CB_ASSERT(m_tree->m_callbacks, has_child(child));
_RYML_CB_ASSERT(m_tree->m_callbacks, child.parent().id() == id());
m_tree->remove(child.id());
@@ -1391,11 +1440,11 @@ public:
}
//! remove the nth child of this node
inline void remove_child(size_t pos)
inline void remove_child(id_type pos)
{
_C4RV();
_C4RR();
_RYML_CB_ASSERT(m_tree->m_callbacks, pos >= 0 && pos < num_children());
size_t child = m_tree->child(m_id, pos);
id_type child = m_tree->child(m_id, pos);
_RYML_CB_ASSERT(m_tree->m_callbacks, child != NONE);
m_tree->remove(child);
}
@@ -1403,8 +1452,8 @@ public:
//! remove a child by name
inline void remove_child(csubstr key)
{
_C4RV();
size_t child = m_tree->find_child(m_id, key);
_C4RR();
id_type child = m_tree->find_child(m_id, key);
_RYML_CB_ASSERT(m_tree->m_callbacks, child != NONE);
m_tree->remove(child);
}
@@ -1417,7 +1466,7 @@ public:
* `n.move({})`. */
inline void move(ConstNodeRef const& after)
{
_C4RV();
_C4RR();
m_tree->move(m_id, after.m_id);
}
@@ -1427,7 +1476,7 @@ public:
* pointer is reset to the tree of the parent node. */
inline void move(NodeRef const& parent, ConstNodeRef const& after)
{
_C4RV();
_C4RR();
if(parent.m_tree == m_tree)
{
m_tree->move(m_id, parent.m_id, after.m_id);
@@ -1445,9 +1494,9 @@ public:
* default-constructed reference like this: `n.move({})`. */
inline NodeRef duplicate(ConstNodeRef const& after) const
{
_C4RV();
_C4RR();
_RYML_CB_ASSERT(m_tree->m_callbacks, m_tree == after.m_tree || after.m_id == NONE);
size_t dup = m_tree->duplicate(m_id, m_tree->parent(m_id), after.m_id);
id_type dup = m_tree->duplicate(m_id, m_tree->parent(m_id), after.m_id);
NodeRef r(m_tree, dup);
return r;
}
@@ -1459,17 +1508,17 @@ public:
* this: `n.move({})`. */
inline NodeRef duplicate(NodeRef const& parent, ConstNodeRef const& after) const
{
_C4RV();
_C4RR();
_RYML_CB_ASSERT(m_tree->m_callbacks, parent.m_tree == after.m_tree || after.m_id == NONE);
if(parent.m_tree == m_tree)
{
size_t dup = m_tree->duplicate(m_id, parent.m_id, after.m_id);
id_type dup = m_tree->duplicate(m_id, parent.m_id, after.m_id);
NodeRef r(m_tree, dup);
return r;
}
else
{
size_t dup = parent.m_tree->duplicate(m_tree, m_id, parent.m_id, after.m_id);
id_type dup = parent.m_tree->duplicate(m_tree, m_id, parent.m_id, after.m_id);
NodeRef r(parent.m_tree, dup);
return r;
}
@@ -1477,7 +1526,7 @@ public:
inline void duplicate_children(NodeRef const& parent, ConstNodeRef const& after) const
{
_C4RV();
_C4RR();
_RYML_CB_ASSERT(m_tree->m_callbacks, parent.m_tree == after.m_tree);
if(parent.m_tree == m_tree)
{
@@ -1491,7 +1540,7 @@ public:
/** @} */
#undef _C4RV
#undef _C4RR
#undef _C4RID
};
@@ -1500,13 +1549,13 @@ public:
inline ConstNodeRef::ConstNodeRef(NodeRef const& that)
: m_tree(that.m_tree)
, m_id(!that.is_seed() ? that.id() : NONE)
, m_id(!that.is_seed() ? that.id() : (id_type)NONE)
{
}
inline ConstNodeRef::ConstNodeRef(NodeRef && that)
: m_tree(that.m_tree)
, m_id(!that.is_seed() ? that.id() : NONE)
, m_id(!that.is_seed() ? that.id() : (id_type)NONE)
{
}
@@ -1514,14 +1563,14 @@ inline ConstNodeRef::ConstNodeRef(NodeRef && that)
inline ConstNodeRef& ConstNodeRef::operator= (NodeRef const& that)
{
m_tree = (that.m_tree);
m_id = (!that.is_seed() ? that.id() : NONE);
m_id = (!that.is_seed() ? that.id() : (id_type)NONE);
return *this;
}
inline ConstNodeRef& ConstNodeRef::operator= (NodeRef && that)
{
m_tree = (that.m_tree);
m_id = (!that.is_seed() ? that.id() : NONE);
m_id = (!that.is_seed() ? that.id() : (id_type)NONE);
return *this;
}

209
src/c4/yml/node_type.cpp Normal file
View File

@@ -0,0 +1,209 @@
#include "c4/yml/node_type.hpp"
namespace c4 {
namespace yml {
const char* NodeType::type_str(NodeType_e ty) noexcept
{
switch(ty & _TYMASK)
{
case KEYVAL:
return "KEYVAL";
case KEY:
return "KEY";
case VAL:
return "VAL";
case MAP:
return "MAP";
case SEQ:
return "SEQ";
case KEYMAP:
return "KEYMAP";
case KEYSEQ:
return "KEYSEQ";
case DOCSEQ:
return "DOCSEQ";
case DOCMAP:
return "DOCMAP";
case DOCVAL:
return "DOCVAL";
case DOC:
return "DOC";
case STREAM:
return "STREAM";
case NOTYPE:
return "NOTYPE";
default:
if((ty & KEYVAL) == KEYVAL)
return "KEYVAL***";
if((ty & KEYMAP) == KEYMAP)
return "KEYMAP***";
if((ty & KEYSEQ) == KEYSEQ)
return "KEYSEQ***";
if((ty & DOCSEQ) == DOCSEQ)
return "DOCSEQ***";
if((ty & DOCMAP) == DOCMAP)
return "DOCMAP***";
if((ty & DOCVAL) == DOCVAL)
return "DOCVAL***";
if(ty & KEY)
return "KEY***";
if(ty & VAL)
return "VAL***";
if(ty & MAP)
return "MAP***";
if(ty & SEQ)
return "SEQ***";
if(ty & DOC)
return "DOC***";
return "(unk)";
}
}
csubstr NodeType::type_str(substr buf, NodeType_e flags) noexcept
{
size_t pos = 0;
bool gotone = false;
#define _prflag(fl, txt) \
do { \
if((flags & fl) == (fl)) \
{ \
if(gotone) \
{ \
if(pos + 1 < buf.len) \
buf[pos] = '|'; \
++pos; \
} \
csubstr fltxt = txt; \
if(pos + fltxt.len <= buf.len) \
memcpy(buf.str + pos, fltxt.str, fltxt.len); \
pos += fltxt.len; \
gotone = true; \
flags = (flags & ~fl); /*remove the flag*/ \
} \
} while(0)
_prflag(STREAM, "STREAM");
_prflag(DOC, "DOC");
// key properties
_prflag(KEY, "KEY");
_prflag(KEYTAG, "KTAG");
_prflag(KEYANCH, "KANCH");
_prflag(KEYREF, "KREF");
_prflag(KEY_LITERAL, "KLITERAL");
_prflag(KEY_FOLDED, "KFOLDED");
_prflag(KEY_SQUO, "KSQUO");
_prflag(KEY_DQUO, "KDQUO");
_prflag(KEY_PLAIN, "KPLAIN");
_prflag(KEY_UNFILT, "KUNFILT");
// val properties
_prflag(VAL, "VAL");
_prflag(VALTAG, "VTAG");
_prflag(VALANCH, "VANCH");
_prflag(VALREF, "VREF");
_prflag(VAL_UNFILT, "VUNFILT");
_prflag(VAL_LITERAL, "VLITERAL");
_prflag(VAL_FOLDED, "VFOLDED");
_prflag(VAL_SQUO, "VSQUO");
_prflag(VAL_DQUO, "VDQUO");
_prflag(VAL_PLAIN, "VPLAIN");
_prflag(VAL_UNFILT, "VUNFILT");
// container properties
_prflag(MAP, "MAP");
_prflag(SEQ, "SEQ");
_prflag(FLOW_SL, "FLOWSL");
_prflag(FLOW_ML, "FLOWML");
_prflag(BLOCK, "BLCK");
if(pos == 0)
_prflag(NOTYPE, "NOTYPE");
#undef _prflag
if(pos < buf.len)
{
buf[pos] = '\0';
return buf.first(pos);
}
else
{
csubstr failed;
failed.len = pos + 1;
failed.str = nullptr;
return failed;
}
}
//-----------------------------------------------------------------------------
// see https://www.yaml.info/learn/quote.html#noplain
bool scalar_style_query_squo(csubstr s) noexcept
{
return ! s.first_of_any("\n ", "\n\t");
}
// see https://www.yaml.info/learn/quote.html#noplain
bool scalar_style_query_plain(csubstr s) noexcept
{
if(s.begins_with("-."))
{
if(s == "-.inf" || s == "-.INF")
return true;
else if(s.sub(2).is_number())
return true;
}
return s != ':'
&& ( ! s.begins_with_any("-:?*&,'\"{}[]|>%#@`\r")) // @ and ` are reserved characters
&& ( ! s.ends_with_any(":#"))
// make this check in the last place, as it has linear
// complexity, while the previous ones are
// constant-time
&& (s.first_of("\n#:[]{},") == npos);
}
NodeType_e scalar_style_choose(csubstr s) noexcept
{
if(s.len)
{
if(s.begins_with_any(" \n\t")
||
s.ends_with_any(" \n\t"))
{
return SCALAR_DQUO;
}
else if( ! scalar_style_query_plain(s))
{
return scalar_style_query_squo(s) ? SCALAR_SQUO : SCALAR_DQUO;
}
// nothing remarkable - use plain
return SCALAR_PLAIN;
}
return s.str ? SCALAR_SQUO : SCALAR_PLAIN;
}
NodeType_e scalar_style_json_choose(csubstr s) noexcept
{
// do not quote special cases
bool plain = (
(s == "true" || s == "false" || s == "null")
||
(
// do not quote numbers
s.is_number()
&&
(
// quote integral numbers if they have a leading 0
// https://github.com/biojppm/rapidyaml/issues/291
(!(s.len > 1 && s.begins_with('0')))
// do not quote reals with leading 0
// https://github.com/biojppm/rapidyaml/issues/313
|| (s.find('.') != csubstr::npos)
)
)
);
return plain ? SCALAR_PLAIN : SCALAR_DQUO;
}
} // namespace yml
} // namespace c4

271
src/c4/yml/node_type.hpp Normal file
View File

@@ -0,0 +1,271 @@
#ifndef C4_YML_NODE_TYPE_HPP_
#define C4_YML_NODE_TYPE_HPP_
#ifndef _C4_YML_COMMON_HPP_
#include "c4/yml/common.hpp"
#endif
C4_SUPPRESS_WARNING_MSVC_PUSH
C4_SUPPRESS_WARNING_GCC_CLANG_PUSH
C4_SUPPRESS_WARNING_GCC_CLANG("-Wold-style-cast")
namespace c4 {
namespace yml {
/** @addtogroup doc_node_type
*
* @{
*/
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
/** the integral type necessary to cover all the bits for NodeType_e */
using type_bits = uint32_t;
/** a bit mask for marking node types and styles */
typedef enum : type_bits {
#define __(v) (type_bits(1) << v) // a convenience define, undefined below
NOTYPE = 0, ///< no node type or style is set
KEY = __(0), ///< is member of a map, must have non-empty key
VAL = __(1), ///< a scalar: has a scalar (ie string) value, possibly empty. must be a leaf node, and cannot be MAP or SEQ
MAP = __(2), ///< a map: a parent of KEYVAL/KEYSEQ/KEYMAP nodes
SEQ = __(3), ///< a seq: a parent of VAL/SEQ/MAP nodes
DOC = __(4), ///< a document
STREAM = __(5)|SEQ, ///< a stream: a seq of docs
KEYREF = __(6), ///< a *reference: the key references an &anchor
VALREF = __(7), ///< a *reference: the val references an &anchor
KEYANCH = __(8), ///< the key has an &anchor
VALANCH = __(9), ///< the val has an &anchor
KEYTAG = __(10), ///< the key has a tag
VALTAG = __(11), ///< the val has a tag
_TYMASK = __(12)-1, ///< all the bits up to here
//
// unfiltered flags:
//
KEY_UNFILT = __(12), ///< the key scalar was left unfiltered; the parser was set not to filter. @see ParserOptions
VAL_UNFILT = __(13), ///< the val scalar was left unfiltered; the parser was set not to filter. @see ParserOptions
//
// style flags:
//
FLOW_SL = __(14), ///< mark container with single-line flow style (seqs as '[val1,val2], maps as '{key: val,key2: val2}')
FLOW_ML = __(15), ///< (NOT IMPLEMENTED, work in progress) mark container with multi-line flow style (seqs as '[\n val1,\n val2\n], maps as '{\n key: val,\n key2: val2\n}')
BLOCK = __(16), ///< mark container with block style (seqs as '- val\n', maps as 'key: val')
KEY_LITERAL = __(17), ///< mark key scalar as multiline, block literal |
VAL_LITERAL = __(18), ///< mark val scalar as multiline, block literal |
KEY_FOLDED = __(19), ///< mark key scalar as multiline, block folded >
VAL_FOLDED = __(20), ///< mark val scalar as multiline, block folded >
KEY_SQUO = __(21), ///< mark key scalar as single quoted '
VAL_SQUO = __(22), ///< mark val scalar as single quoted '
KEY_DQUO = __(23), ///< mark key scalar as double quoted "
VAL_DQUO = __(24), ///< mark val scalar as double quoted "
KEY_PLAIN = __(25), ///< mark key scalar as plain scalar (unquoted, even when multiline)
VAL_PLAIN = __(26), ///< mark val scalar as plain scalar (unquoted, even when multiline)
//
// type combination masks:
//
KEYVAL = KEY|VAL,
KEYSEQ = KEY|SEQ,
KEYMAP = KEY|MAP,
DOCMAP = DOC|MAP,
DOCSEQ = DOC|SEQ,
DOCVAL = DOC|VAL,
//
// style combination masks:
//
SCALAR_LITERAL = KEY_LITERAL|VAL_LITERAL,
SCALAR_FOLDED = KEY_FOLDED|VAL_FOLDED,
SCALAR_SQUO = KEY_SQUO|VAL_SQUO,
SCALAR_DQUO = KEY_DQUO|VAL_DQUO,
SCALAR_PLAIN = KEY_PLAIN|VAL_PLAIN,
KEYQUO = KEY_SQUO|KEY_DQUO|KEY_FOLDED|KEY_LITERAL, ///< key style is one of ', ", > or |
VALQUO = VAL_SQUO|VAL_DQUO|VAL_FOLDED|VAL_LITERAL, ///< val style is one of ', ", > or |
KEY_STYLE = KEY_LITERAL|KEY_FOLDED|KEY_SQUO|KEY_DQUO|KEY_PLAIN, ///< mask of all the scalar styles for key (not container styles!)
VAL_STYLE = VAL_LITERAL|VAL_FOLDED|VAL_SQUO|VAL_DQUO|VAL_PLAIN, ///< mask of all the scalar styles for val (not container styles!)
SCALAR_STYLE = KEY_STYLE|VAL_STYLE,
CONTAINER_STYLE_FLOW = FLOW_SL|FLOW_ML,
CONTAINER_STYLE_BLOCK = BLOCK,
CONTAINER_STYLE = FLOW_SL|FLOW_ML|BLOCK,
STYLE = SCALAR_STYLE | CONTAINER_STYLE,
//
// mixed masks
_KEYMASK = KEY | KEYQUO | KEYANCH | KEYREF | KEYTAG,
_VALMASK = VAL | VALQUO | VALANCH | VALREF | VALTAG,
#undef __
} NodeType_e;
constexpr C4_ALWAYS_INLINE C4_CONST NodeType_e operator| (NodeType_e lhs, NodeType_e rhs) noexcept { return (NodeType_e)(((type_bits)lhs) | ((type_bits)rhs)); }
constexpr C4_ALWAYS_INLINE C4_CONST NodeType_e operator& (NodeType_e lhs, NodeType_e rhs) noexcept { return (NodeType_e)(((type_bits)lhs) & ((type_bits)rhs)); }
constexpr C4_ALWAYS_INLINE C4_CONST NodeType_e operator>> (NodeType_e bits, uint32_t n) noexcept { return (NodeType_e)(((type_bits)bits) >> n); }
constexpr C4_ALWAYS_INLINE C4_CONST NodeType_e operator<< (NodeType_e bits, uint32_t n) noexcept { return (NodeType_e)(((type_bits)bits) << n); }
constexpr C4_ALWAYS_INLINE C4_CONST NodeType_e operator~ (NodeType_e bits) noexcept { return (NodeType_e)(~(type_bits)bits); }
C4_ALWAYS_INLINE NodeType_e& operator&= (NodeType_e &subject, NodeType_e bits) noexcept { subject = (NodeType_e)((type_bits)subject & (type_bits)bits); return subject; }
C4_ALWAYS_INLINE NodeType_e& operator|= (NodeType_e &subject, NodeType_e bits) noexcept { subject = (NodeType_e)((type_bits)subject | (type_bits)bits); return subject; }
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
/** wraps a NodeType_e element with some syntactic sugar and predicates */
struct RYML_EXPORT NodeType
{
public:
NodeType_e type;
public:
C4_ALWAYS_INLINE NodeType() noexcept : type(NOTYPE) {}
C4_ALWAYS_INLINE NodeType(NodeType_e t) noexcept : type(t) {}
C4_ALWAYS_INLINE NodeType(type_bits t) noexcept : type((NodeType_e)t) {}
C4_ALWAYS_INLINE bool has_any(NodeType_e t) const noexcept { return (type & t) != 0u; }
C4_ALWAYS_INLINE bool has_all(NodeType_e t) const noexcept { return (type & t) == t; }
C4_ALWAYS_INLINE bool has_none(NodeType_e t) const noexcept { return (type & t) == 0; }
C4_ALWAYS_INLINE void set(NodeType_e t) noexcept { type = t; }
C4_ALWAYS_INLINE void add(NodeType_e t) noexcept { type = (type|t); }
C4_ALWAYS_INLINE void rem(NodeType_e t) noexcept { type = (type & ~t); }
C4_ALWAYS_INLINE void addrem(NodeType_e bits_to_add, NodeType_e bits_to_remove) noexcept { type |= bits_to_add; type &= ~bits_to_remove; }
C4_ALWAYS_INLINE void clear() noexcept { type = NOTYPE; }
public:
C4_ALWAYS_INLINE operator NodeType_e & C4_RESTRICT () noexcept { return type; }
C4_ALWAYS_INLINE operator NodeType_e const& C4_RESTRICT () const noexcept { return type; }
public:
/** @name node type queries
* @{ */
/** return a preset string based on the node type */
C4_ALWAYS_INLINE const char *type_str() const noexcept { return type_str(type); }
/** return a preset string based on the node type */
static const char* type_str(NodeType_e t) noexcept;
/** fill a string with the node type flags. If the string is small, returns {null, len} */
C4_ALWAYS_INLINE csubstr type_str(substr buf) const noexcept { return type_str(buf, type); }
/** fill a string with the node type flags. If the string is small, returns {null, len} */
static csubstr type_str(substr buf, NodeType_e t) noexcept;
public:
/** @name node type queries
* @{ */
C4_ALWAYS_INLINE bool is_notype() const noexcept { return type == NOTYPE; }
C4_ALWAYS_INLINE bool is_stream() const noexcept { return ((type & STREAM) == STREAM) != 0; }
C4_ALWAYS_INLINE bool is_doc() const noexcept { return (type & DOC) != 0; }
C4_ALWAYS_INLINE bool is_container() const noexcept { return (type & (MAP|SEQ|STREAM)) != 0; }
C4_ALWAYS_INLINE bool is_map() const noexcept { return (type & MAP) != 0; }
C4_ALWAYS_INLINE bool is_seq() const noexcept { return (type & SEQ) != 0; }
C4_ALWAYS_INLINE bool has_key() const noexcept { return (type & KEY) != 0; }
C4_ALWAYS_INLINE bool has_val() const noexcept { return (type & VAL) != 0; }
C4_ALWAYS_INLINE bool is_val() const noexcept { return (type & KEYVAL) == VAL; }
C4_ALWAYS_INLINE bool is_keyval() const noexcept { return (type & KEYVAL) == KEYVAL; }
C4_ALWAYS_INLINE bool has_key_tag() const noexcept { return (type & KEYTAG) != 0; }
C4_ALWAYS_INLINE bool has_val_tag() const noexcept { return (type & VALTAG) != 0; }
C4_ALWAYS_INLINE bool has_key_anchor() const noexcept { return (type & KEYANCH) != 0; }
C4_ALWAYS_INLINE bool has_val_anchor() const noexcept { return (type & VALANCH) != 0; }
C4_ALWAYS_INLINE bool has_anchor() const noexcept { return (type & (KEYANCH|VALANCH)) != 0; }
C4_ALWAYS_INLINE bool is_key_ref() const noexcept { return (type & KEYREF) != 0; }
C4_ALWAYS_INLINE bool is_val_ref() const noexcept { return (type & VALREF) != 0; }
C4_ALWAYS_INLINE bool is_ref() const noexcept { return (type & (KEYREF|VALREF)) != 0; }
C4_ALWAYS_INLINE bool is_key_unfiltered() const noexcept { return (type & (KEY_UNFILT)) != 0; }
C4_ALWAYS_INLINE bool is_val_unfiltered() const noexcept { return (type & (VAL_UNFILT)) != 0; }
RYML_DEPRECATED("use has_key_anchor()") bool is_key_anchor() const noexcept { return has_key_anchor(); }
RYML_DEPRECATED("use has_val_anchor()") bool is_val_anchor() const noexcept { return has_val_anchor(); }
RYML_DEPRECATED("use has_anchor()") bool is_anchor() const noexcept { return has_anchor(); }
RYML_DEPRECATED("use has_anchor() || is_ref()") bool is_anchor_or_ref() const noexcept { return has_anchor() || is_ref(); }
/** @} */
public:
/** @name container+scalar style queries
* @{ */
C4_ALWAYS_INLINE bool is_container_styled() const noexcept { return (type & (CONTAINER_STYLE)) != 0; }
C4_ALWAYS_INLINE bool is_block() const noexcept { return (type & (BLOCK)) != 0; }
C4_ALWAYS_INLINE bool is_flow_sl() const noexcept { return (type & (FLOW_SL)) != 0; }
C4_ALWAYS_INLINE bool is_flow_ml() const noexcept { return (type & (FLOW_ML)) != 0; }
C4_ALWAYS_INLINE bool is_flow() const noexcept { return (type & (FLOW_ML|FLOW_SL)) != 0; }
C4_ALWAYS_INLINE bool is_key_styled() const noexcept { return (type & (KEY_STYLE)) != 0; }
C4_ALWAYS_INLINE bool is_val_styled() const noexcept { return (type & (VAL_STYLE)) != 0; }
C4_ALWAYS_INLINE bool is_key_literal() const noexcept { return (type & (KEY_LITERAL)) != 0; }
C4_ALWAYS_INLINE bool is_val_literal() const noexcept { return (type & (VAL_LITERAL)) != 0; }
C4_ALWAYS_INLINE bool is_key_folded() const noexcept { return (type & (KEY_FOLDED)) != 0; }
C4_ALWAYS_INLINE bool is_val_folded() const noexcept { return (type & (VAL_FOLDED)) != 0; }
C4_ALWAYS_INLINE bool is_key_squo() const noexcept { return (type & (KEY_SQUO)) != 0; }
C4_ALWAYS_INLINE bool is_val_squo() const noexcept { return (type & (VAL_SQUO)) != 0; }
C4_ALWAYS_INLINE bool is_key_dquo() const noexcept { return (type & (KEY_DQUO)) != 0; }
C4_ALWAYS_INLINE bool is_val_dquo() const noexcept { return (type & (VAL_DQUO)) != 0; }
C4_ALWAYS_INLINE bool is_key_plain() const noexcept { return (type & (KEY_PLAIN)) != 0; }
C4_ALWAYS_INLINE bool is_val_plain() const noexcept { return (type & (VAL_PLAIN)) != 0; }
C4_ALWAYS_INLINE bool is_key_quoted() const noexcept { return (type & KEYQUO) != 0; }
C4_ALWAYS_INLINE bool is_val_quoted() const noexcept { return (type & VALQUO) != 0; }
C4_ALWAYS_INLINE bool is_quoted() const noexcept { return (type & (KEYQUO|VALQUO)) != 0; }
C4_ALWAYS_INLINE void set_container_style(NodeType_e style) noexcept { type = ((style & CONTAINER_STYLE) | (type & ~CONTAINER_STYLE)); }
C4_ALWAYS_INLINE void set_key_style(NodeType_e style) noexcept { type = ((style & KEY_STYLE) | (type & ~KEY_STYLE)); }
C4_ALWAYS_INLINE void set_val_style(NodeType_e style) noexcept { type = ((style & VAL_STYLE) | (type & ~VAL_STYLE)); }
/** @} */
};
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
/** @name scalar style helpers
* @{ */
/** choose a YAML emitting style based on the scalar's contents */
RYML_EXPORT NodeType_e scalar_style_choose(csubstr scalar) noexcept;
/** choose a json style based on the scalar's contents */
RYML_EXPORT NodeType_e scalar_style_json_choose(csubstr scalar) noexcept;
/** query whether a scalar can be encoded using single quotes.
* It may not be possible, notably when there is leading
* whitespace after a newline. */
RYML_EXPORT bool scalar_style_query_squo(csubstr s) noexcept;
/** query whether a scalar can be encoded using plain style (no
* quotes, not a literal/folded block scalar). */
RYML_EXPORT bool scalar_style_query_plain(csubstr s) noexcept;
/** YAML-sense query of nullity. returns true if the scalar points
* to `nullptr` or is otherwise equal to one of the strings
* `"~"`,`"null"`,`"Null"`,`"NULL"` */
RYML_EXPORT inline C4_NO_INLINE bool scalar_is_null(csubstr s) noexcept
{
return s.str == nullptr ||
s == "~" ||
s == "null" ||
s == "Null" ||
s == "NULL";
}
/** @} */
/** @} */
} // namespace yml
} // namespace c4
C4_SUPPRESS_WARNING_MSVC_POP
C4_SUPPRESS_WARNING_GCC_CLANG_POP
#endif /* C4_YML_NODE_TYPE_HPP_ */

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

767
src/c4/yml/parse_engine.hpp Normal file
View File

@@ -0,0 +1,767 @@
#ifndef _C4_YML_PARSE_ENGINE_HPP_
#define _C4_YML_PARSE_ENGINE_HPP_
#ifndef _C4_YML_DETAIL_PARSER_DBG_HPP_
#include "c4/yml/detail/parser_dbg.hpp"
#endif
#ifndef _C4_YML_PARSER_STATE_HPP_
#include "c4/yml/parser_state.hpp"
#endif
#if defined(_MSC_VER)
# pragma warning(push)
# pragma warning(disable: 4251/*needs to have dll-interface to be used by clients of struct*/)
#endif
namespace c4 {
namespace yml {
/** @addtogroup doc_parse
* @{ */
/** @defgroup doc_event_handlers Event Handlers
*
* @brief rapidyaml implements its parsing logic with a two-level
* model, where a @ref ParseEngine object reads through the YAML
* source, and dispatches events to an EventHandler bound to the @ref
* ParseEngine. Because @ref ParseEngine is templated on the event
* handler, the binding uses static polymorphism, without any virtual
* functions. The actual handler object can be changed at run time,
* (but of course needs to be the type of the template parameter).
* This is thus a very efficient architecture, and further enables the
* user to provide his own custom handler if he wishes to bypass the
* rapidyaml @ref Tree.
*
* There are two handlers implemented in this project:
*
* - @ref EventHandlerTree is the handler responsible for creating the
* ryml @ref Tree
*
* - @ref EventHandlerYamlStd is the handler responsible for emitting
* standardized [YAML test suite
* events](https://github.com/yaml/yaml-test-suite), used (only) in
* the CI of this project.
*
*
* ### Event model
*
* The event model used by the parse engine and event handlers follows
* very closely the event model in the [YAML test
* suite](https://github.com/yaml/yaml-test-suite).
*
* Consider for example this YAML,
* ```yaml
* {foo: bar,foo2: bar2}
* ```
* which would produce these events in the test-suite parlance:
* ```
* +STR
* +DOC
* +MAP {}
* =VAL :foo
* =VAL :bar
* =VAL :foo2
* =VAL :bar2
* -MAP
* -DOC
* -STR
* ```
*
* For reference, the @ref ParseEngine object will produce this
* sequence of calls to its bound EventHandler:
* ```cpp
* handler.begin_stream();
* handler.begin_doc();
* handler.begin_map_val_flow();
* handler.set_key_scalar_plain("foo");
* handler.set_val_scalar_plain("bar");
* handler.add_sibling();
* handler.set_key_scalar_plain("foo2");
* handler.set_val_scalar_plain("bar2");
* handler.end_map();
* handler.end_doc();
* handler.end_stream();
* ```
*
* For many other examples of all areas of YAML and how ryml's parse
* model corresponds to the YAML standard model, refer to the [unit
* tests for the parse
* engine](https://github.com/biojppm/rapidyaml/tree/master/test/test_parse_engine.cpp).
*
*
* ### Special events
*
* Most of the parsing events adopted by rapidyaml in its event model
* are fairly obvious, but there are two less-obvious events requiring
* some explanation.
*
* These events exist to make it easier to parse some special YAML
* cases. They are called by the parser when a just-handled
* value/container is actually the first key of a new map:
*
* - `actually_val_is_first_key_of_new_map_flow()` (@ref EventHandlerTree::actually_val_is_first_key_of_new_map_flow() "see implementation in EventHandlerTree" / @ref EventHandlerYamlStd::actually_val_is_first_key_of_new_map_flow() "see implementation in EventHandlerYamlStd")
* - `actually_val_is_first_key_of_new_map_block()` (@ref EventHandlerTree::actually_val_is_first_key_of_new_map_block() "see implementation in EventHandlerTree" / @ref EventHandlerYamlStd::actually_val_is_first_key_of_new_map_block() "see implementation in EventHandlerYamlStd")
*
* For example, consider an implicit map inside a seq: `[a: b, c:
* d]` which is parsed as `[{a: b}, {c: d}]`. The standard event
* sequence for this YAML would be the following:
* ```cpp
* handler.begin_seq_val_flow();
* handler.begin_map_val_flow();
* handler.set_key_scalar_plain("a");
* handler.set_val_scalar_plain("b");
* handler.end_map();
* handler.add_sibling();
* handler.begin_map_val_flow();
* handler.set_key_scalar_plain("c");
* handler.set_val_scalar_plain("d");
* handler.end_map();
* handler.end_seq();
* ```
* The problem with this event sequence is that it forces the
* parser to delay setting the val scalar (in this case "a" and
* "c") until it knows whether the scalar is a key or a val. This
* would require the parser to store the scalar until this
* time. For instance, in the example above, the parser should
* delay setting "a" and "c", because they are in fact keys and
* not vals. Until then, the parser would have to store "a" and
* "c" in its internal state. The downside is that this complexity
* cost would apply even if there is no implicit map -- every val
* in a seq would have to be delayed until one of the
* disambiguating subsequent tokens `,-]:` is found.
* By calling this function, the parser can avoid this complexity,
* by preemptively setting the scalar as a val. Then a call to
* this function will create the map and rearrange the scalar as
* key. Now the cost applies only once: when a seqimap starts. So
* the following (easier and cheaper) event sequence below has the
* same effect as the event sequence above:
* ```cpp
* handler.begin_seq_val_flow();
* handler.set_val_scalar_plain("notmap");
* handler.set_val_scalar_plain("a"); // preemptively set "a" as val!
* handler.actually_as_new_map_key(); // create a map, move the "a" val as the key of the first child of the new map
* handler.set_val_scalar_plain("b"); // now "a" is a key and "b" the val
* handler.end_map();
* handler.set_val_scalar_plain("c"); // "c" also as val!
* handler.actually_as_block_flow(); // likewise
* handler.set_val_scalar_plain("d"); // now "c" is a key and "b" the val
* handler.end_map();
* handler.end_seq();
* ```
* This also applies to container keys (although ryml's tree
* cannot accomodate these): the parser can preemptively set a
* container as a val, and call this event to turn that container
* into a key. For example, consider this yaml:
* ```yaml
* [aa, bb]: [cc, dd]
* # ^ ^ ^
* # | | |
* # (2) (1) (3) <- event sequence
* ```
* The standard event sequence for this YAML would be the
* following:
* ```cpp
* handler.begin_map_val_block(); // (1)
* handler.begin_seq_key_flow(); // (2)
* handler.set_val_scalar_plain("aa");
* handler.add_sibling();
* handler.set_val_scalar_plain("bb");
* handler.end_seq();
* handler.begin_seq_val_flow(); // (3)
* handler.set_val_scalar_plain("cc");
* handler.add_sibling();
* handler.set_val_scalar_plain("dd");
* handler.end_seq();
* handler.end_map();
* ```
* The problem with the sequence above is that, reading from
* left-to-right, the parser can only detect the proper calls at
* (1) and (2) once it reaches (1) in the YAML source. So, the
* parser would have to buffer the entire event sequence starting
* from the beginning until it reaches (1). Using this function,
* the parser can do instead:
* ```cpp
* handler.begin_seq_val_flow(); // (2) -- preemptively as val!
* handler.set_val_scalar_plain("aa");
* handler.add_sibling();
* handler.set_val_scalar_plain("bb");
* handler.end_seq();
* handler.actually_as_new_map_key(); // (1) -- adjust when finding that the prev val was actually a key.
* handler.begin_seq_val_flow(); // (3) -- go on as before
* handler.set_val_scalar_plain("cc");
* handler.add_sibling();
* handler.set_val_scalar_plain("dd");
* handler.end_seq();
* handler.end_map();
* ```
*/
class Tree;
class NodeRef;
class ConstNodeRef;
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
/** Options to give to the parser to control its behavior. */
struct RYML_EXPORT ParserOptions
{
private:
typedef enum : uint32_t {
SCALAR_FILTERING = (1u << 0),
LOCATIONS = (1u << 1),
DEFAULTS = SCALAR_FILTERING,
} Flags_e;
uint32_t flags = DEFAULTS;
public:
ParserOptions() = default;
public:
/** @name source location tracking */
/** @{ */
/** enable/disable source location tracking */
ParserOptions& locations(bool enabled) noexcept
{
if(enabled)
flags |= LOCATIONS;
else
flags &= ~LOCATIONS;
return *this;
}
/** query source location tracking status */
C4_ALWAYS_INLINE bool locations() const noexcept { return (flags & LOCATIONS); }
/** @} */
public:
/** @name scalar filtering status (experimental; disable at your discretion) */
/** @{ */
/** enable/disable scalar filtering while parsing */
ParserOptions& scalar_filtering(bool enabled) noexcept
{
if(enabled)
flags |= SCALAR_FILTERING;
else
flags &= ~SCALAR_FILTERING;
return *this;
}
/** query scalar filtering status */
C4_ALWAYS_INLINE bool scalar_filtering() const noexcept { return (flags & SCALAR_FILTERING); }
/** @} */
};
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
/** This is the main driver of parsing logic: it scans the YAML or
* JSON source for tokens, and emits the appropriate sequence of
* parsing events to its event handler. The parse engine itself has no
* special limitations, and *can* accomodate containers as keys; it is the
* event handler may introduce additional constraints.
*
* There are two implemented handlers (see @ref doc_event_handlers,
* which has important notes about the event model):
*
* - @ref EventHandlerTree is the handler responsible for creating the
* ryml @ref Tree
*
* - @ref EventHandlerYamlStd is the handler responsible for emitting
* standardized [YAML test suite
* events](https://github.com/yaml/yaml-test-suite), used (only) in
* the CI of this project. This is not part of the library and is
* not installed.
*/
template<class EventHandler>
class ParseEngine
{
public:
using handler_type = EventHandler;
public:
/** @name construction and assignment */
/** @{ */
ParseEngine(EventHandler *evt_handler, ParserOptions opts={});
~ParseEngine();
ParseEngine(ParseEngine &&);
ParseEngine(ParseEngine const&);
ParseEngine& operator=(ParseEngine &&);
ParseEngine& operator=(ParseEngine const&);
/** @} */
public:
/** @name modifiers */
/** @{ */
/** Reserve a certain capacity for the parsing stack.
* This should be larger than the expected depth of the parsed
* YAML tree.
*
* The parsing stack is the only (potential) heap memory used
* directly by the parser.
*
* If the requested capacity is below the default
* stack size of 16, the memory is used directly in the parser
* object; otherwise it will be allocated from the heap.
*
* @note this reserves memory only for the parser itself; all the
* allocations for the parsed tree will go through the tree's
* allocator (when different).
*
* @note for maximum efficiency, the tree and the arena can (and
* should) also be reserved. */
void reserve_stack(id_type capacity)
{
m_evt_handler->m_stack.reserve(capacity);
}
/** Reserve a certain capacity for the array used to track node
* locations in the source buffer. */
void reserve_locations(size_t num_source_lines)
{
_resize_locations(num_source_lines);
}
RYML_DEPRECATED("filter arena no longer needed")
void reserve_filter_arena(size_t) {}
/** @} */
public:
/** @name getters */
/** @{ */
/** Get the options used to build this parser object. */
ParserOptions const& options() const { return m_options; }
/** Get the current callbacks in the parser. */
Callbacks const& callbacks() const { RYML_ASSERT(m_evt_handler); return m_evt_handler->m_stack.m_callbacks; }
/** Get the name of the latest file parsed by this object. */
csubstr filename() const { return m_file; }
/** Get the latest YAML buffer parsed by this object. */
csubstr source() const { return m_buf; }
id_type stack_capacity() const { RYML_ASSERT(m_evt_handler); return m_evt_handler->m_stack.capacity(); }
size_t locations_capacity() const { return m_newline_offsets_capacity; }
RYML_DEPRECATED("filter arena no longer needed")
size_t filter_arena_capacity() const { return 0u; }
/** @} */
public:
/** @name parse methods */
/** @{ */
/** parse YAML in place, emitting events to the current handler */
void parse_in_place_ev(csubstr filename, substr src);
/** parse JSON in place, emitting events to the current handler */
void parse_json_in_place_ev(csubstr filename, substr src);
/** Quickly inspect the source to estimate the number of nodes the
* resulting tree is likely have. If a tree is empty before
* parsing, considerable time will be spent growing it, so calling
* this to reserve the tree size prior to parsing is likely to
* result in a time gain. We encourage using this method before
* parsing, but as always measure its impact in performance to
* obtain a good trade-off.
*
* @note since this method is meant for optimizing performance, it
* is approximate. The result may be actually smaller than the
* resulting number of nodes, notably if the YAML uses implicit
* maps as flow seq members as in `[these: are, individual:
* maps]`. */
static id_type estimate_tree_capacity(csubstr src);
/** @} */
public:
/** @name deprecated parse_methods
* @{ */
/** @cond dev */
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_place(csubstr filename, substr yaml, Tree *t, size_t node_id);
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_place( substr yaml, Tree *t, size_t node_id);
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_place(csubstr filename, substr yaml, Tree *t );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_place( substr yaml, Tree *t );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_place(csubstr filename, substr yaml, NodeRef node );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_place( substr yaml, NodeRef node );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, Tree>::type parse_in_place(csubstr filename, substr yaml );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, Tree>::type parse_in_place( substr yaml );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_arena(csubstr filename, csubstr yaml, Tree *t, size_t node_id);
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_arena( csubstr yaml, Tree *t, size_t node_id);
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_arena(csubstr filename, csubstr yaml, Tree *t );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_arena( csubstr yaml, Tree *t );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_arena(csubstr filename, csubstr yaml, NodeRef node );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_arena( csubstr yaml, NodeRef node );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, Tree>::type parse_in_arena(csubstr filename, csubstr yaml );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding function in parse.hpp.") typename std::enable_if<U::is_wtree, Tree>::type parse_in_arena( csubstr yaml );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding csubstr version in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_arena(csubstr filename, substr yaml, Tree *t, size_t node_id);
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding csubstr version in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_arena( substr yaml, Tree *t, size_t node_id);
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding csubstr version in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_arena(csubstr filename, substr yaml, Tree *t );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding csubstr version in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_arena( substr yaml, Tree *t );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding csubstr version in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_arena(csubstr filename, substr yaml, NodeRef node );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding csubstr version in parse.hpp.") typename std::enable_if<U::is_wtree, void>::type parse_in_arena( substr yaml, NodeRef node );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding csubstr version in parse.hpp.") typename std::enable_if<U::is_wtree, Tree>::type parse_in_arena(csubstr filename, substr yaml );
template<class U=EventHandler> RYML_DEPRECATED("deliberately undefined. use the freestanding csubstr version in parse.hpp.") typename std::enable_if<U::is_wtree, Tree>::type parse_in_arena( substr yaml );
/** @endcond */
/** @} */
public:
/** @name locations */
/** @{ */
/** Get the location of a node of the last tree to be parsed by this parser. */
Location location(Tree const& tree, id_type node_id) const;
/** Get the location of a node of the last tree to be parsed by this parser. */
Location location(ConstNodeRef node) const;
/** Get the string starting at a particular location, to the end
* of the parsed source buffer. */
csubstr location_contents(Location const& loc) const;
/** Given a pointer to a buffer position, get the location.
* @param[in] val must be pointing to somewhere in the source
* buffer that was last parsed by this object. */
Location val_location(const char *val) const;
/** @} */
public:
/** @name scalar filtering */
/** @{*/
/** filter a plain scalar */
FilterResult filter_scalar_plain(csubstr scalar, substr dst, size_t indentation) noexcept;
/** filter a plain scalar in place */
FilterResult filter_scalar_plain_in_place(substr scalar, size_t cap, size_t indentation) noexcept;
/** filter a single-quoted scalar */
FilterResult filter_scalar_squoted(csubstr scalar, substr dst) noexcept;
/** filter a single-quoted scalar in place */
FilterResult filter_scalar_squoted_in_place(substr scalar, size_t cap) noexcept;
/** filter a double-quoted scalar */
FilterResult filter_scalar_dquoted(csubstr scalar, substr dst);
/** filter a double-quoted scalar in place */
FilterResultExtending filter_scalar_dquoted_in_place(substr scalar, size_t cap);
/** filter a block-literal scalar */
FilterResult filter_scalar_block_literal(csubstr scalar, substr dst, size_t indentation, BlockChomp_e chomp) noexcept;
/** filter a block-literal scalar in place */
FilterResult filter_scalar_block_literal_in_place(substr scalar, size_t cap, size_t indentation, BlockChomp_e chomp) noexcept;
/** filter a block-folded scalar */
FilterResult filter_scalar_block_folded(csubstr scalar, substr dst, size_t indentation, BlockChomp_e chomp) noexcept;
/** filter a block-folded scalar in place */
FilterResult filter_scalar_block_folded_in_place(substr scalar, size_t cap, size_t indentation, BlockChomp_e chomp) noexcept;
/** @} */
private:
struct ScannedScalar
{
substr scalar;
bool needs_filter;
};
struct ScannedBlock
{
substr scalar;
size_t indentation;
BlockChomp_e chomp;
};
bool _is_doc_begin(csubstr s);
bool _is_doc_end(csubstr s);
bool _scan_scalar_plain_blck(ScannedScalar *C4_RESTRICT sc, size_t indentation);
bool _scan_scalar_plain_seq_flow(ScannedScalar *C4_RESTRICT sc);
bool _scan_scalar_plain_seq_blck(ScannedScalar *C4_RESTRICT sc);
bool _scan_scalar_plain_map_flow(ScannedScalar *C4_RESTRICT sc);
bool _scan_scalar_plain_map_blck(ScannedScalar *C4_RESTRICT sc);
bool _scan_scalar_map_json(ScannedScalar *C4_RESTRICT sc);
bool _scan_scalar_seq_json(ScannedScalar *C4_RESTRICT sc);
bool _scan_scalar_plain_unk(ScannedScalar *C4_RESTRICT sc);
bool _is_valid_start_scalar_plain_flow(csubstr s);
ScannedScalar _scan_scalar_squot();
ScannedScalar _scan_scalar_dquot();
void _scan_block(ScannedBlock *C4_RESTRICT sb, size_t indref);
csubstr _scan_anchor();
csubstr _scan_ref_seq();
csubstr _scan_ref_map();
csubstr _scan_tag();
public: // exposed for testing
/** @cond dev */
csubstr _filter_scalar_plain(substr s, size_t indentation);
csubstr _filter_scalar_squot(substr s);
csubstr _filter_scalar_dquot(substr s);
csubstr _filter_scalar_literal(substr s, size_t indentation, BlockChomp_e chomp);
csubstr _filter_scalar_folded(substr s, size_t indentation, BlockChomp_e chomp);
csubstr _maybe_filter_key_scalar_plain(ScannedScalar const& sc, size_t indendation);
csubstr _maybe_filter_val_scalar_plain(ScannedScalar const& sc, size_t indendation);
csubstr _maybe_filter_key_scalar_squot(ScannedScalar const& sc);
csubstr _maybe_filter_val_scalar_squot(ScannedScalar const& sc);
csubstr _maybe_filter_key_scalar_dquot(ScannedScalar const& sc);
csubstr _maybe_filter_val_scalar_dquot(ScannedScalar const& sc);
csubstr _maybe_filter_key_scalar_literal(ScannedBlock const& sb);
csubstr _maybe_filter_val_scalar_literal(ScannedBlock const& sb);
csubstr _maybe_filter_key_scalar_folded(ScannedBlock const& sb);
csubstr _maybe_filter_val_scalar_folded(ScannedBlock const& sb);
/** @endcond */
private:
void _handle_map_block();
void _handle_seq_block();
void _handle_map_flow();
void _handle_seq_flow();
void _handle_seq_imap();
void _handle_map_json();
void _handle_seq_json();
void _handle_unk();
void _handle_unk_json();
void _handle_usty();
void _handle_flow_skip_whitespace();
void _end_map_blck();
void _end_seq_blck();
void _end2_map();
void _end2_seq();
void _begin2_doc();
void _begin2_doc_expl();
void _end2_doc();
void _end2_doc_expl();
void _maybe_begin_doc();
void _maybe_end_doc();
void _start_doc_suddenly();
void _end_doc_suddenly();
void _end_doc_suddenly__pop();
void _end_stream();
void _set_indentation(size_t indentation);
void _save_indentation();
void _handle_indentation_pop_from_block_seq();
void _handle_indentation_pop_from_block_map();
void _handle_indentation_pop(ParserState const* dst);
void _maybe_skip_comment();
void _maybe_skip_whitespace_tokens();
void _maybe_skipchars(char c);
#ifdef RYML_NO_COVERAGE__TO_BE_DELETED
void _maybe_skipchars_up_to(char c, size_t max_to_skip);
#endif
template<size_t N>
void _skipchars(const char (&chars)[N]);
bool _maybe_scan_following_colon() noexcept;
public:
/** @cond dev */
template<class FilterProcessor> auto _filter_plain(FilterProcessor &C4_RESTRICT proc, size_t indentation) noexcept -> decltype(proc.result());
template<class FilterProcessor> auto _filter_squoted(FilterProcessor &C4_RESTRICT proc) noexcept -> decltype(proc.result());
template<class FilterProcessor> auto _filter_dquoted(FilterProcessor &C4_RESTRICT proc) -> decltype(proc.result());
template<class FilterProcessor> auto _filter_block_literal(FilterProcessor &C4_RESTRICT proc, size_t indentation, BlockChomp_e chomp) noexcept -> decltype(proc.result());
template<class FilterProcessor> auto _filter_block_folded(FilterProcessor &C4_RESTRICT proc, size_t indentation, BlockChomp_e chomp) noexcept -> decltype(proc.result());
/** @endcond */
public:
/** @cond dev */
template<class FilterProcessor> void _filter_nl_plain(FilterProcessor &C4_RESTRICT proc, size_t indentation) noexcept;
template<class FilterProcessor> void _filter_nl_squoted(FilterProcessor &C4_RESTRICT proc) noexcept;
template<class FilterProcessor> void _filter_nl_dquoted(FilterProcessor &C4_RESTRICT proc) noexcept;
template<class FilterProcessor> bool _filter_ws_handle_to_first_non_space(FilterProcessor &C4_RESTRICT proc) noexcept;
template<class FilterProcessor> void _filter_ws_copy_trailing(FilterProcessor &C4_RESTRICT proc) noexcept;
template<class FilterProcessor> void _filter_ws_skip_trailing(FilterProcessor &C4_RESTRICT proc) noexcept;
template<class FilterProcessor> void _filter_dquoted_backslash(FilterProcessor &C4_RESTRICT proc);
template<class FilterProcessor> void _filter_chomp(FilterProcessor &C4_RESTRICT proc, BlockChomp_e chomp, size_t indentation) noexcept;
template<class FilterProcessor> size_t _handle_all_whitespace(FilterProcessor &C4_RESTRICT proc, BlockChomp_e chomp) noexcept;
template<class FilterProcessor> size_t _extend_to_chomp(FilterProcessor &C4_RESTRICT proc, size_t contents_len) noexcept;
template<class FilterProcessor> void _filter_block_indentation(FilterProcessor &C4_RESTRICT proc, size_t indentation) noexcept;
template<class FilterProcessor> void _filter_block_folded_newlines(FilterProcessor &C4_RESTRICT proc, size_t indentation, size_t len) noexcept;
template<class FilterProcessor> size_t _filter_block_folded_newlines_compress(FilterProcessor &C4_RESTRICT proc, size_t num_newl, size_t wpos_at_first_newl) noexcept;
template<class FilterProcessor> void _filter_block_folded_newlines_leading(FilterProcessor &C4_RESTRICT proc, size_t indentation, size_t len) noexcept;
template<class FilterProcessor> void _filter_block_folded_indented_block(FilterProcessor &C4_RESTRICT proc, size_t indentation, size_t len, size_t curr_indentation) noexcept;
/** @endcond */
private:
void _line_progressed(size_t ahead);
void _line_ended();
void _line_ended_undo();
bool _finished_file() const;
bool _finished_line() const;
void _scan_line();
substr _peek_next_line(size_t pos=npos) const;
inline bool _at_line_begin() const
{
return m_evt_handler->m_curr->line_contents.rem.begin() == m_evt_handler->m_curr->line_contents.full.begin();
}
private:
C4_ALWAYS_INLINE bool has_all(ParserFlag_t f) const noexcept { return (m_evt_handler->m_curr->flags & f) == f; }
C4_ALWAYS_INLINE bool has_any(ParserFlag_t f) const noexcept { return (m_evt_handler->m_curr->flags & f) != 0; }
C4_ALWAYS_INLINE bool has_none(ParserFlag_t f) const noexcept { return (m_evt_handler->m_curr->flags & f) == 0; }
static C4_ALWAYS_INLINE bool has_all(ParserFlag_t f, ParserState const* C4_RESTRICT s) noexcept { return (s->flags & f) == f; }
static C4_ALWAYS_INLINE bool has_any(ParserFlag_t f, ParserState const* C4_RESTRICT s) noexcept { return (s->flags & f) != 0; }
static C4_ALWAYS_INLINE bool has_none(ParserFlag_t f, ParserState const* C4_RESTRICT s) noexcept { return (s->flags & f) == 0; }
#ifndef RYML_DBG
C4_ALWAYS_INLINE static void add_flags(ParserFlag_t on, ParserState *C4_RESTRICT s) noexcept { s->flags |= on; }
C4_ALWAYS_INLINE static void addrem_flags(ParserFlag_t on, ParserFlag_t off, ParserState *C4_RESTRICT s) noexcept { s->flags &= ~off; s->flags |= on; }
C4_ALWAYS_INLINE static void rem_flags(ParserFlag_t off, ParserState *C4_RESTRICT s) noexcept { s->flags &= ~off; }
C4_ALWAYS_INLINE void add_flags(ParserFlag_t on) noexcept { m_evt_handler->m_curr->flags |= on; }
C4_ALWAYS_INLINE void addrem_flags(ParserFlag_t on, ParserFlag_t off) noexcept { m_evt_handler->m_curr->flags &= ~off; m_evt_handler->m_curr->flags |= on; }
C4_ALWAYS_INLINE void rem_flags(ParserFlag_t off) noexcept { m_evt_handler->m_curr->flags &= ~off; }
#else
static void add_flags(ParserFlag_t on, ParserState *C4_RESTRICT s);
static void addrem_flags(ParserFlag_t on, ParserFlag_t off, ParserState *C4_RESTRICT s);
static void rem_flags(ParserFlag_t off, ParserState *C4_RESTRICT s);
C4_ALWAYS_INLINE void add_flags(ParserFlag_t on) noexcept { add_flags(on, m_evt_handler->m_curr); }
C4_ALWAYS_INLINE void addrem_flags(ParserFlag_t on, ParserFlag_t off) noexcept { addrem_flags(on, off, m_evt_handler->m_curr); }
C4_ALWAYS_INLINE void rem_flags(ParserFlag_t off) noexcept { rem_flags(off, m_evt_handler->m_curr); }
#endif
private:
void _prepare_locations();
void _resize_locations(size_t sz);
bool _locations_dirty() const;
bool _location_from_cont(Tree const& tree, id_type node, Location *C4_RESTRICT loc) const;
bool _location_from_node(Tree const& tree, id_type node, Location *C4_RESTRICT loc, id_type level) const;
private:
void _reset();
void _free();
void _clr();
#ifdef RYML_DBG
template<class ...Args> void _dbg(csubstr fmt, Args const& C4_RESTRICT ...args) const;
#endif
template<class ...Args> void _err(csubstr fmt, Args const& C4_RESTRICT ...args) const;
template<class ...Args> void _errloc(csubstr fmt, Location const& loc, Args const& C4_RESTRICT ...args) const;
template<class DumpFn> void _fmt_msg(DumpFn &&dumpfn) const;
private:
/** store pending tag or anchor/ref annotations */
struct Annotation
{
struct Entry
{
csubstr str;
size_t indentation;
size_t line;
};
Entry annotations[2];
size_t num_entries;
};
void _add_annotation(Annotation *C4_RESTRICT dst, csubstr str, size_t indentation, size_t line);
void _clear_annotations(Annotation *C4_RESTRICT dst);
#ifdef RYML_NO_COVERAGE__TO_BE_DELETED
bool _handle_indentation_from_annotations();
#endif
bool _annotations_require_key_container() const;
void _handle_annotations_before_blck_key_scalar();
void _handle_annotations_before_blck_val_scalar();
void _handle_annotations_before_start_mapblck(size_t current_line);
void _handle_annotations_before_start_mapblck_as_key();
void _handle_annotations_and_indentation_after_start_mapblck(size_t key_indentation, size_t key_line);
size_t _select_indentation_from_annotations(size_t val_indentation, size_t val_line);
private:
ParserOptions m_options;
csubstr m_file;
substr m_buf;
public:
/** @cond dev */
EventHandler *C4_RESTRICT m_evt_handler;
/** @endcond */
private:
Annotation m_pending_anchors;
Annotation m_pending_tags;
bool m_was_inside_qmrk;
bool m_doc_empty = true;
private:
size_t *m_newline_offsets;
size_t m_newline_offsets_size;
size_t m_newline_offsets_capacity;
csubstr m_newline_offsets_buf;
};
/** @cond dev */
RYML_EXPORT C4_NO_INLINE size_t _find_last_newline_and_larger_indentation(csubstr s, size_t indentation) noexcept;
/** @endcond */
/** @} */
} // namespace yml
} // namespace c4
#if defined(_MSC_VER)
# pragma warning(pop)
#endif
#endif /* _C4_YML_PARSE_ENGINE_HPP_ */

211
src/c4/yml/parser_state.hpp Normal file
View File

@@ -0,0 +1,211 @@
#ifndef _C4_YML_PARSER_STATE_HPP_
#define _C4_YML_PARSER_STATE_HPP_
#ifndef _C4_YML_COMMON_HPP_
#include "c4/yml/common.hpp"
#endif
namespace c4 {
namespace yml {
/** data type for @ref ParserState_e */
using ParserFlag_t = int;
#ifdef RYML_DBG
namespace detail {
csubstr _parser_flags_to_str(substr buf, ParserFlag_t flags);
} // namespace
#endif
/** Enumeration of the state flags for the parser */
typedef enum : ParserFlag_t {
RTOP = 0x01 << 0, ///< reading at top level
RUNK = 0x01 << 1, ///< reading unknown state (when starting): must determine whether scalar, map or seq
RMAP = 0x01 << 2, ///< reading a map
RSEQ = 0x01 << 3, ///< reading a seq
FLOW = 0x01 << 4, ///< reading is inside explicit flow chars: [] or {}
BLCK = 0x01 << 5, ///< reading in block mode
QMRK = 0x01 << 6, ///< reading an explicit key (`? key`)
RKEY = 0x01 << 7, ///< reading a scalar as key
RVAL = 0x01 << 9, ///< reading a scalar as val
RKCL = 0x01 << 8, ///< reading the key colon (ie the : after the key in the map)
RNXT = 0x01 << 10, ///< read next val or keyval
SSCL = 0x01 << 11, ///< there's a stored scalar
QSCL = 0x01 << 12, ///< stored scalar was quoted
RSET = 0x01 << 13, ///< the (implicit) map being read is a !!set. @see https://yaml.org/type/set.html
RDOC = 0x01 << 14, ///< reading a document
NDOC = 0x01 << 15, ///< no document mode. a document has ended and another has not started yet.
USTY = 0x01 << 16, ///< reading in unknown style mode - must determine FLOW or BLCK
//! reading an implicit map nested in an explicit seq.
//! eg, {key: [key2: value2, key3: value3]}
//! is parsed as {key: [{key2: value2}, {key3: value3}]}
RSEQIMAP = 0x01 << 17,
} ParserState_e;
/** Helper to control the line contents while parsing a buffer */
struct LineContents
{
substr full; ///< the full line, including newlines on the right
substr stripped; ///< the stripped line, excluding newlines on the right
substr rem; ///< the stripped line remainder; initially starts at the first non-space character
size_t indentation; ///< the number of spaces on the beginning of the line
LineContents() : full(), stripped(), rem(), indentation() {}
void reset_with_next_line(substr buf, size_t offset)
{
RYML_ASSERT(offset <= buf.len);
char const* C4_RESTRICT b = &buf[offset];
char const* C4_RESTRICT e = b;
// get the current line stripped of newline chars
while(e < buf.end() && (*e != '\n' && *e != '\r'))
++e;
RYML_ASSERT(e >= b);
const substr stripped_ = buf.sub(offset, static_cast<size_t>(e - b));
// advance pos to include the first line ending
if(e != buf.end() && *e == '\r')
++e;
if(e != buf.end() && *e == '\n')
++e;
RYML_ASSERT(e >= b);
const substr full_ = buf.sub(offset, static_cast<size_t>(e - b));
reset(full_, stripped_);
}
void reset(substr full_, substr stripped_)
{
full = full_;
stripped = stripped_;
rem = stripped_;
// find the first column where the character is not a space
indentation = stripped.first_not_of(' ');
}
C4_ALWAYS_INLINE size_t current_col() const RYML_NOEXCEPT
{
// WARNING: gcc x86 release builds were wrong (eg returning 0
// when the result should be 4 ) when this function was like
// this:
//
//return current_col(rem);
//
// (see below for the full definition of the called overload
// of current_col())
//
// ... so we explicitly inline the code in here:
RYML_ASSERT(rem.str >= full.str);
size_t col = static_cast<size_t>(rem.str - full.str);
return col;
//
// this was happening only on builds specifically with (gcc
// AND x86 AND release); no other builds were having the
// problem: not in debug, not in x64, not in other
// architectures, not in clang, not in visual studio. WTF!?
//
// Enabling debug prints with RYML_DBG made the problem go
// away, so these could not be used to debug the
// problem. Adding prints inside the called current_col() also
// made the problem go away! WTF!???
//
// a prize will be offered to anybody able to explain why this
// was happening.
}
C4_ALWAYS_INLINE size_t current_col(csubstr s) const RYML_NOEXCEPT
{
RYML_ASSERT(s.str >= full.str);
RYML_ASSERT(full.is_super(s));
size_t col = static_cast<size_t>(s.str - full.str);
return col;
}
};
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
struct ParserState
{
ParserFlag_t flags;
id_type level;
id_type node_id; // don't hold a pointer to the node as it will be relocated during tree resizes
csubstr scalar; // TODO remove
bool more_indented;
size_t scalar_col; // the column where the scalar (or its quotes) begin
Location pos;
LineContents line_contents;
size_t indref; ///< the reference indentation in the current block scope
bool has_children;
ParserState() : flags(), level(), node_id(), scalar(), scalar_col(), pos(), line_contents(), indref() {}
void start_parse(const char *file, id_type node_id_)
{
level = 0;
pos.name = to_csubstr(file);
pos.offset = 0;
pos.line = 1;
pos.col = 1;
node_id = node_id_;
more_indented = false;
scalar_col = 0;
scalar.clear();
indref = 0;
has_children = false;
}
void reset_after_push()
{
node_id = NONE;
indref = NONE;
more_indented = false;
++level;
has_children = false;
}
C4_ALWAYS_INLINE void reset_before_pop(ParserState const& to_pop)
{
pos = to_pop.pos;
line_contents = to_pop.line_contents;
}
C4_ALWAYS_INLINE void mark_with_children()
{
has_children = true;
}
public:
C4_ALWAYS_INLINE bool at_line_beginning() const noexcept
{
return line_contents.rem.str == line_contents.full.str;
}
C4_ALWAYS_INLINE bool indentation_eq() const noexcept
{
RYML_ASSERT(indref != npos);
return line_contents.indentation != npos && line_contents.indentation == indref;
}
C4_ALWAYS_INLINE bool indentation_ge() const noexcept
{
RYML_ASSERT(indref != npos);
return line_contents.indentation != npos && line_contents.indentation >= indref;
}
C4_ALWAYS_INLINE bool indentation_gt() const noexcept
{
RYML_ASSERT(indref != npos);
return line_contents.indentation != npos && line_contents.indentation > indref;
}
C4_ALWAYS_INLINE bool indentation_lt() const noexcept
{
RYML_ASSERT(indref != npos);
return line_contents.indentation != npos && line_contents.indentation < indref;
}
};
} // namespace yml
} // namespace c4
#endif /* _C4_YML_PARSER_STATE_HPP_ */

View File

@@ -0,0 +1,302 @@
#include "c4/yml/reference_resolver.hpp"
#include "c4/dump.hpp" // this is needed to resolve a function in the next header
#include "c4/yml/common.hpp"
#include "c4/yml/detail/parser_dbg.hpp"
#ifdef RYML_DBG
#include "c4/yml/detail/print.hpp"
#else
#define _c4dbg_tree(...)
#define _c4dbg_node(...)
#endif
namespace c4 {
namespace yml {
id_type ReferenceResolver::count_anchors_and_refs_(id_type n)
{
id_type c = 0;
c += m_tree->has_key_anchor(n);
c += m_tree->has_val_anchor(n);
c += m_tree->is_key_ref(n);
c += m_tree->is_val_ref(n);
c += m_tree->has_key(n) && m_tree->key(n) == "<<";
for(id_type ch = m_tree->first_child(n); ch != NONE; ch = m_tree->next_sibling(ch))
c += count_anchors_and_refs_(ch);
return c;
}
void ReferenceResolver::gather_anchors_and_refs__(id_type n)
{
// insert key refs BEFORE inserting val refs
if(m_tree->has_key(n))
{
if(m_tree->key(n) == "<<")
{
_c4dbgpf("node[{}]: key is <<", n);
if(m_tree->has_val(n))
{
if(m_tree->is_val_ref(n))
{
_c4dbgpf("node[{}]: val ref, inheriting!", n);
m_refs.push({VALREF, n, NONE, NONE, NONE, NONE});
//m_refs.push({KEYREF, n, NONE, NONE, NONE, NONE});
}
else
{
_c4dbgpf("node[{}]: not ref!", n);
}
}
else if(m_tree->is_seq(n))
{
// for merging multiple inheritance targets
// <<: [ *CENTER, *BIG ]
_c4dbgpf("node[{}]: is seq!", n);
for(id_type ich = m_tree->first_child(n); ich != NONE; ich = m_tree->next_sibling(ich))
{
_c4dbgpf("node[{}]: val ref, inheriting multiple: {}", n, ich);
if(m_tree->is_container(ich))
{
detail::_report_err(m_tree->m_callbacks, "ERROR: node {} child {}: refs for << cannot be containers.'", n, ich);
C4_UNREACHABLE_AFTER_ERR();
}
m_refs.push({VALREF, ich, NONE, NONE, n, m_tree->next_sibling(n)});
}
return; // don't descend into the seq
}
else
{
detail::_report_err(m_tree->m_callbacks, "ERROR: node {}: refs for << must be either val or seq", n);
C4_UNREACHABLE_AFTER_ERR();
}
}
else if(m_tree->is_key_ref(n))
{
_c4dbgpf("node[{}]: key ref: '{}'", n, m_tree->key_ref(n));
_RYML_CB_ASSERT(m_tree->m_callbacks, m_tree->key(n) != "<<");
_RYML_CB_CHECK(m_tree->m_callbacks, (!m_tree->has_key(n)) || m_tree->key(n).ends_with(m_tree->key_ref(n)));
m_refs.push({KEYREF, n, NONE, NONE, NONE, NONE});
}
}
// val ref
if(m_tree->is_val_ref(n) && (!m_tree->has_key(n) || m_tree->key(n) != "<<"))
{
_c4dbgpf("node[{}]: val ref: '{}'", n, m_tree->val_ref(n));
RYML_CHECK((!m_tree->has_val(n)) || m_tree->val(n).ends_with(m_tree->val_ref(n)));
m_refs.push({VALREF, n, NONE, NONE, NONE, NONE});
}
// anchors
if(m_tree->has_key_anchor(n))
{
_c4dbgpf("node[{}]: key anchor: '{}'", n, m_tree->key_anchor(n));
RYML_CHECK(m_tree->has_key(n));
m_refs.push({KEYANCH, n, NONE, NONE, NONE, NONE});
}
if(m_tree->has_val_anchor(n))
{
_c4dbgpf("node[{}]: val anchor: '{}'", n, m_tree->val_anchor(n));
RYML_CHECK(m_tree->has_val(n) || m_tree->is_container(n));
m_refs.push({VALANCH, n, NONE, NONE, NONE, NONE});
}
// recurse
for(id_type ch = m_tree->first_child(n); ch != NONE; ch = m_tree->next_sibling(ch))
gather_anchors_and_refs__(ch);
}
void ReferenceResolver::gather_anchors_and_refs_()
{
_c4dbgp("gathering anchors and refs...");
// minimize (re-)allocations by counting first
id_type num_anchors_and_refs = count_anchors_and_refs_(m_tree->root_id());
if(!num_anchors_and_refs)
return;
m_refs.reserve(num_anchors_and_refs);
m_refs.clear();
// now descend through the hierarchy
gather_anchors_and_refs__(m_tree->root_id());
_c4dbgpf("found {} anchors/refs", m_refs.size());
// finally connect the reference list
id_type prev_anchor = NONE;
id_type count = 0;
for(auto &rd : m_refs)
{
rd.prev_anchor = prev_anchor;
if(rd.type.has_anchor())
prev_anchor = count;
++count;
}
_c4dbgp("gathering anchors and refs: finished");
}
id_type ReferenceResolver::lookup_(RefData *C4_RESTRICT ra)
{
RYML_ASSERT(ra->type.is_key_ref() || ra->type.is_val_ref());
RYML_ASSERT(ra->type.is_key_ref() != ra->type.is_val_ref());
csubstr refname;
if(ra->type.is_val_ref())
{
refname = m_tree->val_ref(ra->node);
}
else
{
RYML_ASSERT(ra->type.is_key_ref());
refname = m_tree->key_ref(ra->node);
}
while(ra->prev_anchor != NONE)
{
ra = &m_refs[ra->prev_anchor];
if(m_tree->has_anchor(ra->node, refname))
return ra->node;
}
detail::_report_err(m_tree->m_callbacks, "ERROR: anchor not found: '{}'", refname);
C4_UNREACHABLE_AFTER_ERR();
}
void ReferenceResolver::reset_(Tree *t_)
{
if(t_->callbacks() != m_refs.m_callbacks)
{
m_refs.m_callbacks = t_->callbacks();
}
m_refs.clear();
m_tree = t_;
}
void ReferenceResolver::resolve(Tree *t_)
{
_c4dbgp("resolving references...");
reset_(t_);
_c4dbg_tree("unresolved tree", *m_tree);
gather_anchors_and_refs_();
if(m_refs.empty())
return;
/* from the specs: "an alias node refers to the most recent
* node in the serialization having the specified anchor". So
* we need to start looking upward from ref nodes.
*
* @see http://yaml.org/spec/1.2/spec.html#id2765878 */
_c4dbgp("matching anchors/refs...");
for(id_type i = 0, e = m_refs.size(); i < e; ++i)
{
RefData &C4_RESTRICT refdata = m_refs.top(i);
if( ! refdata.type.is_ref())
continue;
refdata.target = lookup_(&refdata);
}
_c4dbgp("matching anchors/refs: finished");
// insert the resolved references
_c4dbgp("modifying tree...");
id_type prev_parent_ref = NONE;
id_type prev_parent_ref_after = NONE;
for(id_type i = 0, e = m_refs.size(); i < e; ++i)
{
RefData const& C4_RESTRICT refdata = m_refs[i];
_c4dbgpf("instance {}/{}...", i, e);
if( ! refdata.type.is_ref())
continue;
_c4dbgpf("instance {} is reference!", i);
if(refdata.parent_ref != NONE)
{
_c4dbgpf("ref {} has parent: {}", i, refdata.parent_ref);
_RYML_CB_ASSERT(m_tree->m_callbacks, m_tree->is_seq(refdata.parent_ref));
const id_type p = m_tree->parent(refdata.parent_ref);
id_type after;
if(prev_parent_ref != refdata.parent_ref)
{
after = refdata.parent_ref;//prev_sibling(rd.parent_ref_sibling);
prev_parent_ref_after = after;
}
else
{
after = prev_parent_ref_after;
}
prev_parent_ref = refdata.parent_ref;
prev_parent_ref_after = m_tree->duplicate_children_no_rep(refdata.target, p, after);
m_tree->remove(refdata.node);
}
else
{
_c4dbgpf("ref {} has no parent", i, refdata.parent_ref);
if(m_tree->has_key(refdata.node) && m_tree->key(refdata.node) == "<<")
{
_c4dbgpf("ref {} is inheriting", i);
_RYML_CB_ASSERT(m_tree->m_callbacks, m_tree->is_keyval(refdata.node));
const id_type p = m_tree->parent(refdata.node);
const id_type after = m_tree->prev_sibling(refdata.node);
m_tree->duplicate_children_no_rep(refdata.target, p, after);
m_tree->remove(refdata.node);
}
else if(refdata.type.is_key_ref())
{
_c4dbgpf("ref {} is key ref", i);
_RYML_CB_ASSERT(m_tree->m_callbacks, m_tree->is_key_ref(refdata.node));
_RYML_CB_ASSERT(m_tree->m_callbacks, m_tree->has_key_anchor(refdata.target) || m_tree->has_val_anchor(refdata.target));
if(m_tree->has_val_anchor(refdata.target) && m_tree->val_anchor(refdata.target) == m_tree->key_ref(refdata.node))
{
_RYML_CB_CHECK(m_tree->m_callbacks, !m_tree->is_container(refdata.target));
_RYML_CB_CHECK(m_tree->m_callbacks, m_tree->has_val(refdata.target));
const type_bits existing_style_flags = VAL_STYLE & m_tree->_p(refdata.target)->m_type.type;
static_assert((VAL_STYLE >> 1u) == (KEY_STYLE), "bad flags");
m_tree->_p(refdata.node)->m_key.scalar = m_tree->val(refdata.target);
m_tree->_add_flags(refdata.node, KEY | (existing_style_flags >> 1u));
}
else
{
_RYML_CB_CHECK(m_tree->m_callbacks, m_tree->key_anchor(refdata.target) == m_tree->key_ref(refdata.node));
m_tree->_p(refdata.node)->m_key.scalar = m_tree->key(refdata.target);
// keys cannot be containers, so don't inherit container flags
const type_bits existing_style_flags = KEY_STYLE & m_tree->_p(refdata.target)->m_type.type;
m_tree->_add_flags(refdata.node, KEY | existing_style_flags);
}
}
else // val ref
{
_c4dbgpf("ref {} is val ref", i);
_RYML_CB_ASSERT(m_tree->m_callbacks, refdata.type.is_val_ref());
if(m_tree->has_key_anchor(refdata.target) && m_tree->key_anchor(refdata.target) == m_tree->val_ref(refdata.node))
{
_RYML_CB_CHECK(m_tree->m_callbacks, !m_tree->is_container(refdata.target));
_RYML_CB_CHECK(m_tree->m_callbacks, m_tree->has_val(refdata.target));
// keys cannot be containers, so don't inherit container flags
const type_bits existing_style_flags = (KEY_STYLE) & m_tree->_p(refdata.target)->m_type.type;
static_assert((KEY_STYLE << 1u) == (VAL_STYLE), "bad flags");
m_tree->_p(refdata.node)->m_val.scalar = m_tree->key(refdata.target);
m_tree->_add_flags(refdata.node, VAL | (existing_style_flags << 1u));
}
else
{
m_tree->duplicate_contents(refdata.target, refdata.node);
}
}
}
}
_c4dbgp("modifying tree: finished");
// clear anchors and refs
_c4dbgp("clearing anchors/refs");
for(auto const& C4_RESTRICT ar : m_refs)
{
m_tree->rem_anchor_ref(ar.node);
if(ar.parent_ref != NONE)
if(m_tree->type(ar.parent_ref) != NOTYPE)
m_tree->remove(ar.parent_ref);
}
_c4dbgp("clearing anchors/refs: finished");
_c4dbg_tree("resolved tree", *m_tree);
m_tree = nullptr;
_c4dbgp("resolving references: finished");
}
} // namespace ryml
} // namespace c4

View File

@@ -0,0 +1,74 @@
#ifndef _C4_YML_REFERENCE_RESOLVER_HPP_
#define _C4_YML_REFERENCE_RESOLVER_HPP_
#include "c4/yml/tree.hpp"
#include "c4/yml/detail/stack.hpp"
namespace c4 {
namespace yml {
/** @addtogroup doc_ref_utils
* @{
*/
/** Reusable object to resolve references/aliases in the tree. */
struct RYML_EXPORT ReferenceResolver
{
ReferenceResolver() = default;
/** Resolve references: for each reference, look for a matching
* anchor, and copy its contents to the ref node.
*
* This method first does a full traversal of the tree to gather
* all anchors and references in a separate collection, then it
* goes through that collection to locate the names, which it does
* by obeying the YAML standard diktat that "an alias node refers
* to the most recent node in the serialization having the
* specified anchor"
*
* So, depending on the number of anchor/alias nodes, this is a
* potentially expensive operation, with a best-case linear
* complexity (from the initial traversal).
*
* @todo verify sanity against anchor-ref attacks (https://en.wikipedia.org/wiki/Billion_laughs_attack )
*/
void resolve(Tree *t_);
public:
/** @cond dev */
struct RefData
{
NodeType type;
id_type node;
id_type prev_anchor;
id_type target;
id_type parent_ref;
id_type parent_ref_sibling;
};
void reset_(Tree *t_);
void gather_anchors_and_refs_();
void gather_anchors_and_refs__(id_type n);
id_type count_anchors_and_refs_(id_type n);
id_type lookup_(RefData *C4_RESTRICT ra);
public:
Tree *C4_RESTRICT m_tree;
/** We're using this stack purely as an array. */
detail::stack<RefData> m_refs;
/** @endcond */
};
/** @} */
} // namespace ryml
} // namespace c4
#endif // _C4_YML_REFERENCE_RESOLVER_HPP_

View File

@@ -17,7 +17,7 @@ template<class V, class Alloc>
void write(c4::yml::NodeRef *n, std::vector<V, Alloc> const& vec)
{
*n |= c4::yml::SEQ;
for(auto const& v : vec)
for(V const& v : vec)
n->append_child() << v;
}
@@ -26,8 +26,8 @@ bool read(c4::yml::ConstNodeRef const& n, std::vector<V, Alloc> *vec)
{
vec->resize(n.num_children());
size_t pos = 0;
for(auto const ch : n)
ch >> (*vec)[pos++];
for(ConstNodeRef const child : n)
child >> (*vec)[pos++];
return true;
}
@@ -38,10 +38,10 @@ bool read(c4::yml::ConstNodeRef const& n, std::vector<bool, Alloc> *vec)
{
vec->resize(n.num_children());
size_t pos = 0;
bool tmp = false;
for(auto const ch : n)
bool tmp = {};
for(ConstNodeRef const child : n)
{
ch >> tmp;
child >> tmp;
(*vec)[pos++] = tmp;
}
return true;

321
src/c4/yml/tag.cpp Normal file
View File

@@ -0,0 +1,321 @@
#include "c4/yml/tag.hpp"
#include "c4/yml/tree.hpp"
#include "c4/yml/detail/parser_dbg.hpp"
namespace c4 {
namespace yml {
bool is_custom_tag(csubstr tag)
{
if((tag.len > 2) && (tag.str[0] == '!'))
{
size_t pos = tag.find('!', 1);
return pos != npos && pos > 1 && tag.str[1] != '<';
}
return false;
}
csubstr normalize_tag(csubstr tag)
{
YamlTag_e t = to_tag(tag);
if(t != TAG_NONE)
return from_tag(t);
if(tag.begins_with("!<"))
tag = tag.sub(1);
if(tag.begins_with("<!"))
return tag;
return tag;
}
csubstr normalize_tag_long(csubstr tag)
{
YamlTag_e t = to_tag(tag);
if(t != TAG_NONE)
return from_tag_long(t);
if(tag.begins_with("!<"))
tag = tag.sub(1);
if(tag.begins_with("<!"))
return tag;
return tag;
}
csubstr normalize_tag_long(csubstr tag, substr output)
{
csubstr result = normalize_tag_long(tag);
if(result.begins_with("!!"))
{
tag = tag.sub(2);
const csubstr pfx = "<tag:yaml.org,2002:";
const size_t len = pfx.len + tag.len + 1;
if(len <= output.len)
{
memcpy(output.str , pfx.str, pfx.len);
memcpy(output.str + pfx.len, tag.str, tag.len);
output[pfx.len + tag.len] = '>';
result = output.first(len);
}
else
{
result.str = nullptr;
result.len = len;
}
}
return result;
}
YamlTag_e to_tag(csubstr tag)
{
if(tag.begins_with("!<"))
tag = tag.sub(1);
if(tag.begins_with("!!"))
tag = tag.sub(2);
else if(tag.begins_with('!'))
return TAG_NONE;
else if(tag.begins_with("tag:yaml.org,2002:"))
{
RYML_ASSERT(csubstr("tag:yaml.org,2002:").len == 18);
tag = tag.sub(18);
}
else if(tag.begins_with("<tag:yaml.org,2002:"))
{
RYML_ASSERT(csubstr("<tag:yaml.org,2002:").len == 19);
tag = tag.sub(19);
if(!tag.len)
return TAG_NONE;
tag = tag.offs(0, 1);
}
if(tag == "map")
return TAG_MAP;
else if(tag == "omap")
return TAG_OMAP;
else if(tag == "pairs")
return TAG_PAIRS;
else if(tag == "set")
return TAG_SET;
else if(tag == "seq")
return TAG_SEQ;
else if(tag == "binary")
return TAG_BINARY;
else if(tag == "bool")
return TAG_BOOL;
else if(tag == "float")
return TAG_FLOAT;
else if(tag == "int")
return TAG_INT;
else if(tag == "merge")
return TAG_MERGE;
else if(tag == "null")
return TAG_NULL;
else if(tag == "str")
return TAG_STR;
else if(tag == "timestamp")
return TAG_TIMESTAMP;
else if(tag == "value")
return TAG_VALUE;
else if(tag == "yaml")
return TAG_YAML;
return TAG_NONE;
}
csubstr from_tag_long(YamlTag_e tag)
{
switch(tag)
{
case TAG_MAP:
return {"<tag:yaml.org,2002:map>"};
case TAG_OMAP:
return {"<tag:yaml.org,2002:omap>"};
case TAG_PAIRS:
return {"<tag:yaml.org,2002:pairs>"};
case TAG_SET:
return {"<tag:yaml.org,2002:set>"};
case TAG_SEQ:
return {"<tag:yaml.org,2002:seq>"};
case TAG_BINARY:
return {"<tag:yaml.org,2002:binary>"};
case TAG_BOOL:
return {"<tag:yaml.org,2002:bool>"};
case TAG_FLOAT:
return {"<tag:yaml.org,2002:float>"};
case TAG_INT:
return {"<tag:yaml.org,2002:int>"};
case TAG_MERGE:
return {"<tag:yaml.org,2002:merge>"};
case TAG_NULL:
return {"<tag:yaml.org,2002:null>"};
case TAG_STR:
return {"<tag:yaml.org,2002:str>"};
case TAG_TIMESTAMP:
return {"<tag:yaml.org,2002:timestamp>"};
case TAG_VALUE:
return {"<tag:yaml.org,2002:value>"};
case TAG_YAML:
return {"<tag:yaml.org,2002:yaml>"};
case TAG_NONE:
default:
return {""};
}
}
csubstr from_tag(YamlTag_e tag)
{
switch(tag)
{
case TAG_MAP:
return {"!!map"};
case TAG_OMAP:
return {"!!omap"};
case TAG_PAIRS:
return {"!!pairs"};
case TAG_SET:
return {"!!set"};
case TAG_SEQ:
return {"!!seq"};
case TAG_BINARY:
return {"!!binary"};
case TAG_BOOL:
return {"!!bool"};
case TAG_FLOAT:
return {"!!float"};
case TAG_INT:
return {"!!int"};
case TAG_MERGE:
return {"!!merge"};
case TAG_NULL:
return {"!!null"};
case TAG_STR:
return {"!!str"};
case TAG_TIMESTAMP:
return {"!!timestamp"};
case TAG_VALUE:
return {"!!value"};
case TAG_YAML:
return {"!!yaml"};
case TAG_NONE:
default:
return {""};
}
}
bool TagDirective::create_from_str(csubstr directive_)
{
csubstr directive = directive_;
directive = directive.sub(4);
if(!directive.begins_with(' '))
return false;
directive = directive.triml(' ');
size_t pos = directive.find(' ');
if(pos == npos)
return false;
handle = directive.first(pos);
directive = directive.sub(handle.len).triml(' ');
pos = directive.find(' ');
if(pos != npos)
directive = directive.first(pos);
prefix = directive;
next_node_id = NONE;
_c4dbgpf("%TAG: handle={} prefix={}", handle, prefix);
return true;
}
bool TagDirective::create_from_str(csubstr directive_, Tree *tree)
{
_RYML_CB_CHECK(tree->callbacks(), directive_.begins_with("%TAG "));
if(!create_from_str(directive_))
{
_RYML_CB_ERR(tree->callbacks(), "invalid tag directive");
}
next_node_id = tree->size();
if(tree->size() > 0)
{
const id_type prev = tree->size() - 1;
if(tree->is_root(prev) && tree->type(prev) != NOTYPE && !tree->is_stream(prev))
++next_node_id;
}
_c4dbgpf("%TAG: handle={} prefix={} next_node={}", handle, prefix, next_node_id);
return true;
}
size_t TagDirective::transform(csubstr tag, substr output, Callbacks const& callbacks) const
{
_c4dbgpf("%TAG: handle={} prefix={} next_node={}. tag={}", handle, prefix, next_node_id, tag);
_RYML_CB_ASSERT(callbacks, tag.len >= handle.len);
csubstr rest = tag.sub(handle.len);
_c4dbgpf("%TAG: rest={}", rest);
if(rest.begins_with('<'))
{
rest = rest.offs(1, 1);
_c4dbgpf("%TAG: begins with <. rest={}", rest);
if(rest.begins_with(prefix))
{
_c4dbgpf("%TAG: already transformed! actual={}", rest.sub(prefix.len));
return 0; // return 0 to signal that the tag is local and cannot be resolved
}
}
size_t len = 1u + prefix.len + rest.len + 1u;
size_t numpc = rest.count('%');
if(numpc == 0)
{
if(len <= output.len)
{
output.str[0] = '<';
memcpy(1u + output.str, prefix.str, prefix.len);
memcpy(1u + output.str + prefix.len, rest.str, rest.len);
output.str[1u + prefix.len + rest.len] = '>';
}
}
else
{
// need to decode URI % sequences
size_t pos = rest.find('%');
_RYML_CB_ASSERT(callbacks, pos != npos);
do {
size_t next = rest.first_not_of("0123456789abcdefABCDEF", pos+1);
if(next == npos)
next = rest.len;
_RYML_CB_CHECK(callbacks, pos+1 < next);
_RYML_CB_CHECK(callbacks, pos+1 + 2 <= next);
size_t delta = next - (pos+1);
len -= delta;
pos = rest.find('%', pos+1);
} while(pos != npos);
if(len <= output.len)
{
size_t prev = 0, wpos = 0;
auto appendstr = [&](csubstr s) { memcpy(output.str + wpos, s.str, s.len); wpos += s.len; };
auto appendchar = [&](char c) { output.str[wpos++] = c; };
appendchar('<');
appendstr(prefix);
pos = rest.find('%');
_RYML_CB_ASSERT(callbacks, pos != npos);
do {
size_t next = rest.first_not_of("0123456789abcdefABCDEF", pos+1);
if(next == npos)
next = rest.len;
_RYML_CB_CHECK(callbacks, pos+1 < next);
_RYML_CB_CHECK(callbacks, pos+1 + 2 <= next);
uint8_t val;
if(C4_UNLIKELY(!read_hex(rest.range(pos+1, next), &val) || val > 127))
_RYML_CB_ERR(callbacks, "invalid URI character");
appendstr(rest.range(prev, pos));
appendchar(static_cast<char>(val));
prev = next;
pos = rest.find('%', pos+1);
} while(pos != npos);
_RYML_CB_ASSERT(callbacks, pos == npos);
_RYML_CB_ASSERT(callbacks, prev > 0);
_RYML_CB_ASSERT(callbacks, rest.len >= prev);
appendstr(rest.sub(prev));
appendchar('>');
_RYML_CB_ASSERT(callbacks, wpos == len);
}
}
return len;
}
} // namespace yml
} // namespace c4

76
src/c4/yml/tag.hpp Normal file
View File

@@ -0,0 +1,76 @@
#ifndef _C4_YML_TAG_HPP_
#define _C4_YML_TAG_HPP_
#include <c4/yml/common.hpp>
namespace c4 {
namespace yml {
class Tree;
/** @addtogroup doc_tag_utils
*
* @{
*/
#ifndef RYML_MAX_TAG_DIRECTIVES
/** the maximum number of tag directives in a Tree */
#define RYML_MAX_TAG_DIRECTIVES 4
#endif
/** the integral type necessary to cover all the bits marking node tags */
using tag_bits = uint16_t;
/** a bit mask for marking tags for types */
typedef enum : tag_bits {
TAG_NONE = 0,
// container types
TAG_MAP = 1, /**< !!map Unordered set of key: value pairs without duplicates. @see https://yaml.org/type/map.html */
TAG_OMAP = 2, /**< !!omap Ordered sequence of key: value pairs without duplicates. @see https://yaml.org/type/omap.html */
TAG_PAIRS = 3, /**< !!pairs Ordered sequence of key: value pairs allowing duplicates. @see https://yaml.org/type/pairs.html */
TAG_SET = 4, /**< !!set Unordered set of non-equal values. @see https://yaml.org/type/set.html */
TAG_SEQ = 5, /**< !!seq Sequence of arbitrary values. @see https://yaml.org/type/seq.html */
// scalar types
TAG_BINARY = 6, /**< !!binary A sequence of zero or more octets (8 bit values). @see https://yaml.org/type/binary.html */
TAG_BOOL = 7, /**< !!bool Mathematical Booleans. @see https://yaml.org/type/bool.html */
TAG_FLOAT = 8, /**< !!float Floating-point approximation to real numbers. https://yaml.org/type/float.html */
TAG_INT = 9, /**< !!float Mathematical integers. https://yaml.org/type/int.html */
TAG_MERGE = 10, /**< !!merge Specify one or more mapping to be merged with the current one. https://yaml.org/type/merge.html */
TAG_NULL = 11, /**< !!null Devoid of value. https://yaml.org/type/null.html */
TAG_STR = 12, /**< !!str A sequence of zero or more Unicode characters. https://yaml.org/type/str.html */
TAG_TIMESTAMP = 13, /**< !!timestamp A point in time https://yaml.org/type/timestamp.html */
TAG_VALUE = 14, /**< !!value Specify the default value of a mapping https://yaml.org/type/value.html */
TAG_YAML = 15, /**< !!yaml Specify the default value of a mapping https://yaml.org/type/yaml.html */
} YamlTag_e;
RYML_EXPORT YamlTag_e to_tag(csubstr tag);
RYML_EXPORT csubstr from_tag(YamlTag_e tag);
RYML_EXPORT csubstr from_tag_long(YamlTag_e tag);
RYML_EXPORT csubstr normalize_tag(csubstr tag);
RYML_EXPORT csubstr normalize_tag_long(csubstr tag);
RYML_EXPORT csubstr normalize_tag_long(csubstr tag, substr output);
RYML_EXPORT bool is_custom_tag(csubstr tag);
struct RYML_EXPORT TagDirective
{
/** Eg `!e!` in `%TAG !e! tag:example.com,2000:app/` */
csubstr handle;
/** Eg `tag:example.com,2000:app/` in `%TAG !e! tag:example.com,2000:app/` */
csubstr prefix;
/** The next node to which this tag directive applies */
id_type next_node_id;
bool create_from_str(csubstr directive_); ///< leaves next_node_id unfilled
bool create_from_str(csubstr directive_, Tree *tree);
size_t transform(csubstr tag, substr output, Callbacks const& callbacks) const;
};
/** @} */
} // namespace yml
} // namespace c4
#endif /* _C4_YML_TAG_HPP_ */

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -23,18 +23,6 @@ namespace yml {
*/
/** Repeat-Character: a character to be written a number of times. */
struct RepC
{
char c;
size_t num_times;
};
inline RepC indent_to(size_t num_levels)
{
return {' ', size_t(2) * num_levels};
}
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
//-----------------------------------------------------------------------------
@@ -86,13 +74,11 @@ struct WriterFile
++m_pos;
}
inline void _do_write(RepC const rc)
inline void _do_write(const char c, size_t num_times)
{
for(size_t i = 0; i < rc.num_times; ++i)
{
fputc(rc.c, m_file);
}
m_pos += rc.num_times;
for(size_t i = 0; i < num_times; ++i)
fputc(c, m_file);
m_pos += num_times;
}
};
@@ -149,13 +135,11 @@ struct WriterOStream
++m_pos;
}
inline void _do_write(RepC const rc)
inline void _do_write(const char c, size_t num_times)
{
for(size_t i = 0; i < rc.num_times; ++i)
{
m_stream.put(rc.c);
}
m_pos += rc.num_times;
for(size_t i = 0; i < num_times; ++i)
m_stream.put(c);
m_pos += num_times;
}
};
@@ -212,22 +196,16 @@ struct WriterBuf
inline void _do_write(const char c)
{
if(m_pos + 1 <= m_buf.len)
{
m_buf[m_pos] = c;
}
++m_pos;
}
inline void _do_write(RepC const rc)
inline void _do_write(const char c, size_t num_times)
{
if(m_pos + rc.num_times <= m_buf.len)
{
for(size_t i = 0; i < rc.num_times; ++i)
{
m_buf[m_pos + i] = rc.c;
}
}
m_pos += rc.num_times;
if(m_pos + num_times <= m_buf.len)
for(size_t i = 0; i < num_times; ++i)
m_buf[m_pos + i] = c;
m_pos += num_times;
}
};

View File

@@ -4,7 +4,12 @@
#include "c4/yml/tree.hpp"
#include "c4/yml/node.hpp"
#include "c4/yml/emit.hpp"
#include "c4/yml/event_handler_tree.hpp"
#include "c4/yml/parse_engine.hpp"
#include "c4/yml/filter_processor.hpp"
#include "c4/yml/parse.hpp"
#include "c4/yml/preprocess.hpp"
#include "c4/yml/reference_resolver.hpp"
#include "c4/yml/tag.hpp"
#endif // _C4_YML_YML_HPP_

View File

@@ -50,9 +50,7 @@ def amalgamate_ryml(filename: str,
with_fastfloat=with_fastfloat,
with_stl=with_stl)
repo = "https://github.com/biojppm/rapidyaml"
defmacro = ryml_defmacro
srcfiles = [
am.cmttext(f"""
ryml_preamble = f"""
Rapid YAML - a library to parse and emit YAML, and do it fast.
{repo}
@@ -61,21 +59,37 @@ DO NOT EDIT. This file is generated automatically.
This is an amalgamated single-header version of the library.
INSTRUCTIONS:
- Include at will in any header of your project
- In one (and only one) of your project source files,
#define {defmacro} and then include this header.
This will enable the function and class definitions in
the header file.
- To compile into a shared library, just define the
preprocessor symbol RYML_SHARED . This will take
care of symbol export/import.
"""),
- Include at will in any header of your project. Because the
amalgamated header file is large, to speed up compilation of
your project, protect the include with its include guard
`_RYML_SINGLE_HEADER_AMALGAMATED_HPP_`, ie like this:
```
#ifndef _RYML_SINGLE_HEADER_AMALGAMATED_HPP_
#include <ryml_all.hpp>
#endif
```
- In one (and only one) of your project source files, #define
{ryml_defmacro} and then include this header. This will enable
the function and class definitions in the header file.
- To compile into a shared library, define the preprocessor symbol
RYML_SHARED before including the header. This will take care of
symbol export/import.
"""
srcfiles = [
am.cmttext(ryml_preamble),
am.cmtfile("LICENSE.txt"),
am.injcode(exports_def_code),
am.onlyif(with_c4core, am.injcode(c4core_def_code)),
am.onlyif(with_c4core, c4core_amalgamated),
"src/c4/yml/export.hpp",
"src/c4/yml/fwd.hpp",
"src/c4/yml/common.hpp",
"src/c4/yml/node_type.hpp",
"src/c4/yml/tag.hpp",
"src/c4/yml/tree.hpp",
"src/c4/yml/node.hpp",
"src/c4/yml/writer.hpp",
@@ -84,16 +98,26 @@ INSTRUCTIONS:
"src/c4/yml/emit.hpp",
"src/c4/yml/emit.def.hpp",
"src/c4/yml/detail/stack.hpp",
"src/c4/yml/filter_processor.hpp",
"src/c4/yml/parser_state.hpp",
"src/c4/yml/event_handler_stack.hpp",
"src/c4/yml/event_handler_tree.hpp",
"src/c4/yml/parse_engine.hpp",
"src/c4/yml/preprocess.hpp",
"src/c4/yml/reference_resolver.hpp",
"src/c4/yml/parse.hpp",
am.onlyif(with_stl, "src/c4/yml/std/map.hpp"),
am.onlyif(with_stl, "src/c4/yml/std/string.hpp"),
am.onlyif(with_stl, "src/c4/yml/std/vector.hpp"),
am.onlyif(with_stl, "src/c4/yml/std/std.hpp"),
"src/c4/yml/common.cpp",
"src/c4/yml/node_type.cpp",
"src/c4/yml/tag.cpp",
"src/c4/yml/tree.cpp",
"src/c4/yml/parse_engine.def.hpp",
"src/c4/yml/reference_resolver.cpp",
"src/c4/yml/parse.cpp",
"src/c4/yml/node.cpp",
"src/c4/yml/preprocess.hpp",
"src/c4/yml/preprocess.cpp",
"src/c4/yml/detail/checks.hpp",
"src/c4/yml/detail/print.hpp",
@@ -109,7 +133,7 @@ INSTRUCTIONS:
re.compile(r'^\s*#\s*include "(c4/.*)".*$'),
re.compile(r'^\s*#\s*include <(c4/.*)>.*$'),
],
definition_macro=defmacro,
definition_macro=ryml_defmacro,
repo=repo,
result_incguard="_RYML_SINGLE_HEADER_AMALGAMATED_HPP_")
result_with_only_first_includes = am.include_only_first(result)