[libvirt] [PATCH 0/3] yajl cleanups

Fixes 'make check' on RHEL 6, which I recently broke, then adds further improvements to the JSON parser. I'm tempted to push patch 1/3 as a build-breaker, but since the other two need review, I'll hold off and we can do all three as a series. Eric Blake (3): json: cope with older yajl semantics json: reject javascript comments json: reject trailing garbage src/util/virjson.c | 45 ++++++++++++++++++++++++++++++++++++++------- tests/jsontest.c | 5 ++++- 2 files changed, 42 insertions(+), 8 deletions(-) -- 2.4.3

Commit ceb496e5 fails on RHEL 6, with yajl 1.0.7, because that version of yajl returns yajl_status_insufficient_data when the parser is waiting for the rest of a token (this enum value was dropped in yajl 2, so we have to wrap it). It also exposes a problem where older yajl silently ignores trailing garbage after a successful parse, so this patch works around that by changing the testsuite. Another more invasive patch can add tighter semantics to json parsing, but this is sufficient for a minimal clean backport. While touching this, fix up our error message cleanup. Yajl documents that error messages produced by yajl_get_error() MUST be cleaned with yajl_free_error(); this is certainly true if we were to pass non-NULL allocator callbacks during yajl_alloc(), but probably harmless in our usage of passing NULL. But better safe than sorry. * src/util/virjson.c (virJSONValueFromString): Allow different error code. Use canonical cleanup of error message. (VIR_YAJL_STATUS_OK): New helper macro. * tests/jsontest.c (mymain): Wrap text to avoid difference in trailing garbage handling Signed-off-by: Eric Blake <eblake@redhat.com> --- src/util/virjson.c | 12 ++++++++---- tests/jsontest.c | 2 +- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/src/util/virjson.c b/src/util/virjson.c index 29e2c39..4257b30 100644 --- a/src/util/virjson.c +++ b/src/util/virjson.c @@ -36,9 +36,12 @@ # ifdef WITH_YAJL2 # define yajl_size_t size_t +# define VIR_YAJL_STATUS_OK(status) ((status) == yajl_status_ok) # else # define yajl_size_t unsigned int # define yajl_complete_parse yajl_parse_complete +# define VIR_YAJL_STATUS_OK(status) \ + ((status) == yajl_status_ok || (status) == yajl_status_insufficient_data) # endif #endif @@ -1590,6 +1593,8 @@ virJSONValueFromString(const char *jsonstring) yajl_handle hand; virJSONParser parser = { NULL, NULL, 0 }; virJSONValuePtr ret = NULL; + int rc; + size_t len = strlen(jsonstring); # ifndef WITH_YAJL2 yajl_parser_config cfg = { 1, 1 }; # endif @@ -1611,9 +1616,8 @@ virJSONValueFromString(const char *jsonstring) goto cleanup; } - if (yajl_parse(hand, - (const unsigned char *)jsonstring, - strlen(jsonstring)) != yajl_status_ok || + rc = yajl_parse(hand, (const unsigned char *)jsonstring, len); + if (!VIR_YAJL_STATUS_OK(rc) || yajl_complete_parse(hand) != yajl_status_ok) { unsigned char *errstr = yajl_get_error(hand, 1, (const unsigned char*)jsonstring, @@ -1622,7 +1626,7 @@ virJSONValueFromString(const char *jsonstring) virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot parse json %s: %s"), jsonstring, (const char*) errstr); - VIR_FREE(errstr); + yajl_free_error(hand, errstr); virJSONValueFree(parser.head); goto cleanup; } diff --git a/tests/jsontest.c b/tests/jsontest.c index 34a07ee..8ac0970 100644 --- a/tests/jsontest.c +++ b/tests/jsontest.c @@ -416,7 +416,7 @@ mymain(void) DO_TEST_PARSE("boolean", "true"); DO_TEST_PARSE("null", "null"); DO_TEST_PARSE_FAIL("incomplete keyword", "tr"); - DO_TEST_PARSE_FAIL("overdone keyword", "truest"); + DO_TEST_PARSE_FAIL("overdone keyword", "[ truest ]"); DO_TEST_PARSE_FAIL("unknown keyword", "huh"); DO_TEST_PARSE_FAIL("object with numeric keys", "{ 1:1, 2:1, 3:2 }"); -- 2.4.3

We have been allowing javascript style comments in JSON ever since commit 9428f2c (v0.7.5), but qemu doesn't send them, and they are not strict JSON. Reject them for now; if we can later prove that it is worthwhile, we can reinstate it at that point (or even make it conditional, by adding a bool parameter to the libvirt entry point). * src/util/virjson.c (virJSONValueFromString): Don't enable comment parsing. * tests/jsontest.c (mymain): Test it. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/util/virjson.c | 4 ++-- tests/jsontest.c | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/src/util/virjson.c b/src/util/virjson.c index 4257b30..8d12fad 100644 --- a/src/util/virjson.c +++ b/src/util/virjson.c @@ -1596,7 +1596,7 @@ virJSONValueFromString(const char *jsonstring) int rc; size_t len = strlen(jsonstring); # ifndef WITH_YAJL2 - yajl_parser_config cfg = { 1, 1 }; + yajl_parser_config cfg = { 0, 1 }; # endif VIR_DEBUG("string=%s", jsonstring); @@ -1604,7 +1604,7 @@ virJSONValueFromString(const char *jsonstring) # ifdef WITH_YAJL2 hand = yajl_alloc(&parserCallbacks, NULL, &parser); if (hand) { - yajl_config(hand, yajl_allow_comments, 1); + yajl_config(hand, yajl_allow_comments, 0); yajl_config(hand, yajl_dont_validate_strings, 0); } # else diff --git a/tests/jsontest.c b/tests/jsontest.c index 8ac0970..f6c2d84 100644 --- a/tests/jsontest.c +++ b/tests/jsontest.c @@ -418,6 +418,7 @@ mymain(void) DO_TEST_PARSE_FAIL("incomplete keyword", "tr"); DO_TEST_PARSE_FAIL("overdone keyword", "[ truest ]"); DO_TEST_PARSE_FAIL("unknown keyword", "huh"); + DO_TEST_PARSE_FAIL("comments", "[ /* nope */\n1 // not this either\n]"); DO_TEST_PARSE_FAIL("object with numeric keys", "{ 1:1, 2:1, 3:2 }"); DO_TEST_PARSE_FAIL("unterminated object", "{ \"1\":1, \"2\":1, \"3\":2"); -- 2.4.3

Yajl 2 has a nice feature that it can be configured whether to allow multiple JSON objects parsed from a single stream, defaulting to off. And yajl 1.0.12 at least provided a way to tell if all input bytes were parsed, or if trailing bytes remained after a valid JSON object was parsed. But we target RHEL 6 yajl 1.0.7, which has neither of these. So fake it by always parsing '[...]' instead, so that trailing garbage either trips up the array parse, or is easily detected when unwrapping the result. * src/util/virjson.c (virJSONValueFromString): With older json, wrap text to avoid trailing garbage. * tests/jsontest.c (mymain): Add tests for this. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/util/virjson.c | 29 ++++++++++++++++++++++++++++- tests/jsontest.c | 2 ++ 2 files changed, 30 insertions(+), 1 deletion(-) diff --git a/src/util/virjson.c b/src/util/virjson.c index 8d12fad..a33005a 100644 --- a/src/util/virjson.c +++ b/src/util/virjson.c @@ -1597,6 +1597,7 @@ virJSONValueFromString(const char *jsonstring) size_t len = strlen(jsonstring); # ifndef WITH_YAJL2 yajl_parser_config cfg = { 0, 1 }; + virJSONValuePtr tmp; # endif VIR_DEBUG("string=%s", jsonstring); @@ -1616,7 +1617,21 @@ virJSONValueFromString(const char *jsonstring) goto cleanup; } + /* Yajl 2 is nice enough to default to rejecting trailing garbage. + * Yajl 1.0.12 has yajl_get_bytes_consumed to make that detection + * simpler. But we're stuck with yajl 1.0.7 on RHEL 6, which + * happily quits parsing at the end of a valid JSON construct, + * with no visibility into how much more input remains. Wrapping + * things in an array forces yajl to confess the truth. */ +# ifdef WITH_YAJL2 rc = yajl_parse(hand, (const unsigned char *)jsonstring, len); +# else + rc = yajl_parse(hand, (const unsigned char *)"[", 1); + if (VIR_YAJL_STATUS_OK(rc)) + rc = yajl_parse(hand, (const unsigned char *)jsonstring, len); + if (VIR_YAJL_STATUS_OK(rc)) + rc = yajl_parse(hand, (const unsigned char *)"]", 1); +# endif if (!VIR_YAJL_STATUS_OK(rc) || yajl_complete_parse(hand) != yajl_status_ok) { unsigned char *errstr = yajl_get_error(hand, 1, @@ -1638,6 +1653,18 @@ virJSONValueFromString(const char *jsonstring) virJSONValueFree(parser.head); } else { ret = parser.head; +# ifndef WITH_YAJL2 + /* Undo the array wrapping above */ + tmp = ret; + ret = NULL; + if (virJSONValueArraySize(tmp) > 1) + virReportError(VIR_ERR_INTERNAL_ERROR, + _("cannot parse json %s: too many items present"), + jsonstring); + else + ret = virJSONValueArraySteal(tmp, 0); + virJSONValueFree(tmp); +# endif } cleanup: @@ -1650,7 +1677,7 @@ virJSONValueFromString(const char *jsonstring) VIR_FREE(parser.state); } - VIR_DEBUG("result=%p", parser.head); + VIR_DEBUG("result=%p", ret); return ret; } diff --git a/tests/jsontest.c b/tests/jsontest.c index f6c2d84..a363dc0 100644 --- a/tests/jsontest.c +++ b/tests/jsontest.c @@ -419,6 +419,8 @@ mymain(void) DO_TEST_PARSE_FAIL("overdone keyword", "[ truest ]"); DO_TEST_PARSE_FAIL("unknown keyword", "huh"); DO_TEST_PARSE_FAIL("comments", "[ /* nope */\n1 // not this either\n]"); + DO_TEST_PARSE_FAIL("trailing garbage", "[] []"); + DO_TEST_PARSE_FAIL("list without array", "1, 1"); DO_TEST_PARSE_FAIL("object with numeric keys", "{ 1:1, 2:1, 3:2 }"); DO_TEST_PARSE_FAIL("unterminated object", "{ \"1\":1, \"2\":1, \"3\":2"); -- 2.4.3

Since older yajl ignores trailing garbage, a client can cause problems by intentionally ending the wrapper array early. Since we already track nesting, it's not too much harder to reject invalid nesting pops. * src/util/virjson. (_virJSONParser): Add field. (virJSONValueFromString): Set witness. (virJSONParserHandleEndArray): Use it to catch abuse. * tests/jsontest.c (mymain): Test it. Signed-off-by: Eric Blake <eblake@redhat.com> --- Could be squashed with 3/3, if desired. src/util/virjson.c | 7 +++++-- tests/jsontest.c | 1 + 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/src/util/virjson.c b/src/util/virjson.c index a33005a..3c6ed34 100644 --- a/src/util/virjson.c +++ b/src/util/virjson.c @@ -64,6 +64,7 @@ struct _virJSONParser { virJSONValuePtr head; virJSONParserStatePtr state; size_t nstate; + int wrap; }; @@ -1556,7 +1557,7 @@ virJSONParserHandleEndArray(void *ctx) VIR_DEBUG("parser=%p", parser); - if (!parser->nstate) + if (!(parser->nstate - parser->wrap)) return 0; state = &(parser->state[parser->nstate-1]); @@ -1591,7 +1592,7 @@ virJSONValuePtr virJSONValueFromString(const char *jsonstring) { yajl_handle hand; - virJSONParser parser = { NULL, NULL, 0 }; + virJSONParser parser = { NULL, NULL, 0, 0 }; virJSONValuePtr ret = NULL; int rc; size_t len = strlen(jsonstring); @@ -1627,8 +1628,10 @@ virJSONValueFromString(const char *jsonstring) rc = yajl_parse(hand, (const unsigned char *)jsonstring, len); # else rc = yajl_parse(hand, (const unsigned char *)"[", 1); + parser.wrap = 1; if (VIR_YAJL_STATUS_OK(rc)) rc = yajl_parse(hand, (const unsigned char *)jsonstring, len); + parser.wrap = 0; if (VIR_YAJL_STATUS_OK(rc)) rc = yajl_parse(hand, (const unsigned char *)"]", 1); # endif diff --git a/tests/jsontest.c b/tests/jsontest.c index a363dc0..97b9c0a 100644 --- a/tests/jsontest.c +++ b/tests/jsontest.c @@ -421,6 +421,7 @@ mymain(void) DO_TEST_PARSE_FAIL("comments", "[ /* nope */\n1 // not this either\n]"); DO_TEST_PARSE_FAIL("trailing garbage", "[] []"); DO_TEST_PARSE_FAIL("list without array", "1, 1"); + DO_TEST_PARSE_FAIL("parser abuse", "1] [2"); DO_TEST_PARSE_FAIL("object with numeric keys", "{ 1:1, 2:1, 3:2 }"); DO_TEST_PARSE_FAIL("unterminated object", "{ \"1\":1, \"2\":1, \"3\":2"); -- 2.4.3

On Mon, Jun 22, 2015 at 15:01:15 -0600, Eric Blake wrote:
Since older yajl ignores trailing garbage, a client can cause problems by intentionally ending the wrapper array early. Since we already track nesting, it's not too much harder to reject invalid nesting pops.
* src/util/virjson. (_virJSONParser): Add field. (virJSONValueFromString): Set witness. (virJSONParserHandleEndArray): Use it to catch abuse. * tests/jsontest.c (mymain): Test it.
Signed-off-by: Eric Blake <eblake@redhat.com> ---
Could be squashed with 3/3, if desired.
src/util/virjson.c | 7 +++++-- tests/jsontest.c | 1 + 2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/src/util/virjson.c b/src/util/virjson.c index a33005a..3c6ed34 100644 --- a/src/util/virjson.c +++ b/src/util/virjson.c @@ -64,6 +64,7 @@ struct _virJSONParser { virJSONValuePtr head; virJSONParserStatePtr state; size_t nstate; + int wrap;
Boolean?
};
@@ -1556,7 +1557,7 @@ virJSONParserHandleEndArray(void *ctx)
VIR_DEBUG("parser=%p", parser);
- if (!parser->nstate) + if (!(parser->nstate - parser->wrap))
Yuck! This really covers up what's happening here. if ((parser->nstate == 1 && parser->wrap) || (parser->nstate == 0 && !parser->wrap)) It takes two lines but you at least don't cover up the logic.
return 0;
state = &(parser->state[parser->nstate-1]); @@ -1591,7 +1592,7 @@ virJSONValuePtr virJSONValueFromString(const char *jsonstring) { yajl_handle hand; - virJSONParser parser = { NULL, NULL, 0 }; + virJSONParser parser = { NULL, NULL, 0, 0 }; virJSONValuePtr ret = NULL; int rc; size_t len = strlen(jsonstring); @@ -1627,8 +1628,10 @@ virJSONValueFromString(const char *jsonstring) rc = yajl_parse(hand, (const unsigned char *)jsonstring, len); # else rc = yajl_parse(hand, (const unsigned char *)"[", 1); + parser.wrap = 1;
True?
if (VIR_YAJL_STATUS_OK(rc)) rc = yajl_parse(hand, (const unsigned char *)jsonstring, len); + parser.wrap = 0;
False?
if (VIR_YAJL_STATUS_OK(rc)) rc = yajl_parse(hand, (const unsigned char *)"]", 1); # endif
Peter

We already enable the parser option to detect invalid UTF-8, but didn't test it. Also, JSON states that behavior of an object with a duplicated key is undefined; we chose to reject it, but were not testing it. With the enhanced tests in place, we can simplify yajl2 initialization by relying on parser defaults being sane. * src/util/virjson.c (virJSONValueFromString): Simplify. * tests/jsontest.c (mymain): Test more bad usage. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/util/virjson.c | 6 +----- tests/jsontest.c | 2 ++ 2 files changed, 3 insertions(+), 5 deletions(-) diff --git a/src/util/virjson.c b/src/util/virjson.c index 3c6ed34..c123943 100644 --- a/src/util/virjson.c +++ b/src/util/virjson.c @@ -1597,7 +1597,7 @@ virJSONValueFromString(const char *jsonstring) int rc; size_t len = strlen(jsonstring); # ifndef WITH_YAJL2 - yajl_parser_config cfg = { 0, 1 }; + yajl_parser_config cfg = { 0, 1 }; /* Match yajl 2 default behavior */ virJSONValuePtr tmp; # endif @@ -1605,10 +1605,6 @@ virJSONValueFromString(const char *jsonstring) # ifdef WITH_YAJL2 hand = yajl_alloc(&parserCallbacks, NULL, &parser); - if (hand) { - yajl_config(hand, yajl_allow_comments, 0); - yajl_config(hand, yajl_dont_validate_strings, 0); - } # else hand = yajl_alloc(&parserCallbacks, &cfg, NULL, &parser); # endif diff --git a/tests/jsontest.c b/tests/jsontest.c index 97b9c0a..223f867 100644 --- a/tests/jsontest.c +++ b/tests/jsontest.c @@ -422,6 +422,7 @@ mymain(void) DO_TEST_PARSE_FAIL("trailing garbage", "[] []"); DO_TEST_PARSE_FAIL("list without array", "1, 1"); DO_TEST_PARSE_FAIL("parser abuse", "1] [2"); + DO_TEST_PARSE_FAIL("invalid UTF-8", "\"\x80\""); DO_TEST_PARSE_FAIL("object with numeric keys", "{ 1:1, 2:1, 3:2 }"); DO_TEST_PARSE_FAIL("unterminated object", "{ \"1\":1, \"2\":1, \"3\":2"); @@ -430,6 +431,7 @@ mymain(void) DO_TEST_PARSE_FAIL("array of an object with an array as a key", "[ {[\"key1\", \"key2\"]: \"value\"} ]"); DO_TEST_PARSE_FAIL("object with unterminated key", "{ \"key:7 }"); + DO_TEST_PARSE_FAIL("duplicate key", "{ \"a\": 1, \"a\": 1 }"); DO_TEST_FULL("lookup on array", Lookup, "[ 1 ]", NULL, false); -- 2.4.3

On 22.06.2015 22:18, Eric Blake wrote:
Fixes 'make check' on RHEL 6, which I recently broke, then adds further improvements to the JSON parser.
I'm tempted to push patch 1/3 as a build-breaker, but since the other two need review, I'll hold off and we can do all three as a series.
Eric Blake (3): json: cope with older yajl semantics json: reject javascript comments json: reject trailing garbage
src/util/virjson.c | 45 ++++++++++++++++++++++++++++++++++++++------- tests/jsontest.c | 5 ++++- 2 files changed, 42 insertions(+), 8 deletions(-)
ACK series. Michal
participants (3)
-
Eric Blake
-
Michal Privoznik
-
Peter Krempa