Skip to content

Instantly share code, notes, and snippets.

@skeeto
Last active November 8, 2022 00:54
Show Gist options
  • Save skeeto/93891c57a83b2562389959926a9cb364 to your computer and use it in GitHub Desktop.
Save skeeto/93891c57a83b2562389959926a9cb364 to your computer and use it in GitHub Desktop.
Jsonic fuzz tester
// Fuzz test for Jsonic
// $ afl-gcc -m32 -fsanitize=address,undefined fuzz.c jsonic.c
// $ afl-fuzz -m800 -iexamples/heroes -oout ./a.out
// https://github.com/rohanrhu/jsonic
// This is free and unencumbered software released into the public domain.
#include <stdio.h>
#include <stdlib.h>
#include "jsonic.h"
static int explore(jsonic_node_t *root, char *buf)
{
jsonic_node_t *key = 0;
switch (root->type) {
case JSONIC_NONE:
return 1;
case JSONIC_OBJECT:
for (;;) {
key = jsonic_object_iter_kv_free(buf, root, key);
if (key->type == JSONIC_NONE) {
return 0;
}
printf("=> %s\n", key->key);
if (explore(key, buf)) {
return 1;
}
}
case JSONIC_ARRAY:
for (;;) {
jsonic_node_t *e = jsonic_array_iter_free(buf, root, e, 0);
if (e->type == JSONIC_NONE) {
return 0;
}
if (explore(e, buf)) {
return 1;
}
}
case JSONIC_STRING:
case JSONIC_NUMBER:
case JSONIC_BOOLEAN:
case JSONIC_NULL:
puts(root->val);
return 0;
}
abort();
}
int main(void)
{
char *buf = malloc(1<<10);
int len = fread(buf, 1, (1<<10)-1, stdin);
buf[len++] = 0;
buf = realloc(buf, len);
return explore(jsonic_get_root(buf), buf);
}
@skeeto
Copy link
Author

skeeto commented Nov 7, 2022

Thanks, I switched it over.

@rohanrhu
Copy link

rohanrhu commented Nov 7, 2022

Meowww it is something like this:

jsonic_node_t* e = NULL;
for (;;) {
    e = jsonic_array_iter_free(
        buf,   // json string
        root,  // array node
        e,     // from object
        0      // index is always zero because it means 0. item after current from object
    );
    if (e->type == JSONIC_NONE) break;

    // ...

Btw are you getting impressive results? I have new ideas about performance, I will implement sooooon.

@skeeto
Copy link
Author

skeeto commented Nov 7, 2022

Alright, I think I understand now. I just pushed the fix.

Btw are you getting impressive results?

I was only using this to fuzz test since it's an easy way to uncover issues. The fuzz test isn't sensitive to memory leaks, so that's why I don't waste time cleaning up. As noted, your parser is written robustly, and a couple hours of parallel fuzzing turned up nothing. I haven't fuzzed with this latest fix, but I don't think it would make a difference.

I'd try your benchmarks, but my distribution's (Bullseye) Boost is slightly too old to have json.hpp. I just don't have the patience to wrangle a newer Boost, so I'll just leave it at that.

I wouldn't be surprised to learn that yours is faster since it's less dynamic (a good thing). Or, more specifically, it pushes the dynamism onto the caller, and your benchmark represents the typical case where the caller doesn't introduce that extra dynamism in the first place. The caller is essentially parsing to a known, fixed schema and won't consume just any arbitrary JSON.

That's not far from my own usual approach to JSON, which is to put together the barest of tokenizers for my target schema, then parse exactly that schema and nothing more. The result is tiny, and orders of magnitude faster than generic JSON parser like Boost/etc. A JSON tokenizer is only around ~70 LOC, and the rest depends on the complexity of the schema. A couple examples:

@rohanrhu
Copy link

rohanrhu commented Nov 8, 2022

It is nice just ~70 SLOC 🙀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment