Skip to content

Instantly share code, notes, and snippets.

@catb0t
Created September 12, 2016 14:05
Show Gist options
  • Save catb0t/aa1ca192e7c826d7fa366dbf36cb1a85 to your computer and use it in GitHub Desktop.
Save catb0t/aa1ca192e7c826d7fa366dbf36cb1a85 to your computer and use it in GitHub Desktop.
inconsist
canonical-data.json needs standardisation
Hello,
I maintain the Factor track,
and I'd like to automate generation of unit tests
for exercises in my language, and looking at
`exercises/leap/canonical-data.json` it would seem
to be quite simple. However, many of the
`canonical-data.json`s don't have a standard set
of keys found in `leap`'s json, and this makes it
difficult to automate around.
There are, as far as I can tell, two solutions to
the problems introduced by the inconsistencies.
* Rather than hardcoding the `description`, `input`
and `expected` keys, use a regex / fuzzy find to
group keys into description, input and output.
The main disadvantages of this are twofold: not
only must my code be flimsy, but so must everyone
else's, and subject to break on the whims of anyone.
* Standardise on a fixed, predictable set of keys
and what their values represent. This makes the jobs
of track maintainers easier, simplifies interacting
code, and future-proofs the api and the code.
I think standardisation would be greatly beneficial,
but before I open a pull request with structural
changes to hundreds of lines of data, I'd like some
feedback.
First, is anyone objected to changing the names of
the keys? They're rather haphazard (nearly as if
it had been written for humans to read ): ) and some
exercises are missing `canonical-data.json` altogether,
and consequently I have difficulty believing there are
programs reading this stuff. (If we make an API more
accessible, perhaps more tracks will automate
generation / regeneration of tests, which would be
positive.)
Second, what keys should be used? I'm thinking
something like:
* For exercises with one input translating to one
output, `description`, `input` and `output`.
* For exercises with multiple inputs / multiple
outputs, `description`, `input_N`, `output_N`.
Note that it would be disadvantageous to use an array
for multiple inputs / outputs where one is not part of the
exercise because it would be hard or impossible to tell the
difference.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment