Skip to content

Instantly share code, notes, and snippets.

@laughinghan
Last active April 15, 2016 21:27
Show Gist options
  • Save laughinghan/a324e6067d067370e2cb42d344d0dab4 to your computer and use it in GitHub Desktop.
Save laughinghan/a324e6067d067370e2cb42d344d0dab4 to your computer and use it in GitHub Desktop.

Concise JSON schemas that resemble matching data

Possible name: "JSON Receptor", like how biochemical receptors resemble the molecules they match?

  • Basic example http://json-schema.org/examples.html

    {
      $title: 'Example Schema',
      firstName: 'string',
      lastName: 'string',
      age: 'integer >=0'
    }
  • Simple example http://json-schema.org/example1.html

    {
      $title: 'Product',
      $description: 'A product from Acme\'s catalog',
      id: 'integer',
      name: 'string',
      price: 'number >0',
      $optional_tags: ['>=1 unique', 'string']
    }

    Set of products schema:

    [{
      id: 'integer',
      name: 'string',
      price: 'number >0',
      $optional_tags: ['>=1 unique', 'string'],
      $optional_dimensions: {
        length: 'number',
        width: 'number',
        height: 'number',
      },
      $optional_warehouseLocation: {
        $ref: 'http://json-schema.org/geo'
      }
    }
  • Advanced example http://json-schema.org/example2.html

    {
      storage: { oneOf: [
        '$diskDevice',
        '$diskUUID',
        '$nfs',
        '$tmpfs'
      ]}
      $define_diskDevice: {},
      $define_diskUUID: {},
      $define_nfs: {},
      $define_tmpfs: {}
    }
  • MathSON "Math Symbolic Object Notation" e.g.

    [
      'x=',
      {
        numerator: ['-b±', { sqrt: ['b', { superscript: ['2'] }, '-4ac'] }],
        denominator: ['2a']
      }
    ]
    To skip 
    [
      2, // skip aka retain the 'x='
      {
        numerator: [2, { $delete: '±' }, '+']
      }
    ]
    
    {
      $define
    }
  • Self-description

    {
      $define_string: 'string /^string( length(>0|<?=1|>=?1|(=|[<>]=?){integer >=2})| {integer >=1}<=length<={integer >=2})?( \/{regex that matches regexes}\/[im])?$/',
      $define_number: 'string /^number( [<>]=?{float}| {float}<=?n<=?{float})?( multipleOf={float >0})?$/',
      $define_integer: 'string /^integer( [<>]=?{integer}| {integer}<=i<={integer})?( multipleOf={integer >=2})?$/',
      $define_array: { $oneOf: [
        { $enum: [[]] },
        ['length=1', '$schema'],
        {
          $define_array_prefix: { $allOf: [
            'string /^(length(>0|<?=1|>=?1|(=|[<>]=?){integer >=2})|{integer >=1}<=length<={integer >=2})( unique)?$/',
            { $not: { $enum: ['length=1 unique'] } }
          ] },
          $oneOf: [
            ['$array_prefix', '$schema'],
            ['$array_prefix', '$schema', {
              $raw_$moreItems: { $oneOf: [{ $enum: [false] }, '$array'] },
              $moreKeys: false
              TODO
            }]
          ]
        },
        ['length>1', '$schema'],
        ['length>0', '$schema', { $moreItems: [TODO] }],
        [{ $enum: ['unique'] }, { $moreItems: ['length>1', '$schema' TODO] }]
      ] },
      $define_object: {
        $optional_$moreKeys: { $enum: [false] }
        TODO
      },
      $define_schema: { $oneOf: [
        { $enum: ['boolean', 'any', null, 'date-time', 'email', 'hostname', 'ipv4', 'ipv6', 'uri'] },
        '$string',
        '$number',
        '$integer',
        '$array',
        '$object'
      ] },
      $oneOf: [ '$schema' ]
    }

    The only requirements not specified by this schema are:

    • the maximum of a range must be strictly greater than the minimum of a range
    • non-tuple arrays must not have { $moreItems: false }

    Note that there are only a handful of ways in which two distinct schemas could be equivalent:

    • exclusive vs inclusive min/max (for ever exclusive min/max, there's an equivalent inclusive min/max)
      • (this could've been made more restrictive but it's just so much more natural to say "string length>0" or "integer >100" than "string length>=1" or "number >=101")
    • float syntax (equivalent ways to represent a number: 10 vs 10.0 vs 1e1 vs 1e+1 vs 1e01)
      • (this could've been made more restrictive but wanted the same float syntax as JSON)
    • regexes (distinct but equivalent regexes (or regex equivalent to a built-in format))
    • "factoring out" schemas ($define_*, $extend_*, or factoring stuff out of each subschema of an $anyOf)
    • $anyOf & friends ($anyOf and $oneOf are often equivalent, and they and $enum or [] can often be switched around:
      • { $oneOf: [{ $enum: [1] }, { $enum: [2] }] } is equivalent to { $enum: [1, 2] }
      • { $oneOf: [['string', 'string'], ['string', 'string', 'string']] } is equivalent to ['2<=length<=3', 'string'])
    • 'boolean', null, and 'integer' are equivalent to { $enum: [true, false] }, { $enum: [null] }, and 'number multipleOf=1', respectively (and hence 'integer multipleOf=2' is equivalent to 'number multipleOf=2')

    Full regex versions:

      $define_string: 'string /^string( length(>0|<?=1|>=?1|(=|[<>]=?)([2-9]|[1-9]\d*))| [1-9]\d*<=length<=([2-9]|[1-9]\d*))?( \/(?![*+?])(?:[^\r\n\[/\\]|\\.|\[(?:[^\r\n\]\\]|\\.)*\])+\/)?$/',
      $define_number: 'string /^number( [<>]=?-?(0|[1-9]\d*)(\.\d+)?([eE][+-]?\d+)?| (0|[1-9]\d*)(\.\d+)?([eE][+-]?\d+)?<=?n<=?(0|[1-9]\d*)(\.\d+)?([eE][+-]?\d+)?)?( multipleOf=(0|[1-9]\d*)(\.\d+)?([eE][+-]?\d+)?)?$/',
      $define_integer: 'string /^integer( [<>]=?-?(0|[1-9]\d*)| -?(0|[1-9]\d*)<=i<=-?(0|[1-9]\d*))?( multipleOf=([2-9]|[1-9]\d+))?$/',
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment