Intro

The JSON Schema spec doesn't speak on error construction at all (which is absolutely the right decision).

This means that each library author has to figure out how to construct informative errors individually.

This document is meant to sketch out some best practices.

Example

Schema

{
  "properties": {
    "foo": {
      "items": {
        "type": "string"
      }
    }
  }
}

Data

{
  "foo": [ null ]
}

Some Options

(Examples in Haskell)

1. The simplest possible

exampleError :: Bool
exampleError = False

This makes for totally unhelpful error messages.

2. Only reporting the bare minimum needed to derive anything else

Note: I think it's important that the errors returned by individual validators don't know about the schema-level errors they will eventually be used to produce. That way they can be used in other schemas in the future without causing problems. In this case we achieve that by parameterizing PropertiesValFailure and ItemsValFailure with the schema level error type (which in this case ends up being ValidationFailure)

-- | Declare our the error type for the "properties" validator.
--
-- It's a hashmap of properties keys to the errors that resulted from them.
--
-- It takes a type argument (`err`) because we want to be agnostic about
-- what errors the schema it's eventually used in can produce. That way
-- it can be used by later JSON Schema specifications as well as the current one.
data PropertiesValFailure err = PropertiesValFailure (HashMap Text err)

-- | Declare the error type for the "items" validator.
--
-- It's a hashmap of indexes to the errors that resulted from the data
-- at that index.
data ItemsValFailure err = ItemsValFailure (HashMap Int err)

-- | Declare the error type for the whole schema.
data ValidationFailure
    = InvalidProperties (PropertiesValFailure Failure)
    | InvalidItems (ItemsValFailure Failure)
    -- ^ In a real JSON Schema lib we'd have a lot more errors than this
    -- to handle (allOf, anyOf, etc.) but this is just an example.
    | LeafFailure
    -- ^ LeafFailure is what we use for "type", as well as any other validators
    -- that themselves can't contain errors (like "maximum", "mininum", etc.)
    -- The value of validators like "type" can be derived from the starting schema
    -- and the rest of the error message (e.g.
    -- `InvalidProperties (PropertiesValFailure (HashMap.singleton "quux" LeafFailure))`
    -- would mean that the validator that cause the error is value of the
    -- schema object at the "properties/quux" key). 

-- | Validation produces a list of `ValidationFailure`s, up to one for each top-level
-- validator.
exampleError :: [ValidationFailure]
exampleError =
    [ InvalidProperties (PropertiesValFailure (
          HashMap.singleton "foo" (
              InvalidItems (ItemsValFailure (HashMap.singleton 0 LeafFailure)
              )
          )
      )
    ]

This provides all the information we need to reconstruct what happended, which is good! But the messages are still hard to read at a glance. What other info should probably be included?

@handrews:

(First sidenote: I went through the code, fixed errors, changed names, and commented everything. It should be more self-evident now)

Could you include the actual text output that the code is intended to produce?

This actually gets at one of my questions -- I'm wondering if it would be better to produce human readable messages along with the original errors, or to produce them later. Regardless in this case we aren't including them originally, so we would have to produce them later with a function of the type ValidationFailure -> String.

In this case the most naive version of this function would produce:

Validation failed:
  In the validator "properties" at the "foo" key:
    In the validator "items" at the 0 index:
      An error was found.

A more sophisticated version could look at "foo/0" in the schema and data and replace An error was found with type: "null" doesn't match type: "string".

I was going to have another question here -- should the main validation functions distinguish the leaf validators from each other (eg should they have TypeValFailure, MaximumValFailure, etc. or not) but I'm becoming sure that they should and having one overall LeafFailure isn't the right answer. While it's true the former includes info that could be derived from the rest of the error, the process of deriving that info is an unnecessary additional complication to making nice error messages.

seagreen/json_schema_failures.md

Intro

Example

Schema

Data

Some Options

1. The simplest possible

2. Only reporting the bare minimum needed to derive anything else

seagreen commented Jan 4, 2017 •

edited

Loading

Uh oh!

seagreen commented Jan 4, 2017

Uh oh!

seagreen commented Jan 13, 2017

Uh oh!

seagreen/json_schema_failures.md

Intro

Example

Schema

Data

Some Options

1. The simplest possible

2. Only reporting the bare minimum needed to derive anything else

seagreen commented Jan 4, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seagreen commented Jan 4, 2017

Uh oh!

seagreen commented Jan 13, 2017

Uh oh!

seagreen commented Jan 4, 2017 •

edited

Loading