Skip to content

Instantly share code, notes, and snippets.

@jdolitsky
Last active April 12, 2022 18:14
Show Gist options
  • Save jdolitsky/c63b10c1da016aa1045c673c30e1ca0a to your computer and use it in GitHub Desktop.
Save jdolitsky/c63b10c1da016aa1045c673c30e1ca0a to your computer and use it in GitHub Desktop.
OCI Reference Types WG Proposal F (by anonymous)

Proposal F

Allow for query of registry schemas.

Description

As the number of different kinds of references increases, it becomes more important to be able to intelligently query. This proposal is based off of Proposal B but could generally be applied to any json structure.

Links

Description Link
GitHub issue where this was first proposed View
Sample code of a working proof-of-concept View
Blog post which proves the point View

Modifications

JSON Schema

Given a new field, reference, as proposed in PROPOSAL_B:

{
  "mediaType": "icecream/scoops.vanilla.v1.json",
  "size": 2345,
  "digest": "sha256:b2b2b2...",
  "reference": { // <----- New field
    "mediaType": "icecream/cone.v1.json",
    "size": 1234,
    "digest": "sha256:a1a1a1..."
  }
}

We expose a new query language to filter down a single listing, or across the references in the registry.

Registry HTTP API

We will first start with an example. Let's say we use the example in PROPOSAL_B:

GET /v2/<name>/manifests/<ref>/references
GET /v2/products/cones/manifests/neapolitan/references

generally yields:

{
  "manifests": [
    <descriptor1>,
    <descriptor2>,
    ...
  ]
}

or specifically yields:

{
  "manifests": [
    {
      "mediaType": "icecream/scoops.vanilla.v1.json",
      "size": 2345,
      "digest": "sha256:b2b2b2..."
    },
    {
      "mediaType": "icecream/scoops.chocolate.v1.json",
      "size": 2345,
      "digest": "sha256:c3c3c3..."
    },
    {
      "mediaType": "icecream/scoops.strawberry.v1.json",
      "size": 2345,
      "digest": "sha256:d4d4d4..."
    }
  ]
}

We could then filter this down with a query, which would be especially useful if the result is long.

GET /v2/products/cones/manifests/neapolitan/references?q_manifests__mediatype__contains=vanilla
{
  "manifests": [
    {
      "mediaType": "icecream/scoops.vanilla.v1.json",
      "size": 2345,
      "digest": "sha256:b2b2b2..."
    },
    {
      "mediaType": "icecream/scoops.vanilla.v1.json",
      "size": 2345,
      "digest": "sha256:c3c3c3..."
    },
    {
      "mediaType": "icecream/scoops.vanilla.v1.json",
      "size": 2345,
      "digest": "sha256:d4d4d4..."
    }
  ]
}

However, this is more useful to query up one level across the registry:

Find me all references of the type chocolate.

GET /v2/_oci/references?q_manifests__mediatype__contains=chocolate

That might be an angrier query, implementation wise. The registry implementation would probably want to provide pagination and rate limiting to not abuse it, and some kind of indexing. There will be no changes to existing schema, as the query will simply allow us to better filter the provided content. The attributes and structure of the current schema are the drivers of the query string.

Query Format

The query format string always starts with q_ and is broken into the following pieces:

GET ?q_<attribute>__<nested-attribute>__<comparison>=vanilla
GET ?q_manifests__mediatype__contains=vanilla

And would mirror how the Django Query API All letters for the query parameter should be provided in lower case (and would be transformed if not), and all queries should be case insensitive. Comparison values can be string or numerican (e.g., for a size or version). The following comparison values are supported:

name description example
contains search for any match that include a value ?q_manifests__mediatype__contains=vanilla
notcontains search for any match that doesn't include a value ?q_manifests__mediatype__contains=vanilla
eq search for exact match (case insensitive) ?q_manifests__mediatype__eq=chocolate
ne search for values that aren't of the exact match ?q_manifests__mediatype__ne=strawberry
gt search for values that are greater than the queyr ?q_manifests__size__gt=400
lt search for values that are less than the query ?q_manifests__size__lt=600

Registry behavior

  • If a query is not valid for any items in the data structure, an error response should be returned with a meaningful message
  • If a query is only valid for some items, the rest can be ignored.
  • Thus an empty response (with 200 OK) says that the query was run successfully (and no matches).

Requirements

The requirements stated for the working group only are specific to a digest or tag. This proposal extends filtering and search to be for any attribute in json provided by the API, and allow for custom filtering. The closest match is:

  • As a user, I want to query the registry for stored objects that reference a container image filtering by type (eg. Signature, SBOM, etc) or by annotation (I want to see all signatures from this identity)
  • As a tool writer, I would like to be able to efficiently query artifacts of different types attached to a given digest
  • As a tool writer, I would like to be able to query for a specific artifact based on artifact type and other user defined annotations on the artifact

These use cases are sort of related, if "most updated" can be derived from a search query.

  • As a user, I want to identify the most updated artifact in a registry
  • As a user, I would like to be able to map monotonically increasing product versions to container images so I have an idea of deployment progression
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment