Allow for query of registry schemas.
As the number of different kinds of references increases, it becomes more important to be able to intelligently query. This proposal is based off of Proposal B but could generally be applied to any json structure.
Description | Link |
---|---|
GitHub issue where this was first proposed | View |
Sample code of a working proof-of-concept | View |
Blog post which proves the point | View |
Given a new field, reference
, as proposed in PROPOSAL_B:
We expose a new query language to filter down a single listing, or across the references in the registry.
We will first start with an example. Let's say we use the example in PROPOSAL_B:
GET /v2/<name>/manifests/<ref>/references
GET /v2/products/cones/manifests/neapolitan/references
generally yields:
{
"manifests": [
<descriptor1>,
<descriptor2>,
...
]
}
or specifically yields:
{
"manifests": [
{
"mediaType": "icecream/scoops.vanilla.v1.json",
"size": 2345,
"digest": "sha256:b2b2b2..."
},
{
"mediaType": "icecream/scoops.chocolate.v1.json",
"size": 2345,
"digest": "sha256:c3c3c3..."
},
{
"mediaType": "icecream/scoops.strawberry.v1.json",
"size": 2345,
"digest": "sha256:d4d4d4..."
}
]
}
We could then filter this down with a query, which would be especially useful if the result is long.
GET /v2/products/cones/manifests/neapolitan/references?q_manifests__mediatype__contains=vanilla
{
"manifests": [
{
"mediaType": "icecream/scoops.vanilla.v1.json",
"size": 2345,
"digest": "sha256:b2b2b2..."
},
{
"mediaType": "icecream/scoops.vanilla.v1.json",
"size": 2345,
"digest": "sha256:c3c3c3..."
},
{
"mediaType": "icecream/scoops.vanilla.v1.json",
"size": 2345,
"digest": "sha256:d4d4d4..."
}
]
}
However, this is more useful to query up one level across the registry:
Find me all references of the type chocolate.
GET /v2/_oci/references?q_manifests__mediatype__contains=chocolate
That might be an angrier query, implementation wise. The registry implementation would probably want to provide pagination and rate limiting to not abuse it, and some kind of indexing. There will be no changes to existing schema, as the query will simply allow us to better filter the provided content. The attributes and structure of the current schema are the drivers of the query string.
The query format string always starts with q_
and is broken into the following pieces:
GET ?q_<attribute>__<nested-attribute>__<comparison>=vanilla
GET ?q_manifests__mediatype__contains=vanilla
And would mirror how the Django Query API All letters for the query parameter should be provided in lower case (and would be transformed if not), and all queries should be case insensitive. Comparison values can be string or numerican (e.g., for a size or version). The following comparison values are supported:
name | description | example |
---|---|---|
contains | search for any match that include a value | ?q_manifests__mediatype__contains=vanilla |
notcontains | search for any match that doesn't include a value | ?q_manifests__mediatype__contains=vanilla |
eq | search for exact match (case insensitive) | ?q_manifests__mediatype__eq=chocolate |
ne | search for values that aren't of the exact match | ?q_manifests__mediatype__ne=strawberry |
gt | search for values that are greater than the queyr | ?q_manifests__size__gt=400 |
lt | search for values that are less than the query | ?q_manifests__size__lt=600 |
- If a query is not valid for any items in the data structure, an error response should be returned with a meaningful message
- If a query is only valid for some items, the rest can be ignored.
- Thus an empty response (with 200 OK) says that the query was run successfully (and no matches).
The requirements stated for the working group only are specific to a digest or tag. This proposal extends filtering and search to be for any attribute in json provided by the API, and allow for custom filtering. The closest match is:
- As a user, I want to query the registry for stored objects that reference a container image filtering by type (eg. Signature, SBOM, etc) or by annotation (I want to see all signatures from this identity)
- As a tool writer, I would like to be able to efficiently query artifacts of different types attached to a given digest
- As a tool writer, I would like to be able to query for a specific artifact based on artifact type and other user defined annotations on the artifact
These use cases are sort of related, if "most updated" can be derived from a search query.
- As a user, I want to identify the most updated artifact in a registry
- As a user, I would like to be able to map monotonically increasing product versions to container images so I have an idea of deployment progression