This is a draft of a guide for migrating from Databroker v1.x to Databroker v2.x (currently in prerelease). The data storage does not change; only the way it is accessed changes. It is possible to run Databroker v1.x and 2.x against the same MongoDB concurrently. Databroker 1.x was effectively a plugin to Intake. Databroker 2.x refactors Databroker as a plugin to Tiled, and drops any depenedency on Intake.
Databroker 2.x supports backward-compatible* usage:
from databroker import Broker
db = Broker.named("xyz")as well as access via the Tiled Python API, illustrated below.
*Some methods in Databroker 1.x cannot be supported, but we find that the vast majority of user code runs unchanged.
By reimagining Databroker as a service we get the following advantages:
- It is possible to access Databroker data from any language, not just Python, via HTTP.
- It is possible to enforce granular access controls.
- Databroker data can be transcoded into many formats.
- Databroker data can be served alongside data, such as analysis results, that does not lend itself to Bluesky's event-based data model.
- For many workloads, it is much faster.
pip install --upgrade --pre databroker[all]
In this mode, the "server" and the client run in the same process. Data is passed between them via Python function calls. There is no actual networking. This is useful for debugging.
# ~/.config/tiled/profiles/test.yml
xyz:
direct:
authentication:
allow_anonymous_access: true
trees:
- tree: databroker.mongo_normalized:Tree.from_uri
path: /
args:
uri: mongodb://{hostname}:{port}/{database}
asset_registry_uri: mongodb://{hostname}:{port}/{database} # may be omitted if it's the same as uri aboveNew API (Tiled):
from tiled.client import from_profile
c = from_profile("xyz")Backward-compatible API:
from databroker import Broker
db = Broker.named("xyz")At this stage, you should see the speed benefits.
All versions of Bluesky/Ophyd have captured the shape and dtype of external (e.g. Area Detector) data. However,
until now, nothing actually relied on that information being correct. As such, we have only recently discovered and
addressed bugs where the wrong shape or dtype were being recorded. If you encounter errors like BadShapeMetadata,
this is why. Fortunately, there is an automated way to fix this: we have a script that opens each data set, looks at the
actual shape, and updates the relevant document(s) in MongoDB to reflect. It's not quite ready for sharing, but it
can be made ready soon.
To get proper security and access control, we need to run a real server. The configuration is similar: everything
that was under direct: above now goes at top level.
# config.yml
authentication:
allow_anonymous_access: true
trees:
- tree: databroker.mongo_normalized:Tree.from_uri
path: /
args:
uri: mongodb://{hostname}:{port}/{database}
asset_registry_uri: mongodb://{hostname}:{port}/{database} # may be omitted if it's the same as uri aboveThe config.yml can be placed anywhere. We pass its location to the command below to start the server:
tiled serve config config.yml
And we can connect to it like:
from tiled.client import from_uri
c = from_uri("http://localhost:8000/api")
We can update our profile:
# ~/.config/tiled/profiles/test.yml
xyz:
uri: http://localhost:8000/apiand now the same client-side usage as before will connect to the actual server instead of running one directly in-process. Exactly as before:
from tiled.client import from_profile
c = from_profile("xyz")Backward-compatible API:
from databroker import Broker
db = Broker.named("xyz")