Skip to content

Instantly share code, notes, and snippets.

@labkode
Forked from icewind1991/Storage API.md
Last active November 2, 2015 15:07
Show Gist options
  • Save labkode/a84a8f66920a6cb9355c to your computer and use it in GitHub Desktop.
Save labkode/a84a8f66920a6cb9355c to your computer and use it in GitHub Desktop.

Storage API

The storage API consists of 2 seperate (sets of) interfaces:

  • Storage Implementation Interface: it is the main interface to implement to create a custom storage backend.
  • Storage Adapter Interface: it is the interface that upper layers like end user application can implement to access the storage backend.

3 supported storage types

  1. Storage backend fully handles all meta data (CERN EOS)
  2. Storage backend provides meta data which is cached (Local Storage, most external)
  3. Storage backend doesn't provide any meta data, only blob storage (Object Storage like S3)

Storage Implementation Interfaces

The responsabilities of this interface are splitted over 5 interfaces to allow reuse from storage to storage implementations.

WARNING: the resource object SHOULD NOT be the resource obtained when doing a local open. It SHOULD be an abstract class that represents the operations that can be made on a resource independently of the storage backend. This is needed to avoid passing storage implementation objects to upper layers trough the storage interface like it is now with OC.

Storage\Data

Responsible for storing file data, no knowledge about any meta data besides file path

  • readStream(string $path): resource
  • writeStream(string $path, resource $data)
  • delete(string $path)

Storage\Tree

Handles directories and file tree operations (list content, rename)

  • exists(string $path): bool
  • newFolder(string $path)
  • deleteFolder(string $path)
  • listFolderContents(string $path)
  • move(string $source, string $target)

Storage\MetaRead

Provides read access to metadata

  • getMeta(string $path): MetaData
  • getMetaById(string $id): MetaData
  • getFolderContentsMeta(string $path): MetaData[]
  • getFolderContentsMetaById(string $id): MetaData[]

Storage\MetaWrite

Provides write access to metadata

  • setMeta(string $path, array $data)
  • move(string $souce, string $target)
  • remove(string $path)

Storage\MetaTree

  • getParentsById(int $id): MetaData[] WHY IS THIS NEEDED ? IT LOOKS LIKE A SQL METADATA STORE IMPLEMENTATION DETAIL
  • traverse(string $path): Traversable<MetaData>

Storage Adapter

The storage adapter takes care of hiding the difference in storage implementation types from the user of the storage interface

The adapter takes one or more classes which implement the various implementation interfaces.

Different implementations of the storage adapter can be used to add functionality such as metadata caching or spreading data over multiple data stores.

Adapter\Adapter

Adapator for storage implementation which fully manage their own metadata

Requires one Data, Tree, MetaRead and a MetaTree instance

  • readStream(string $path): resource
  • writeStream(string $path, resource $): resource
  • newFolder(string $path)
  • delete(string $path)
  • rename(string $source, string $path)
  • exists(string $path): bool
  • getMeta(string $path): MetaData
  • getMetaById(int $id): MetaData
  • getFolderContents(string $path): MetaData[]
  • getFolderContentsById(string $id): MetaData[]
  • listParentsById(int $id): int[] WHY IS THIS NEEDED ?
  • traverse(string $path): Traversable<MetaData>
  • getParentsById(int $id): MetaData[] WHY IS THIS NEEDED ?

Adapter\UpdateMeta extends Adapter\Adapter

Adapter for storage implementation where we need to update the metadata manually after write operations

Requires one Data, Tree, MetaRead, MetaWrite and a MetaTree instance

Adapter\Caching extends Adapter\UpdateMeta

Adapter for storage implementation where meta data should be cached.

Requires two MetaRead and Tree instances, one Data, MetaTree and a MetaWrite instance

Adapter\FullMeta extends Adapter\Adapter

Adapter for storage implementation where we need to manage some metadata (ETag, Mtime) manually

Requires one Data, Tree, MetaRead and a MetaWrite instance

Scanner

Takes one Tree and MetaRead instance as source and syncronizes it with a MetaRead and MetaWrite instance

Example cases

Local (and most external storages)

Local implements Data, Tree and MetaRead where all meta data is read from the underlying filesystem DBCache implements Tree, MetaRead, MetaWrite and MetaTree with all data stored in the database

Adapter\Caching reads it's meta data from the DBCache, updates the DBCache when needed and reads and writes the files from Local

ObjectStore

ObjectStore implements Data and only reads and writes blobs from the objectstore DBCache implements Tree, MetaRead, MetaWrite and MetaTree with all data stored in the database

Adapter\FullMeta handles maintaining all metadata in the DBCache

EOS (cern's storage implementation with full metadata)

EOS implements Data, Tree, MetaRead, MetaWrite and MetaTree and handles all metadata operations itself

Adater\Adapter only has to pass all operations down to EOS

@icewind1991
Copy link

@labkode what do you mean with

This is needed to avoid passing storage implementation objects to upper layers trough the storage interface like it is now with OC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment