Skip to content

Instantly share code, notes, and snippets.

@saswata-dutta
Last active July 17, 2023 11:33
Show Gist options
  • Save saswata-dutta/441497c4d65c3ce646401a909b24b9a4 to your computer and use it in GitHub Desktop.
Save saswata-dutta/441497c4d65c3ce646401a909b24b9a4 to your computer and use it in GitHub Desktop.
create a file system in DynamoDb

Index:

  1. prefer using ulid for all file and folder in the index: this avoids rewriting entries in case of rename
  2. have s3 url separately created than file and folder index, so that move doesn't require s3 moves
  3. limit folder depth to 3 or 5 from ui and back-end : to avoid bulky folder level operations

DDb Schema:

  • pk: acc-id
  • sk: parentPath + "___" + ulid (to make uniq rows use ULID strings in the sk which are sorted by time)
  • root is just '/'; there is no row for root.
  • path separator is '/'
  • parentPath is all folders from root
  • s3 url can be flat within acc-id: s3 key is uuid -> store it in db row
sample:
/
    file1
    file2
    folder1
        file3
        file4
    folder2
        file5
        folder3
            file6
Ddb entries: (not showing other meta data like created, author, modified etc)
pk:a123, sk:/___ulid1 => {type: file, name: "file1", s3: "url", tags: [t1, t2, t3 upto 5 tags], description: " bla bla 10 words"}
pk:a123, sk:/___ulid2 => {type: file, name: "file2", s3: "url", tags: [t1, t2, t3 upto 5 tags], description: " bla bla 10 words"}

pk:a123, sk:/___ulid3 => {type: folder, name: "folder1"}
pk:a123, sk:/ulid3___ulid4 => {type: file, name: "file3", s3: "url", tags: [t1, t2, t3 upto 5 tags], description: " bla bla 10 words"}
pk:a123, sk:/ulid3___ulid5 => {type: file, name: "file4", s3: "url", tags: [t1, t2, t3 upto 5 tags], description: " bla bla 10 words"}


pk:a123, sk:/___ulid6 => {type: folder, name: "folder2"}
pk:a123, sk:/ulid6___ulid8 => {type: file, name: "file5", s3: "url", tags: [t1, t2, t3 upto 5 tags], description: " bla bla 10 words"}


pk:a123, sk:/ulid6___ulid8 => {type: folder, name: "folder3"}
pk:a123, sk:/ulid6/ulid8___ulid9 => {type: file, name: "file6", s3: "url", tags: [t1, t2, t3 upto 5 tags], description: " bla bla 10 words"}
Queries:

ls dir: ddb.query(pk=accId, sk=starts_with(parentPath + "___"))

create: ddb.updateItem(pk=accId, sk=parentPath + "___" + ulid)

edit details: ddb.updateItem(pk=accId, sk=exactKey_known_from_ui)

del file: ddb.deleteItem(pk=accId, sk=exactKey_known_from_ui)

  • then s3 del too

move file: create in dest folder then delete in current folder

  • no need to move in s3

upload new file version: upload to s3 in same location

  • (s3 bucket configured to have say 3~5 versions)

download file: ddb.getItem(pk=accId, sk=exactKey_known_from_ui) => get s3 url

  • prefer sending s3 url and let it download from browser

move/del folder: costly as need to update all child entries

  • do a recursive ls of folder and operate in async`
  • limit folder depth to say 3~5
  • some eventual consistency if during move someone created new file in old path -> can prevent by maintaining a a folder lock entry in db
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment