informal spec for shitty datalog db implemented on a filesystem: * a directory represents a table * a file in the directory represents a row * the filename is a universal hash of the content * the file/directory extension is the format of the content (e.g. `.ssii`: two strings, two integers) * row format could be binary or single-line csv some nice properties: * needs no container format, leverages existing technology for hi-throughput caching and buffering as well as shared network access * separates unindexed data from index * create/drop tables by just adding/removing directories * index by table just by searching directory * add/remove rows by just writing/removing files * rows are content-addressed, therefore: * rows are immutable * deduplicates, since duplicate content -> duplicate filename * corruption can be detected by checking if content matches filename * each row has a creation date and time, as well as last access (permitting LRU drops) * alter columns by transitioning file extensions; can be safely resumed after interruption * online indices can update after file system notifications * bonus: permission flags could do something interesting drawbacks: * max 64K rows per table, beyond that a HAMT-like structure is needed, i.e. sort into subdirectories by first few digits of hash * possibly too taxing for SSDs * incomplete: applications still need to build indices over the data