Skip to content

Instantly share code, notes, and snippets.

@VictorTaelin
Last active December 16, 2021 19:13
Show Gist options
  • Save VictorTaelin/5d5b1146e1f0877319b8ad7dbb1aae0c to your computer and use it in GitHub Desktop.
Save VictorTaelin/5d5b1146e1f0877319b8ad7dbb1aae0c to your computer and use it in GitHub Desktop.
MyDreamDB

There is nothing more annoying than databases. Absolutely all DBs nowadays are based on some kind of pre-determined data structure (tables, documents, key/val stores, whatever) plus some methods to mutate data on them. They're the functional programmer's worst nightmare and one of the few "imperative" things that still impregnate Haskell programs. I wonder if there isn't, on this human world, a single functional-oriented DB.

I'm thinking of an app-centric, append-only-log database. That is, rather than having tables or documents with operations that mutate the database state - like all DBs nowadays do, and which is completely non-functional - it would merely store an immutable history of transactions. You would then derive the app state from a reducer. Let me explain with an example. Suppose we're programming a collective TODO-list application. In order to create a DB, all you need is the specification of your app and a data path:

Local database

import MyDreamDB

data Action = NewTask { user :: String, task :: String, deadline :: Date } deriving Serialize
data State = State [String] deriving Serialize

todoApp :: App
todoApp = App {
  init = State [],
  next = \ (NewTask user task deadline) tasks ->
    (user ++ " must do " ++ task ++ " before " ++ show deadline ++ ".") : tasks}

app <- localDB "./todos" todoApp :: App Action State

If the DB isn't created, it creates it. Otherwise, it uses the existing info. And... that is it! app now contains an object that works exactly like a Haskell value. Of course, the whole DB isn't loaded in memory; whether it is on memory or disk, that is up to the DB engine.

Insert / remove

You insert/remove data by merely appending transactions.

append db $ NewTask "SrPeixinho" "Post my dream DB on /r/haskell" 
append db $ NewTask "SrPeixinho" "Shave my beard"
append db $ NewTask "SrPeixinho" "Buy that gift"

Those will append new items to the list of tasks because it is defined like so, but they could remove, patch, or do anything you want with the DB state.

Queries

Just use plain Haskell. For example, suppose that you want to get all tasks containing the word post:

postTasks = filter (elem "post" . words) app

And that is it.

Migrations

If only State changes, you need to do nothing. For example, suppose you store tasks as a tuple (user, task, deadline) instead of a description, as I did previously. Then, go ahead and change State and next:

data State = State [(String, String, Date)]
next = \ (NewTask user task deadline) -> (user, task, deadline)

The next time you load the DB, the engine notices the change and automagically re-computes the final state based on the log of transactions.

If Action changes - for example, you decide to store deadline as integers - you just map the old transaction type to the new one.

main = do
  migrate "./todos" $ \ (NewTask user task deadline) -> (NewTask user task (toInteger deadline))

Indexing

Suppose you're too often querying the amount of tasks of a given user, and that became a bottleneck. To index it, you just update State and next to include the index structure explicitly.

data State = State {
  tasks :: [String],
  userTaskCount :: Map String Int}

next (NewTask user task deadline) (State tasks count) = State tasks' count' where
  tasks' = (user, task, deadline) : tasks
  count' = updateWithDefault 0 (+ 1) user count

Like with migrations, DB realizes the change and updates the final state. Then you can get the count of any user in O(1):

lookup "SrPeixinho" . userTaskCount $ todos

Any arbitrary indexing could be performed that way. No DBs, no queries. So easy!


Replication, communication, online Apps

There is one thing more annoying than databases. Communication. Sockets, APIs, HTTP. All of those are required by nowadays real-time applications and are all a pain in the ass. Suppose I gave you the task to make a real-time online site for our Todo app. How would you do it? Probably, create a RESTful API with tons methods to serve items of our TODO lists, then create a front-end application in JavaScript/React, then make Ajax requests to pool the tasks, then a websocket server because it was too slow and... STOP! You clearly live in the past. With MyDreamDB, this is what you would do:

main = do
  app <- publicDB "./todos" todoApp :: App Action State 
  renderApp $ "<div>" ++ show app ++ "</div>"

$ ghcjs myApp.hs -o myApp.html
$ swarm up myApp.html
$ chrome "bzz:/hash_of_my_app"

See it? No, you don't. Look at it again. Exactly: by changing one word - from localDB to publicDB - your now app is online. That means any process in the world running the same application will be able to see your transactions. Moreover, by adding another line - a State -> HTML call - I gave a view to our app. Then I compiled that file to HTML, hosted it in a decentralized storage (swarm), and opened it on Chrome. What you see on the screen is a real-time TODO-list of countless people in the world. Yes!

No, no, wait - you didn't even provide an IP or anything. How would the DB know how to find processes running the same App?

It hashes the specification of your APP, contacts a select number of IPs to find other processes running it and then joins a network of nodes running that app.

But if the DB is public, anyone can join my DB, so they will be able to destroy my data.

No, this is an append-only database. Forgot? No information is ever destroed.

What about spam? If anyone can join, what is stopping someone from sending tons of transactions and bloating my DB?

Before broadcasting a transaction, the DB creates a small proof-of-work of it - basically, a sufficiently small hash of the App code. Other nodes only accept transactions with enough PoW. This takes time to compute, so you essentially create a "portable" anti-spam measure for a distributed network that replaces the need for fees and an integrated currency.

OK, but if anyone is able to submit any transaction, he is still able to do anything with the app state.

No; people are only able to do what is encoded on next.

But what about logins, accounts, passwords? If all my app's info is public, anyone is able to see everyone else's password.

Use digital signatures.

OK, but every info is still public. Some applications simply require private info.

Use encryption.

Someone with tons of CPU power is still able to DDOS my app.

Yes.

Is it efficient enough?

Each application would work as a specific-purpose blockchain, which are often perfectly usable for their specific applications.

So you're telling me that, with MyDreamDB, you could recreate Bitcoin in a bunch of lines of code?

Yes:

import MyDreamDB

type Address = String

data State = State { 
    lastHash :: String,
    balance :: Map Address Balance}

data Action
    = Mine { to :: Address, nonce :: String }
    | Send { sig :: Signature, to :: Address, amount :: Integer }

bittycoinApp :: App
bittycoinApp = App { init = State empty, next = next} where

    -- "Mining" here is merely a mean of limiting emission,
    -- it is not necessary for the operation of the network.
    -- Different strategies could be used.
    next (Mine to hash) (State lastHash balance)
      | sha256 (lastHash++hash) < X = 

    -- Send money to someone
    next tx@(Send sig to amount) st@(State lastHash balance) 
      | not $ ecVerify sig (show tx) = st    -- Signature doesn't match
      | lookup address balance < amount = st -- Not enough funds
      | otherwise = State lastHash balance'  -- Tx successful

      where 
        from = ecRecover sig -- the transaction sender
        balance' = update from (- amount)
                  . update to   (+ amuont)
                  $ balance

main = do
    onlineDB "./data" bittycoinApp :: App State Action

Compile and run something like that and you have a perfectly functioning full-node of a digital currency with properties very similar to Bitcoin. Anyone running the same code would connect to the same network. Of course, it might be improved with adjustable difficulty and many other things. But the hardest "blockchain" aspects - decentralization, transactions, consensus, gossip protocols - that all could and should be part of the decentralized implementation of MyDreamDB.

Conclusion

That is, honestly, the project I think we lack the most. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment