There is nothing more annoying than databases. Absolutely all DBs nowadays are based on some kind of pre-determined data structure (tables, documents, key/val stores, whatever) plus some methods to mutate data on them. They're the functional programmer's worst nightmare and one of the few "imperative" things that still impregnate Haskell programs. I wonder if there isn't, on this human world, a single functional-oriented DB.
I'm thinking of an app-centric, append-only-log database. That is, rather than having tables or documents with operations that mutate the database state - like all DBs nowadays do, and which is completely non-functional - it would merely store an immutable history of transactions. You would then derive the app state from a reducer. Let me explain with an example. Suppose we're programming a collective TODO-list application. In order to create a DB, all you need is the specification of your app and a data path:
import MyDreamDB
data Action = NewTask { user :: String, task :: String, deadline :: Date } deriving Serialize
data State = State [String] deriving Serialize
todoApp :: App
todoApp = App {
init = State [],
next = \ (NewTask user task deadline) tasks ->
(user ++ " must do " ++ task ++ " before " ++ show deadline ++ ".") : tasks}
app <- localDB "./todos" todoApp :: App Action State
If the DB isn't created, it creates it. Otherwise, it uses the existing info. And... that is it! app
now contains an object that works exactly like a Haskell value. Of course, the whole DB isn't loaded in memory; whether it is on memory or disk, that is up to the DB engine.
You insert/remove data by merely appending transactions.
append db $ NewTask "SrPeixinho" "Post my dream DB on /r/haskell"
append db $ NewTask "SrPeixinho" "Shave my beard"
append db $ NewTask "SrPeixinho" "Buy that gift"
Those will append new items to the list of tasks because it is defined like so, but they could remove, patch, or do anything you want with the DB state.
Just use plain Haskell. For example, suppose that you want to get all tasks containing the word post
:
postTasks = filter (elem "post" . words) app
And that is it.
If only State
changes, you need to do nothing. For example, suppose you store tasks as a tuple (user, task, deadline)
instead of a description, as I did previously. Then, go ahead and change State
and next
:
data State = State [(String, String, Date)]
next = \ (NewTask user task deadline) -> (user, task, deadline)
The next time you load the DB, the engine notices the change and automagically re-computes the final state based on the log of transactions.
If Action
changes - for example, you decide to store deadline
as integers - you just map the old transaction type to the new one.
main = do
migrate "./todos" $ \ (NewTask user task deadline) -> (NewTask user task (toInteger deadline))
Suppose you're too often querying the amount of tasks of a given user, and that became a bottleneck. To index it, you just update State
and next
to include the index structure explicitly.
data State = State {
tasks :: [String],
userTaskCount :: Map String Int}
next (NewTask user task deadline) (State tasks count) = State tasks' count' where
tasks' = (user, task, deadline) : tasks
count' = updateWithDefault 0 (+ 1) user count
Like with migrations, DB realizes the change and updates the final state. Then you can get the count of any user in O(1):
lookup "SrPeixinho" . userTaskCount $ todos
Any arbitrary indexing could be performed that way. No DBs, no queries. So easy!
There is one thing more annoying than databases. Communication. Sockets, APIs, HTTP. All of those are required by nowadays real-time applications and are all a pain in the ass. Suppose I gave you the task to make a real-time online site for our Todo
app. How would you do it? Probably, create a RESTful API with tons methods to serve items of our TODO lists, then create a front-end application in JavaScript/React, then make Ajax requests to pool the tasks, then a websocket server because it was too slow and... STOP! You clearly live in the past. With MyDreamDB, this is what you would do:
main = do
app <- publicDB "./todos" todoApp :: App Action State
renderApp $ "<div>" ++ show app ++ "</div>"
$ ghcjs myApp.hs -o myApp.html
$ swarm up myApp.html
$ chrome "bzz:/hash_of_my_app"
See it? No, you don't. Look at it again. Exactly: by changing one word - from localDB
to publicDB
- your now app
is online. That means any process in the world running the same application will be able to see your transactions. Moreover, by adding another line - a State -> HTML
call - I gave a view to our app. Then I compiled that file to HTML, hosted it in a decentralized storage (swarm
), and opened it on Chrome. What you see on the screen is a real-time TODO-list of countless people in the world. Yes!
No, no, wait - you didn't even provide an IP or anything. How would the DB know how to find processes running the same App?
It hashes the specification of your APP, contacts a select number of IPs to find other processes running it and then joins a network of nodes running that app.
But if the DB is public, anyone can join my DB, so they will be able to destroy my data.
No, this is an append-only database. Forgot? No information is ever destroed.
What about spam? If anyone can join, what is stopping someone from sending tons of transactions and bloating my DB?
Before broadcasting a transaction, the DB creates a small proof-of-work of it - basically, a sufficiently small hash of the App code. Other nodes only accept transactions with enough PoW. This takes time to compute, so you essentially create a "portable" anti-spam measure for a distributed network that replaces the need for fees and an integrated currency.
OK, but if anyone is able to submit any transaction, he is still able to do anything with the app state.
No; people are only able to do what is encoded on next
.
But what about logins, accounts, passwords? If all my app's info is public, anyone is able to see everyone else's password.
Use digital signatures.
OK, but every info is still public. Some applications simply require private info.
Use encryption.
Someone with tons of CPU power is still able to DDOS my app.
Yes.
Is it efficient enough?
Each application would work as a specific-purpose blockchain, which are often perfectly usable for their specific applications.
So you're telling me that, with MyDreamDB, you could recreate Bitcoin in a bunch of lines of code?
Yes:
import MyDreamDB
type Address = String
data State = State {
lastHash :: String,
balance :: Map Address Balance}
data Action
= Mine { to :: Address, nonce :: String }
| Send { sig :: Signature, to :: Address, amount :: Integer }
bittycoinApp :: App
bittycoinApp = App { init = State empty, next = next} where
-- "Mining" here is merely a mean of limiting emission,
-- it is not necessary for the operation of the network.
-- Different strategies could be used.
next (Mine to hash) (State lastHash balance)
| sha256 (lastHash++hash) < X =
-- Send money to someone
next tx@(Send sig to amount) st@(State lastHash balance)
| not $ ecVerify sig (show tx) = st -- Signature doesn't match
| lookup address balance < amount = st -- Not enough funds
| otherwise = State lastHash balance' -- Tx successful
where
from = ecRecover sig -- the transaction sender
balance' = update from (- amount)
. update to (+ amuont)
$ balance
main = do
onlineDB "./data" bittycoinApp :: App State Action
Compile and run something like that and you have a perfectly functioning full-node of a digital currency with properties very similar to Bitcoin. Anyone running the same code would connect to the same network. Of course, it might be improved with adjustable difficulty and many other things. But the hardest "blockchain" aspects - decentralization, transactions, consensus, gossip protocols - that all could and should be part of the decentralized implementation of MyDreamDB.
That is, honestly, the project I think we lack the most. What do you think?