Skip to content

Instantly share code, notes, and snippets.

@intech
Forked from HenriqueLimas/leveldb.md
Created July 15, 2019 01:01
Show Gist options
  • Save intech/eca8ae77729ff983a17c3edc5dbe2f1c to your computer and use it in GitHub Desktop.
Save intech/eca8ae77729ff983a17c3edc5dbe2f1c to your computer and use it in GitHub Desktop.

leveldb

am e,bedded key/value database. Is good to be a modular database. If you wanna run applications on browser as well as in node.

its embedded it is in the same process. It lives in the program. It was built for chrome.

install

Since leveldb is a standalone database, you can install it with npm:

npm install level
var level = require('level')
var db = level('test-foo.db')

valueEncondig

var db = level('',{ valueEnconding: 'json' })

what leveldb is good for

  • running the same database in node and browser
  • when you data inste very relationcal
  • build your own kappa architecture (based on logs, useful for p2p)

level methods

  • db.get()
  • db.put()
  • db.del()
  • db.batch()
  • db.createReadStream()

atomicity

either all transcation succeed or all transaction fail. Ex. if you create a new user account and you need to store the address, you don't want to data be inconsistency.

consistency

atomicity is important to enforce consistency.

batch

insert multiple records at a time, atomically

db.batch([
	{ key: "foo", value: 123},
	{ key: "foo", value: 123}
], error => {})

if one fail, all the other fails too.

createReadStream

db.createReadStream(opts)

Return a readable objectMode stream.

  • opts.gte -greater than or equal to
  • opts.gt
  • opts.lte
  • opts.lt
  • opts.limit
  • opts.reverse
var level = require('level')
var db = level('batch.db', { valueEncoding: 'json' })
var to = require('to2')

var batch = []

for (var i = 0; i < 10; i++) {
  batch.push({ key: i, value: i * 1000})
}

db.batch(batch, function(error) {
  if (error) console.error(error)
})
var level = require('level')
var db = level('batch.db', { valueEncoding: 'json' })
var to = require('to2')

db.createReadStream()
  .pipe(to.obj(function (row, enc, next) {
    console.log(row)
    next()
  }))

thinking lexicographically

keys are sorted in strings

and numbers get converted into strings.

Depende on the valueEncoding, but its usally best to store small data, like string.

organizing your keys

key/value structure we might use for a user/post system

data.json

[
{key: "user!substack", value: {bio: "beep boop"}},
{key: "post!substack!2012654 454", value: {bio: "beep boop"}}
]
var level = require('level')
var db = level('user.db')
var batch = require('./data.json')
db.batch(batch, err => {
	if (err) console.error(err)
})
var level = require('level')
var db = level('user.db')

db.createReadStream({gt: 'post!',	lt: 'post!~'})
	.pipe(to.obj((row, enc, next) => {
		console.log(row)
		next()
	})

secondary indexes

We can use .batch() to create multiple keys for each post:

var now = new Date().toISOString()
var id = crypto.randomBytes(16).toString('hex')
var subkey = now + '!' + id
db.batch([
	{type:'post',key:'post!substack!'+subkey,value:msg},
	{type:'post',key:'post!'+subkey,value:msg}
])

Code to insert users

var level = require('level')
var db = level('posts.db', { valueEncoding: 'json'})

var name = process.argv[2]
db.put('user!' + name, {}, err => {
	if (err) console.error(err)
})

Code to list users

var level = require('level')
var db = level('posts.db', { valueEncoding: 'json'})
var to = require('to2')

db.createReadStream({gt: 'user!',lt: 'user!~'})
	.pipe(to.obj((row, enc, next) => {
		console.log(row.key.split('!')[1])
		next()
	}))

Code to create post

var level = require('level')
var db = level('posts.db', { valueEncoding: 'json'})
var strftime = require('strftime')
var randomBytes = require('crypto').randomBytes

var name = process.argv[2]
var msg = process.argv.slice(3).join(' ')
var time = strftime('%F %T')
// ISO8601 - YYYY-MM-DD HH:MM:SS

var id = randomBytes(16).toString('hex') // unique id

var batch = [
  { key: 'post!' + id, value: { name: name, time: time, body: msg } },
  { key: 'post-name!' + name + '!' + time + '!' + id, value: 0 }, //secondary indexes
  { key: 'post-time!' + time + '!' + name + '!' + id, value: 0 }, // secondary indexes
]

db.batch(batch, err => {
	if (err) console.error(err)
})

Code to list posts

var level = require('level')
var db = level('posts.db', { valueEncoding: 'json'})
var to = require('to2')

db.createReadStream({gt: 'post!',lt: 'post!~'})
	.pipe(to.obj((row, enc, next) => {
		var id = row.key.split('!')[0]
		var name = row.value.name
		var time = row.value.time
		var body = row.value.body
		console.log(`${time} <${name}> ${body}`)
		next()
	}))

Code to list posts by name

var level = require('level')
var db = level('posts.db', { valueEncoding: 'json'})
var to = require('to2')

var name = process.argv[2]
var opts = {
	gt: 'post-name!' + name + '!',
	lt: 'post-name!' + name + '!~'
}

db.createReadStream({gt: 'post!',lt: 'post!~'})
	.pipe(to.obj((row, enc, next) => {
		var id = row.key.split('!').slice(-1)[0]
		db.get('post!' + id, function(err, doc) {
			var name = row.value.name
			var time = row.value.time
			var body = row.value.body
			console.log(`${time} <${name}> ${body}`)
			next()
		})

	}))

Code to list posts by time

var level = require('level')
var db = level('posts.db', { valueEncoding: 'json'})
var to = require('to2')

var name = process.argv[2]
var opts = {
	gt: 'post-time!',
	lt: 'post-time!~'
}

db.createReadStream({gt: 'post!',lt: 'post!~'})
	.pipe(to.obj((row, enc, next) => {
		var id = row.key.split('!').slice(-1)[0]
		db.get('post!' + id, function(err, doc) {
			var name = row.value.name
			var time = row.value.time
			var body = row.value.body
			console.log(`${time} <${name}> ${body}`)
			next()
		})

	}))

Libraries for leveldb

subleveldown

Organize keyspace in sublevels. It useful to use just one database and create subdb on it.

var sublevel = require('subleveldown')
var level = require('level')
var db = level('sub.db')

var adb = sublevel(db, 'a')
var bdb = sublevel(db, 'b')

adb.get('count', (err, value) => {
	var n = Number((value || 0)) + 1
	adb.put('count', n, err => {
		if (err) console.error(err)
		else console.log('a:count', n)
	})
})

bdb.get('count', (err, value) => {
	var n = Number((value || 0)) + 1
	bdb.put('count', n, err => {
		if (err) console.error(err)
		else console.log('b:count', n)
	})
})

level-livefeed

subscribe to a live feed of changes to the database. Useful to use with web socket.

var level = require('level')
var db = level('live.db')
var liveStream = require('level-livefeed')
var to = require('to2')

liveStream(db)
	.pipe(to.obj((row, enc, next) => {
		console.log(row)
		next()
	}))

setInterval(() => {
	db.put('hello!' + Date.now(), Date.now(), err => {
		if (err) console.error(err)
	})
}, 500)

There is also a options on that lib to show onlye the most recent.

Leveldb in the browser

require('level-browserify') instead of require('level')

Use indexdb behind the scene.

main.js

var level = require('level-browserify')
var db = level('whatever', { valueEncoding: 'json'})
var html = require('yo-yo')

var root = document.body.appendChild(docuemnt.createElement('div'))
var count = '?'
update()

db.get('count', (err, value) => {
	count = value || 0
	update()
})

function update () {
	html.update(root, html`<div>
		<h1>${count}</h1>
		<button onclick=${onclick}>CLICK ME</button>
	</div>`)
	
	function onclick (ev) {
		count++
		db.put('count', count, err => {
			if (err) console.error(err)
			else update()
		})
	}
}

what to store and not store in level

  • best for tyny documents
  • docuemnts can point at binayr data by hash

some good modules for blob storage:

  • content-addressable-blob-store
  • hypercore
  • webtorrent
  • torrent-blob-store

addimg.js

var blobs = require('content-addressable-blob-store')
var store = blobs('img.blob')
var level = require('level')
var db = level('img.db', { valueEncoding: 'json' })

var w = store.createWriteStream(err => {
	if (err) return console.error(err)
	var key = 'img!' + w.key
	var doc = {
		time: Date.now()
	}
	
	db.put(key, doc, err => {
		if (err) console.error(err)
	})
})

process.stdin
	.pipe(w)

listfile.js

var blobs = require('content-addressable-blob-store')
var store = blobs('img.blob')
var level = require('level')
var db = level('img.db', { valueEncoding: 'json' })
var to = require('to2')

db.createReadStream()
	.pipe(to.obj((row, enc, next) => {
		console.log({key: row.key.split('!')[1], value: row.value})
		next()
	})

readfile.js

var blobs = require('content-addressable-blob-store')
var store = blobs('img.blob')

var hash = process.argv[2]
store.createReadStream(hash)
	.pipe(process.stdout)

server.js

var http = require('http')
var blobs = require('content-addressable-blob-store')
var store = blobs('img.blob')
var level = require('level')
var db = level('img.db', { valueEncoding: 'json' })
var through = require('through2')

var server = http.createServer((req, res) => {
	if (req.url === '/') {
		db.createReadStream()
			.pipe(through.obj((row, enc, next) => {
				next(null, `<div>
					<h1>${row.key.split('!')[1]}</h1>
					<img src="image/${row.key.split('!')[1]}">
				</div>`)
			})
			.pipe(res)

	} else if (/^/image\//.test(req.url)) {
		var hash = req.url.replace(/^/image\//, '')
		res.setHeader('Content-Type', 'image/jpg')
		store.createReadStream(hash).pipe(res)
	} else {
		res.end('not found\n')
	}
}).listen(5000)

======

crypto

hashes

Take a chunk of data and generate a fixed long size

var createHash = require('crypto').createHash
var createHash = require('createHash')

var stream = createHash(algorithm)

algorithims:

  • sha1
  • sha256
  • sha512
  • md5

Can gives some information that the data is consistent. Hash in the browser use create-hash

symmetric chiphers

  • requires a shared secret (like a password)

asymmetric crypto

  • public/private keyparis
  • need toknow somebody's public key

random number generator

secure entropy needed for generating keys. Math.random() is not secure. We can use (crypto).getRandomValues

on browser crypto.getRandomValue(new UIntArray)

don't roll your own crypto

very easy to mess something up

  • replay attacks
  • timing attacks
  • padding/compression oracle attacks
  • side channels
  • downgrade attacks

use libsodium/nacl

  • works in node and the browser
  • require(chloride)
  • uses only good crypto algorithms
  • resists timing attacks

sodium generate keyparis

var sodium = require('chloride')
var kp = sodium.crypto_sign_keypair())

var str = JSON.stringify({ publickey: kp.publiKey.toString('hex'), secretKey: kp.secretKey.toString('hex')})

combiend vs detached

  • combined mode - output containe the original message + signature
  • detached - output contains only the signature

sodium sign/verify combined

var signedData = sodium.crypto_sign(msg, secretKey)
var msg = sodium.crypt

sodium sign/verify detached

var signedData = sodium.crypto_sign_detached(msg, secretKey)
var ok = sodium.crypt_sign_verify_detached(sig, msg, publicKey)

sodium authenticated encyption combined

symmetric cipher with message authentication code (MAC) to prevent tampering

var nonce = crypto.randomBytes(24)
var key = crypto.randomBytes(32)
var cipherText = sodium.crypt_secretbox_easy(msg, nonce, key)
var clearText = sodium.crypto_secretbox_open_easy(cipherText, nonce, key)

secret connection

npm install secret-handshake pull-stream-to-stream

Its useful to estabilish a connection and garantie it.

merkle DAGs

  • hash every document
  • point to other docs by their hash inside the doc itself

examples:

  • git
  • ipfs
  • secure scuttlebutt
  • dat/hypercore

security prperties:

  • tamper-proof: changing a doc changes every hash that points at it
  • with signing, docs are also authenticated

you can receive merkle DAT nodes from untruested peers over gossip protocols

We can create merkle DAGs with shasum

First message of merkle DAGs

echo hello | shasum
echo -e 'i am the second doc\n<HASH of the first message>' | shasum

It will create a unique hash for each message. And if the first one change, the hash change.

Kappa architecture

enterprise architecture

  • immutable, append-only logs are the source of truth
  • materialized views build from the logs

also good for p2p!

You have an append-only logs, we cannot go back and remove. Its important to build robust system.

append-only logs

  • use hashes to build a merkle DAG
  • trivial naive replication: concatenate

hyperlog

append-only merkle DAG log store

  • link to other documents by cryptographic has
  • hooks for crypto signing and verification.
var level = require('level')
var hyperlog = require('hyperlog')
var log = hyperlog(level('log.db'), { valueEncoding: 'json' })

add.js

var level = require('level')
var db = level('log.db')
var hyperlog = require('hyperlog')
var log = hyperlog(db, { valueEncoding: 'json' })

var msg = process.argv[2]
var links = process.argv.slice(3)
log.add(links, { message: msg, time: Date.now() }, function (err, node) {
	if (err) console.error(err)
	else console.log(node)
})

list.js

var level = require('level')
var db = level('log.db')
var hyperlog = require('hyperlog')
var log = hyperlog(db, { valueEncoding: 'json' })
var to = require('to2')

log.createReadStream()
	.pipe(to.obj(function (row, enc, next) {
		console.log(row)
		next()
	})

replicate.js

var level = require('level')
var db = level(process.argv[2])
var hyperlog = require('hyperlog')
var log = hyperlog(db, { valueEncoding: 'json' })


process.stdin.pipe(log.replicate()).pipe(process.stdout)

dubpsh 'node rep.js log.db' 'node rep.js log2.db' Its use for simmetric

hyperlog-index

build materialized views on top of a hyperlog

var indexer = require('hyperlog-index')
var dex = indexer({
	log: log,
	db: db,
	map: function(row, next) {
		//
	}
})

hyperkv

  • p2p key/value store as a materialized view over a hyperlogs
  • multi-value register

hyperlog-sodium

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment