Skip to content

Instantly share code, notes, and snippets.

View de-sh's full-sized avatar

Devdutt Shenoi de-sh

View GitHub Profile
@de-sh
de-sh / cncf_tikv_final_gsoc_submission.md
Last active August 24, 2020 19:09
A summary of achievements made through the Google Summer of Code program with CNCF, in adding a cloud native storage engine to the TiKV project

The project proposal I had submitted aimed to take TiKV closer to full cloud native support. A vague understanding of the project allowed me to put forward the idea of supporting cloud based data stores such as AWS S3 to the TiKV storage engine, i.e. rocksdb, by making use of the rocksdb-cloud library. The ideas put forward by the team at TiKV also helped to better formulate a plan that has been worked on through various PRs to the sister projects that associate with the storage component of the Database. The main issue being worked on was initiated well before the GSoC project started and can be found at tikv/tikv#6506.

during the community boding period, Yi Wu(@yiwu-arbug) ideated further on the project with Liu Wei(@little-wallace), who compiled a [document

@de-sh
de-sh / week_nine.md
Last active August 6, 2020 15:09
A report of work done during the ninth week of the Google Summer of Code 2020 program

July 26 - August 01: Most of this week was used up in fixing the build process to run the CI on the newly added features, fixing the manner in which it handled the installation of AWS SDK as well as providing a stable based for starting work on tikv/tikv. Yi also opened an issue(#8367) which I have started solving with a draft PR(#8383). The idea is to progressively add a variable to DbConfig called s3 which will itself be composed of an enabled flag to mark the use of S3 as a storage engine.

I have been facing a lot of hurdles in building without error/warning. I think there is an issue with cargo-nightly.

Update: There was infact an error with cargo-nightly! The Cargo.lock was being forced to update on every cargo build thus there was a lot of version mismatch that was not meant to be. This was spotted in 1.47.0-nightly 2020-07-26 and solved in [1.47.0-nightly 2020-08-05](https:

#[derive(Clone, Serialize, Deserialize, PartialEq, Debug, Configuration)]
#[serde(default)]
#[serde(rename_all = "kebab-case")]
pub struct DbConfig {
#[config(skip)]
pub info_log_level: LogLevel,
#[serde(with = "rocks_config::recovery_mode_serde")]
#[config(skip)]
pub wal_recovery_mode: DBRecoveryMode,
#[config(skip)]
@de-sh
de-sh / week_eight.md
Last active July 29, 2020 12:40
A report of work done during the eighth week of the Google Summer of Code 2020 program

July 19-25: As Yi had decided to get me invested in re-writing the rocksdb-cloud repo as a plugin to tikv/rocksdb, there were some editions made to the codebase modernizing it from tag v5.18.3 to work seamlessly and compile for tikv/rocksdb at v6.4.tikv. The re-structured code removing all except the cloud components in tikv/rocksdb#182 required some reworking of tikv/rocksdb, by publicising the LogReporter struct, is being achieved in #181.

With respect to addition of an interface to these cloud features through the rust-rocksdb ffi code, I was able to restructure it in such a way that now we can create a AWS S3 based instance of rocksdb env, a [rudimentry test](https://github.com/tikv/rust-rocksdb/blob/45b28cd4c99b7d37380df441

@de-sh
de-sh / week_seven.md
Last active July 29, 2020 12:35
A report of work done during the seventh week of the Google Summer of Code 2020 program

July 12-18: Work on rust-rocksdb bindings for rocksdb-cloud progressed with addition of some more code changes for compilation to be successful, apparently the code wasn't compiling due to certain changes in rocksdb that weren't carried out equally among the forks tikv/rocksdb~6.4.tikv and rockset/rocksdb-cloud~v6.7.3, including the inclusion of remote compaction that we have planned to include in the future and aren't pursuing at the get go, these are PluggableCompactionParam and PluggableCompactionResult defined in pluggable_compaction.h, so we have decided to use rockset/rocksdb-cloud~v5.18.3 instead.

As the work continued, I was able to make the changes necessary for this, have included them in local an

@de-sh
de-sh / week_six.md
Last active July 29, 2020 05:44
A report of work done in the sixth week of the Google Summer of Code 2020 program

July 5-11: Work on rust-rocksdb-cloud bindings progressed with experimentation in writing an interface ccloud wrapper that is inspired from crocksdb C-ABI of the [rust-rocskdb]. It doesn't seem to be a working option and I am not entirely sure if this is the way to go forward, I have requested Yi to reconsider the creation of an entirely separate interface that is not contained within the same crate as the rocksdb interface. Some trial interfaces that I have created includes a layer of code that effectively invokes CloudEnv::NewAwsEnv() but there are issues in compilation.

@de-sh
de-sh / cndb-se.md
Last active July 19, 2020 18:04
Cloud Native Database Storage Engines

In the cloud native world, databases are more than just stores of data, they act as data exchanges, transporters and in some form as data processors. In this scheme of things, a database is in effect composed of micro-services to make it truly cloud native. Log Structured Merge Trees are a datastructure standard that fits well for use in this realm of high volume, high velocity 'data-engines' as I like to call them.

One of the databases that many people refer to in this field is Google's BigTable, infamous in the space for having started a conversation. But we are not going to be talking about that and infact refer to a product born out of Facebook, rocksdb. MyRocks is a database that works at scale and is distributed, building on top of the standards set by MySQL, with a core written in CPP known as rocksdb. Rocksdb-Cloud is a set of tools that were added on top of this engine that exponentially increases the ability of this database as well as add the ability for it to work on a public cloud setting, makin

@de-sh
de-sh / week_five.md
Last active July 29, 2020 05:35
A report of work done in week five of the Google Summer of Code 2020 program

June 28 - July 4: I am unable to keep a note of the activities as the weeks are progressing due to a hectic load from the experimentative nature of work alotted to me, but it seems like I am not doing bad and have been going in the correct direction. I have no experience of working with CPP code compilation at the scale of the rocksdb-cloud project, but it seems like the CMakeLists.txt I have written is somewhat working from the edits that @yiwu-arbug has committed to the repo. I will take notes and document the same here when the entire process is done with.

@de-sh
de-sh / week_four.md
Last active July 22, 2020 08:22
A report of work done in week four of the Google Summer of Code 2020 program

June 21-27: Conversed with Yi Wu by video call, I have internal exams going on, but I have started work on tasks as pointed out by him on rust-rocksdb#514.

@de-sh
de-sh / fold_rs.md
Last active June 19, 2020 16:36
Summing a vector into a single value

The syntax of rust is readability driven and there is immense attention to detail given on simple interfaces to achieve functional programming approach of computation with map, filter and fold allowing for an iter item to be operated on, which sometimes increases the readability of the program in question.

Introducing Closures

A closure is the concept of isolating the values that a function is operating on, it's 'environment', popularized in functional programming. We are using an anonymous function in our program, which is passed literally as a value instead of being a call. To define an anonymous function, closure in rust we are going to use the | operator and place the arguments between it, before defining the function's behaviour right after.

let inc = |i| i+1;
println!("{}", inc(1));

Run in Playground