Last active
February 24, 2020 08:31
-
-
Save VarunVats9/281482b2cd67413c3108d840687fa941 to your computer and use it in GitHub Desktop.
Cassandra
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Reference: | |
https://opensourceconnections.com/blog/2013/07/24/understanding-how-cql3-maps-to-cassandras-internal-data-structure-sets-lists-and-maps/ | |
// Column-Family | |
----------------------------------------------------------------------------------------------- | |
ID Last First Bonus | |
1 Doe John 8000 | |
2 Smith Jane 4000 | |
3 Beck Sam 1000 | |
Cassandra is a partitioned row store. Rows are organized into tables with a required primary key. | |
Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. | |
Cassandra will automatically repartition as machines are added and removed from the cluster. | |
Row store means that like relational databases, Cassandra organizes data by rows and columns. | |
In a row-oriented database management system, the data would be stored like this: | |
1,Doe,John,8000;2,Smith,Jane,4000;3,Beck,Sam,1000; | |
In a column-oriented database management system, the data would be stored like this: | |
1,2,3;Doe,Smith,Beck;John,Jane,Sam;8000,4000,1000; | |
Cassandra is basically a column-family store | |
Cassandra would store the above data as, | |
"Bounses" : { | |
row1 : { "ID":1, "Last":"Doe", "First":"John", "Bonus":8000}, | |
row2 : { "ID":2, "Last":"Smith", "First":"Jane", "Bonus":4000} | |
... | |
} | |
// [ BASIC PRIMARY KEY ] RowKey: PARTITION_KEY_VALUE column = FIELD_NAME value = FIELD_VALUE | |
----------------------------------------------------------------------------------------------- | |
RowKey: 1 | |
=> (column=, value=, timestamp=1374546754299000) | |
=> (column=field2, value=00000002, timestamp=1374546754299000) | |
=> (column=field3, value=00000003, timestamp=1374546754299000) | |
// [ COMPOSITE PRIMARY KEY ] RowKey: PARTITION_KEY_VALUE | |
// column = CLUSTERING_KEY_VALUE:FIELD_NAME value = FIELD_VALUE | |
----------------------------------------------------------------------------------------------- | |
RowKey: softwaredoug | |
=> (column=2013-07-13 08:21:54-0400:, value=, timestamp=1374673155373000) | |
=> (column=2013-07-13 08:21:54-0400:lat, value=4218a5e3, timestamp=1374673155373000) | |
=> (column=2013-07-13 08:21:54-0400:long, value=c29d1917, timestamp=1374673155373000) | |
=> (column=2013-07-13 08:21:54-0400:tweet, value=486176696e67206368657374207061696e2e, timestamp=1374673155373000) | |
=> (column=2013-07-21 12:15:27-0400:, value=, timestamp=1374673155407000) | |
=> (column=2013-07-21 12:15:27-0400:lat, value=42185f3b, timestamp=1374673155407000) | |
=> (column=2013-07-21 12:15:27-0400:long, value=c29d2560, timestamp=1374673155407000) | |
=> (column=2013-07-21 12:15:27-0400:tweet, value=53706565646f2073656c662073686f742e, timestamp=1374673155407000) | |
// [ MORE COMPOSITE PRIMARY KEY ] RowKey: PARTITION_KEY[1]_VALUE:PARTITION_KEY[2]_VALUE | |
// column = CLUSTERING_KEY[1]_VALUE:CLUSTERING_KEY[2]_VALUE:FIELD_NAME value = FIELD_VALUE | |
----------------------------------------------------------------------------------------------- | |
RowKey: partitionVal1:partitionVal2 | |
=> (column=clusterVal1:clusterVal2:, value=, timestamp=1374630892473000) | |
=> (column=clusterVal1:clusterVal2:normalfield1, value=6e6f726d616c56616c31, timestamp=1374630892473000) | |
=> (column=clusterVal1:clusterVal2:normalfield2, value=6e6f726d616c56616c32, timestamp=1374630892473000) | |
// [ MAP ] column = MAP_FIELD_NAME:KEY value = VALUE_OF_KEY | |
----------------------------------------------------------------------------------------------- | |
RowKey: scott | |
=> (column=, value=, timestamp=1374684062860000) | |
=> (column=phonenumbers:bill, value='555-7382', timestamp=1374684062860000) | |
=> (column=phonenumbers:jane, value='555-8743', timestamp=1374684062860000) | |
=> (column=phonenumbers:patricia, value='555-4326', timestamp=1374684062860000) | |
// [ LIST ] column = LIST_FIELD_NAME:UUID value = FIELD_VALUE | |
// UUIDs are maintained to keep the entries in order, inserts are fast, deletes are slow(scan). | |
----------------------------------------------------------------------------------------------- | |
RowKey: john | |
=> (column=, value=, timestamp=1374687324950000) | |
=> (column=friends:26017c10f48711e2801fdf9895e5d0f8, value='doug', timestamp=1374687206993000) | |
=> (column=friends:26017c11f48711e2801fdf9895e5d0f8, value='patricia', timestamp=1374687206993000) | |
=> (column=friends:26017c12f48711e2801fdf9895e5d0f8, value='scott', timestamp=1374687206993000) | |
=> (column=friends:6c504b60f48711e2801fdf9895e5d0f8, value='matt', timestamp=1374687324950000) | |
=> (column=friends:6c504b61f48711e2801fdf9895e5d0f8, value='eric', timestamp=1374687324950000) | |
// [ SET ] column = SET_FIELD_NAME:VALUE value = EMPTY | |
----------------------------------------------------------------------------------------------- | |
RowKey: john | |
=> (column=, value=, timestamp=1374688135443000) | |
=> (column=friends:'doug', value=, timestamp=1374688108307000) | |
=> (column=friends:'eric', value=, timestamp=1374688135443000) | |
=> (column=friends:'matt', value=, timestamp=1374688135443000) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment