Created
September 22, 2011 22:37
-
-
Save steeve/1236242 to your computer and use it in GitHub Desktop.
Brisk + Cassandra + get_slice
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Brisk's Hive allows you to transpose a row key to a table of (row_key, column_name, value). | |
Now we are able to leverage Cassandra's get_slice to only return | |
a subset of columns. Very useful when using Cassandra indexes (wide rows). | |
See the pull request: https://github.com/riptano/hive/pull/3 | |
Let's say you have a wide row index: | |
So instead of having: | |
SELECT * FROM MyTable WHERE a > x and b < y; | |
You can do: | |
CREATE EXTERNAL TABLE MyDB.MyTmpIdx(key string, col_value int, foreign_key string) | |
STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler' | |
WITH SERDEPROPERTIES ( | |
"cassandra.cf.name" = "MyCF", | |
"cassandra.columns.mapping" = ":key,:column,:value") | |
TBLPROPERTIES ( | |
"cassandra.ks.name" = "MyKS", | |
"cassandra.slice.predicate.range.start" = "50", | |
"cassandra.slice.predicate.range.finish" = "800", | |
"cassandra.slice.predicate.range.reversed" = "false", | |
"cassandra.slice.predicate.range.comparator" = "org.apache.cassandra.db.marshal.IntegerType" ); | |
SELECT MyTable.* FROM MyTable LEFT SEMI JOIN MyTmpIdx on (MyTable.key = MyTmpIdx.foreign_key); | |
Boom, you only have a subset, directly from Cassandra, before the mapping even began :) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Pull request here!
https://github.com/riptano/hive/pull/3