Last active
October 15, 2019 02:34
-
-
Save vmarquez/204b8f44b1279fdbae97b40f8681bc25 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public class Event { | |
@PartitionKey(0) public UUID accountId | |
@PartitionKey(1)public String yearMonthDay; | |
@ClusteringKey public UUID eventId; | |
//other data... | |
} | |
public static void sampleUsage() { | |
//we want to ONLY query data from three years ago for a set of accounts, so we will generate that somehow. | |
//Also note that one token will likely generate many Events... | |
Set<UUID> accounts = getRelevantAccounts(); | |
List<String> dateRange = generateDateRange("2016-01-01", "2017-01-01"); | |
PCollection<Token> tokensToQuery = p.apply(generateMyTokens(accounts, dateRange)); //Note token is not serializable, we can represent it with a custom class wrapping byte arrays | |
PCollection<Event> events = tokensToQuery.apply(CassandraIO.<Event>readAll("Select * from Event where token(accountId, yearMonthDay) = ?")); | |
//query above or even just table could be specificed wtih the builder pattern, this is just an example. | |
} | |
/* | |
Currently CassandraIO queries over the entire token range, and allows for filtering. Obviously if we want to exclude tens or | |
hundreds of thousands of primary keys this won't work so well, so instead i'm proposing a way to supply a list of Tokens to query. | |
Similar to how CassandraIO currently bunches up token range queries as a List<List<Query>> we can do the same under the hood, ideally grouping by the node | |
that owns the token. | |
I believe it would also be possible to, under the hood, make the current implementation use something similar, where it would take a PCollection<TokenRange> | |
and a query, and in the above proposed case each token range would only span one actual token. | |
*/ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment