Created
May 31, 2017 01:30
-
-
Save learner-long-life/909824e7866c158f053d54b16b1750de to your computer and use it in GitHub Desktop.
Partitions vs. Topics for Sharding
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Changing topics: | |
I’m considering using 64 different topics, listings0 through listings63, instead of 64 partitions. This would allow us to parallelize multiple ML processors, as consumer groups for the same shard. | |
We can have the KS apps actually running on the Solr machines themselves, along with the reducer for a particular shard. | |
hyungoo [5:50 PM] | |
@cryptogoth would we need more than 64-way parallelism? | |
[5:50] | |
also, if we want to parallelize, we can keep it a single topic and have 64*n partitions | |
cryptogoth | |
[6:23 PM] | |
@hyungoo we would benefit from more than 64-way parallelism. For example, 4-way parallelism within each of the 64 shards (256-way) | |
[6:23] | |
if we are running a backfill of ML processor jobs, that could be beneficial | |
hyungoo [6:23 PM] | |
i see | |
[6:23] | |
i’m fine with having 256 partitions | |
cryptogoth | |
[6:23 PM] | |
The design in the roadmap and design document uses 64 partitions, but there are some difficulties (edited) | |
hyungoo [6:23 PM] | |
but i don’t see why we’d need 64 topics, though | |
[6:24] | |
i’d have 1 topic with 256 partition instead | |
cryptogoth | |
[6:24 PM] | |
* We cannot auto-balance within a partition. Consumers, outside of a group, would have to hardcode / subscribe to that partition, and kafka could not help distribute load between them | |
hyungoo [6:25 PM] | |
oh | |
[6:25] | |
yeah | |
[6:25] | |
having 1 topic with 256 partition would not have any issue with that? | |
cryptogoth | |
[6:25 PM] | |
So if we have 4 ML processor jobs for shard 1, they couldn’t help consume partition 1 in an auto-balanced way | |
hyungoo [6:26 PM] | |
why can’t we have 4 partitions for each shard/column? | |
[6:26] | |
hm actually | |
[6:26] | |
back to a more important question | |
[6:27] | |
why do we think 64 partition is not parallel enough? | |
[6:27] | |
your points are true, but i’d like to avoid that unless we actually need to | |
cryptogoth | |
[6:27 PM] | |
64-partition would probably be parallel enough. | |
hyungoo [6:27 PM] | |
is it the indexing part? reducing KS part? or ML inference part? | |
cryptogoth | |
[6:28 PM] | |
the ML inference part is where I think it would be useful to have a separate topic for each shard | |
hyungoo [6:29 PM] | |
hm | |
[6:29] | |
that’d actually be one of the biggest bottleneck | |
[6:30] | |
but we could tackle it in different ways | |
[6:30] | |
we could have just one consumer for ML inference part and scale up/down cicero servers | |
[6:30] | |
or we could let one consumer talk to a single cicero server (possibly a local one) | |
cryptogoth | |
[6:30 PM] | |
true. interesting | |
hyungoo [6:30 PM] | |
and scale the number of consumer - in this case, we may need 64+ partitions | |
[6:31] | |
I have a feeling 64 cicero server would be good for our first implementation, though | |
[6:32] | |
unless we move on to cloud or k8s soon | |
cryptogoth | |
[6:33 PM] | |
in my current KTables demo i’ll stick with partitions for now | |
[6:33] | |
right now we only have 4 prod cicero servers (blackbird05-blackbird08) that i’m aware of | |
hyungoo [6:34 PM] | |
sounds good to me | |
[6:36] | |
keeping things simple would help us get our first n-to-n version working quickly | |
cryptogoth | |
[7:43 PM] | |
:thumbsup: |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment