Skip to content

Instantly share code, notes, and snippets.

@awreece
Created June 4, 2013 18:27
Show Gist options
  • Save awreece/5708264 to your computer and use it in GitHub Desktop.
Save awreece/5708264 to your computer and use it in GitHub Desktop.
[08:43:33] <awreece> what is is a good benchmark for memcached?
[08:44:27] <awreece> https://github.com/antirez/mc-benchmark appears to be several years old, and last I checked there was some disagreement about it
[08:44:55] <awreece> in particular, I want to experiment with a new storage engine
[11:02:50] <dormando> awreece: github.com/dormando/mc-crusher
[11:05:28] <awreece> cool. I'll try this
[11:05:57] <awreece> does it make any attempt to simulate realistic application workloads, or is the general wisdom "every workload is different" ?
[11:06:43] <awreece> (I notice "it only works well with small values")
[11:08:08] <dormando> most workloads are different. mc-crusher lets you set up config files for different types
[11:08:23] <dormando> it's mostly designed to stress the storage engine locks/etc. find the breaking points.
[11:08:48] <awreece> ok
[11:09:00] <awreece> for reference, I'm looking into integrating the work from this paper https://www.cs.cmu.edu/~dga/papers/memc3-nsdi2013.pdf
[11:09:25] <awreece> hopefully as an alternate engine
[11:09:26] <dho> ha
[11:10:24] <dormando> awreece: I think I remember that paper... IIRC some part of it says "writes aren't really that important"
[11:10:33] <dormando> so it wasn't super interesting to me. it's a clever idea though
[11:10:33] <awreece> I don't know if the community has discussed it or not earlier, my impression from talking to the author was that the code wasn't actually prepared to contribute back
[11:10:42] <dormando> yeah it never is
[11:10:46] <awreece> IIRC it said "resizes don't happen"
[11:10:48] <dormando> people write papers like that all the time
[11:10:54] <dormando> resizes was another thing
[11:11:09] <dormando> but it also said theoretical write speed is low
[11:11:10] <awreece> my goal is to actually prototype it
[11:11:16] <dormando> unless I'm confusing that with one of the other 20 papers
[11:11:17] <dho> dormando: yes, that's the read-heavy workload paper
[11:11:17] <awreece> really?
[11:11:34] <dormando> yeah it's sorta buried in there
[11:11:39] <dho> dormando: the one I was showing you a few months ago
[11:12:17] <dormando> some of them don't work with slab rebalance too
[11:12:32] <awreece> I've talked to the paper author and have his blessing to work on it
[11:12:40] <dormando> cool. good luck
[11:12:57] <awreece> when I get farther, should I also email the mailing list?
[11:12:57] <dho> it falls over for write-heavy workloads
[11:13:01] <dormando> yeah
[11:13:05] <awreece> ok
[11:13:26] <dho> I would say it would be very nice if those algorithms could be used dynamically.
[11:13:36] <dho> or at least swapped out.
[11:13:37] <awreece> dho: perhaps I didn't read the paper that carefully, but I never got the impression it was slower on writes existing solutions
[11:13:49] <dho> awreece: it's definitely in there, sec
[11:13:52] <awreece> dho: well, I was planning to make a memcached engine
[11:14:11] <awreece> so it should theoretically be easy to swap out
[11:14:59] <dho> "When
[11:14:59] <dho> more write trac is generated, performance of cuckoo
[11:14:59] <dho> hash tables declines because Inserts are serialized and
[11:15:00] <dho> more Lookups happen concurrently"
[11:15:09] <dho> additionally, it serializes all writes
[11:15:18] <dormando> so does the current one :P
[11:15:43] <awreece> it allows
[11:15:43] <awreece> only one writer at a time—a tradeo
we accept as our target workloads are read-heavy.
[11:15:56] <awreece> yeah
[11:16:07] <awreece> my impression of the current code was that it serialized writers
[11:16:16] <dormando> it is, but the lock is really short
[11:16:21] <dho> anyhoo. just some thoughts. i don't actually use memcached, i just sit in here to torment dormando
[11:16:21] <dormando> so something complex + serialized will lose to it
[11:16:22] <awreece> yeah, this could be held for a while
[11:16:29] <dho> and watch him yell at people
[11:16:31] <dormando> dho: shush
[11:16:31] <awreece> also, hi dho
[11:16:37] <dho> also hi
[11:16:38] <awreece> I mostly see you in #go-nuts
[11:16:49] <dho> yarp
[11:16:53] <dho> i work with dormando
[11:16:58] <dho> so obligatory hard time
[11:16:59] <dormando> awreece: feel free to engineer whatevs. just be aware that it's hard for me to "swap in the default" engine despite dozens of people screaming at me to do so :P
[11:17:29] <awreece> do we have real benchmarks to be able to say stuff like "our target workloads are read heavy' or whatnot?
[11:17:41] <dormando> I'm not sure what you mean
[11:17:42] <dho> you can configure mccrusher to effectively make those claims
[11:17:57] <dormando> well.. yeah.
[11:18:00] <dho> if that's what you're asking
[11:18:02] <awreece> well, I don't use memcached in production, but I'm sure there exist peopel that do :P
[11:18:07] <awreece> the peopel who would complain
[11:18:09] <dormando> with mc-crusher you can just run some readers and some writers
[11:18:10] <awreece> do they have numbers?
[11:18:15] <dormando> I don't know
[11:18:18] <awreece> traces?
[11:18:22] <awreece> ok, thats fair
[11:18:29] <dho> there are people who have write-heavy workloads
[11:18:36] <dormando> some people do massive imports when memcached starts
[11:18:39] <dormando> or daily imports and stuff
[11:18:40] <awreece> my first question is absolutely "what do workloads look like?"
[11:18:45] <awreece> hrm
[11:18:48] <dormando> workloads look like everything and all things under teh sun
[11:18:51] <dho> there are people who have 10% hit rates
[11:19:01] <dho> but it's still a win, because it's still better than 0%
[11:19:03] <awreece> ... why are they using memcached
[11:19:05] <dormando> so I only ever improve the performance of all edge cases
[11:19:06] <awreece> ha, ok
[11:19:08] <dormando> I never reduce it
[11:19:19] <dormando> ie; I can't increase memory usage
[11:19:23] <dormando> and I can't slow down writes in favor of reads
[11:19:29] <dormando> I only just chip away at making them scale better
[11:19:38] <awreece> mreh
[11:19:47] <dormando> that's what alternative storage engines were supposed tos olve
[11:19:55] <dormando> but we've had issues getting people to test that branch
[11:20:11] <awreece> it took me a while to find the engine branch
[11:20:22] <dho> that's a good approach ;)
[11:20:23] <dormando> yeah. the stupid engine branch is behind 1.4 for perf patches
[11:20:30] <dormando> so bench 1.4
[11:20:34] <dormando> then write your engine into 1.5 or something
[11:20:52] <dormando> if you're looking to be sure that "write speed below this level is fine" you'll never find that
[11:21:00] <awreece> could you repeat that one more time?
[11:21:02] <dormando> there're tons of companies who use the thing who'll never post anything anywhere ever
[11:21:14] <awreece> you want me to submit code against 1.5?
[11:21:14] <dormando> awreece: the engine branch is behind the stable tree
[11:21:17] <dormando> it is slower
[11:21:19] <awreece> yeah, I saw that
[11:21:21] <awreece> oh, really
[11:21:30] <dormando> yes
[11:21:32] <awreece> ... why?
[11:21:44] <dormando> because every time I beg for help testing the engine branch nobody does it
[11:21:50] <dormando> so I had to keep developing on -stable
[11:21:57] <dormando> because this project is cursed and I hate it
[11:22:01] <awreece> :(
[11:22:23] <dormando> the engine tree is fine, I just never got enough attention to it to line it up with memcached itself
[11:22:32] <dormando> so the engine tree appears in mysql, couchbase, etc
[11:22:41] <dormando> with storage engines which ahve different requirements and expectations
[11:23:05] <dormando> anyway I gotta run
[11:23:06] <dormando> have fun
[11:23:09] <awreece> thanks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment