Created
January 20, 2013 20:01
-
-
Save goshacmd/4581288 to your computer and use it in GitHub Desktop.
MongoDB: Map/Reduce vs Aggregation framework on 1M docs. Basically just counting how many times was made each request (only 250K of requests had search queries). Aggregation turned to be 84x faster.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1000000 documents | |
249819 documents with non-empty 'search_query' | |
user system total real | |
simple map/reduce 0.270000 0.010000 0.280000 (353.701946) | |
simple aggregation 0.180000 0.010000 0.190000 ( 8.049170) | |
filtering map/reduce 0.250000 0.010000 0.260000 (337.955130) | |
filtering aggregation 0.150000 0.010000 0.160000 ( 4.095468) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require File.expand_path('../config/environment', __FILE__) | |
puts "#{RequestEvent.count} documents" | |
puts "#{RequestEvent.ne("data.search_query" => nil).count} documents with non-empty 'search_query'" | |
puts | |
# Simple map/reducing | |
def simple_map_reduce | |
map = %Q{ | |
function() { | |
emit(this.data.search_query, { count: 1 }); | |
} | |
} | |
reduce = %Q{ | |
function(key, values) { | |
var result = { count: 1 }; | |
values.forEach(function(value) { | |
result.count += value.count; | |
}); | |
return result; | |
} | |
} | |
RequestEvent.all.map_reduce(map, reduce).out(inline: 1).to_a | |
end | |
# Simple aggregation | |
def simple_aggregation | |
pipeline = [{ "$group" => { "_id" => "$data.search_query", "count" => { "$sum" => 1 } } }] | |
RequestEvent.collection.aggregate(pipeline) | |
end | |
# Map/reduce, exclude blank fields | |
def filtering_map_reduce | |
map = %Q{ | |
function() { | |
if (this.data.search_query) | |
emit(this.data.search_query, { count: 1 }); | |
} | |
} | |
reduce = %Q{ | |
function(key, values) { | |
var result = { count: 1 }; | |
values.forEach(function(value) { | |
result.count += value.count; | |
}); | |
return result; | |
} | |
} | |
RequestEvent.all.map_reduce(map, reduce).out(inline: 1).to_a | |
end | |
# Aggregate, exclude blank fields | |
def filtering_aggregation | |
pipeline = [ | |
{ "$match" => { "data.search_query" => { "$ne" => nil } } }, | |
{ "$group" => { "_id" => "$data.search_query", "count" => { "$sum" => 1 } } } | |
] | |
RequestEvent.collection.aggregate(pipeline) | |
end | |
Benchmark.bm do |x| | |
x.report("simple map/reduce") { simple_map_reduce } | |
x.report("simple aggregation") { simple_aggregation } | |
x.report("filtering map/reduce") { filtering_map_reduce } | |
x.report("filtering aggregation") { filtering_aggregation } | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment