Skip to content

Instantly share code, notes, and snippets.

@soloman1124
Created March 3, 2015 00:26
Show Gist options
  • Select an option

  • Save soloman1124/da850c399db17d590668 to your computer and use it in GitHub Desktop.

Select an option

Save soloman1124/da850c399db17d590668 to your computer and use it in GitHub Desktop.
Generate CSV for the events distribution over past few days
require 'aws-sdk'
require 'active_support/all'
require 'csv'
EVENTS_FROM = 2.day.ago
EVENT_TYPES = %w(
page_view user_identify clicked_give_now_cta page_received_donation_-_supporter
post_created_-_supporter page_shared_-_supporter viewed_supporter_page viewed_sign_up_page
commented_on_online_donation_-_supporter joined_team_-_supporter
user_signed_in page_updated_-_supporter user_alias page_created_-_supporter
viewed_home_page left_team_-_supporter signed_up_-_supporter user_initiated_connection_to_app_-_supporter
viewed_fundraising_goal_step requested_to_join_team_-_supporter donation_held_-_supporter
kinesis_consumer_health_check user_connected_to_app_-_supporter completed_order user_signed_up
)
dynamo_db = AWS::DynamoDB::Client.new(api_version: '2012-08-10')
def count dynamo_db, name, from_time, to_time, last_key: nil
options = {
table_name: 'EventStore',
index_name: 'NameTimeIndex',
select: 'COUNT',
key_conditions: {
'name' => {
comparison_operator: 'EQ',
attribute_value_list: ['S' => name ]
},
'time' => {
attribute_value_list: [{'N' => from_time.to_i.to_s }, {'N' => to_time.to_i.to_s }],
comparison_operator: 'BETWEEN'
}
}
}
if last_key
options.merge! exclusive_start_key: last_key
end
result = {}
5.times do |t|
begin
result = dynamo_db.query options
break
rescue AWS::DynamoDB::Errors::ProvisionedThroughputExceededException
p "try again in #{t+1} seconds..."
sleep (t+1)
end
end
if result[:last_evaluated_key]
return result[:count] + count(dynamo_db, name, from_time, to_time, last_key: result[:last_evaluated_key])
else
return result[:count]
end
end
def time_keys from: EVENTS_FROM
keys = []
at = from
while at <= Time.now
keys << at.strftime("%Y-%m-%d %H:00:00")
at = at + 1.hour
end
keys
end
def write_to_csv event_stats_table, time_columns
CSV.open("./events_stats.csv", "wb") do |csv|
csv << ['Event Names'].concat(time_columns)
event_stats_table.each do |event_name, entry|
count_infos = time_columns.map { |key| entry.fetch(key, 0) }
csv << [event_name].concat(count_infos)
end
end
end
time_columns = time_keys()
event_stats_table = {}
begin
EVENT_TYPES.each do |event_name|
puts "start counting [#{event_name}]..."
entry =
time_columns.each_with_object Hash.new do |time_slice, hash|
from_time = Time.parse time_slice
to_time = from_time + 1.hour
hash[time_slice] = count dynamo_db, event_name, from_time, to_time
end
p entry
event_stats_table[event_name] = entry
end
ensure
p event_stats_table
write_to_csv(event_stats_table, time_columns)
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment