-
-
Save akcrono/376904e565fd873e7b78 to your computer and use it in GitHub Desktop.
class CustomMailExporter | |
require 'fileutils' | |
attr_accessor :service, :target_users, :target_start_date, :target_end_date, | |
:path, :errors, :filename_counter, :emails_found | |
# Dates should be in Date or DateTime format. Users should be an array. | |
def initialize(service, target_users, target_start_date, target_end_date) | |
@service = service | |
@target_users = target_users.map(&:downcase) | |
@target_start_date = target_start_date | |
@target_end_date = target_end_date | |
@path = "/mnt/#{service.id}/" | |
@errors = [] | |
@filename_counter = 0 | |
@emails_found = 0 | |
end | |
def process | |
FileUtils.mkdir_p(path) unless File.directory?(path) | |
service.metadatum_class.find_each(service.id) do |datum| | |
if target_user?(datum.from) && date_in_range?(datum.date) | |
write_contents_to_file(datum) | |
@emails_found += 1 | |
end | |
end | |
return true if errors.count == 0 | |
end | |
def convert_to_email_address(from) | |
from.split("<").last.split(">").first | |
end | |
def target_user?(from) | |
target_users.include?(convert_to_email_address(from).downcase) | |
end | |
def date_in_range?(date) | |
date > target_start_date && date < target_end_date | |
end | |
def write_contents_to_file(datum) | |
begin | |
path_and_name = path | |
if datum.respond_to?(:content_filename) && datum.content_filename.present? | |
path_and_name += datum.content_filename.gsub(/[.<>:"\/\\|\?\*']/, "") | |
#use gsub for problem characters in subjects | |
else | |
path_and_name += filename_counter.to_s | |
filename_counter += 1 | |
end | |
path_and_name += ".eml" unless path_and_name.include? ".eml" | |
File.open(path_and_name, 'wb') do |f| | |
datum.content { |chunk| f << chunk } | |
end | |
rescue => e | |
errors << [datum.key, e] | |
end | |
end | |
end |
Also regarding date_in_range?
, the comparison you're making, date > target_start_date && date < target_end_date
, suggests that target_start_date and target_end_date are non-inclusive. Is this desirable?
Minor FYI, but S3Datum has a method for writing content to a file, see Concerns::S3Datum#write_content_to_file for details.
Regarding lines 46-52, I dont think that you need to worry about Datums missing or not responding to content_filename. The content_filename method is required by the S3Datum interface for files to be stored in S3. Without it, we wouldn't be able to store or fetch content for them anyway. Also, with regard to
path_and_name += datum.content_filename.gsub(/[.<>:"\/\\|\?\*']/, "")
#use gsub for problem characters in subjects
it seems like the GoogleMailDatum and GoogleMailRestModels::CannonicalDatum classes both use the message_id to generate the filename, not the subject, so the gsub call may not be completely necessary, though I guess it's possible that I'm missing something.
Overall, I think it's good. Nice job!
With regard to the
date_in_range?
method, you should make sure that you're dealing with consistent objects. The comment above your initialize method suggests thattarget_start_date
andtarget_end_date
should be Date or DateTime objects, but the GoogleMailDatum#date method returns a Time object. The differences between these types could lead to unexpected behavior when you try to make comparisons, as you do indate_in_range?
. This script would be much more reliable and easier to work with if you just chose one type of Time object to use throughout the script. I would suggest using Time over Date or DateTime, since Time is the one used by the Datum class.