Skip to content

Instantly share code, notes, and snippets.

@dsilfen-handy
Last active July 7, 2016 21:24
Show Gist options
  • Save dsilfen-handy/16afe895573e9d035857faabc594aec4 to your computer and use it in GitHub Desktop.
Save dsilfen-handy/16afe895573e9d035857faabc594aec4 to your computer and use it in GitHub Desktop.
Helpful checklist when writing rake tasks

Rake it 'Til you make it

Tips on making resiliant rake tasks for data cleanup

Your script should have the following features:

  1. Print the output, show progress in the stdout to make it
  2. Provide a limit argument, so you can test on a small subset
  3. Provide a offset argument, so you can skip already processed rows

Keep in mind while writing:

  1. Ensure idempotency, so multiple runs do not process the same data multiple times.
  2. Keep the task block small. Use methods for business logic, the task should just provide simple iteration.
  3. ActiveRecord calls add up. Be mindful where bottlenecks may occur
  4. Consider dropping down for raw SQL for mass insertions/updates

Before you run:

  1. Have a plan for ensuring the results are as expected
  2. Ensure it runs on a local dataset

While its running & after its completed:

  1. Keep an eye on the process for unexpected errors.
  2. Spot check the output ocassionally to confirm everthing is working
  3. Ask yourself, Will this task ever be used again? If the answer is no, why not delete it?

**Example backfill task**
require 'open-uri'

namespace :backfill do
  task :albums, [:limit, :offset] do
    
    limit = args.limit.to_i
    offset = args.offset.to_i 
    url = args.url 
    data = get_data_for_backfill(url, limit, offset)
    total = data.count
    
    data.each_with_index do |album, index|
      puts "PROCESSING #{index}/#{total}, ALBUM ID #{album.id} "
      backfill_album(album)
    end
  end
  
  def backfill_album(album)
    if album.processed?
      puts "ALREADY BEEN PROCESSED #{album.id}"
    else
      was_successful = album.process! 
      if was_successful
        puts "SUCCESSFULLY PROCESSED #{album.id}"
      else
        puts "FAILED TO PROCESS #{album.id}"
      end
    end
  end
  
  def get_data_for_backfill(limit, offset)
    Album
      .where(processed: false)
      .limit(limit)
      .offset(offset)
  end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment