Skip to content

Instantly share code, notes, and snippets.

@heycarsten
Created April 15, 2013 16:46
Show Gist options
  • Save heycarsten/5389477 to your computer and use it in GitHub Desktop.
Save heycarsten/5389477 to your computer and use it in GitHub Desktop.
My answers to some questions from @hvt on Twitter about LCBO API and Ruby.

@hvt asks: How’ you create your JSON structure? How did you parse LCBO data to show “result”:{..} KVPs Working off an HTML file currently

How'd you create your JSON structure?

I build a hash in Ruby and then serialize it to JSON and send that to the client. Here's an extremely basic example of this in a Rails controller:

class ProductsController < ApplicationController

  def index
    @products = Product.all
    render json: { result: @products }
  end

end

To be clear this is not how I'm doing this in LCBO API. In LCBO API I have a library that I built which generates the hash for the response based off parameters that were provided in the query string, adding things like pagination metadata and associated objects to responses. My controllers look more like this:

class ProductsController < ApplicationController

  def index
    respond_with @products = Product.query(params)
  end

end

But, under the hood it's basically doing the same thing. You can check out the documentation for ActionController::Responder, ActionController::MimeResponds, and ActiveModel::Serializers::JSON for more information about how Rails deals with object serialization.

Now days there are some nice libraries for serializing objects in a Rails app, my favorite is ActiveModel::Serializers, it's part of the Rails API project. If I was to build LCBO API today, I would definitely be using that to build my serialized objects instead of rolling my own solution.

How did you parse LCBO data to show “result”:{..} KVPs Working off an HTML file currently

Well hopefully my above answer helps with the "result":{...} part of your question. As to how I turned the HTML pages on LCBO.com into an easily consumable RESTful API, I can describe that as a series of steps:

  1. Crawl the HTML pages
  2. Parse the HTML into Ruby data structures
  3. Store the parsed data and serve it over HTTP

I actually have the code that does steps 1 and 2 open soruced and on GitHub. It's all pretty straight forward, but there are a few things that make parsing the HTML on LCBO.com non-trivial and make the library more complex than would be ideal:

  • Inconsistent character encoding
  • Table-based and whitespace-based layouts: No meaningful CSS classes, sometimes I need to resort to checking for sequences of line-breaks to differentiate data.
  • Inconsistent naming conventions: IPA vs I.P.A., EVERTHING IS UPPERCASE, Speling Mastakes.

All of these things add up to more complexity, so it's probably not the greatest learning experience if you don't know where to start. Also, it's been 4 years since I wrote most of that code, and I'd like to think that I could simplify it quite a bit if I was to go on a refactoring spree.

Part 3, this is what LCBO API does. It uses the LCBO gem to crawl LCBO.com and it stores that data in a database, then using Rails I serve that data to users of the API as serialized JSON. There is a scheduled job that runs daily which starts the crawler and collects all of the updated store, product, and inventory information. After this is done, some calculations are performed, a CSV snapshot is taken and uploaded to Amazon S3, and the cycle continues for another day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment