@hvt asks: How’ you create your JSON structure? How did you parse LCBO data to show “result”:{..} KVPs Working off an HTML file currently
I build a hash in Ruby and then serialize it to JSON and send that to the client. Here's an extremely basic example of this in a Rails controller:
class ProductsController < ApplicationController
def index
@products = Product.all
render json: { result: @products }
end
end
To be clear this is not how I'm doing this in LCBO API. In LCBO API I have a library that I built which generates the hash for the response based off parameters that were provided in the query string, adding things like pagination metadata and associated objects to responses. My controllers look more like this:
class ProductsController < ApplicationController
def index
respond_with @products = Product.query(params)
end
end
But, under the hood it's basically doing the same thing. You can check out the documentation for ActionController::Responder, ActionController::MimeResponds, and ActiveModel::Serializers::JSON for more information about how Rails deals with object serialization.
Now days there are some nice libraries for serializing objects in a Rails app, my favorite is ActiveModel::Serializers, it's part of the Rails API project. If I was to build LCBO API today, I would definitely be using that to build my serialized objects instead of rolling my own solution.
Well hopefully my above answer helps with the "result":{...}
part of your question. As to how I turned the HTML pages on LCBO.com into an easily consumable RESTful API, I can describe that as a series of steps:
- Crawl the HTML pages
- Parse the HTML into Ruby data structures
- Store the parsed data and serve it over HTTP
I actually have the code that does steps 1 and 2 open soruced and on GitHub. It's all pretty straight forward, but there are a few things that make parsing the HTML on LCBO.com non-trivial and make the library more complex than would be ideal:
- Inconsistent character encoding
- Table-based and whitespace-based layouts: No meaningful CSS classes, sometimes I need to resort to checking for sequences of line-breaks to differentiate data.
- Inconsistent naming conventions: IPA vs I.P.A., EVERTHING IS UPPERCASE, Speling Mastakes.
All of these things add up to more complexity, so it's probably not the greatest learning experience if you don't know where to start. Also, it's been 4 years since I wrote most of that code, and I'd like to think that I could simplify it quite a bit if I was to go on a refactoring spree.
Part 3, this is what LCBO API does. It uses the LCBO gem to crawl LCBO.com and it stores that data in a database, then using Rails I serve that data to users of the API as serialized JSON. There is a scheduled job that runs daily which starts the crawler and collects all of the updated store, product, and inventory information. After this is done, some calculations are performed, a CSV snapshot is taken and uploaded to Amazon S3, and the cycle continues for another day.