Skip to content

Instantly share code, notes, and snippets.

@arkadijs
Last active August 29, 2015 14:18
Show Gist options
  • Save arkadijs/a2a6901d34272fbbdaa7 to your computer and use it in GitHub Desktop.
Save arkadijs/a2a6901d34272fbbdaa7 to your computer and use it in GitHub Desktop.
Reactive programming workshop http://ldn.lv/events/220739388

Reactive workshop

Today we're building Instagram image scraper.

Instagram has an API to poll for recent media to get media's attributes, including URL and location. Your program will be sending that information to a pre-cooked web UI for display.

Step zero: Necessary evil

Please login into Instagram, go to https://instagram.com/developer/ and create an app (Manage Client > Register a New Client) to obtain Client ID that is required to call Instagram's API. You may put http://rxdisplay.neueda.lv/ into Website and http://rxdisplay.neueda.lv/oauth into OAuth redirect uri. Any other URL-s would do too.

Join workshop chat at https://gitter.im/arkadijs/reactive-workshop to receive updates, code snippets, and brag about your accomplishments. After each step a solution will be posted to get you back on track, just in case.

Step one: Query Instagram API

Use HTTP client to request JSON from https://api.instagram.com/v1/tags/$tag/media/recent?client_id=$client_id&count=10 Tags API, then parse it and start an Observable stream. The stream should contains image URL and location coordinates, if any. Search the web for popular tags. Use Observable's interval(), from(), just(), create() methods and an HTTP client of your choice. There is a good chance you might enjoy from(Future|Promise) call.

Use subscribe() to print the data.

Instagram limit is 5000 API requests per hour per client id or access token.

Step two: Flatten the array

Depending on the what your approach is - from() or just(), do you use Future/Promise or not - you may end up with Observable of Media or Observable of List of Media. You need Observable of Media for next step.

Try flatMap() instead of map(). Try Observable.merge() (flatten).

Optional sidetrack: It is essential to understand how to construct Observables from scratch. Having Observable of List of Media to play with is perfect opportunity. Try Observable.create() and/or (Replay)Subject to bridge the list into single-item Observable.

Step three: Send data to UI

We have UI ready for you at http://rxdisplay.neueda.lv/. POST to http://rxdisplay.neueda.lv/in a JSON like the following:

{
    "tag":"tbt",
    "url":"https://scontent.cdninstagram.com/....jpg",
    "location":{
        "latitude":51.504976275,
        "longitude":-0.087847965,
        "id":225481160,
        "name":"The Shard London"
    },
    "participant":"change-me"
}

Send 150x150px thumbnail URL. location is optional. participant is to distinguish your feed on projector's screen. You can debug your personal feed by opening http://rxdisplay.neueda.lv/?participant=change-me.

Step four: Deduplicate the data

Instagram's /media/recent query may return images already pulled in previous run. Also, image may have multiple tags. Apply Observable.distinct() to filter out duplicates. Verify the filtering works. Split the stream by issuing multiple subscriptions, then count elements of deduplicated and unfiltered streams.

Use count() (size). If numbers doesn't match - check replay() and observeOn().

Ultraviolence mode: Add your own Rx operator

Create Smooth operator, like Sample and Debounce, but (1) no event loss and (2) internal adaptive trigger that track incoming rate and smoothly adapts outgoing rate to keep it steady:

123........45......6790123.......4 =>
1...2...3...4..5...6.7.8.9.0.1.2.3.4

Code starters

Complete solutions

References and manuals

  1. Instagram API
  1. ReactiveX

Cloud IDE

In case your development environment of choice is not with you today, or you want to try ReactiveX on unfamiliar platform, it may be beneficial to use Cloud IDE. Here is the list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment