Proposed/ran by Andreas Schmidt, Nokia
Based off his design around the Nokia Places API
- Picked JSON, no support for XML
- Added ?accept=application/json to the URL in the browser for a raw response
| #!/bin/bash | |
| ### Copyright 2010 Manuel Carrasco Moñino. (manolo at apache.org) | |
| ### Copyright 2016 Patrick Double (pat at patdouble.com) | |
| ### | |
| ### Licensed under the Apache License, Version 2.0. | |
| ### You may obtain a copy of it at | |
| ### http://www.apache.org/licenses/LICENSE-2.0 | |
| ### | |
| ### A library for shell scripts which creates reports in jUnit format. |
Proposed/ran by Andreas Schmidt, Nokia
Based off his design around the Nokia Places API
To install scikit-learn easily run the following command.
curl https://gist.githubusercontent.com/dacamo76/4780765/raw/c3779996d8f6b13caaaa48d33aa1585684c7f8e6/scikit-learn-install.sh | sh
Please look over the shell file being run to make sure no evil is done to your machine.
| #!/usr/bin/env bash | |
| # These steps will take a long time to download the data set. | |
| # First, get the list of available NQuad files to download. | |
| wget http://webdatacommons.org/2012-08/stats/files.list | |
| # We're only interested in the microdata set right now since that seems to be where schema.org/Book is used more. So create a file list | |
| cat files.list | grep html-microdata > microdata_files.list | |
| # OK, this will take a while depending on your connection. Let it run overnight. | |
| wget -i microdata_files.list |
| #!/usr/bin/env ruby | |
| # a quick, simple script to partially parse output from https://github.com/trivio/common_crawl_index/blob/master/bin/remote_read | |
| # and output subdomains in order of count | |
| url_counts = {} | |
| total_urls = 0 | |
| File.readlines(ARGV[0]).each do |line| | |
| url = line.split(' ').first | |
| reverse_hostname = url.split('/').first |
| class StaticResource < Webmachine::Resource | |
| def encodings_provided | |
| {"gzip" => :encode_gzip, "identity" => :encode_identity} | |
| end | |
| def allowed_methods | |
| %W[GET] | |
| end |
| source 'http://rubygems.org' | |
| gem 'webmachine' | |
| gem 'unicorn' |
| require 'bundler/setup' | |
| require 'roar/representer/json' | |
| require 'roar/representer/feature/hypermedia' | |
| require 'webmachine' | |
| class Product | |
| include Roar::Representer::JSON | |
| include Roar::Representer::Feature::Hypermedia | |
| property :name |