Skip to content

Instantly share code, notes, and snippets.

View ncouture's full-sized avatar
🌳
Continuously Growing

Nicolas Couture ncouture

🌳
Continuously Growing
View GitHub Profile
@ncouture
ncouture / setup.md
Created December 16, 2015 18:47 — forked from xrstf/setup.md
Nutch 2.3 + ElasticSearch 1.4 + HBase 0.94 Setup

Info

This guide sets up a non-clustered Nutch crawler, which stores its data via HBase. We will not learn how to setup Hadoop et al., but just the bare minimum to crawl and index websites on a single machine.

Terms

  • Nutch - the crawler (fetches and parses websites)
  • HBase - filesystem storage for Nutch (Hadoop component, basically)
@ncouture
ncouture / tidy.conf
Last active July 29, 2019 20:05 — forked from paultreny/tidy.conf
The config file I use for tidy-html5. $ tidy -config <path/to/tidy.conf> input.html > output.html
# Example tidy-html5 configuration file.
fix-uri: yes
keep-time: yes
hide-comment: yes
indent-cdata: yes
omit-optional-tags: yes
output-encoding: utf8
indent: auto
indent-spaces: 2
@ncouture
ncouture / setup.py
Created May 5, 2016 20:10 — forked from mmerickel/setup.py
SSLOnlyMiddleware
setup(
entry_points={
'paste.filter_app_factory': [
'ssl_only = myapp.middlewares.ssl_only:make_filter',
],
}
)
@ncouture
ncouture / pyquery-spider.py
Created November 6, 2016 13:34
This is a re-implementation using PyQuery instead of XPaths for the Scrapy spider tutorial found here: http://doc.scrapy.org/en/latest/intro/tutorial.html#our-first-spider
from scrapy.spider import BaseSpider
# Requires this patch:
# https://github.com/joehillen/scrapy/commit/6301adcfe9933b91b3918a93387e669165a215c9
from scrapy.selector import PyQuerySelector
class DmozSpiderPyQuery(BaseSpider):
name = "pyquery"
allowed_domains = ["dmoz.org"]
start_urls = [
@ncouture
ncouture / reverse-tether.sh
Created March 15, 2017 17:12 — forked from chrisob/reverse-tether.sh
Android reverse tethering over bridged SSH tap interface
#!/bin/bash
# Based on the scripts written by class101 of xda-developers.com:
# http://forum.xda-developers.com/showpost.php?p=57490025&postcount=205
#
# This script enables a secure tunnel for your android phone to "reverse tether"
# and access the internet/a private network via the following steps:
#
# 1. Establish a level 3 (TAP) tunnel from your local host to a remote server via SSH (tap0)
# 2. Establish a level 3 interface between your local host and your android phone via USB (usb0)
@ncouture
ncouture / face_identification.ipynb
Created April 19, 2018 18:57 — forked from onidzelskyi/face_identification.ipynb
Simple Python app show you easy way to identify particular face among the other faces
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
import subprocess
import argparse
import base64
import json
"""
Currently uses Google's cloud speech API
"""
from googleapiclient import discovery
@ncouture
ncouture / compound-query.js
Created May 6, 2018 15:35 — forked from ademilter/compound-query.js
Firestore #firebase
var citiesRef = db.collection("cities");
var query = citiesRef.where("capital", "==", true);
citiesRef.where("state", "==", "CA")
citiesRef.where("population", "<", 100000)
citiesRef.where("name", ">=", "San Francisco")
citiesRef
.where("state", "==", "CA")