Skip to content

Instantly share code, notes, and snippets.

View garywu's full-sized avatar
🌴
On vacation

Gary Wu garywu

🌴
On vacation
View GitHub Profile
# cheap immitation of %pycat
from IPython.core.display import display, HTML
from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import HtmlFormatter
def show(filename):
text = !cat $filename
text = '\n'.join(text)

all existed containers

docker ps -q -f status=exited

remove all exited containers

docker rm $(docker ps -q -f status=exited)

Delete all containers

docker rm $(docker ps -a -q)

Delete all images

http://www.winwaed.com/blog/2012/04/09/calculating-word-statistics-from-the-gutenberg-corpus/
Following on from the previous article about scanning text files for word statistics, I shall extend this to use real large corpora. First we shall use this script to create statistics for the entire Gutenberg English language corpus. Next I shall do the same with the entire English language Wikipedia.
Project Gutenberg is a non-profit project to digitize books with expired copyright. It is noted for including large numbers of classic texts and novels. All of the texts are available for free online. Although the site could be spidered, this is strongly discouraged due to the limited resources of the project. Instead, a CD or DVD image can be downloaded (or purchased); or you can create a mirror. Instructions for doing this are here, and include the following recommend shell command:
rsync -avHS --delete --delete-after --exclude '*cache/generated' [email protected]::gutenberg /home/ftp/pub/mirrors/gutenberg
1
@garywu
garywu / hidden folder and files
Created February 16, 2017 23:43
hidden folder/files
cmd + shift + . will show hidden folder/files with leading . in the file open
@garywu
garywu / update-wordpress-page
Created February 10, 2017 20:52
update-wordpress-page: updates a Wordpress page with the contents of the input file
#!/usr/bin/env python
#
# update-wordpress-page: updates a Wordpress page with the contents
# of the input file
#
# Described on http://free-electrons.com/blog/automated-wp-page-updates/
# Download: git.free-electrons.com/training-scripts/tree/wordpress
#
# Usage: update-wordpress-page url-base post-id content-file
#
@garywu
garywu / !jupyter-kernel-gateway-hello-world
Last active February 8, 2017 05:05
jupyter-notebook-kernel-gateway-hello-world
Jupyter notebook as microservice with kernel gateway and docker
@garywu
garywu / pg-pong.py
Created February 6, 2017 13:23 — forked from karpathy/pg-pong.py
Training a Neural Network ATARI Pong agent with Policy Gradients from raw pixels
""" Trains an agent with (stochastic) Policy Gradients on Pong. Uses OpenAI Gym. """
import numpy as np
import cPickle as pickle
import gym
# hyperparameters
H = 200 # number of hidden layer neurons
batch_size = 10 # every how many episodes to do a param update?
learning_rate = 1e-4
gamma = 0.99 # discount factor for reward
#Mining YouTube using Python & performing social media analysis (on ALS ice bucket challenge)
#https://www.analyticsvidhya.com/blog/2014/09/mining-youtube-python-social-media-analysis/
#complete Python script to mine YouTube data. Just replace your key and keyword you want to search
from apiclient.discovery import build #pip install google-api-python-client
from apiclient.errors import HttpError #pip install google-api-python-client
from oauth2client.tools import argparser #pip install oauth2client
import pandas as pd #pip install pandas
import matplotlib as plt
//https://hackernoon.com/cracking-nut-nodejs-express-block-get-remote-request-client-ip-address-e4cdfa461add#.2faupv29q
var express = require(‘express’)
var app = express()
// Part1, defining blacklist
var BLACKLIST =['192.0.0.1'];
// Part2, Geting client IP
var getClientIp = function(req) {
var ipAddress = req.connection.remoteAddress;