debovis’s gists

onyxfish / example1.py

Created March 5, 2010 16:51

Basic example of using NLTK for name entity extraction.

	import nltk

	with open('sample.txt', 'r') as f:
	sample = f.read()

	sentences = nltk.sent_tokenize(sample)
	tokenized_sentences = [nltk.word_tokenize(sentence) for sentence in sentences]
	tagged_sentences = [nltk.pos_tag(sentence) for sentence in tokenized_sentences]
	chunked_sentences = nltk.batch_ne_chunk(tagged_sentences, binary=True)

lrvick / flask_geventwebsocket_example.py

Created September 1, 2011 07:17

Simple Websocket echo client/server with Flask and gevent / gevent-websocket

	from geventwebsocket.handler import WebSocketHandler
	from gevent.pywsgi import WSGIServer
	from flask import Flask, request, render_template

	app = Flask(__name__)

	@app.route('/')
	def index():
	return render_template('index.html')

jsmecham / underscore.extensions.js

Created October 17, 2011 15:20

Useful Underscore.js Extensions

	(function() {

	//
	// Iterates over an array of numbers and returns the sum. Example:
	//
	// _.sum([1, 2, 3]) => 6
	//
	_.sum = function(obj) {
	if (!$.isArray(obj) \|\| obj.length == 0) return 0;
	return _.reduce(obj, function(sum, n) {

turicas / Makefile

Created December 3, 2011 23:22

Create slugs and abbreviate names using Python

	test:
	clear
	nosetests --with-coverage --cover-package name_utils test_name_utils.py

	clean:
	find -regex '.*\.pyc' -exec rm {} \;
	find -regex '.*~' -exec rm {} \;

	.PHONY: test clean

jtratner / dynamic_blueprints.py

Last active November 26, 2019 14:24

Dynamic blueprints flask pseudocode

	import os

	PATH = path/to/my/blueprints/directory
	BLUEPRINT = 'the_blueprint'

	def import_file(path, name=None):
	""" imports a file with given name and path """
	# use the imp module to do actual imports
	import imp
	name = name or os.path.split(path)[-1].replace(".", "_")

dergachev / _onetab_export_markdown.bookmarklet.js

Created May 23, 2013 00:28

Onetab export format to gists;

	// works on selected text
	var selection = window.getSelection().toString();

	if (selection) {
	selection = selection.match(/[^\r\n]+/g)
	.map(function(a) { return "* " + a; }).
	join("\n");
	prompt("Here's your markdown:", selection);
	}

kmike / hattrie_vectorizer.py

Created March 27, 2014 11:21

	import numpy as np
	import scipy.sparse as sp
	import hat_trie
	from sklearn.feature_extraction.text import CountVectorizer, _make_int_array


	class HatTrieCountVectorizer(CountVectorizer):

	def _count_vocab(self, raw_documents, fixed_vocab):
	"""Create sparse feature matrix, and vocabulary where fixed_vocab=False

ngpestelos / remove-docker-containers.md

Last active August 20, 2024 19:34

How to remove unused Docker containers and images

I wrote this four years ago, so instead use this command:

$ docker rmi $(docker images -q -f dangling=true)

anildigital / gist:862675ec1b7bccabc311

Created July 26, 2014 18:27

Remove dangling docker images

docker rmi $(docker images -q -f dangling=true)

konradkonrad / es_features.py

Last active January 25, 2022 23:52

tfidf from elasticsearch

	import elasticsearch
	from math import log


	def tfidf_matrix(es, index, doc_type, fields, size=10, bulk=500, query=dict(match_all=[])):
	"""Generate tfidf for `size` documents of `index`/`doc_type`.
	All `fields` need to have the mapping "term_vector": "yes".
	This is the consuming version (i.e. get everything at once).

	:param es: elasticsearch client

John DeBovis debovis