Skip to content

Instantly share code, notes, and snippets.

@nz
nz / Delete all documents in a Solr index using curl.md
Last active November 13, 2024 01:24
Delete all documents in a Solr index using curl
# http://wiki.apache.org/solr/FAQ#How_can_I_delete_all_documents_from_my_index.3F
# http://wiki.apache.org/solr/UpdateXmlMessages#Updating_a_Data_Record_via_curl

curl "http://index.websolr.com/solr/a0b1c2d3/update?commit=true" -H "Content-Type: text/xml" --data-binary '<delete><query>*:*</query></delete>'

I'm amused at the traction this little gist is getting on Google! I would be remiss not to point out that six+ years later I'm still helping thousands of companies on a daily basis with their search index management, by providing managed Solr as a service over at Websolr, and hosted Elasticsearch at Bonsai. Check us out if you'd like an expert helping hand at Solr and Elasticsearch hosting, ops and support!

@alexbowe
alexbowe / nltk-intro.py
Created March 21, 2011 12:59
Demonstration of extracting key phrases with NLTK in Python
import nltk
text = """The Buddha, the Godhead, resides quite as comfortably in the circuits of a digital
computer or the gears of a cycle transmission as he does at the top of a mountain
or in the petals of a flower. To think otherwise is to demean the Buddha...which is
to demean oneself."""
# Used when tokenizing words
sentence_re = r'''(?x) # set flag to allow verbose regexps
([A-Z])(\.[A-Z])+\.? # abbreviations, e.g. U.S.A.
@KirkWylie
KirkWylie / YieldCurve.R
Created November 3, 2011 16:37
Yield Curve Plotting Using OpenGamma and R
##
# Copyright (C) 2011 - present by OpenGamma Inc. and the OpenGamma group of companies
#
# Please see distribution for license.
##
# Loads the time-series for yield curve data points to construct a 3D "curve over time" graph.
# Curve tickers
tickers <- c ("US00O/N Index", "US0001W Index", "US0002W Index", "US0001M Index", "US0002M Index", "US0003M Index", "USSW2 Curncy", "USSW3 Curncy", "USSW4 Curncy", "USSW5 Curncy", "USSW6 Curncy", "USSW7 Curncy", "USSW8 Curncy", "USSW9 Curncy", "USSW10 Curncy", "USSW15 Curncy", "USSW20 Curncy", "USSW25 Curncy", "USSW30 Curncy")
@purcell
purcell / completion.js
Created November 28, 2011 16:54
Cached autocompletion of wikipedia page titles with jquery ui autocomplete plugin
function cached_completer(completer) {
var cache = {};
return function(request, response) {
if (request.term in cache) {
response(cache[request.term]);
} else {
completer(request, function(resp) {
cache[request.term] = resp;
response(resp);
});
@hellerbarde
hellerbarde / latency.markdown
Created May 31, 2012 13:16 — forked from jboner/latency.txt
Latency numbers every programmer should know

Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns             
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 µs
SSD random read ........................ 150,000 ns  = 150 µs

Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs

@yanofsky
yanofsky / LICENSE
Last active October 17, 2024 22:49
A script to download all of a user's tweets into a csv
This is free and unencumbered software released into the public domain.
Anyone is free to copy, modify, publish, use, compile, sell, or
distribute this software, either in source code form or as a compiled
binary, for any purpose, commercial or non-commercial, and by any
means.
In jurisdictions that recognize copyright laws, the author or authors
of this software dedicate any and all copyright interest in the
software to the public domain. We make this dedication for the benefit
@bsweger
bsweger / useful_pandas_snippets.md
Last active November 13, 2024 19:55
Useful Pandas Snippets

Useful Pandas Snippets

A personal diary of DataFrame munging over the years.

Data Types and Conversion

Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)

def next_digit(value, base):
return value + str(sum(int(a)*b for a,b in zip(value, base))%11%10)
def make_valid(value, ap2, base):
return next_digit(next_digit(value, base), ap2+base)
def is_valid_cpf(cpf):
return make_valid(cpf[:9], [0], [1,2,3,4,5,6,7,8,9]) == cpf
def is_valid_cnpj(cnpj):
'''From Coding Train
https://youtu.be/BAejnwN4Ccw
3/2/2017
Added Genetic Algorithm
4/27/2017
'''
import random
cities = [];
import pandas as pd
import pandas_datareader.data as web
import numpy as np
import datetime
from scipy.optimize import minimize
TOLERANCE = 1e-10
def _allocation_risk(weights, covariances):