Skip to content

Instantly share code, notes, and snippets.

View divideby0's full-sized avatar

Cedric Hurst divideby0

  • Spantree Technology Group, LLC
  • Chicago, IL
View GitHub Profile
#!/usr/bin/python
# Retrieves a comma-delimited list of unicasts hosts for Elasticsearch instances
# running on marathon, based on the application name.
import sys, getopt, json, requests
def exit_invalid_args():
print 'usage: marathon-es-hosts.py -m <marathon_url> -a <app_name>'
sys.exit(2)
@divideby0
divideby0 / first_name.synonyms.txt
Last active May 20, 2016 18:46
First Name Synonyms for Elasticsearch / Solr / Lucene (generated from Freebase name variations)
aadu => ado
aaliyah => aleaseya, alea, aliya, aliyah, allyiah, alia, aleeya, aleah
aaron => arron, aron, ahron
aarti => arti
aayesha => ayesha, ayasha, ayse, ayisha
aba => abod, abony, abos
abagail => abbe, abby, abbie, abilio, gail, abbey, abigale, abigail, abihail, abigayle
abba => abbot, abbott, absa, abbe
abbe => abba, abbot, abbott, absa, abbey, abby, abigail, abihail, abagail, abigayle, abilio, gail, abigale, abbie
abbey => abbe, abilio, gail, abigale, abigail, abihail, abagail, abigayle, abbie, abby
➜ curl -L --verbose http://7io.r.mailjet.com/link/x09v/o7huxq8/1/Ws4pdbrgrFEEui3rR6oQbw/aHR0cDovL3JzdHlsZS5tZS9uL2t4YzJkamFuZQ
* Adding handle: conn: 0x7fe942804000
* Adding handle: send: 0
* Adding handle: recv: 0
* Curl_addHandleToPipeline: length: 1
* - Conn 0 (0x7fe942804000) send_pipe: 1, recv_pipe: 0
* About to connect() to 3 port 80 (#0)
* Trying 0.0.0.3...
* Failed to connect to 0.0.0.3: No route to host
* couldn't connect to host at 3:80

In the stemming section of mapping & analysis, we refer to a stemmed language called "Lovins" and make an educated guess that this is the language they speak in Lovinia. Lovinia is not a country and Lovins is actually an English-language stemmer that enhances the Snowball stemmer. Using the Lovins option will work with English alone.

http://snowball.tartarus.org/algorithms/lovins/festschrift.html

In response to a performance optimization question during Q&A, we explain that the mlockall option reserves the full amount of heap on startup. While this is true, the option also ensures that all the allocated memory is never swapped or paged.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:*",
"elasticloadbalancing:*",
"cloudwatch:*",
"autoscaling:*",
PUPPET_DIR=/etc/puppet/
yum -y --nogpgcheck install "http://yum.puppetlabs.com/el/${RELEASE}/products/x86_64/puppetlabs-release-6-7.noarch.rpm"
yum update -y
yum -y groupinstall "Development Tools"
yum -q -y makecache
yum -q -y install git puppet ruby-rdoc ruby-devel rubygems
gem install rubygems-update
mkdir -p "$PUPPET_DIR"
gem install librarian-puppet
{
"_index": "introcloud",
"_type": "attendee",
"_id": "511c1f6bf44daf677e1d1ffa",
"_version": 1,
"exists": true,
"_source": {
"userId": "53FA4F27-5C81-4EC9-A2D4-0123383BCD3E",
"firstName": "Jane",...
"events": [
This file has been truncated, but you can view the full file.
[
{
"name": "Albany Park",
"slug": "albany-park",
"geoJson": {"type":"MultiPolygon","coordinates":[[[[-87.724211,41.975689],[-87.724081,41.975576],[-87.723951,41.975578],[-87.723773,41.97558],[-87.723544,41.975583],[-87.723355,41.975585],[-87.722786,41.975592],[-87.721267,41.975605],[-87.720888,41.975608],[-87.720859,41.974088],[-87.720825,41.974085],[-87.720636,41.974088],[-87.720561,41.974079],[-87.720501,41.974061],[-87.72037,41.97401],[-87.720264,41.973981],[-87.720135,41.973962],[-87.719949,41.97389],[-87.719723,41.973819],[-87.719616,41.973775],[-87.719252,41.973625],[-87.719147,41.973578],[-87.719074,41.973535],[-87.718948,41.973461],[-87.718826,41.973396],[-87.718714,41.973346],[-87.718388,41.973226],[-87.718366,41.973218],[-87.718291,41.973195],[-87.718069,41.973097],[-87.717935,41.973023],[-87.717737,41.972915],[-87.717479,41.972781],[-87.717195,41.972637],[-87.717132,41.972604],[-87.716961,41.972517],[-87.716548,41.97228],[-87.716203,41.972086],[-87.715963,41.971978],[-87.715918,41.97
@divideby0
divideby0 / solr-vs-elasticsearch.md
Last active April 8, 2016 22:04
Solr vs Elasticsearch

Note: This article was greatly influenced by a much more comprehensive overview published by Sematext.

In my opinion, Solr is more mature but Elasticsearch has a few bleeding-edge features which might be worth the transition depending on what you're trying to do.

Elasticsearch is more schemaless than Solr. It also supports nested documents, multi-fields storing multiple types in the same index, and cross-index searching.

From a management standpoint, the latest versions of Solr (4.x) use Zookeeper ensembles for clustering whereas Elasticsearch