Skip to content

Instantly share code, notes, and snippets.

@karmi
Last active September 4, 2017 02:01
Show Gist options
  • Save karmi/988619 to your computer and use it in GitHub Desktop.
Save karmi/988619 to your computer and use it in GitHub Desktop.
Simple tag cloud with ElasticSearch `terms` facet
# (Re)create the index
curl -X DELETE "http://localhost:9200/tagcloud"
curl -X PUT "http://localhost:9200/tagcloud"-d '{
"settings" : {
"index" : {
"number_of_shards" : 1,
"number_of_replicas" : 0
}
}
}'
# Insert the data
curl -X POST "http://localhost:9200/tagcloud/document" -d '{ "body" : "Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro \"Matz\" Matsumoto. It was influenced primarily by Perl, Smalltalk, Eiffel, and Lisp." }'
curl -X POST "http://localhost:9200/tagcloud/document" -d '{ "body" : "Erlang is a general-purpose concurrent, garbage-collected programming language and runtime system. The sequential subset of Erlang is a functional language, with strict evaluation, single assignment, and dynamic typing. For concurrency it follows the Actor model. It was designed by Ericsson to support distributed, fault-tolerant, soft-real-time, non-stop applications. It supports hot swapping, thus code can be changed without stopping a system." }'
curl -X POST "http://localhost:9200/tagcloud/document" -d '{ "body" : "JavaScript, also known as ECMAScript, is a prototype-based, object-oriented scripting language that is dynamic, weakly typed and has first-class functions. It is also considered a functional programming language like Scheme and OCaml because it has closures and supports higher-order functions." }'
curl -X POST "http://localhost:9200/tagcloud/_refresh"
# Queries
curl -X POST "http://localhost:9200/tagcloud/_search?search_type=count&pretty=true" -d '
{
"query" : {
"match_all" : {}
},
"facets" : {
"tagcloud" : {
"terms" : { "field" : "body", "size" : 25 }
}
}
}'
@ebuildy
Copy link

ebuildy commented Jun 30, 2012

Nice snippet ! Is there a way to remove some words (such as "also", "a", "is" ....) ?

@serpent403
Copy link

Yes, it is possible to specify a set of terms that should be excluded from the terms facet request result

{
"query" : {
"match_all" : { }
},
"facets" : {
"tagcloud" : {
"terms" : {
"field" : "body",
"exclude" : ["a", "also", . . . ]
}
}
}
}

Check the doc -> http://www.elasticsearch.org/guide/reference/api/search/facets/terms-facet.html

@brycemcd
Copy link

Note to future searchers that facets are deprecated in favor of aggregations as shown in the documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment