This tutorial is an Elasticsearch bootcamp. Elasticsearch is a fully-distributed and scalable search server based on Apache Lucene. Companies like foursquare, soundcloud, github and hundreds more use it to power search and analytics in their applications.
At the end of the day, you’ll:
- know the most important concepts and terminology of search engines
- have a deep understanding of Elasticsearch
- apply Elasticsearch to build search applications
- analyze and resolve common problems with Elasticsearch
No prior experience with search or Elasticsearch is required. This tutorial is specially useful for folks using Elasticsearch for logging and want to learn how to use some of the more advanced features.
-
Overview of full-text search (1 hour)
- why another datastore?
- theory: information retrieval
- vector space model
- inverted indices
- index construction
- computing scores
- evaluation: precision and recall
-
Getting started with ES (30 mins)
- differences between Lucene / Solr / Elasticsearch
- downloading and installing
- distributed features: sharding, replication, fault tolerance
- architecture: indices, types, routing, nodes
-
Search (1 hour)
- mappings and datatypes
- configuring analyzers, tokenizers
- query DSL and API overview
- search types: term, prefix, fuzzy, etc.
- sorting, facets, filters, highlighting
- advanced: geo-bound search, more-like-this
-
Other features (20 mins)
- percolation, scripting, parent-child documents, rivers
-
Production (30 mins)
- data-flow: pulling data from MySQL for indexing
- security & audit
- performance tuning
- cluster API for health, node state, etc.
- monitoring, alerting, backups, etc.