#Preface These are a number of questions I've distilled over my first week of using Elasticsearch.
When ingesting data into Elasticsearch should you be using a single index for each application and then use types within that index to break up the data?
My current setup is like this, with a project
index, users
and leagues
types within that index.
If you are storing documents in two different types how do you deal with documents in different types which
contain the same data? For example, a User
document which contains a League
and then a League
document
which contains the User
.
Is it more beneficial to store a large document, or a greater amount of smaller documents? If I need to analyse Users
in Leagues
it makes sense to include all the leagues which the user is in. The opposite method raises the further
question of how to relate documents inside Elasticsearch if they are normalised.
My main use-case is a web application, and as such I need to sync my data from MySQL into Elasticsearch on a regular basis so that any data I pull from Elasticsearch is consistant with MySQL data the admin team are seeing in the CMS.
My current setup is that I have a bulk import command line task to import documents from MySQL into Elasticsearch, and then
I have a listener which listens for afterSave
and afterDelete
events which then update the matching Elasticsearch document.
As my documents share certain parts of each others documents, how can I build my mappings in a way that allows the documents to share their mapping in my code to prevent my mappings being repeated.
Should an application use a single index?
http://stackoverflow.com/a/14554767/234451