Elasticsearch supports vector search, but when implementing vector search, it is essentially expected that all data resides in RAM (off-heap memory). Previously, there was no way to know how much memory was required by indexes storing vector data, but starting from v9.1, metrics related to vector data can now be obtained.
This article introduces how to obtain these metrics and their meanings. Additionally, we compare the metrics when storing vectors with four types of index options: Flat, HNSW, Int8 HNSW, and BBQ HNSW, and verify the impact of each index option on RAM.
Vector data in Elasticsearch is stored in off-heap memory. Off-heap refers to native memory areas outside of the JVM's heap memory. By using off-heap memory, Elasticsearch/Lucene can efficiently handle large amounts of vector data. However, since it is managed separately from the JVM's heap memory, it is not included in regular JVM memory usage metrics. Therefore, it is necessary to obtain off-heap memory u