openshift logging elasticsearch tenant log size retention braindump
h2. Summary
*As* a VSHN customer user
*I want* to search and visualise my logs
*So that* I can check my application health and be aided with debugging it.-
h2. Context
APPUiO Public has 2.4TB Log Volumes for ~10 days retention.
h2. Alternatives
h3. Elastic Cloud on Kubernetes
I did not found much information on retention.
h3. graylog
Uses ES under the hood.
h3. Loki
Loki has a great GUI (Grafana Explore View) close to Kibana.
Loki has [time based retention per tenant and stream|]. It supports [flexible tenant usage e.g. customer label|].
Loki has the [APIs|] to implement size based retention with a custom manager, we created a prototype of this in my last Company.
h2. Notes
Kibana/ES [Access Control|]. Roles [here|].
{{app}} index is pretty [hard-coded|].
Size plugin is not installed.
Calculate size from query
PUT my-index-000002
"mappings": {
"_doc": {
"_size": {
"enabled": true
PUT /my-index-000002/_doc/1
"text": "This is a document"
PUT /my-index-000002/_doc/2
"text": "This is another document"
GET my-index-000002/_search
"query": {
"range": {
"_size": {
"gt": 10
"aggs": {
"sizes": {
"terms": {
"field": "_size",
"size": 10
"sort": [
"_size": {
"order": "desc"
"script_fields": {
"size": {
"script": "doc['_size']"
OCP has a function to prune namespace: this could be extended for custom retention policies.
* With per tenant index it would be possible to limit storage size indirectly:
* * Rollover every nth gigabyte
* * [Delete indexes|] with [count filter|].
* [Curator has been removed with no replacement|]
* It is not possible to limit indexes to size with the logging operator
* Log ingestion in bytes by namespace
sum (
label_replace(increase(log_collected_bytes_total[24h]), "znamespace", "$1", "path", ".*_(.*)_.*")
* Newer ES have an option to check index sizes: it dows not work in our ES
h2. Out of Scope
h2. Further links
h2. Acceptance criteria
* Elasticsearch logging is enabled
* RBAC rules so customers can view their logs
* Announce in [|]
h2. Implementation Ideas
1. Install size-mapper plugin, [custom index rollover, deletion scripts with sizemapper query|].
2. Patch fluentd deployment to produce one index per customer, custom index rollover, deletion scripts.
