jcantrill’s gists

jcantrill / gist:860bc7f54ba2674e1d93a421ee01dade

Created June 5, 2026 19:18

	# Plan: Add Optional Label Provider to FileLineTooBigError

	## Context

	When the kubernetes_logs source encounters log lines that exceed `max_line_bytes` or `max_merged_line_bytes`, it emits a `component_errors_total` metric. Currently, this metric only includes generic labels (`error_code`, `error_type`, `stage`, and component labels added automatically via tracing context).

	The problem is that when troubleshooting issues with oversized log lines, operators cannot identify which pod, namespace, or container is generating the problematic logs without correlating timestamps with verbose error logs.

	The Kubernetes log file path already contains all the needed metadata in its structure:
	```

jcantrill / gist:99d913f0def1b87719b402078e258a4c

Created April 28, 2023 18:01

fluentd positionfile.conf

	<system>
	log_level info
	</system>

	<source>
	@type tail
	path /loopfs/in/*.log
	pos_file /loopfs/in/my.pos
	<parse>
	@type csv

jcantrill / gist:4dafdf19ef3acea1e716fb4fdb787e9d

Created April 28, 2023 18:01

fluentd rotation test

	require 'file-tail'

	source_dir = ARGV.length > 0 ? ARGV[0] : '/tmp/loopfs/test'
	no_of_sources = ARGV.length > 1 ? ARGV[1].to_i : 1
	msg_size = ARGV.length > 2 ? ARGV[2].to_i : 1

	pos_file = "#{source_dir}/my.pos"

	running = true
	unwatched = "".rjust(16,'f')

jcantrill / gist:f8d4a8216628c41ffaca248d2d430f6a

Created January 21, 2020 14:22

++ oc -n openshift-logging exec -c elasticsearch elasticsearch-cdm-4tng2d4i-1-b9b8878d7-mchvt -- curl -ks '"https://elasticsearch-metrics.openshift-logging.svc:60000/_prometheus/metrics"' '-H"Authorization:' Bearer 'eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJvcGVuc2hpZnQtbG9nZ2luZyIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJ1bmF1dGhvcml6ZWQtc2EtMTI2OTEtdG9rZW4ta2h2Z20iLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoidW5hdXRob3JpemVkLXNhLTEyNjkxIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiMjU5MWI0OTgtM2M1OS0xMWVhLWIxNDMtMDJmNzEyYTc2YTc2Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Om9wZW5zaGlmdC1sb2dnaW5nOnVuYXV0aG9yaXplZC1zYS0xMjY5MSJ9.L6vjQdB-CTaK2bOXVeXl-6ObRBa5BqTGJytB_BSeMqqEXteA8RHkCq0ke4wj57j3jTtAFbTHKMpfT1oELlSuVE7Agz8XWg2TlGKbXfyxYvUxqs1GhCX5yyqskjLy4D8Iz2eLMAaY7gQZNne-9pAegQTA_iS36rHeQxHOJgOvmjiBSfANNk43jhsamRzoVsmubT9_xMlCDAXN-_qqfIZPFucM0Qn7pHr_CH

jcantrill / gist:8b9507ad0e48e35bd5f14f70ed6a062a

Created January 24, 2019 19:43

	CN=system.logging.rsyslog,OU=OpenShift,O=Logging
	\|- indices:
	\|-/
	\|-CRUD
	\|-CREATE_INDEX
	\|-CRUD
	\|--- cluster:
	\|-CREATE_INDEX
	\|-CRUD
	\|-cluster:monitor/*

jcantrill / images

Created September 5, 2018 18:17

get the images

	#!/bin/bash

	pod=$1
	echo "DCs"
	echo "----"
	oc get dc -n logging -o yaml \| grep image: \| sort \| uniq

	echo "DSs"
	echo "----"
	oc get ds -n logging -o yaml \| grep image: \| sort \| uniq

jcantrill / gist:eca85af2057b84642510cc086f1e5b97

Created August 28, 2018 13:22

Standing up Openshift using 'oc cluster up' and ansible

	Overview
	At the time of writing this document, 'oc cluster up --logging' or its 3.11 equivalent is broken. Following are instructions on using 'oc cluster up' followed by ansible to install logging. These instructions are generally valid for any Openshift release from 3.5 to 3.11.

	Environment
	These instructions are based on using:

	Host: Centos 7 on libvirt

	Mem: 8G

jcantrill / fluent-logs

Created July 25, 2018 13:08

Get the logs of the fluent pods

	#!/bin/bash
	pod=${1:-}
	if [ -z "${pod}" ]; then
	pod=$(oc get pods -l component=fluentd -o jsonpath={.items[*].metadata.name})
	fi
	for p in ${pod}; do
	echo ">>>>>>>><<<<<<<<<<<<<"
	echo " ${p}"
	echo ">>>>>>>><<<<<<<<<<<<<"
	oc logs $p

jcantrill / delete-index-patterns

Last active August 27, 2018 17:00

This script finds the old index patterns that match the 'project.*.*.*.*' and removes them from the .kibana index

-#!/bin/bash -e
-POD=$1
-SIZE="${SIZE:-1000}"
-index=".kibana"
-oc exec -n logging -c elasticsearch $POD -- es_util --query="$index/index-pattern/_search?pretty&stored_fields=_id&size=$SIZE" | grep id | grep project\..* | cut -d ':' -f 2 | cut -d '"' -f 2 | paste -sd " " > patterns
-echo '' > payload
-for p in $(cat patterns); do
-  echo "{\"delete\":{\"_index\":\"${index}\", \"_type\":\"index-pattern\", \"_id\":\"$p\"}}"i # >> payload
-done
-cat payload

jcantrill / check-fluent-connectivity-to-es

Last active July 25, 2018 13:00

This script checks the ability of the fluent pods to connect to Elasticsearch

	#!/bin/sh

	pods=${1:-"--all"}
	shift
	if [ "${pods}" == "--all" ]; then
	pods=$(oc get pods -l component=fluentd -o jsonpath={.items[*].metadata.name})
	fi

	for p in $pods; do
	output=$(oc exec $p -- curl --silent -q https://logging-es:9200/ --key /etc/fluent/keys/key --cacert /etc/fluent/keys/ca --cert /etc/fluent/keys/cert "$@")

Jeff Cantrill jcantrill