Skip to content

Instantly share code, notes, and snippets.

View smarterclayton's full-sized avatar

Clayton Coleman smarterclayton

View GitHub Profile
https://prometheus-openshift-devops-monitor.44fs.preview.openshiftapps.com/graph?g0.range_input=30m&g0.expr=up&g0.tab=0&g1.range_input=30m&g1.expr=sum+without+(instance)+(rate(apiserver_request_count%7Bcode!~%222.%3F.%3F%22%7D%5B5m%5D)+%3E+0)&g1.tab=0&g2.range_input=30m&g2.expr=sum+without+(kubernetes_io_hostname%2Cinstance%2Cnew_node)+(rate(kubelet_docker_operations_errors%5B5m%5D)+%3E+0)&g2.tab=0&g3.range_input=1h&g3.expr=container_memory_rss%7Bcontainer_name%3D%22prometheus%22%7D&g3.tab=0&g4.range_input=1h&g4.expr=count+by+(gitVersion)+(openshift_build_info)&g4.tab=0
(pprof) top30
3.47s of 7.15s total (48.53%)
Dropped 416 nodes (cum <= 0.04s)
Showing top 30 nodes out of 248 (cum >= 0.85s)
flat flat% sum% cum cum%
0 0% 0% 7.04s 98.46% runtime.goexit
3.43s 47.97% 47.97% 3.44s 48.11% syscall.Syscall
0 0% 47.97% 2.38s 33.29% io/ioutil.ReadDir
0 0% 47.97% 2.20s 30.77% os.(*File).Readdir
0 0% 47.97% 2.20s 30.77% os.(*File).readdir
https://prometheus-openshift-devops-monitor.d800.free-int.openshiftapps.com/graph?g0.range_input=1h&g0.expr=container_memory_rss%7Bcontainer_name%3D%22prometheus%22%7D&g0.tab=0&g1.range_input=1h&g1.expr=drop_common_labels(sum+without+(instance)+(rate(apiserver_request_count%7Bcode!~'2%5C%5Cd%5C%5Cd'%7D%5B3m%5D)))&g1.tab=0&g2.range_input=1h&g2.expr=sum+by+(gitVersion)+(openshift_build_info)&g2.tab=0&g3.range_input=1h&g3.expr=drop_common_labels(sort_desc(sum+without+(cpu)+(rate(container_cpu_usage_seconds_total%5B5m%5D)))+%3E+0.1)&g3.tab=0&g4.range_input=1h&g4.expr=sum+without+(instance)+(rate(kubelet_docker_operations_errors%5B5m%5D))+%3E+0&g4.tab=0&g5.range_input=1h&g5.expr=drop_common_labels(sort_desc(sum+without+(instance)+(rate(apiserver_request_count%7Bverb%3D~%22PUT%7CPOST%7CPATCH%7CDELETE%22%7D%5B5m%5D))))&g5.tab=0
○ oc get --raw /metrics | grep grpc
# HELP etcd_network_client_grpc_received_bytes_total The total number of bytes received from grpc clients.
# TYPE etcd_network_client_grpc_received_bytes_total counter
etcd_network_client_grpc_received_bytes_total 57586
# HELP etcd_network_client_grpc_sent_bytes_total The total number of bytes sent to grpc clients.
# TYPE etcd_network_client_grpc_sent_bytes_total counter
etcd_network_client_grpc_sent_bytes_total 1.7420058e+07
# HELP grpc_client_handled_total Total number of RPCs completed by the client, regardless of success or failure.
# TYPE grpc_client_handled_total counter
grpc_client_handled_total{grpc_code="OK",grpc_method="LeaseGrant",grpc_service="etcdserverpb.Lease",grpc_type="unary"} 2
TASK [Gathering Facts] *********************************************************
Thursday 03 August 2017 20:18:52 +0000 (0:00:00.141) 0:03:56.731 *******
fatal: [ci-claytontest-ig-n-q7zh]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Warning: Permanently added '104.154.198.205' (ECDSA) to the list of known hosts.\r\nTraceback (most recent call last):\n File \"<stdin>\", line 136, in <module>\nNameError: name 'temp_path' is not defined\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 1}
fatal: [ci-claytontest-ig-n-d06s]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Warning: Permanently added '35.188.123.22' (ECDSA) to the list of known hosts.\r\nTraceback (most recent call last):\n File \"<stdin>\", line 136, in <module>\nNameError: name 'temp_path' is not defined\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 1}
fatal: [ci-claytontest-ig-n-l7s2]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Warning: Permanently added '146.148.10
- hosts: localhost
tasks:
- stat: path=/usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/std_include.yml
register: std_include
become: no
- include: /usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/std_include.yml
when: hostvars['localhost']['std_include'] is defined and hostvars['localhost']['std_include'].exists
allowDisabledDocker: false
apiVersion: v1
authConfig:
authenticationCacheSize: 1000
authenticationCacheTTL: 5m
authorizationCacheSize: 1000
authorizationCacheTTL: 5m
dnsBindAddress: 10.192.208.5:53
dnsDomain: cluster.local
dnsIP: 10.192.208.5
@smarterclayton
smarterclayton / gist:fd22a00423115d19aef049d6336ef918
Last active October 31, 2017 14:55
heavy string allocation in iptables code on 3.7 node
### Not that busy node
Showing nodes accounting for 51472970, 64.09% of 80315852 total
Dropped 1872 nodes (cum <= 401579)
Showing top 30 nodes out of 277
flat flat% sum% cum cum%
0 0% 0% 80185072 99.84% runtime.goexit /usr/lib/golang/src/runtime/asm_amd64.s
39878766 49.65% 49.65% 39878766 49.65% runtime.rawstringtmp /usr/lib/golang/src/runtime/string.go
0 0% 49.65% 35614612 44.34% runtime.slicebytetostring /usr/lib/golang/src/runtime/string.go
0 0% 49.65% 22676400 28.23% github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/proxy/iptables.(*Proxier).syncProxyRules /builddir/build/BUILD/atomic-openshift-git-0.ee33a8f/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/proxy/iptables/proxier.go
TASK [openshift_excluder : Detecting Atomic Host Operating System] ****************************************************************************************************************
Thursday 02 November 2017 17:58:07 +0000 (0:00:00.085) 0:27:15.831 *****
ok: [origin-ci-ig-n-n7xn] => {"changed": false, "failed": false, "stat": {"exists": false}}
TASK [openshift_excluder : Debug r_openshift_excluder_enable_docker_excluder] *****************************************************************************************************
Thursday 02 November 2017 17:58:08 +0000 (0:00:00.643) 0:27:16.474 *****
ok: [origin-ci-ig-n-n7xn] => {
"r_openshift_excluder_enable_docker_excluder": true
}
PLAY [Drain and upgrade nodes] ****************************************************************************************************************************************************
TASK [Mark node unschedulable] ****************************************************************************************************************************************************
Thursday 02 November 2017 20:30:19 +0000 (0:00:01.323) 0:19:03.378 *****
changed: [origin-ci-ig-n-3hjt -> 35.184.74.68] => {"attempts": 1, "changed": true, "failed": false, "results": {"cmd": "/bin/oc adm manage-node origin-ci-ig-n-3hjt --schedulable=False", "nodes": [{"name": "origin-ci-ig-n-3hjt", "schedulable": false}], "results": "NAME STATUS AGE VERSION\norigin-ci-ig-n-3hjt Ready,SchedulingDisabled 64d v1.7.0+695f48a16f\n", "returncode": 0}, "state": "present"}
TASK [Drain Node for Kubelet upgrade] *********************************************************************************************