Skip to content

Instantly share code, notes, and snippets.

@jhaubrich
Created July 30, 2015 15:51
Show Gist options
  • Save jhaubrich/da1b753fd8c105eee030 to your computer and use it in GitHub Desktop.
Save jhaubrich/da1b753fd8c105eee030 to your computer and use it in GitHub Desktop.

NOC Maint hosed graphite

GLaDOS

./noc_down_graphite.png

Data Missing from when the NOC went down.

Populators listen to atlas:50000

waas@GLaDOS:/home/waas  $ cat /etc/supervisor/conf.d/populators.conf 

[program:chi2]
user = waas
command = /opt/tools/docker/graphite/populators/chi2.py --tap tcp://atlas:50000
environment = PYTHONPATH='/opt/tools/lib'
redirect_stderr=true
autorestart = true

[program:gus]
user = waas
command = /opt/tools/docker/graphite/populators/gus.py --tap tcp://atlas:50000
environment = PYTHONPATH='/opt/tools/lib'
redirect_stderr=true
autorestart = true

[program:agc]
user = waas
command = /opt/tools/docker/graphite/populators/agc.py --tap tcp://atlas:50000
environment = PYTHONPATH='/opt/tools/lib'
redirect_stderr=true
autorestart = true

[program:rg1]
user = waas
command = /opt/tools/docker/graphite/populators/rg1.py --tap tcp://atlas:50000
environment = PYTHONPATH='/opt/tools/lib'
redirect_stderr=true
autorestart = true

Atlas

dockers running fine.

waas@ATLAS:/home/waas  $ docker ps

CONTAINER ID        IMAGE                       COMMAND                CREATED             STATUS              PORTS                                                                        NAMES
# 91eee924168c        brewery-shadow:latest       /usr/bin/supervisord   8 days ago          Up 5 days           127.0.0.1:2021->22/tcp, 0.0.0.0:50021->50000/tcp, 0.0.0.0:50121->50100/tcp   esnoc1              
# 11dff62e8dd8        brewery-shadow:latest       /usr/bin/supervisord   8 days ago          Up 5 days           127.0.0.1:2022->22/tcp, 0.0.0.0:50022->50000/tcp, 0.0.0.0:50122->50100/tcp   esnoc2              
# 59137399041b        brewery-shadowlite:latest   /usr/bin/supervisord   8 days ago          Up 5 days           127.0.0.1:2553->22/tcp, 0.0.0.0:50153->50100/tcp, 0.0.0.0:55553->50000/tcp   backfill_sl         
# b84ddca7a83e        brewery-tcsmon:latest       /usr/bin/supervisord   8 days ago          Up 5 days           127.0.0.1:2012->22/tcp, 0.0.0.0:50012->50000/tcp, 0.0.0.0:50112->50100/tcp   noc2                
# 6cebf9afa47e        brewery-tcsmon:latest       /usr/bin/supervisord   8 days ago          Up 5 days           127.0.0.1:2011->22/tcp, 0.0.0.0:50011->50000/tcp, 0.0.0.0:50111->50100/tcp   noc1                
# ed360c1bc079        brewery-tcsmon:latest       /usr/bin/supervisord   8 days ago          Up 5 days           127.0.0.1:2001->22/tcp, 0.0.0.0:50001->50000/tcp, 0.0.0.0:50101->50100/tcp   poc1                
# 50dd1bdd952a        brewery-tcsmon:latest       /usr/bin/supervisord   8 days ago          Up 5 days           127.0.0.1:2031->22/tcp, 0.0.0.0:50031->50000/tcp, 0.0.0.0:50131->50100/tcp   sl1                 
# 9f6491c39d98        brewery-tcsmon:latest       /usr/bin/supervisord   8 days ago          Up 5 days           127.0.0.1:2002->22/tcp, 0.0.0.0:50002->50000/tcp, 0.0.0.0:50102->50100/tcp   poc2                
# ca5a9577b5fd        brewery-gold:latest         /usr/bin/supervisord   8 days ago          Up 5 days           127.0.0.1:2555->22/tcp, 0.0.0.0:50155->50100/tcp, 0.0.0.0:55555->50000/tcp   backfill            
# b7f430dc0f39        brewery-tcsmon:latest       /usr/bin/supervisord   8 days ago          Up 5 days           127.0.0.1:2032->22/tcp, 0.0.0.0:50032->50000/tcp, 0.0.0.0:50132->50100/tcp   sl2                 
c05b5f7ed8b6        brewery-gold:latest         /usr/bin/supervisord   8 days ago          Up 5 days           127.0.0.1:2000->22/tcp, 0.0.0.0:50000->50000/tcp, 0.0.0.0:50100->50100/tcp   gold                
#9ea95f0ad1a8        brewery-shadowlite:latest   /usr/bin/supervisord   8 days ago          Up 5 days           127.0.0.1:2033->22/tcp, 0.0.0.0:50033->50000/tcp, 0.0.0.0:50133->50100/tcp   gold_sl  

gold brewery listens to eth7

gold:
    iface: eth7
    addr: 10.69.69.7/32
    ssh: 2000
    hopper: 50100
    keg: 50000
    docker_image: brewery-gold

supervisor looks good

waas@ATLAS:/home/waas  $ sudo supervisorctl

[sudo] password for waas: 
# brewery:esnoc1_harvester         RUNNING    pid 7950, uptime 5 days, 20:34:51
# brewery:esnoc2_harvester         RUNNING    pid 7971, uptime 5 days, 20:34:51
brewery:gold_harvester           RUNNING    pid 7973, uptime 5 days, 20:34:51
# brewery:gold_sl_harvester        RUNNING    pid 7963, uptime 5 days, 20:34:51
# brewery:noc1_harvester           RUNNING    pid 7949, uptime 5 days, 20:34:51
# brewery:noc2_harvester           RUNNING    pid 7977, uptime 5 days, 20:34:51
# brewery:poc1_harvester           RUNNING    pid 7969, uptime 5 days, 20:34:51
# brewery:poc2_harvester           RUNNING    pid 7960, uptime 5 days, 20:34:51
# brewery:sl1_harvester            RUNNING    pid 7955, uptime 5 days, 20:34:51
# brewery:sl2_harvester            RUNNING    pid 7967, uptime 5 days, 20:34:51
# collector:es1_noc1               RUNNING    pid 9222, uptime 5 days, 20:19:47
# collector:es1_noc2               RUNNING    pid 9223, uptime 5 days, 20:19:44
collector:gold                   RUNNING    pid 9214, uptime 5 days, 20:20:14
# collector:noc1                   RUNNING    pid 9217, uptime 5 days, 20:20:09
# collector:noc2                   RUNNING    pid 9219, uptime 5 days, 20:20:07
# collector:poc1                   RUNNING    pid 9220, uptime 5 days, 20:20:03
# collector:poc2                   RUNNING    pid 9221, uptime 5 days, 20:20:01
# collector:sl1_noc1               FATAL      Exited too quickly (process log may have details)
# collector:sl1_noc2               FATAL      Exited too quickly (process log may have details)
deduplifier                      RUNNING    pid 7924, uptime 5 days, 20:34:51

deduplifier

waas@ATLAS:/home/waas  $ cat /etc/supervisor/conf.d/dedup.conf 
[program:deduplifier]
command =  python /opt/pcaptools/dedup/dedup.py
autorestart = true
autostart = true

reads eth[1-4] writes to eth7

# create outbound interface socket
out_if = socket.socket(socket.AF_PACKET, socket.SOCK_RAW)
out_if.bind(("eth7", 0))

#open input interfaces
fd1=pcap.pcap(name='eth1', promisc=True) #POC1
fd2=pcap.pcap(name='eth2', promisc=True) #POC2
fd3=pcap.pcap(name='eth3', promisc=True) #NOC1
fd4=pcap.pcap(name='eth4', promisc=True) #NOC2

Gold Harvester

waas@ATLAS:/home/waas  $ cat /etc/supervisor/conf.d/brewery.conf 
[group:brewery]
programs=gold_harvester, poc1_harvester, poc2_harvester, noc1_harvester, noc2_harvester, esnoc1_harvester, esnoc2_harvester, sl1_harvester, sl2_harvester, gold_sl_harvester

[program:gold_harvester]
command=python /opt/pcaptools/waas_brewery/waas_brewery/socket_harvester.py -i eth7 --hopper localhost:50100
redirect_stderr=true
autorestart=false
stopasgroup=true

# [program:poc1_harvester]
# command=python /opt/pcaptools/waas_brewery/waas_brewery/socket_harvester.py -i eth1 --hopper localhost:50101
# redirect_stderr=true
# autorestart=false
# stopasgroup=true

# [program:poc2_harvester]
# command=python /opt/pcaptools/waas_brewery/waas_brewery/socket_harvester.py -i eth2 --hopper localhost:50102
# redirect_stderr=true
# autorestart=false
# stopasgroup=true

# [program:noc1_harvester]
# command=python /opt/pcaptools/waas_brewery/waas_brewery/socket_harvester.py -i eth3 --hopper localhost:50111
# redirect_stderr=true
# autorestart=false
# stopasgroup=true

# [program:noc2_harvester]
# command=python /opt/pcaptools/waas_brewery/waas_brewery/socket_harvester.py -i eth4 --hopper localhost:50112
# redirect_stderr=true
# autorestart=false
# stopasgroup=true

# [program:esnoc1_harvester]
# command=python /opt/pcaptools/waas_brewery/waas_brewery/socket_harvester.py -i eth5 --hopper localhost:50121
# redirect_stderr=true
# autorestart=false
# stopasgroup=true

# [program:esnoc2_harvester]
# command=python /opt/pcaptools/waas_brewery/waas_brewery/socket_harvester.py -i eth6 --hopper localhost:50122
# redirect_stderr=true
# autorestart=false
# stopasgroup=true

# [program:sl1_harvester]
# command=python /opt/pcaptools/waas_brewery/waas_brewery/socket_harvester.py -i eth8 --hopper localhost:50131
# redirect_stderr=true
# autorestart=false
# stopasgroup=true

# [program:sl2_harvester]
# command=python /opt/pcaptools/waas_brewery/waas_brewery/socket_harvester.py -i eth9 --hopper localhost:50132
# redirect_stderr=true
# autorestart=false
# stopasgroup=true

# [program:gold_sl_harvester]
# command=python /opt/pcaptools/waas_brewery/waas_brewery/socket_harvester.py -i eth8 --hopper localhost:50133
# redirect_stderr=true
# autorestart=false
# stopasgroup=true

collector.conf

Not relevant, but here for completeness

waas@ATLAS:/home/waas  $ cat /etc/supervisor/conf.d/collector.conf 
[group:collector]
programs = poc1, poc2, noc1, noc2, es1_noc1, es1_noc2, gold, sl1_noc1, sl1_noc2
priority = 999

[program:poc1]
user = _cat-pcap
command = /usr/sbin/tcpdump -ni eth1 -G 300 -w "/wmd/source/%%Y/%%m/%%d/cat/pcap/field/poc/field_poc1_%%Y-%%m-%%dT%%H:%%M:%%SZ.pcap"
redirect_stderr=true
autorestart = true
autostart = true

[program:poc2]
user = _cat-pcap
command = /usr/sbin/tcpdump -ni eth2 -G 300 -w "/wmd/source/%%Y/%%m/%%d/cat/pcap/field/poc/field_poc2_%%Y-%%m-%%dT%%H:%%M:%%SZ.pcap"
redirect_stderr=true
autorestart = true
autostart = true


[program:noc1]
user = _cat-pcap
command = /usr/sbin/tcpdump -ni eth3 -G 300 -w "/wmd/source/%%Y/%%m/%%d/cat/pcap/field/noc/field_noc1_%%Y-%%m-%%dT%%H:%%M:%%SZ.pcap"
redirect_stderr=true
autorestart = true
autostart = true

[program:noc2]
user = _cat-pcap
command = /usr/sbin/tcpdump -ni eth4 -G 300 -w "/wmd/source/%%Y/%%m/%%d/cat/pcap/field/noc/field_noc2_%%Y-%%m-%%dT%%H:%%M:%%SZ.pcap"
redirect_stderr=true
autorestart = true
autostart = true


[program:es1_noc1]
user = _cat-pcap
command = /usr/sbin/tcpdump -ni eth5 -G 300 -w "/wmd/source/%%Y/%%m/%%d/cat/pcap/es1/noc/es1_noc1_%%Y-%%m-%%dT%%H:%%M:%%SZ.pcap"
redirect_stderr=true
autorestart = true
autostart = true

[program:es1_noc2]
user = _cat-pcap
command = /usr/sbin/tcpdump -ni eth6 -G 300 -w "/wmd/source/%%Y/%%m/%%d/cat/pcap/es1/noc/es2_noc2_%%Y-%%m-%%dT%%H:%%M:%%SZ.pcap"
redirect_stderr=true
autorestart = true
autostart = true

[program:gold]
user = _cat-pcap
command = /usr/sbin/tcpdump -ni eth7 -G 300 -w "/wmd/source/%%Y/%%m/%%d/cat/pcap/field/gold/field_gold_%%Y-%%m-%%dT%%H:%%M:%%SZ.pcap"
redirect_stderr=true
autorestart = true
autostart = true

[program:sl1_noc1]
user = _cat-pcap
command = /usr/sbin/tcpdump -ni eth8 -G 300 -w "/wmd/source/%%Y/%%m/%%d/cat/pcap/sl1/noc/sl1_noc1_%%Y-%%m-%%dT%%H:%%M:%%SZ.pcap"
redirect_stderr=true
autorestart = true
autostart = true

[program:sl1_noc2]
user = _cat-pcap
command = /usr/sbin/tcpdump -ni eth9 -G 300 -w "/wmd/source/%%Y/%%m/%%d/cat/pcap/sl1/noc/sl1_noc2_%%Y-%%m-%%dT%%H:%%M:%%SZ.pcap"
redirect_stderr=true
autorestart = true
autostart = true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment