Skip to content

Instantly share code, notes, and snippets.

View carmstrong's full-sized avatar

Chris Armstrong carmstrong

View GitHub Profile
@carmstrong
carmstrong / add-remove-node.md
Last active August 29, 2015 14:07
Adding/removing deis-store hosts

Adding and removing nodes from the cluster

Most Deis components handle new machines just fine. Care has to be taken when removing machines from the cluster, however, since the deis-store components act as the backing store for all the stateful data Deis needs to function properly.

Note that these instructions follow the Ceph documentation for removing monitors and removing OSDs. Should these instructions differ significantly from the Ceph documentation, the Ceph documentation should be followed, and a PR to update this documentation would be much appreciated.

Since Ceph uses the Paxos algorithm, it is important to always have enough monitors in the cluster to be able to achieve a majority: 1:1, 2:3, 3:4, 3:5, 4:6, etc. It is always preferable to add a new node to the cluster before removing an old one, if possible.

This documentatio

```
core@timdeisSC-coreos-0 ~ $ sudo vi /run/systemd/system/etcd.service.d/20-cloudinit.conf
[Service]
We trust you have received the usual lecture from the local System
Administrator. It usually boils down to these three things:
#1) Respect the privacy of others.
#2) Think before you type.
#3) With great power comes great responsibility.
timdeisSC-coreos-1 core # ls /etc/sudoers.d
waagent
timdeisSC-coreos-1 core # cat /etc/sudo.conf
# CoreOS /etc/sudo.conf
# Use an alternative path for the default sudoers file.
Plugin sudoers_policy sudoers.so sudoers_file=/usr/share/baselayout/sudoers
Plugin sudoers_io sudoers.so
Dec 09 03:11:44 ip-10-21-2-96.ec2.internal fleetd[602]: INFO manager.go:218: Writing systemd unit iconic-airfield_v3.web.5.service (809b)
Dec 09 03:11:44 ip-10-21-2-96.ec2.internal fleetd[602]: INFO manager.go:142: Instructing systemd to reload units
Dec 09 03:11:44 ip-10-21-2-96.ec2.internal fleetd[602]: INFO reconcile.go:274: AgentReconciler completed task: type=LoadUnit job=iconic-airfield_v3.web.5.service reason="unit scheduled here but not loaded"
Dec 09 03:11:45 ip-10-21-2-96.ec2.internal fleetd[602]: INFO manager.go:78: Triggered systemd unit iconic-airfield_v3.web.5.service start: job=10566
Dec 09 03:11:45 ip-10-21-2-96.ec2.internal fleetd[602]: INFO reconcile.go:274: AgentReconciler completed task: type=StartUnit job=iconic-airfield_v3.web.5.service reason="unit currently loaded but desired state is launched"
Dec 09 03:24:40 ip-10-21-2-96.ec2.internal fleetd[602]: INFO manager.go:89: Triggered systemd unit iconic-airfield_v3.web.5.service stop: job=12043
Dec 09 03:24:40 ip-10-21-2-96.ec2.internal fle
@carmstrong
carmstrong / deis-project-touchers.rb
Created December 18, 2014 21:36
Deis project activity by user
#!/usr/bin/env ruby
require 'csv'
require 'octokit'
date_since = ARGV.first
if date_since.nil? or date_since.empty?
date_since = '2013-07-22'
end
Jan 06 01:38:17 ip-10-21-2-53.ec2.internal systemd[1]: Started etcd.
Jan 06 01:38:17 ip-10-21-2-53.ec2.internal etcd[828]: [etcd] Jan 6 01:38:17.772 WARNING | Failed to statfs: no such file or directory
Jan 06 01:38:17 ip-10-21-2-53.ec2.internal etcd[828]: [etcd] Jan 6 01:38:17.772 INFO | Discovery via https://discovery.etcd.io using prefix /343364e84d83902415ca331e6e7bb8c4.
Jan 06 01:38:18 ip-10-21-2-53.ec2.internal etcd[828]: [etcd] Jan 6 01:38:18.294 INFO | Discovery found peers [http://10.21.2.51:7001 http://10.21.1.137:7001]
Jan 06 01:38:18 ip-10-21-2-53.ec2.internal etcd[828]: [etcd] Jan 6 01:38:18.294 INFO | Discovery fetched back peer list: [10.21.2.51:7001 10.21.1.137:7001]
Jan 06 01:38:18 ip-10-21-2-53.ec2.internal etcd[828]: [etcd] Jan 6 01:38:18.295 INFO | b607ffcb8bc849c7ad6651d5559d1aa1 attempted to join via 10.21.2.51:7001 failed: fail checking join version: Client Internal Error (Get http://10.21.2.51:7001/version: dial tcp 10
Jan 06 01:38:18 ip-10-21-2-53.ec2.intern
diff --git a/contrib/ec2/gen-json.py b/contrib/ec2/gen-json.py
index 166c385..d49afa8 100755
--- a/contrib/ec2/gen-json.py
+++ b/contrib/ec2/gen-json.py
@@ -14,7 +14,7 @@ FORMAT_EPHEMERAL_VOLUME = '''
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/sbin/wipefs -f /dev/xvdb
- ExecStart=/usr/sbin/mkfs.ext4 /dev/xvdb
+ ExecStart=/usr/sbin/mkfs.ext3 /dev/xvdb
@carmstrong
carmstrong / etcd-20rc1.diff
Last active August 29, 2015 14:13
etcd 2.0rc1 on CoreOS
diff --git a/contrib/coreos/user-data.example b/contrib/coreos/user-data.example
index 6c37afc..12308f8 100644
--- a/contrib/coreos/user-data.example
+++ b/contrib/coreos/user-data.example
@@ -5,12 +5,10 @@ coreos:
# generate a new token for each unique cluster from https://discovery.etcd.io/new
# uncomment the following line and replace it with your discovery URL
# discovery: https://discovery.etcd.io/12345693838asdfasfadf13939923
- addr: $private_ipv4:4001
- peer-addr: $private_ipv4:7001
@carmstrong
carmstrong / daemon1
Last active August 29, 2015 14:20
Ceph hang on start
2015-05-05 23:25:06.330716 7f7e82697700 0 -- 10.132.253.121:6801/1 >> 10.132.162.15:6801/1 pipe(0x6829000 sd=71 :6801 s=0 pgs=0 cs=0 l=0 c=0x426ffa0).accept connect_seq 2 vs existing 1 state standby
2015-05-05 23:25:06.355427 7f7e82d9e700 0 -- 10.132.253.121:6801/1 >> 10.132.253.118:6801/1 pipe(0x6824000 sd=108 :6801 s=0 pgs=0 cs=0 l=0 c=0x4270aa0).accept connect_seq 2 vs existing 1 state standby
2015-05-05 23:25:06.363901 7f7e837a1700 0 -- 10.132.253.121:6801/1 >> 10.132.162.16:6801/1 pipe(0x6b65000 sd=150 :6801 s=0 pgs=0 cs=0 l=0 c=0x4270100).accept connect_seq 2 vs existing 1 state standby
2015-05-05 23:30:51.366301 7f7e82d9e700 0 -- 10.132.253.121:6801/1 >> 10.132.253.118:6801/1 pipe(0x6824000 sd=108 :6801 s=2 pgs=12 cs=3 l=0 c=0x4270ec0).fault with nothing to send, going to standby
2015-05-05 23:30:51.384787 7f7e837a1700 0 -- 10.132.253.121:6801/1 >> 10.132.162.16:6801/1 pipe(0x6b65000 sd=150 :6801 s=2 pgs=12 cs=3 l=0 c=0x4270680).fault with nothing to send, going to standby
2015-05-05 23:30:51.3855
@carmstrong
carmstrong / gist:b58c0446be10a3d21713
Created May 13, 2015 23:23
deis-publisher strace
deis-01:~# strace -p 16787 &
deis-01:~# Process 16787 attached
epoll_wait(4, {{EPOLLOUT, {u32=2343648864, u64=140430594358880}}}, 128, -1) = 1
epoll_wait(4, {{EPOLLIN|EPOLLOUT, {u32=2343648864, u64=140430594358880}}}, 128, -1) = 1
futex(0x8cc560, FUTEX_WAKE, 1) = 1
read(3, "HTTP/1.1 200 OK\r\nContent-Type: a"..., 4096) = 2992
epoll_wait(4, {}, 128, 0) = 0
futex(0x8cc560, FUTEX_WAKE, 1) = 1
epoll_ctl(4, EPOLL_CTL_DEL, 3, 7ffd66f5fdb4) = 0
close(3) = 0