HM9000 の backend storage である etcd を触ってみる

　このgistは Cloud Foundry Advent Calendar 2013 の8日目の記事です。

　昨日の記事がまさかのyoutubeだったのですが，私はそこまでやれないので，平常運転(gistのみ)で行きます。

　話を戻します。

　前回の私の記事に書いた通り，HM9000は，Cloud Foundry の core component の一つ，Health Manager のGolang実装です。

　(本来これは前回書いておくべきことでしたが) Cloud Foundry における Health Manager の役割は，

一定期間ごとに，
- DEAからユーザー・アプリの実行状態に関する情報(※1)を収集し，
- それを CCDB (Cloud Controller Database) に格納された，ユーザー・アプリのあるべき状態と比較し
- あるべき状態と収集した状態に齟齬がある時は，それを修正するよう Cloud Controller に要求する(※2)
  - インスタンス数が足りない場合 ⇒ 起動要求を出すよう Cloud Controller に要求
  - インスタンス数が多すぎる場合 ⇒ 停止要求を出すよう Cloud Controller に要求

というものです。

＃詳しくは私が昔書いた記事などをご覧ください。上記記事は古いので，NATSのsubject名やソースコード内のメソッド名は現状と一致していませんが，動作の本質については変わっていません。

　ここで，(※2)の動作に注目してほしいのですが，この動作はidempotentではないので，同じ要求を誤って複数回発行してしまうと，必要以上にインスタンス数が増えたり減ったりしてしまいます。このため，これまで Health Manager は，Cloud Foundry のコア・コンポーネントでは唯一(※3)，クラスター化されていませんでした。

＃※3: NATSサーバーもクラスター化されていませんでしたが，NATSサーバーは正確にはコア・コンポーネントではない(外部ツールという位置づけ)ので，「コア・コンポーネントでは唯一」と書きました。

　HM9000では，バックエンドのデータストアに(※1)や(※2)に関する情報を格納し，このデータを複数の Health Manager プロセス間で共有することによって，クラスター化が可能になりました。

　この「バックエンドのデータストア」として，公式のREADMEには，「etcdやZooKeeperが使える」と書いてあります。そこで(前置きが長くなってしまいましたが，漸く本題)今回は，etcdを実際に動かしてみることにしました。

　etcdのREADMEを見ると，冒頭に「設定共有やサービス発見向けの，高可用 key-value store である」と書かれています。ここでいう「高可用」は，具体的には冗長化/クラスター化可能ということになると思います。HM9000のクラスター化のためのバックエンド・データストア自体がクラスター化できないと，今度はバックエンド・データストアが single point of failure になるという本末転倒なことになってしまうので，これはまあ当然といえば当然ですね。

　以下，上記READMEに沿って，実際に動かしてみます。

Build

　まずetcdをgithubからcloneしてきて，buildします。

cd $GOPATH
git clone https://github.com/coreos/etcd.git src/github.com/coreos/etcd
cd src/github.com/coreos/etcd/
./build

　Buildは一瞬で終わります。

単体実行

　まずは単体実行を試します。

　データ格納用のディレクトリーを作ります。

mkdir /tmp/etcd1.data

　作成されたetcdを起動します。

./etcd -data-dir /tmp/etcd1.data -name machine1 -addr 0.0.0.0:4001 &

-addrは，クライアント(curl等)からのアクセスを待ち受けるアドレス/ポートです。省略可能で，省略すると127.0.0.1:4001が使われるのですが，Health Manager のバックエンドとして使う状況を考慮して，ここではlocalhostと外向きIPアドレスの両方で待ち受けるよう，0.0.0.0:4001を陽に指定しています。なお，この実験機(以下実験機1)の外向きIPアドレスは192.168.14.111です。

[etcd] Dec  7 16:03:58.879 INFO      | etcd server [name machine1, listen on 0.0.0.0:4001, advertised url http://0.0.0.0:4001]
[etcd] Dec  7 16:03:58.880 INFO      | raft server [name machine1, listen on 127.0.0.1:7001, advertised url http://127.0.0.1:7001]

　正常に起動すると，上記のような出力がコンソールに出るはずです。etcdは普通に起動するとコンソールを占有するので，起動時に&を付けてバックグラウンドで実行するようにすると良いと思います。

　データを投入してみます(curlのoptionの詳細についてはcurlのman等をみてください)。

curl -L http://127.0.0.1:4001/v2/keys/key1 -X PUT -d value="value1"

　データの投入にはPOSTではなくPUTを使います。データの更新にもPUTを使うので，データの書き込み操作はPUTに集約されていると言えます。

{"action":"set","node":{"key":"/key1","value":"value1","modifiedIndex":2,"createdIndex":2}}

　正常に書き込めた場合，上記のような出力が返ってきます。ただこれだけでは本当に書き込めたかどうかわからないので，読み出して確認してみます。

curl -L http://127.0.0.1:4001/v2/keys/key1 -X GET

{"action":"get","node":{"key":"/key1","value":"value1","modifiedIndex":2,"createdIndex":2}}

　確かに先ほど書き込んだ値が読み出せました。

　念のため外向きのIPアドレスからも読み出してみます。

curl -L http://192.168.14.111:4001/v2/keys/key1 -X GET

{"action":"get","node":{"key":"/key1","value":"value1","modifiedIndex":2,"createdIndex":2}}

　問題なく読み出せました。

　次はデータを更新してみます。今度は外部IP側からPUTしてみます。

curl -L http://192.168.14.111:4001/v2/keys/key1 -X PUT -d value="new value"

{"action":"set","node":{"key":"/key1","prevValue":"value1","value":"new value","modifiedIndex":3,"createdIndex":3}}

　更新されたかどうか確認します。

 curl -L http://192.168.14.111:4001/v2/keys/key1 -X GET

{"action":"get","node":{"key":"/key1","value":"new value","modifiedIndex":3,"createdIndex":3}}

curl -L http://127.0.0.1:4001/v2/keys/key1 -X GET

{"action":"get","node":{"key":"/key1","value":"new value","modifiedIndex":3,"createdIndex":3}}

　確かに更新されています。

　最後は削除です。今度はlocalhostから。

curl -L http://127.0.0.1:4001/v2/keys/key1 -X DELETE

{"action":"delete","node":{"key":"/key1","prevValue":"new value","modifiedIndex":4,"createdIndex":3}}

　削除されていること確認します。

curl -L http://127.0.0.1:4001/v2/keys/key1 -X GET

{"errorCode":100,"message":"Key Not Found","cause":"/key1","index":4}

curl -L http://192.168.14.111:4001/v2/keys/key1 -X GET

{"errorCode":100,"message":"Key Not Found","cause":"/key1","index":4}

クラスター構成

　次はクラスター構成を試します。クラスター構成といっても，今回は2台しか用意できなかったので，使うマシンは以下の2台です。

192.168.14.111/22 (machine1)
192.168.15.149/22 (machine2)

　まず，machine1のetcdを起動します。

./etcd -data-dir /tmp/etcd1.data -name machine1 -addr 0.0.0.0:4001 -peer-addr 0.0.0.0:7001 &

[etcd] Dec  7 23:02:23.980 INFO      | etcd server [name machine1, listen on 0.0.0.0:4001, advertised url http://0.0.0.0:4001]
[etcd] Dec  7 23:02:24.303 INFO      | URLs:  / machine1 (http://127.0.0.1:7001)
[etcd] Dec  7 23:02:24.304 WARNING   | the entire cluster is down! this peer will restart the cluster.
[etcd] Dec  7 23:02:24.304 INFO      | raft server [name machine1, listen on 127.0.0.1:7001, advertised url http://127.0.0.1:7001]

　WARNINGが出ていますが，とりあえずそのままmachine2のetcdを起動します。

./etcd -data-dir /tmp/etcd2.data -name machine2 -addr 0.0.0.0:4002 -peer-addr 0.0.0.0:7002 -peers 192.168.14.111:7001 &

　machine1のpeer-addressを-peersの引数に指定することで，etcd間でpeer-to-peer通信が行われるようになる，はず，でしたが…。

[etcd] Dec  7 23:17:31.464 INFO      | etcd server [name machine2, listen on :4002, advertised url http://0.0.0.0:4002]
[etcd] Dec  7 23:17:31.466 WARNING   | cannot join to cluster via given peers, retry in 10 seconds
[etcd] Dec  7 23:17:41.467 WARNING   | cannot join to cluster via given peers, retry in 10 seconds
[etcd] Dec  7 23:17:51.468 WARNING   | cannot join to cluster via given peers, retry in 10 seconds
[etcd] Dec  7 23:18:01.468 CRITICAL  | Cannot join the cluster via given peers after 3 retries

[1]+  Exit 1                  ./etcd -data-dir /tmp/etcd2.data -name machine2 -addr 0.0.0.0:4002 -peer-addr 0.0.0.0:7002 -peers 192.168.14.111:7001

　3回接続をretryした後落ちてしまいました。

　machine1を起動した時のメッセージを改めて見てみると，

[etcd] Dec  7 23:02:24.304 WARNING   | the entire cluster is down! this peer will restart the cluster.
[etcd] Dec  7 23:02:24.304 INFO      | raft server [name machine1, listen on 127.0.0.1:7001, advertised url http://127.0.0.1:7001]

となっていて，どうやら他のpeerが存在しないため，localhostだけをlistenする形で起動してしまっているようです。

　いろいろ調べてみた結果，-force(または-f)を付けて起動すると，強制的に外向けIPアドレスもlistenして起動するようにできました。

./etcd -data-dir /tmp/etcd1.data -name machine1 -addr 0.0.0.0:4001 -peer-addr 0.0.0.0:7001 -force &

[etcd] Dec  7 23:29:07.341 INFO      | etcd server [name machine1, listen on 0.0.0.0:4001, advertised url http://0.0.0.0:4001]
[etcd] Dec  7 23:29:07.342 INFO      | raft server [name machine1, listen on 0.0.0.0:7001, advertised url http://0.0.0.0:7001]

　この状態でもう一度machine2を起動してみます。

./etcd -data-dir /tmp/etcd2.data -name machine2 -addr 0.0.0.0:4002 -peer-addr 0.0.0.0:7002 -peers 192.168.14.111:7001 &

[etcd] Dec  7 23:30:44.815 INFO      | etcd server [name machine2, listen on :4002, advertised url http://0.0.0.0:4002]
[etcd] Dec  7 23:30:44.820 INFO      | raft server [name machine2, listen on :7002, advertised url http://0.0.0.0:7002]

　今度はうまく起動したようです。

　では，データを投入してみます。

curl -L http://192.168.14.111:4001/v2/keys/key1 -X PUT -d value="new value"

curl -L http://192.168.14.111:4001/v2/keys/key1 -X PUT -d value="value1"
raft: Command timeout

　うまくいきません。この後いろいろ試したのですが，結局外向けIPアドレスを使ったクラスターをうまく動かすことはできませんでした。これについては，機会があったらまた試したいと思います。

　今回は仕方なく方針を変更して，READMEに書いてあるとおりlocalhostでのクラスター構成を試してみます。

./etcd -data-dir /tmp/etcd1.data -name machine1 -addr 127.0.0.1:4001 -peer-addr 127.0.0.1:7001 &

[etcd] Dec  8 00:24:41.309 INFO      | etcd server [name machine1, listen on 127.0.0.1:4001, advertised url http://127.0.0.1:4001]
[etcd] Dec  8 00:24:41.310 INFO      | raft server [name machine1, listen on 127.0.0.1:7001, advertised url http://127.0.0.1:7001]

./etcd -data-dir /tmp/etcd2.data -name machine1 -addr 127.0.0.1:4002 -peer-addr 127.0.0.1:7002 -peers 127.0.0.1:7001 &

[etcd] Dec  8 00:25:30.308 INFO      | etcd server [name machine1, listen on 127.0.0.1:4002, advertised url http://127.0.0.1:4002]
[etcd] Dec  8 00:25:30.309 INFO      | raft server [name machine1, listen on 127.0.0.1:7002, advertised url http://127.0.0.1:7002]

./etcd -data-dir /tmp/etcd3.data -name machine1 -addr 127.0.0.1:4003 -peer-addr 127.0.0.1:7003 -peers 127.0.0.1:7001 &

[etcd] Dec  8 00:25:50.276 INFO      | etcd server [name machine1, listen on 127.0.0.1:4003, advertised url http://127.0.0.1:4003]
[etcd] Dec  8 00:25:50.277 INFO      | raft server [name machine1, listen on 127.0.0.1:7003, advertised url http://127.0.0.1:7003]

　問題なく起動しました。

　データを投入します。投入先はmachine1にします。

curl -L http://127.0.0.1:4001/v2/keys/key1 -X PUT -d value="value1"

{"action":"set","node":{"key":"/key1","value":"value1","modifiedIndex":2,"createdIndex":2}}

　これもOK。

　データの読み出し。まずはmachine1から。

curl -L http://127.0.0.1:4001/v2/keys/key1 -X GET

{"action":"get","node":{"key":"/key1","value":"value1","modifiedIndex":2,"createdIndex":2}}

　OK。

　次はmachine2から。

curl -L http://127.0.0.1:4002/v2/keys/key1 -X GET

{"errorCode":100,"message":"Key Not Found","cause":"/key1","index":0}

　あれ？　うまく行きません。

　machin3から。

curl -L http://127.0.0.1:4003/v2/keys/key1 -X GET

{"errorCode":100,"message":"Key Not Found","cause":"/key1","index":0}

　これもダメです。

　クラスターの情報を取得してみます。

curl -L http://127.0.0.1:4001/v2/keys/_etcd/machines -X GET

{"action":"get","node":{"key":"/_etcd/machines","dir":true,"nodes":[{"key":"/_etcd/machines/machine1","value":"raft=http://127.0.0.1:7001&etcd=http://127.0.0.1:4001","modifiedIndex":1,"createdIndex":1}],"modifiedIndex":1,"createdIndex":1}}

　machine1しかいません。

　起動時のログを見返してみると，ポート[4002,7002]とポート[4003,7003]で起動したetcdの名前が，両方ともmachine1になっています。実はこれ，一度間違って-name machine1で起動したのですが，間違いに気付いて停止し，それぞれ-name machine2,-name machine3で起動し直したものなのです。etcdは一度起動すると，設定をdata-dirに保存するようで，別設定で再起動しても前の設定が残ってしまうことがあります。今回は，machine2,machine3それぞれのdata-dirの中身を消して，正しい設定で起動し直しました。

　もう一度クラスターの状態を取得してみます。

curl -L http://127.0.0.1:4001/v2/keys/_etcd/machines -X GET

{"action":"get","node":{"key":"/_etcd/machines","dir":true,"nodes":[{"key":"/_etcd/machines/machine1","value":"raft=http://127.0.0.1:7001&etcd=http://127.0.0.1:4001","modifiedIndex":1,"createdIndex":1},{"key":"/_etcd/machines/machine2","value":"raft=http://127.0.0.1:7002&etcd=http://127.0.0.1:4002","modifiedIndex":4,"createdIndex":4},{"key":"/_etcd/machines/machine3","value":"raft=http://127.0.0.1:7003&etcd=http://127.0.0.1:4003","modifiedIndex":5,"createdIndex":5}],"modifiedIndex":1,"createdIndex":1}}

　今度はちゃんとmachine2とmachine3が見えます。

　データを投入して，取得。

curl -L http://127.0.0.1:4001/v2/keys/key1 -X PUT -d value="value1"

{"action":"set","node":{"key":"/key1","prevValue":"value1","value":"value1","modifiedIndex":3,"createdIndex":3}}

curl -L http://127.0.0.1:4001/v2/keys/key1 -X GET

{"action":"get","node":{"key":"/key1","value":"value1","modifiedIndex":3,"createdIndex":3}}

curl -L http://127.0.0.1:4002/v2/keys/key1 -X GET

{"action":"get","node":{"key":"/key1","value":"value1","modifiedIndex":3,"createdIndex":3}}

curl -L http://127.0.0.1:4003/v2/keys/key1 -X GET

{"action":"get","node":{"key":"/key1","value":"value1","modifiedIndex":3,"createdIndex":3}}

　今度は大丈夫なようです。

　machine2から更新してみます。

curl -L http://127.0.0.1:4002/v2/keys/key1 -X PUT -d value="value2"

{"action":"set","node":{"key":"/key1","prevValue":"value1","value":"value2","modifiedIndex":6,"createdIndex":6}}

　それぞれのmachineから読み出し。

curl -L http://127.0.0.1:4001/v2/keys/key1 -X GET

{"action":"get","node":{"key":"/key1","value":"value2","modifiedIndex":6,"createdIndex":6}}

curl -L http://127.0.0.1:4002/v2/keys/key1 -X GET

{"action":"get","node":{"key":"/key1","value":"value2","modifiedIndex":6,"createdIndex":6}}

curl -L http://127.0.0.1:4003/v2/keys/key1 -X GET

{"action":"get","node":{"key":"/key1","value":"value2","modifiedIndex":6,"createdIndex":6}}

　これも問題ありません。

　次はmachine3から削除します。

curl -L http://127.0.0.1:4003/v2/keys/key1 -X DELETE

{"action":"delete","node":{"key":"/key1","prevValue":"value2","modifiedIndex":7,"createdIndex":6}}

　同様にそれぞれのmachineから参照してみます。

curl -L http://127.0.0.1:4003/v2/keys/key1 -X GET

{"errorCode":100,"message":"Key Not Found","cause":"/key1","index":7}

curl -L http://127.0.0.1:4002/v2/keys/key1 -X GET

{"errorCode":100,"message":"Key Not Found","cause":"/key1","index":7}

curl -L http://127.0.0.1:4001/v2/keys/key1 -X GET

{"errorCode":100,"message":"Key Not Found","cause":"/key1","index":7}

　どこからも見ても存在しなくなりました。

　次は，データを投入後，etcdを落としてみます。　まずは投入。

curl -L http://127.0.0.1:4001/v2/keys/xyz -X PUT -d value="ABC"

{"action":"set","node":{"key":"/xyz","value":"ABC","modifiedIndex":10,"createdIndex":10}}

　今は全てのmachineから参照できます。

curl -L http://127.0.0.1:4001/v2/keys/xyz -X GET

{"action":"get","node":{"key":"/xyz","value":"ABC","modifiedIndex":10,"createdIndex":10}}

curl -L http://127.0.0.1:4002/v2/keys/xyz -X GET

{"action":"get","node":{"key":"/xyz","value":"ABC","modifiedIndex":10,"createdIndex":10}}

curl -L http://127.0.0.1:4003/v2/keys/xyz -X GET

{"action":"get","node":{"key":"/xyz","value":"ABC","modifiedIndex":10,"createdIndex":10}}

　machine1を落としてみると，

curl -L http://127.0.0.1:4001/v2/keys/xyz -X GET

curl: (7) couldn't connect to host

　machine1には接続できませんが，

curl -L http://127.0.0.1:4002/v2/keys/xyz -X GET

{"action":"get","node":{"key":"/xyz","value":"ABC","modifiedIndex":10,"createdIndex":10}}

curl -L http://127.0.0.1:4003/v2/keys/xyz -X GET

{"action":"get","node":{"key":"/xyz","value":"ABC","modifiedIndex":10,"createdIndex":10}}

　machine2,3からはデータがちゃんと取れました。

　続いてmachine2も落としてみると，

curl -L http://127.0.0.1:4001/v2/keys/xyz -X GET

curl: (7) couldn't connect to host

curl -L http://127.0.0.1:4002/v2/keys/xyz -X GET

curl: (7) couldn't connect to host

curl -L http://127.0.0.1:4003/v2/keys/xyz -X GET

{"action":"get","node":{"key":"/xyz","value":"ABC","modifiedIndex":10,"createdIndex":10}}

　今度はmachine3からだけデータが取得できました。

　ここで再度machine1を起動して，クラスターに参加させます。

./etcd -data-dir /tmp/etcd1.data -name machine1 -addr 127.0.0.1:4001 -peer-addr 127.0.0.1:7001 -peers 127.0.0.1:7003 -force

[etcd] Dec  8 01:41:20.668 INFO      | etcd server [name machine1, listen on 127.0.0.1:4001, advertised url http://127.0.0.1:4001]
[etcd] Dec  8 01:41:21.019 WARNING   | cannot join to cluster via given peers, retry in 10 seconds
[etcd] Dec  8 01:41:31.371 WARNING   | cannot join to cluster via given peers, retry in 10 seconds
[etcd] Dec  8 01:41:41.722 WARNING   | cannot join to cluster via given peers, retry in 10 seconds
[etcd] Dec  8 01:41:51.722 CRITICAL  | Cannot join the cluster via given peers after 3 retries

　！？　ダメでした。　そこで，最初に起動した時と同じく，-peersを付けずに，起動してみました。

./etcd -data-dir /tmp/etcd1.data -name machine1 -addr 127.0.0.1:4001 -peer-addr 127.0.0.1:7001

[etcd] Dec  8 01:49:51.079 INFO      | etcd server [name machine1, listen on 127.0.0.1:4001, advertised url http://127.0.0.1:4001]
[etcd] Dec  8 01:49:51.161 INFO      | URLs:  / machine1 (http://127.0.0.1:7001,http://127.0.0.1:7002,http://127.0.0.1:7003)
[etcd] Dec  8 01:49:51.512 WARNING   | the entire cluster is down! this peer will restart the cluster.
[etcd] Dec  8 01:49:51.513 INFO      | raft server [name machine1, listen on 127.0.0.1:7001, advertised url http://127.0.0.1:7001]

　！？　今度は起動しました。

　クラスター情報を確認してみます。

curl -L http://127.0.0.1:4001/v2/keys/_etcd/machines -X GET

{"action":"get","node":{"key":"/_etcd/machines","dir":true,"nodes":[{"key":"/_etcd/machines/machine1","value":"raft=http://127.0.0.1:7001&etcd=http://127.0.0.1:4001","modifiedIndex":1,"createdIndex":1},{"key":"/_etcd/machines/machine2","value":"raft=http://127.0.0.1:7002&etcd=http://127.0.0.1:4002","modifiedIndex":2,"createdIndex":2},{"key":"/_etcd/machines/machine3","value":"raft=http://127.0.0.1:7003&etcd=http://127.0.0.1:4003","modifiedIndex":3,"createdIndex":3}],"modifiedIndex":1,"createdIndex":1}}

curl -L http://127.0.0.1:4003/v2/keys/_etcd/machines -X GET

{"action":"get","node":{"key":"/_etcd/machines","dir":true,"nodes":[{"key":"/_etcd/machines/machine1","value":"raft=http://127.0.0.1:7001&etcd=http://127.0.0.1:4001","modifiedIndex":1,"createdIndex":1},{"key":"/_etcd/machines/machine2","value":"raft=http://127.0.0.1:7002&etcd=http://127.0.0.1:4002","modifiedIndex":4,"createdIndex":4},{"key":"/_etcd/machines/machine3","value":"raft=http://127.0.0.1:7003&etcd=http://127.0.0.1:4003","modifiedIndex":5,"createdIndex":5}],"modifiedIndex":1,"createdIndex":1}}

　うまくいっているようです。

　停止→再起動前に投入したデータをmachine1から取得してみます。

curl -L http://127.0.0.1:4001/v2/keys/xyz -X GET

{"action":"get","node":{"key":"/xyz","value":"ABC","modifiedIndex":10,"createdIndex":10}}

　取れました。ちゃんと再分散も働いているようです。

まとめ

　というわけで，だいぶ長くなってしまったので，この辺で終わりにします。

　etcdは，クラスターの構成方法に少し癖があるものの，割と簡単にクラスターが組めて，データの冗長化の信頼性も高そうです。残された興味は，Health Managerのバックエンドとして，どれくらい性能が出るかです。時間があれば，今後調べてみたいと思っています。

nota-ja/gist:7847007

Build

単体実行

クラスター構成

まとめ