Created
July 3, 2012 01:08
-
-
Save Drewzar/3036784 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[14:38]*** Ines_S ([email protected]) has joined channel #ey-vitrue | |
<Ines_S> hello | |
<banzaiman> sessions: no resque processes running on this one. | |
[14:40]<banzaiman> sessions: is there one that has resque running? I checked resque10, resque8 and resque9, but I don't see any resque process yet. | |
[14:41]<banzaiman> sessions: so, this failure to reload unicorn must be coming from somewhere else. | |
<banzaiman> deploy error is somehow getting stuck on resque9. | |
[14:43]<Drewzar> Hey Ines_S | |
[14:44]<Drewzar> LuVitrue: is also here with us. | |
<Ines_S> hey Drewzar | |
<Drewzar> Where you able to find anything? | |
[14:45]<banzaiman> sessions: so… | |
<Ines_S> going to fetch your chef recipes, you guys have an even more custom cluster than we normally find | |
<banzaiman> sessions: i-c0ef0ca0 is the one that had the issue. | |
<banzaiman> sessions: how is resque supposed to be started on this? monit is not getting set up, it seems. | |
<sessions> banzaiman yes, all the nodes labeled resque should have resque running | |
[14:46]<sessions> yeah, it should be monit | |
[14:49]<banzaiman> sessions: hmm. | |
[14:50]<banzaiman> resque10, resque12, resque9, resque11, resque13, resque8, resque3 are not running any resque process. | |
[14:51]<sessions> hmm... that could be a problem | |
<Drewzar> Ines_S: Yea, sorry I didn't build that env, otherwise I'd be more of a help =\ | |
<Ines_S> no worries, it will just take a little time for me to understand what you guys were trying to do | |
[14:52]<Ines_S> the last time I saw your environment it was just a 6 node replica set | |
<sessions> banzaiman i guess i'll need to deploy... there is no current directory | |
<Ines_S> btw did you get a chance to look at your production cluster? | |
<Ines_S> looks like it's missing a config server | |
[14:53]<Drewzar> I haven't looked in to that but I'm pretty sure the app is working. | |
<Drewzar> (as no one is screaming) :) | |
<sessions> banzaiman deploying now to see if it drops the current directory on those nodes | |
<LuVitrue> On staging, I don't remember we had a standalone config server. | |
[14:56]<Ines_S> Drewzar: the app is working because the shards can be written/read from | |
<Ines_S> but if you need to rebalance | |
<Ines_S> the operation will fail | |
[14:57]<Drewzar> So we need another config server in prod? | |
<Drewzar> Do we have to take the whole mongo cluster to do that? | |
<Drewzar> take it down* | |
[14:59]<LuVitrue> Looks like we are missing one configserver on production, too. | |
<Ines_S> yeah | |
<Ines_S> you're missing config 2 | |
<Ines_S> so the metadata in the cluster is read only | |
<Ines_S> things "look ok" but you need 3 for a fully functional cluster | |
<Ines_S> so talk to me about your staging environment | |
[15:00]<Ines_S> my guess is that you are doing a shard consisting of a 2 node replica set | |
<Ines_S> is that correct? | |
<Ines_S> then you are missing an arbiter | |
<banzaiman> sessions: let me know how it goes | |
[15:01]<sessions> ok | |
[15:02]<sessions> banzaiman just an update is it's an the step right after the "permanently added...." | |
[15:03]<LuVitrue> Ines_S: you mean mongos? | |
<banzaiman> sessions: do you use engineyard gem? | |
[15:04]<sessions> yeah, but i used the dashboard inthis case and am tailing the log on app-master | |
<banzaiman> ssh key warnings are mostly noise, so it may or may not be relevant. | |
<banzaiman> ok. | |
[15:05]<sessions> it's actually been a long time.... seems to be much much longer than normal | |
[15:07]<banzaiman> sessions: it appears to fail at the same spot with the same unicorn master. | |
<sessions> hmm | |
[15:09]<banzaiman> not sure what's going on there. | |
<banzaiman> the master seems funcitonal. | |
<sessions> yeah the app is fine | |
<LuVitrue> Not sure how it was setup, but an arbiter doesn't require a dedicated server, | |
[15:10]<LuVitrue> it's probably running from some other node? | |
[15:18]<Ines_S> but it's still a process | |
<Ines_S> that needs to be up | |
[15:19]<Ines_S> requerying your instances for status | |
<sessions> banzaiman any ideas? | |
[15:20]<banzaiman> I'm looking at the unicorn masters that failed to reload at the moment | |
<banzaiman> there appear to be a few of those. | |
[15:21]<Ines_S> ohh so you had a shard of only a master-server | |
<Ines_S> crater i-9847e9f5 util [mongo1] ~ # cat /etc/mongodb/master.conf | |
<Ines_S> dbpath = /db/mongodb/master | |
<Ines_S> logpath = /var/log/mongodb/master.log | |
<Ines_S> logappend = true | |
[15:24]<Ines_S> I'm attempting to restart the mongod processes of your data nodes | |
<Ines_S> if this works your staging env may be ok | |
[15:27]<Ines_S> alright | |
<Ines_S> looks like it's up | |
<Ines_S> crater i-b05cf2dd util [mongo2] ~ # ps aux | grep mongo | |
<Ines_S> root 3540 0.0 0.7 89804 12924 ? Sl Jun30 0:02 /usr/local/mongodb/bin/mongod --configsvr --config /etc/mongodb/config.conf | |
<Ines_S> root 17406 0.0 1.6 174816 29648 pts/1 Sl 15:23 0:00 /usr/local/mongodb/bin/mongod --config /etc/mongodb/master.conf | |
<Ines_S> root 17468 0.0 0.0 1796 604 pts/1 S+ 15:26 0:00 grep --colour=auto mongo | |
<Ines_S> crater i-b05cf2dd util [mongo2] ~ # /usr/local/mongodb/bin/mongo | |
<Ines_S> MongoDB shell version: 2.0.0 | |
<Ines_S> connecting to: test | |
<Ines_S> > show dbs; | |
<Ines_S> local 0.203125GB | |
<Ines_S> publisher 1.452880859375GB | |
<Ines_S> > | |
[15:28]<banzaiman> sessions: can I try restarting unicorn on one? | |
<LuVitrue> looks like it's just that mongod was down | |
[15:29]<Ines_S> yep | |
<Ines_S> try another deploy? | |
<sessions> bonzaiman yes | |
<Ines_S> I'm updating the ticket | |
<LuVitrue> cool, thanks. | |
[15:31]<LuVitrue> Drewzar: we need to fix the production mongo servers | |
[15:32]<banzaiman> ah | |
<banzaiman> sessions: | |
<banzaiman> app i-1b41cf7e current # cat config/database.yml | |
<banzaiman> production: | |
<banzaiman> adapter: mysql2 | |
<banzaiman> database: 'emcee' | |
<banzaiman> username: 'deploy' | |
<banzaiman> password: 'q2wC5vdJll' | |
<banzaiman> host: 'ec2-50-16-120-246.compute-1.amazonaws.com' | |
<banzaiman> reconnect: true | |
<banzaiman> but mysql2 is not available. | |
<banzaiman> your app is using mysql, it appears. | |
<sessions> bonzaiman... shouldn't we be using mysql instead of mysql2? | |
<LuVitrue> use mysql2 | |
[15:33]<LuVitrue> We have upgraded about two months ago. | |
<LuVitrue> wait, | |
<LuVitrue> this is for emcee? | |
<banzaiman> Gemfile is not reflecting that. | |
<banzaiman> yes, emcee. | |
<LuVitrue> nvmd | |
[15:34]<banzaiman> ok | |
<banzaiman> why are we choosing mysql2, I wonder… | |
<sessions> banzaiman i think this is a recent change | |
<banzaiman> sessions: on your part, or ours? | |
[15:35]<Drewzar> LuVitrue: Ines_S: so publisher staging is back up and running? | |
[15:36]<LuVitrue> at least the mongodb connection is okay now. | |
<sessions> yours | |
<banzaiman> forcing mysql restarts unicorn, it appears. | |
<Ines_S> yeah Drewzar, you need to confirm that you can deploy correctly though | |
<Drewzar> LuVitrue: okay, lets wait till sessions and banzaiman stop playing with prod before we start another thing. | |
[15:37]<LuVitrue> I am redeploying staging now. | |
[15:38]<Drewzar> okay, let us know :) | |
<sessions> banzaiman force restart rather than hot reload? | |
[15:39]<banzaiman> sessions: if the app's database.yml is messed up, I don't want to restart unicorn. | |
<sessions> i see | |
<sessions> LuVitrue: do you see a problem with using the mysql2 gem with emcee? | |
[15:40]<banzaiman> sessions: and all of them, except the one that I just manually edited, has mysql2. | |
<LuVitrue> I haven't tested mysql2 with emcee. | |
<sessions> banzaiman strange, because the application is actually running | |
<sessions> on the app nodes | |
<LuVitrue> I thought you were talking about Publisher. | |
[15:41]<banzaiman> sessions: I am puzzled, too. :-S | |
[15:43]<sessions> banzaiman so it appears as the app will work on mysql2 | |
<sessions> actualy i don't know | |
[15:44]<Drewzar> Want me to pull Eli in here? | |
<Ines_S> hey guys any questions regarding the mongo cluster on Publisher? | |
<sessions> seems as though the app is though b/c it's in the db.yml folder in the current directory | |
[15:45]<Ines_S> I may step out for a bit and work on a different ticket | |
<Drewzar> Ines_S: in order to add a config server we're going to need to reboot the mongo cluster, correct? | |
[15:47]<Ines_S> yeah, you will need to stop the entire cluster | |
<Ines_S> change the startup scripts | |
<Ines_S> here let me give you a link | |
<Drewzar> Okay we'll need to plan downtime for that. | |
<Ines_S> likely best done with a scheduled maintenance window | |
[15:48]<Drewzar> yeah | |
<Drewzar> I'll work with sessions on that. | |
<Ines_S> http://www.mongodb.org/display/DOCS/Changing+Config+Servers | |
<Ines_S> ok, depending on the time I may be able to join you | |
<Drewzar> Sweet thanks | |
<Drewzar> It usally done Thrusdays at 11P | |
<Ines_S> how about you open a ticket to start planning the maintenance | |
<banzaiman> sessions: I think it is actually using mysql on the app master. | |
<banzaiman> app_master i-b251d5dd ~ # lsof | grep emcee | grep mysql | |
<banzaiman> rubyee 8271 deploy mem REG 65,145 91230 165638 /data/emcee/shared/bundled_gems/ruby/1.8/gems/mysql-2.8.1/lib/mysql_api.so | |
<banzaiman> rubyee 11435 deploy mem REG 65,145 91230 165638 /data/emcee/shared/bundled_gems/ruby/1.8/gems/mysql-2.8.1/lib/mysql_api.so | |
[15:49]<Ines_S> we can document what needs to be done | |
<banzaiman> rubyee 11436 deploy mem REG 65,145 91230 165638 /data/emcee/shared/bundled_gems/ruby/1.8/gems/mysql-2.8.1/lib/mysql_api.so | |
<banzaiman> rubyee 11438 deploy mem REG 65,145 91230 165638 /data/emcee/shared/bundled_gems/ruby/1.8/gems/mysql-2.8.1/lib/mysql_api.so | |
<banzaiman> rubyee 11446 deploy mem REG 65,145 91230 165638 /data/emcee/shared/bundled_gems/ruby/1.8/gems/mysql-2.8.1/lib/mysql_api.so | |
<Ines_S> and have someone handy to assist if needed | |
<Drewzar> Ines_S: I'll open a ticket once we have a date set | |
<Ines_S> ok guys I'm stepping out | |
<Ines_S> cool Drewzar | |
<Drewzar> Thanks so much :) | |
<Ines_S> nice to meet you all | |
<Ines_S> welcome |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment