Skip to content

Instantly share code, notes, and snippets.

@minimum2scp
Last active December 21, 2015 21:29
Show Gist options
  • Save minimum2scp/6368318 to your computer and use it in GitHub Desktop.
Save minimum2scp/6368318 to your computer and use it in GitHub Desktop.
自宅サーバのRAID1がデグレードした
 6043  2012-10-20 13:41  0:00  sudo mdadm --detail
 6044  2012-10-20 13:42  0:00  sudo mdadm --remove /dev/md0 /dev/sdb1
 6045  2012-10-20 13:43  0:00  apt-cache show gpart
 6046  2012-10-20 13:44  0:00  sudo apt-get install parted
 6047  2012-10-20 13:45  0:00  sudo parted
 6050  2012-10-20 13:51  0:00  sudo mdadm --create /dev/md1 --level=1 --raid-devices=2 missing /dev/sdb2
 6051  2012-10-20 13:52  0:00  sudo mdadm --detail --scan
 6052  2012-10-20 13:52  0:00  sudo pvcreate /dev/md1
 6053  2012-10-20 13:52  0:00  sudo vgextend vg3 /dev/md1
 6054  2012-10-20 13:52  0:00  run-help sudo
 6055  2012-10-20 13:53  0:00  run-help pvmove
 6056  2012-10-20 13:53  0:00  pvmove -h
 6057  2012-10-20 13:54  0:00  sudo pvmove -h
 6058  2012-10-20 13:55  0:00  sudo pvmove -v /dev/vg3/testdomU-disk /dev/md0 /dev/md1
 6059  2012-10-20 13:55  0:00  sudo pvmove -v /dev/mapper/vg3-testdomU--disk /dev/md0 /dev/md1
 6060  2012-10-20 13:55  0:00  sudo pvmove -v testdomU-disk /dev/md0 /dev/md1
 6061  2012-10-20 13:55  0:00  sudo pvmove -v -n testdomU-disk /dev/md0 /dev/md1
 6062  2012-10-20 13:58  0:00  sudo pvmove -v /dev/md0 /dev/md1
 6063  2012-10-20 14:00  0:00  sudo lvs
 6064  2012-10-20 14:04  0:00  lvdisplay --help
 6065  2012-10-20 14:04  0:00  run-help lvdisplay
 6066  2012-10-20 14:04  0:00  man lvm
 6067  2012-10-20 14:04  0:00  man lvdisplay
 6068  2012-10-20 14:14  0:00  man lvs
 6069  2012-10-20 14:51  0:00  df /backup/asp
 6070  2012-10-20 14:51  0:00  ll /tmp
 6071  2012-10-20 15:17  0:00  sudo vgreduce
 6072  2012-10-20 15:17  0:00  sudo vgreduce -h
 6073  2012-10-20 15:17  0:00  ls /dev/vg3/
 6074  2012-10-20 15:17  0:00  sudo vgreduce vg3 /dev/md0
 6075  2012-10-20 15:20  0:00  apt-cache search gpart
 6076  2012-10-20 15:23  0:00  sudo mdadm --detail /dev/md0
 6077  2012-10-20 15:23  0:00  sudo mdadm --detail /dev/md1
 6078  2012-10-20 15:23  0:00  ls /dev/sd*
 6079  2012-10-20 15:23  0:00  sudo mdadm --add /dev/md1 /dev/sda2
 6080  2012-10-20 15:24  0:00  lv /etc/mdadm/mdadm.conf
 6081  2012-10-20 15:27  0:00  df -T
 6082  2012-10-20 15:27  0:00  sudo update-initramfs -u -a
 6083  2012-10-20 15:29  0:00  ll /lib/mod
 6084  2012-10-20 15:30  0:00  ll /lib
 6085  2012-10-20 15:30  0:00  cat /boot/grub/device.map
 6087  2012-10-20 22:01  0:00  dpkg -l grub\*
 6088  2012-10-20 22:01  0:00  ls /var/cache/apt/archives/grub*
 6089  2012-10-20 22:01  0:00  apt-cache policy grub2
 6090  2012-10-20 22:02  0:00  apt-holds.rb -a grub2 grub-pc grub-common
 6091  2012-10-20 22:08  0:00  apt-holds.rb -r grub2 grub-pc grub-common
 6092  2012-10-20 22:08  0:00  sudo apt-get dist-upgrade
 6093  2012-10-20 22:13  0:00  sudo apt-get remove --purge linux-image-3.2.0-1-amd64 linux-image-3.2.0-2-amd64
 6094  2012-10-20 22:13  0:00  ls /lib/modules
 6095  2012-10-20 22:14  0:00  sudo apt-get remove --purge xen-hypervisor-4.0-amd64 -s
 6096  2012-10-20 22:14  0:00  dpkg -l xen-qemu\*
 6097  2012-10-20 22:14  0:00  apt-cache search qemu dm
 6098  2012-10-20 22:14  0:00  sudo apt-get autoclean
 6099  2012-10-20 22:15  0:00  sudo vi /etc/default/grub
 6100  2012-10-20 22:18  0:00  sudo mdadm --fail /dev/md0 /dev/sdb1
 6101  2012-10-20 22:19  0:00  sudo mdadm --fail /dev/md1 /dev/sdb1
 6102  2012-10-20 22:19  0:00  sudo mdadm --fail /dev/md1 /dev/sdb2
 6103  2012-10-20 22:19  0:00  sudo mdadm --remove /dev/md1 /dev/sdb2
 6104  2012-10-20 22:21  0:00  sudo parted /dev/sda
 6105  2012-10-20 22:23  0:00  sudo mkfs.ext4 /dev/sdb2
 6106  2012-10-20 22:29  0:00  ll /dev/sd*
 6107  2012-10-20 22:30  0:00  man mdadm
 6108  2012-10-20 22:31  0:00  sudo mdadm --zero-superblock /dev/sdb2
 6109  2012-10-20 22:31  0:00  sudo mdadm --zero-superblock /dev/md0
 6110  2012-10-20 22:33  0:00  sudo parted /dev/sdb print
 6111  2012-10-20 22:33  0:00  man parted
 6112  2012-10-20 22:34  0:00  sudo parted /dev/sdb 'unit s; print'
 6113  2012-10-20 22:34  0:00  echo -e "unit s\nprint" | sudo parted /dev/sdb
 6114  2012-10-20 22:36  0:00  echo -e "unit s\nprint\nq" | sudo parted /dev/sda
 6115  2012-10-20 22:36  0:00  sudo pvremove /dev/md0
 6117  2012-10-20 22:37  0:00  echo -e "unit s\nprint\nq" | sudo parted /dev/sdb
 6118  2012-10-20 22:37  0:00  sudo parted /dev/sdb
 6119  2012-10-20 22:40  0:00  ls /dev/sdb*
 6120  2012-10-20 22:40  0:00  sudo mkfs.reiserfs /dev/sdb1
 6121  2012-10-20 22:43  0:00  sudo mdadm --stop /dev/md0
 6122  2012-10-20 22:43  0:00  sudo mdadm --create /dev/md0 --level=1 --raid-disks=2 missing /dev/sdb2
 6123  2012-10-20 22:44  0:00  sudo mdadm --readwrite /dev/md0
 6124  2012-10-20 22:45  0:00  sudo pvcreate /dev/md0
 6125  2012-10-20 22:45  0:00  sudo vgextend vg3 /dev/md0
 6126  2012-10-20 22:45  0:00  sudo vgs
 6127  2012-10-20 22:45  0:00  screen
 6128  2012-10-20 22:45  0:00  sudo pvmove /dev/md1 /dev/md0
 6129  2012-10-20 23:04  0:00  free -m
 6130  2012-10-20 23:06  0:00  psql -U rsyslog -h localhost rsyslog
 6131  2012-10-20 23:45  0:00  screen -xR
 6132  2012-10-20 23:58  0:00  sudo vgreduce vg3 /dev/md1
 6133  2012-10-20 23:58  0:00  sudo pvremove /dev/md1
 6134  2012-10-20 23:59  0:00  sudo mdadm --stop /dev/md1
 6135  2012-10-20 23:59  0:00  sudo pvs
 6137  2012-10-20 23:59  0:00  sudo mdadm --misc --zero-superblock /dev/sda2
 6138  2012-10-20 23:59  0:00  sudo mdadm --add /dev/md0 /dev/sda2
 6139  2012-10-21 00:02  0:00  sudo xm tpo
 6141  2012-10-21 18:30  0:00  sudo update-initramfs -u -k all
 6142  2012-10-21 23:50  0:00  watch cat /proc/mdstat
From: mdadm monitoring <[email protected]>
To: [email protected]
Subject: Fail event on /dev/md0:bigbrother
Message-Id: <[email protected]>
Date: Thu, 29 Aug 2013 01:12:46 +0900 (JST)

This is an automatically generated mail message from mdadm
running on bigbrother

A Fail event had been detected on md device /dev/md0.

It could be related to component device /dev/sdb2.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1] 
md0 : active raid1 sda2[2] sdb2[1](F)
      976627264 blocks super 1.2 [2/1] [U_]
      
unused devices: <none>
tsuyoshi@bigbrother% cat /proc/mdstat
Personalities : [raid1] 
md0 : active raid1 sda2[2] sdb2[1](F)
      976627264 blocks super 1.2 [2/1] [U_]
      
unused devices: <none>
tsuyoshi@bigbrother% sudo mdadm -D /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Sat Oct 20 22:43:54 2012
     Raid Level : raid1
     Array Size : 976627264 (931.38 GiB 1000.07 GB)
  Used Dev Size : 976627264 (931.38 GiB 1000.07 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Thu Aug 29 01:49:07 2013
          State : active, degraded 
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0

           Name : bigbrother:0  (local to host bigbrother)
           UUID : a7dc949a:38213694:bad6ab03:db64d1d3
         Events : 61474

    Number   Major   Minor   RaidDevice State
       2       8        2        0      active sync   /dev/sda2
       1       0        0        1      removed

       1       8       18        -      faulty spare   /dev/sdb2
Aug 29 01:12:39 bigbrother kernel: [ 6205.034049] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 29 01:12:39 bigbrother kernel: [ 6205.052635] ata4.00: irq_stat 0x40000001
Aug 29 01:12:39 bigbrother kernel: [ 6205.070880] ata4.00: failed command: READ DMA EXT
Aug 29 01:12:39 bigbrother kernel: [ 6205.088840] ata4.00: cmd 25/00:08:48:19:d5/00:00:19:00:00/e0 tag 0 dma 4096 in
Aug 29 01:12:39 bigbrother kernel: [ 6205.088840]          res 71/04:04:9d:00:32/00:00:00:00:00/e0 Emask 0x1 (device error)
Aug 29 01:12:39 bigbrother kernel: [ 6205.124308] ata4.00: status: { DRDY DF ERR }
Aug 29 01:12:39 bigbrother kernel: [ 6205.141664] ata4.00: error: { ABRT }
Aug 29 01:12:39 bigbrother kernel: [ 6205.185540] ata4.00: both IDENTIFYs aborted, assuming NODEV
Aug 29 01:12:39 bigbrother kernel: [ 6205.185544] ata4.00: revalidation failed (errno=-2)
Aug 29 01:12:39 bigbrother kernel: [ 6205.202272] ata4: hard resetting link
Aug 29 01:12:39 bigbrother kernel: [ 6205.708070] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Aug 29 01:12:40 bigbrother kernel: [ 6205.751867] ata4.00: both IDENTIFYs aborted, assuming NODEV
Aug 29 01:12:40 bigbrother kernel: [ 6205.751870] ata4.00: revalidation failed (errno=-2)
Aug 29 01:12:44 bigbrother kernel: [ 6210.724090] ata4: hard resetting link
Aug 29 01:12:45 bigbrother kernel: [ 6211.232080] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Aug 29 01:12:45 bigbrother kernel: [ 6211.290372] ata4.00: both IDENTIFYs aborted, assuming NODEV
Aug 29 01:12:45 bigbrother kernel: [ 6211.290376] ata4.00: revalidation failed (errno=-2)
Aug 29 01:12:45 bigbrother kernel: [ 6211.305794] ata4.00: disabled
Aug 29 01:12:45 bigbrother kernel: [ 6211.320815] ata4: EH complete
Aug 29 01:12:45 bigbrother kernel: [ 6211.335416] sd 3:0:0:0: [sdb] Unhandled error code
Aug 29 01:12:45 bigbrother kernel: [ 6211.339406] sd 3:0:0:0: [sdb]  
Aug 29 01:12:45 bigbrother kernel: [ 6211.339406] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 29 01:12:45 bigbrother kernel: [ 6211.339406] sd 3:0:0:0: [sdb] CDB: 
Aug 29 01:12:45 bigbrother kernel: [ 6211.339406] Read(10): 28 00 19 d5 19 48 00 00 08 00
Aug 29 01:12:45 bigbrother kernel: [ 6211.339406] end_request: I/O error, dev sdb, sector 433396040
Aug 29 01:12:45 bigbrother kernel: [ 6211.339406] md/raid1:md0: sdb2: rescheduling sector 433129800
Aug 29 01:12:45 bigbrother kernel: [ 6211.339406] sd 3:0:0:0: [sdb] Unhandled error code
Aug 29 01:12:45 bigbrother kernel: [ 6211.443577] sd 3:0:0:0: [sdb]  
Aug 29 01:12:45 bigbrother kernel: [ 6211.455826] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 29 01:12:45 bigbrother kernel: [ 6211.468112] sd 3:0:0:0: [sdb] CDB: 
Aug 29 01:12:45 bigbrother kernel: [ 6211.479949] Write(10): 2a 00 00 00 10 08 00 00 01 00
Aug 29 01:12:45 bigbrother kernel: [ 6211.491678] end_request: I/O error, dev sdb, sector 4104
Aug 29 01:12:45 bigbrother kernel: [ 6211.503072] end_request: I/O error, dev sdb, sector 4104
Aug 29 01:12:45 bigbrother kernel: [ 6211.507066] md: super_written gets error=-5, uptodate=0
Aug 29 01:12:45 bigbrother kernel: [ 6211.507066] md/raid1:md0: Disk failure on sdb2, disabling device.
Aug 29 01:12:45 bigbrother kernel: [ 6211.507066] md/raid1:md0: Operation continuing on 1 devices.
Aug 29 01:12:45 bigbrother kernel: [ 6211.545737] sd 3:0:0:0: [sdb] Unhandled error code
Aug 29 01:12:45 bigbrother kernel: [ 6211.555757] sd 3:0:0:0: [sdb]  
Aug 29 01:12:45 bigbrother kernel: [ 6211.565374] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 29 01:12:45 bigbrother kernel: [ 6211.574959] sd 3:0:0:0: [sdb] CDB: 
Aug 29 01:12:45 bigbrother kernel: [ 6211.584139] Read(10): 28 00 1a 0f 01 90 00 00 80 00
Aug 29 01:12:45 bigbrother kernel: [ 6211.593150] end_request: I/O error, dev sdb, sector 437191056
Aug 29 01:12:45 bigbrother kernel: [ 6211.601916] md/raid1:md0: sdb2: rescheduling sector 436924816
Aug 29 01:12:45 bigbrother kernel: [ 6211.610505] md/raid1:md0: sdb2: rescheduling sector 436924904
Aug 29 01:12:45 bigbrother kernel: [ 6211.638525] md/raid1:md0: redirecting sector 433129800 to other mirror: sda2
Aug 29 01:12:45 bigbrother kernel: [ 6211.726230] md/raid1:md0: redirecting sector 436924816 to other mirror: sda2
Aug 29 01:12:46 bigbrother kernel: [ 6211.832685] md/raid1:md0: redirecting sector 436924904 to other mirror: sda2
Aug 29 01:12:46 bigbrother kernel: [ 6211.840044] RAID1 conf printout:
Aug 29 01:12:46 bigbrother kernel: [ 6211.840046]  --- wd:1 rd:2
Aug 29 01:12:46 bigbrother kernel: [ 6211.840047]  disk 0, wo:0, o:1, dev:sda2
Aug 29 01:12:46 bigbrother kernel: [ 6211.840049]  disk 1, wo:1, o:0, dev:sdb2
Aug 29 01:12:46 bigbrother kernel: [ 6211.865362] RAID1 conf printout:
Aug 29 01:12:46 bigbrother kernel: [ 6211.865367]  --- wd:1 rd:2
Aug 29 01:12:46 bigbrother kernel: [ 6211.865370]  disk 0, wo:0, o:1, dev:sda2
Aug 29 01:12:46 bigbrother mdadm[2900]: Fail event detected on md device /dev/md0, component device /dev/sdb2
Aug 29 01:19:37 bigbrother hddtemp[2724]: /dev/sdb: ST31000333AS: no sensor
Aug 29 01:23:52 bigbrother kernel: [ 6878.074435] sd 3:0:0:0: [sdb] Unhandled error code
Aug 29 01:23:52 bigbrother kernel: [ 6878.093093] sd 3:0:0:0: [sdb]  
Aug 29 01:23:52 bigbrother kernel: [ 6878.111192] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 29 01:23:52 bigbrother kernel: [ 6878.129240] sd 3:0:0:0: [sdb] CDB: 
Aug 29 01:23:52 bigbrother kernel: [ 6878.147122] Read(10): 28 00 00 00 00 00 00 00 20 00
Aug 29 01:23:52 bigbrother kernel: [ 6878.164791] end_request: I/O error, dev sdb, sector 0
Aug 29 01:23:52 bigbrother kernel: [ 6878.181994] Buffer I/O error on device sdb, logical block 0
Aug 29 01:23:52 bigbrother kernel: [ 6878.199003] Buffer I/O error on device sdb, logical block 1
Aug 29 01:23:52 bigbrother kernel: [ 6878.199028] sd 3:0:0:0: [sdb] Unhandled error code
Aug 29 01:23:52 bigbrother kernel: [ 6878.199029] sd 3:0:0:0: [sdb]  
Aug 29 01:23:52 bigbrother kernel: [ 6878.199030] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 29 01:23:52 bigbrother kernel: [ 6878.199031] sd 3:0:0:0: [sdb] CDB: 
Aug 29 01:23:52 bigbrother kernel: [ 6878.199035] Read(10): 28 00 00 00 00 00 00 00 08 00
Aug 29 01:23:52 bigbrother kernel: [ 6878.199036] end_request: I/O error, dev sdb, sector 0
Aug 29 01:23:52 bigbrother kernel: [ 6878.199038] Buffer I/O error on device sdb, logical block 0
Aug 29 01:23:52 bigbrother kernel: [ 6878.199111] sd 3:0:0:0: [sdb] Unhandled error code
Aug 29 01:23:52 bigbrother kernel: [ 6878.199112] sd 3:0:0:0: [sdb]  
Aug 29 01:23:52 bigbrother kernel: [ 6878.199113] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 29 01:23:52 bigbrother kernel: [ 6878.199114] sd 3:0:0:0: [sdb] CDB: 
Aug 29 01:23:52 bigbrother kernel: [ 6878.199117] Read(10): 28 00 74 70 6d a8 00 00 08 00
Aug 29 01:23:52 bigbrother kernel: [ 6878.199118] end_request: I/O error, dev sdb, sector 1953525160
Aug 29 01:23:52 bigbrother kernel: [ 6878.199119] Buffer I/O error on device sdb, logical block 244190645
Aug 29 01:23:52 bigbrother kernel: [ 6878.199131] sd 3:0:0:0: [sdb] Unhandled error code
Aug 29 01:23:52 bigbrother kernel: [ 6878.199132] sd 3:0:0:0: [sdb]  
Aug 29 01:23:52 bigbrother kernel: [ 6878.199133] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 29 01:23:52 bigbrother kernel: [ 6878.199134] sd 3:0:0:0: [sdb] CDB: 
Aug 29 01:23:52 bigbrother kernel: [ 6878.199137] Read(10): 28 00 74 70 6d a8 00 00 08 00
Aug 29 01:23:52 bigbrother kernel: [ 6878.199138] end_request: I/O error, dev sdb, sector 1953525160
Aug 29 01:23:52 bigbrother kernel: [ 6878.199139] Buffer I/O error on device sdb, logical block 244190645
Aug 29 01:23:52 bigbrother kernel: [ 6878.199195] sd 3:0:0:0: [sdb] Unhandled error code
Aug 29 01:23:52 bigbrother kernel: [ 6878.199196] sd 3:0:0:0: [sdb]  
Aug 29 01:23:52 bigbrother kernel: [ 6878.199197] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 29 01:23:52 bigbrother kernel: [ 6878.199198] sd 3:0:0:0: [sdb] CDB: 
Aug 29 01:23:52 bigbrother kernel: [ 6878.199200] Read(10): 28 00 00 00 00 00 00 00 08 00
Aug 29 01:23:52 bigbrother kernel: [ 6878.199201] end_request: I/O error, dev sdb, sector 0
Aug 29 01:23:52 bigbrother kernel: [ 6878.199202] Buffer I/O error on device sdb, logical block 0
Aug 29 01:23:52 bigbrother kernel: [ 6878.576273] Buffer I/O error on device sdb, logical block 2
Aug 29 01:23:52 bigbrother kernel: [ 6878.585873] Buffer I/O error on device sdb, logical block 3
Aug 29 01:24:09 bigbrother kernel: [ 6895.111231] sd 3:0:0:0: [sdb] Unhandled error code
Aug 29 01:24:09 bigbrother kernel: [ 6895.119853] sd 3:0:0:0: [sdb]  
Aug 29 01:24:09 bigbrother kernel: [ 6895.128048] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 29 01:24:09 bigbrother kernel: [ 6895.136107] sd 3:0:0:0: [sdb] CDB: 
Aug 29 01:24:09 bigbrother kernel: [ 6895.143879] Read(10): 28 00 00 00 00 00 00 00 08 00
Aug 29 01:24:09 bigbrother kernel: [ 6895.156076] end_request: I/O error, dev sdb, sector 0
Aug 29 01:24:09 bigbrother kernel: [ 6895.163357] Buffer I/O error on device sdb, logical block 0
Aug 29 01:24:09 bigbrother kernel: [ 6895.170563] sd 3:0:0:0: [sdb] Unhandled error code
Aug 29 01:24:09 bigbrother kernel: [ 6895.177378] sd 3:0:0:0: [sdb]  
Aug 29 01:24:09 bigbrother kernel: [ 6895.183882] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 29 01:24:09 bigbrother kernel: [ 6895.190269] sd 3:0:0:0: [sdb] CDB: 
Aug 29 01:24:09 bigbrother kernel: [ 6895.196672] Read(10): 28 00 74 70 6d a8 00 00 08 00
Aug 29 01:24:09 bigbrother kernel: [ 6895.203135] end_request: I/O error, dev sdb, sector 1953525160
Aug 29 01:24:09 bigbrother kernel: [ 6895.209570] Buffer I/O error on device sdb, logical block 244190645
Aug 29 01:24:09 bigbrother kernel: [ 6895.216162] sd 3:0:0:0: [sdb] Unhandled error code
Aug 29 01:24:09 bigbrother kernel: [ 6895.222638] sd 3:0:0:0: [sdb]  
Aug 29 01:24:09 bigbrother kernel: [ 6895.229051] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug 29 01:24:09 bigbrother kernel: [ 6895.235675] sd 3:0:0:0: [sdb] CDB: 
Aug 29 01:24:09 bigbrother kernel: [ 6895.242263] Read(10): 28 00 00 00 00 00 00 00 08 00
Aug 29 01:24:09 bigbrother kernel: [ 6895.248961] end_request: I/O error, dev sdb, sector 0
Aug 29 01:24:09 bigbrother kernel: [ 6895.255577] Buffer I/O error on device sdb, logical block 0
Aug 29 01:29:38 bigbrother hddtemp[2724]: /dev/sdb: ST31000333AS: no sensor
Aug 29 01:39:38 bigbrother hddtemp[2724]: /dev/sdb: ST31000333AS: 227 C

異常事態。

昨日おかしくなったのは、sdbで、昨日まったくアクセスできなかったのだけど… md127 という謎アレイが出現してる。

tsuyoshi@bigbrother% cat /proc/mdstat
Personalities : [raid1] 
md127 : active raid1 sdb2[1]
      976627264 blocks super 1.2 [2/1] [_U]
      
md0 : active raid1 sda2[2]
      976627264 blocks super 1.2 [2/1] [U_]
      
unused devices: <none>
tsuyoshi@bigbrother% sudo mdadm -D /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Sat Oct 20 22:43:54 2012
     Raid Level : raid1
     Array Size : 976627264 (931.38 GiB 1000.07 GB)
  Used Dev Size : 976627264 (931.38 GiB 1000.07 GB)
   Raid Devices : 2
  Total Devices : 1
    Persistence : Superblock is persistent

    Update Time : Fri Aug 30 00:22:22 2013
          State : clean, degraded 
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           Name : bigbrother:0  (local to host bigbrother)
           UUID : a7dc949a:38213694:bad6ab03:db64d1d3
         Events : 86355

    Number   Major   Minor   RaidDevice State
       2       8        2        0      active sync   /dev/sda2
       1       0        0        1      removed
tsuyoshi@bigbrother% sudo mdadm -D /dev/md127
/dev/md127:
        Version : 1.2
  Creation Time : Sat Oct 20 22:43:54 2012
     Raid Level : raid1
     Array Size : 976627264 (931.38 GiB 1000.07 GB)
  Used Dev Size : 976627264 (931.38 GiB 1000.07 GB)
   Raid Devices : 2
  Total Devices : 1
    Persistence : Superblock is persistent

    Update Time : Fri Aug 30 00:22:40 2013
          State : clean, degraded 
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           Name : bigbrother:0  (local to host bigbrother)
           UUID : a7dc949a:38213694:bad6ab03:db64d1d3
         Events : 60257

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       18        1      active sync   /dev/sdb2
tsuyoshi@bigbrother% sudo mdadm -E /dev/sda2
/dev/sda2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a7dc949a:38213694:bad6ab03:db64d1d3
           Name : bigbrother:0  (local to host bigbrother)
  Creation Time : Sat Oct 20 22:43:54 2012
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953258895 (931.39 GiB 1000.07 GB)
     Array Size : 976627264 (931.38 GiB 1000.07 GB)
  Used Dev Size : 1953254528 (931.38 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 2c68f378:2d07822a:99dddea9:9e0d553b

    Update Time : Fri Aug 30 00:34:06 2013
       Checksum : be723f47 - correct
         Events : 86561


   Device Role : Active device 0
   Array State : A. ('A' == active, '.' == missing)
tsuyoshi@bigbrother% sudo mdadm -E /dev/sdb2
/dev/sdb2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a7dc949a:38213694:bad6ab03:db64d1d3
           Name : bigbrother:0  (local to host bigbrother)
  Creation Time : Sat Oct 20 22:43:54 2012
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953254800 (931.38 GiB 1000.07 GB)
     Array Size : 976627264 (931.38 GiB 1000.07 GB)
  Used Dev Size : 1953254528 (931.38 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 61a1d1a7:62f05cd8:34925d72:eee95789

    Update Time : Fri Aug 30 00:33:40 2013
       Checksum : b1b07d6a - correct
         Events : 60683


   Device Role : Active device 1
   Array State : .A ('A' == active, '.' == missing)

sda2 が md0 に、sdb2 が md127 になって、それぞれ片肺状態。

LVM は、なんと md127 の方を認識してるっぽい。

tsuyoshi@bigbrother% sudo pvdisplay
  Found duplicate PV NPqoOslaVLsrDpKNam0b6WBAes5Vkyf3: using /dev/md127 not /dev/md0
  --- Physical volume ---
  PV Name               /dev/md127
  VG Name               vg3
  PV Size               931.38 GiB / not usable 1.56 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              238434
  Free PE               77744
  Allocated PE          160690
  PV UUID               NPqoOs-laVL-srDp-KNam-0b6W-BAes-5Vkyf3
   

/dev/md127 を stop しようと思ったけどこれはまずそう。

シャットダウンして /dev/sdb を外した方が良さげ… だけどサーバを物理的に動かすのが面倒なので、外さずになんとかする。

Debian Liveでブートしようと思ったけど、USBメモリからブートしてくれない(フロントのUSB死んでたっけ?)。

しょうがないので、/etc/lvm/lvm.confで/dev/md0しか使わないように変更してリブート。

filter = [ "a|/dev/md0$|", "r/.*/" ]

/dev/md0 と /dev/md127 で、/dev/md127だけを使ってると思われる状況でこんなことするのにどれだけ意味があるのかと思ったけど意味があった模様。 /dev/md0 を PV として認識してくれた。

tsuyoshi@bigbrother% sudo pvdisplay
  --- Physical volume ---
  PV Name               /dev/md0
  VG Name               vg3
  PV Size               931.38 GiB / not usable 1.56 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              238434
  Free PE               77744
  Allocated PE          160690
  PV UUID               NPqoOs-laVL-srDp-KNam-0b6W-BAes-5Vkyf3
   

その他の情報:

tsuyoshi@bigbrother% cat /proc/mdstat
Personalities : [raid1] 
md127 : active (auto-read-only) raid1 sdb2[1]
      976627264 blocks super 1.2 [2/1] [_U]
      
md0 : active raid1 sda2[2]
      976627264 blocks super 1.2 [2/1] [U_]
      
unused devices: <none>
tsuyoshi@bigbrother% sudo mdadm -D /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Sat Oct 20 22:43:54 2012
     Raid Level : raid1
     Array Size : 976627264 (931.38 GiB 1000.07 GB)
  Used Dev Size : 976627264 (931.38 GiB 1000.07 GB)
   Raid Devices : 2
  Total Devices : 1
    Persistence : Superblock is persistent

    Update Time : Fri Aug 30 01:16:10 2013
          State : clean, degraded 
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           Name : bigbrother:0  (local to host bigbrother)
           UUID : a7dc949a:38213694:bad6ab03:db64d1d3
         Events : 87075

    Number   Major   Minor   RaidDevice State
       2       8        2        0      active sync   /dev/sda2
       1       0        0        1      removed
tsuyoshi@bigbrother% sudo mdadm -D /dev/md127
/dev/md127:
        Version : 1.2
  Creation Time : Sat Oct 20 22:43:54 2012
     Raid Level : raid1
     Array Size : 976627264 (931.38 GiB 1000.07 GB)
  Used Dev Size : 976627264 (931.38 GiB 1000.07 GB)
   Raid Devices : 2
  Total Devices : 1
    Persistence : Superblock is persistent

    Update Time : Fri Aug 30 01:08:36 2013
          State : clean, degraded 
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           Name : bigbrother:0  (local to host bigbrother)
           UUID : a7dc949a:38213694:bad6ab03:db64d1d3
         Events : 60857

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       18        1      active sync   /dev/sdb2
tsuyoshi@bigbrother% sudo mdadm -E /dev/sda2
/dev/sda2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a7dc949a:38213694:bad6ab03:db64d1d3
           Name : bigbrother:0  (local to host bigbrother)
  Creation Time : Sat Oct 20 22:43:54 2012
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953258895 (931.39 GiB 1000.07 GB)
     Array Size : 976627264 (931.38 GiB 1000.07 GB)
  Used Dev Size : 1953254528 (931.38 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 2c68f378:2d07822a:99dddea9:9e0d553b

    Update Time : Fri Aug 30 01:16:40 2013
       Checksum : be724b59 - correct
         Events : 87097


   Device Role : Active device 0
   Array State : A. ('A' == active, '.' == missing)
tsuyoshi@bigbrother% sudo mdadm -E /dev/sdb2
/dev/sdb2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a7dc949a:38213694:bad6ab03:db64d1d3
           Name : bigbrother:0  (local to host bigbrother)
  Creation Time : Sat Oct 20 22:43:54 2012
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953254800 (931.38 GiB 1000.07 GB)
     Array Size : 976627264 (931.38 GiB 1000.07 GB)
  Used Dev Size : 1953254528 (931.38 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 61a1d1a7:62f05cd8:34925d72:eee95789

    Update Time : Fri Aug 30 01:08:36 2013
       Checksum : b1b08648 - correct
         Events : 60857


   Device Role : Active device 1
   Array State : .A ('A' == active, '.' == missing)

/dev/md127 を停止して、/dev/sdb2 を初期化して、/dev/md0 へ戻す。

tsuyoshi@bigbrother% sudo mdadm --stop /dev/md127
mdadm: stopped /dev/md127
tsuyoshi@bigbrother% sudo mdadm --misc --zero-superblock /dev/sdb2
tsuyoshi@bigbrother% sudo mdadm -E /dev/sdb2
mdadm: No md superblock detected on /dev/sdb2.
tsuyoshi@bigbrother% sudo mdadm --add /dev/md0 /dev/sdb2
mdadm: added /dev/sdb2

戻した。状況確認。

tsuyoshi@bigbrother% cat /proc/mdstat
Personalities : [raid1] 
md0 : active raid1 sdb2[3] sda2[2]
      976627264 blocks super 1.2 [2/1] [U_]
      [>....................]  recovery =  0.0% (288192/976627264) finish=112.9min speed=144096K/sec
      
unused devices: <none>
tsuyoshi@bigbrother% sudo mdadm -D /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Sat Oct 20 22:43:54 2012
     Raid Level : raid1
     Array Size : 976627264 (931.38 GiB 1000.07 GB)
  Used Dev Size : 976627264 (931.38 GiB 1000.07 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Fri Aug 30 01:18:10 2013
          State : clean, degraded, recovering 
 Active Devices : 1
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 1

 Rebuild Status : 0% complete

           Name : bigbrother:0  (local to host bigbrother)
           UUID : a7dc949a:38213694:bad6ab03:db64d1d3
         Events : 87149

    Number   Major   Minor   RaidDevice State
       2       8        2        0      active sync   /dev/sda2
       3       8       18        1      spare rebuilding   /dev/sdb2
tsuyoshi@bigbrother% sudo mdadm -E /dev/sda2
/dev/sda2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a7dc949a:38213694:bad6ab03:db64d1d3
           Name : bigbrother:0  (local to host bigbrother)
  Creation Time : Sat Oct 20 22:43:54 2012
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953258895 (931.39 GiB 1000.07 GB)
     Array Size : 976627264 (931.38 GiB 1000.07 GB)
  Used Dev Size : 1953254528 (931.38 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 2c68f378:2d07822a:99dddea9:9e0d553b

    Update Time : Fri Aug 30 01:18:15 2013
       Checksum : be754bef - correct
         Events : 87153


   Device Role : Active device 0
   Array State : AA ('A' == active, '.' == missing)
tsuyoshi@bigbrother% sudo mdadm -E /dev/sdb2
/dev/sdb2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x2
     Array UUID : a7dc949a:38213694:bad6ab03:db64d1d3
           Name : bigbrother:0  (local to host bigbrother)
  Creation Time : Sat Oct 20 22:43:54 2012
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953254800 (931.38 GiB 1000.07 GB)
     Array Size : 976627264 (931.38 GiB 1000.07 GB)
  Used Dev Size : 1953254528 (931.38 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
Recovery Offset : 0 sectors
          State : active
    Device UUID : 48a300f8:a065d1e6:41b3b6e7:c135bd05

    Update Time : Fri Aug 30 01:18:04 2013
       Checksum : 211d33b - correct
         Events : 87144


   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing)

/etc/lvm/lvm.conf を戻しておく。

tsuyoshi@bigbrother% sudo etckeeper vcs checkout lvm/lvm.conf

RAID1リビルドに数時間かかるので様子見。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment