Last active
February 12, 2019 14:52
-
-
Save ilude/bb50efa56b976700c200966b044b57b7 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| {{lowercase title}} | |
| [[Category:Hardware detection and troubleshooting]] | |
| badblocks is a program to test storage devices for bad blocks. | |
| In case of a HDD the whole sector should get retired. A sector is a subdivision of a track on a storage device and sectors that have become bad cannot be used because they have become permanently damaged (a bad sector can have adverse effects ranging from changing a letter in a text file to causing a binary program to have a segmentation fault). | |
| [[S.M.A.R.T.]] (Self-Monitoring, Analysis, and Reporting Technology) is Hardware-featured in almost every HDD still in use nowadays and in some cases it can automatically retire defect HDD Sectors. Anyhow it only passively waits for errors while badblocks writes simple patterns to every block of a device and then reads and checks them searching for damaged areas. (Just like memtest86* does with RAM.) | |
| This can be done in a destructive write-mode that effectively [[Securely_wipe_disk|wipes]] the device (do Backup!) or in non-destructive read-write (Backup advisable as well!) and read-only modes. | |
| == Usage == | |
| badblocks is in {{Pkg|e2fsprogs}} | |
| Usage: | |
| badblocks [ -svwnfBX ] [ -b block-size ] [ -c blocks_at_once ] [ -e max_bad_blocks ] | |
| [ -d read_delay_factor ] [ -i input_file ] [ -o output_file ] [ -p num_passes ] [ -t test_pattern ] | |
| device [ last-block ] [ first-block ] | |
| == Storage Device Fidelity == | |
| Although there is no firm rule has been set, it is common thinking that a new drive should have zero bad sectors. Over time, bad sectors will develop on a storage device and although they are able to be defined to the file system so that they are avoided, continual use of the drive will usually result in additional bad sectors forming and are usually the harbinger of death of a hard drive. Replacement of the device is recommended. | |
| == Comparisons with Other Programs == | |
| Typical recommended practice for testing a storage device for bad sectors is to use the manufacturer's testing program. Most manufacturers have programs that do this. The main reasoning for this is that manufacturers usually have their standards built into the test programs that will tell you if the drive needs to be replaced or not. The caveat here being is that some manufacturers testing programs do not print full test results and allow a certain number of bad sectors saying only if they pass or not. Manufacturer programs, however, are generally quicker than ''badblocks'' sometimes a fair amount so. | |
| == Testing for Bad Sectors == | |
| To test for bad sectors in Linux the program ''badblocks'' is typically used. ''badblocks'' has several different modes to be able to detect bad sectors. | |
| === read-write Test (warning:destructive) === | |
| This test is primarily for testing new drives and is a read-write test. As the pattern is written to every accessible block the device effectively gets [[Securely_wipe_disk|wiped]]. Default is an extensive test with four passes using four different patterns: 0xaa (10101010), 0x55 (01010101), 0xff (11111111) and 0x00 (00000000). For some devices this will take a couple of days to complete. | |
| {{hc|# badblocks -wsv /dev/<device>| | |
| Checking for bad blocks in read-write mode | |
| From block 0 to 488386583 | |
| Testing with pattern '''0xaa''': done | |
| Reading and comparing: done | |
| Testing with pattern '''0x55''': done | |
| Reading and comparing: done | |
| Testing with pattern '''0xff''': 22.93% done, 4:09:55 elapsed. (0/0/0 errors) | |
| [...] | |
| Testing with pattern '''0x00''': done | |
| Reading and comparing: done | |
| Pass completed, 0 bad blocks found. (0/0/0 errors)}} | |
| Options: | |
| ; -w: do a destructive write test | |
| ; -s: show progress-bar | |
| ; -v: be "verbose" and output bad sectors detected to stdout | |
| Additional options you might consider: | |
| ; -p <number>: run through the extensive four pass test <number> of sequent iterations | |
| ; -o </path/to/output-file>: print bad sectors to <output-file> instead of stdout | |
| ; -t <test_pattern> : Specify a pattern. See below. | |
| ==== define specific test pattern ==== | |
| '''From the manpage:''' "The <test_pattern> may either be a numeric value between 0 and ULONG_MAX-1 inclusive [...]." | |
| {{Expansion}} | |
| ===== random pattern ===== | |
| Badblocks can be made to repeatedly write a single "random pattern" with the {{ic|-t random}} option. | |
| {{hc|# badblocks -wsv -t random /dev/<device>| | |
| Checking for bad blocks in read-write mode | |
| From block 0 to 488386583 | |
| Testing with '''random pattern''': done | |
| Reading and comparing: done | |
| Pass completed, 0 bad blocks found. (0/0/0 errors)}} | |
| {{Warning|This is not secure for cryptographic purposes. A "random pattern" is a contradiction in itself. As badblocks does not (like [[Securely_wipe_disk#.2Fdev.2Furandom|urandom]]) apply sophisticated procedures to reuse entropy but simply repeats one "random pattern" it should not be used where random data would be needed, e.g. for [[Securely_wipe_disk#Preparations_for_block_device_encryption|block device encryption]].}} | |
| === read-write Test (non-destructive) === | |
| This test is designed for devices with data already on them. A non-destructive read-write test makes a backup of the original content of a sector before testing with a single random pattern and then restoring the content from the backup. This is a single pass test and is useful as a general maintenance test. | |
| {{hc|# badblocks -nsv /dev/<device>| | |
| Checking for bad blocks in non-destructive read-write mode | |
| From block 0 to 488386583 | |
| Checking for bad blocks (non-destructive read-write test) | |
| Testing with '''random pattern''': done | |
| Pass completed, 0 bad blocks found. (0/0/0 errors)}} | |
| The {{ic|-n}} option signifies a non-destructive read-write test. | |
| == Have File System Incorporate Bad Sectors == | |
| To not use bad sectors they have to be known by the filesystem. | |
| === During Filesystem Check === | |
| Incorporating bad sectors can be done using the filesystem check utility ({{ic|fsck}}). {{ic|fsck}} can be told to use ''badblocks'' during a check. To do a '''read-write''' (non-destructive) test and have the bad sectors made known to the filesystem run: | |
| # fsck -vcck /dev/<device-PARTITION> | |
| The {{ic|-cc}} option tells run {{ic|fsck}} in '''non-destructive''' test mode, the {{ic|-v}} tells {{ic|fsck}} to show its output, and the {{ic|-k}} option preserves old bad sectors that were detected. | |
| To do a '''read-only''' test (not recommended): | |
| # fsck -vck /dev/<device-PARTITION> | |
| === Before Filesystem Creation === | |
| Alternately, this can be done before filesystem creation. | |
| If badblocks is run without the {{ic|-o}} option bad sectors will only be printed to stdout. | |
| Example output for read errors in the beginning of the disk: | |
| {{hc|# badblocks -wsv /dev/<drive>| | |
| [...] | |
| Testing with pattern '''0xff''': done | |
| Reading and comparing: | |
| [...] | |
| 37584 | |
| 37585 0.84% done, 7:31:08 elapsed. (0/0/527405 errors) | |
| 37586 | |
| [...] | |
| done | |
| Testing with pattern '''0x00''': | |
| Reading and comparing: | |
| [...] | |
| 37584 | |
| 37585 | |
| [...] | |
| done | |
| Pass completed, 527405 bad blocks found. (0/0/527405 errors)}} | |
| For comfortably passing badblocks error output to the filesystem it has to be written to a file. | |
| {{hc|# badblocks -wsv '''-o''' /root/<badblocks.txt> /dev/<device>| | |
| Checking for bad blocks in read-write mode | |
| From block 0 to 488386583 | |
| Testing with pattern '''0xaa''': done | |
| Reading and comparing: 6.36% done, 0:51 elapsed. (0/0/14713 errors) | |
| [...] | |
| Testing with pattern '''0x00''': done | |
| Reading and comparing: done | |
| Pass completed, 527405 bad blocks found. (0/0/527405 errors)}} | |
| Then (re-)create the file system with the information: | |
| # mkfs.<filesystem-type> '''-l''' /root/<badblocks.txt> /dev/<device> | |
| {{Note|The meaning of {{ic|0/0/527405}} errors is <number of read errors>/<number of write errors>/<number of corruption errors>.}} | |
| ==== Block size ==== | |
| {{Merge|Securely wipe disk#Block size|Block size alignment is not specific to this tiny section. Other Arch Wiki Articles already do cover this up. Search for it and cover everything up on a [[Block size]] page.}} | |
| first find the file systems '''block size'''. For example for ext# filesystems: | |
| # dumpe2fs /dev/<device-PARTITION> | grep 'Block size' | |
| feed this to ''badblocks'': | |
| # badblocks -b <block size> | |
| == References == | |
| * [http://www.pcguide.com/ts/x/comp/hdd/errorsBadSectors-c.html My hard disk has bad sectors or is developing bad sectors over time] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| All modern hard drives offer the possibility to monitor its current state via [[SMART]] attributes. These values provide information about various parameters of the hard disk and can provide information on the disk's remaining lifespan or on any possible errors. In addition, various SMART tests can be performed to determine any hardware problems on the disk. This article describes how such tests can be performed for Linux using '''[[smartctl]] (Smartmontools)'''. | |
| == Installation of Smartmontools == | |
| The Smartmontools can be installed on Ubuntu using the package sources: | |
| <source lang="bash"> | |
| sudo apt-get install smartmontools | |
| </source> | |
| To ensure the hard disk supports SMART and is enabled, use the following command (in this example for the hard disk /dev/sdc): | |
| <source lang="bash"> | |
| sudo smartctl -i /dev/sdc | |
| </source> | |
| Example Output: | |
| <source lang="text"> | |
| smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.5.0-39-generic] (local build) | |
| Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net | |
| === START OF INFORMATION SECTION === | |
| Model Family: Western Digital RE4 Serial ATA | |
| Device Model: WDC WD5003ABYX-01WERA1 | |
| Serial Number: WD-WMAYP5453158 | |
| LU WWN Device Id: 5 0014ee 00385d526 | |
| Firmware Version: 01.01S02 | |
| User Capacity: 500,107,862,016 bytes [500 GB] | |
| Sector Size: 512 bytes logical/physical | |
| Device is: In smartctl database [for details use: -P show] | |
| ATA Version is: 8 | |
| ATA Standard is: Exact ATA specification draft version not indicated | |
| Local Time is: Mon Sep 2 14:06:57 2013 CEST | |
| SMART support is: Available - device has SMART capability. | |
| SMART support is: Enabled | |
| </source> | |
| The last two lines are the most important as these indicate whether SMART support is available and enabled. | |
| == Available Tests == | |
| SMART offers two different tests, according to specification type, for and SCSI devices.<ref>[http://www.t10.org/ftp/t10/document.99/99-179r0.pdf Hard Drive Self-tests] (t10.org)</ref> Each of these tests can be performed in two modes: | |
| * Foreground Mode | |
| * Background Mode | |
| In '''Background Mode''' the '''priority of the test is low''', which means the normal instructions continue to be processed by the hard disk. If the hard drive is busy, the test is paused and then continues at a lower load speed, so there is '''no interruption of the operation'''.<br /> | |
| In '''Foreground Mode''' all commands will be answered during the test with a "CHECK CONDITION" status. Therefore, this mode is only recommended when the hard disk is not used. '''In principle, the background mode is the preferred mode.''' | |
| === ATA/SCSI Tests === | |
| ==== Short Test ==== | |
| The goal of the short test is the rapid identification of a defective hard drive. Therefore, a maximum run time for the short test is 2 min. The test checks the disk by dividing it into three different segments. The following areas are tested: | |
| * '''Electrical Properties''': The controller tests its own electronics, and since this is specific to each manufacturer, it cannot be explained exactly what is being tested. It is conceivable, for example, to test the internal RAM, the read/write circuits or the head electronics. | |
| * '''Mechanical Properties''': The exact sequence of the servos and the positioning mechanism to be tested is also specific to each manufacturer. | |
| * '''Read/Verify''': It will read a certain area of the disk and verify certain data, the size and position of the region that is read is also specific to each manufacturer. | |
| ==== Long Test ==== | |
| The long test was designed as the final test in production and is the same as '''the short test''' with '''two differences'''. The first: there is '''no time restriction''' and in the Read/Verify segment the '''entire disk is checked''' and not just a section. The Long test can, for example, be used to confirm the results of the short tests. | |
| === ATA specified Tests === | |
| All tests listed here are only available for ATA hard drives. | |
| ==== Conveyance Test ==== | |
| This test can be performed to determine damage during transport of the hard disk within just a few minutes. | |
| ==== Select Tests ==== | |
| During selected tests the specified range of LBAs is checked. The LBAs to be scanned are specified in the following formats: | |
| <source lang="bash"> | |
| sudo smartctl -t select,10-20 /dev/sdc #LBA 10 to LBA 20 (incl.) | |
| sudo smartctl -t select,10+11 /dev/sdc #LBA 10 to LBA 20 (incl.) | |
| </source> | |
| It is also possible to have multiple ranges, (up to 5), to scan: | |
| <source lang="bash"> | |
| sudo smartctl -t select,0-10 -t select,5-15 -t select,10-20 /dev/sdc | |
| </source> | |
| == Test procedure with smartctl == | |
| Before performing a test, an approximate indication of the time duration of the various tests are displayed using the following command: | |
| <source lang="bash"> | |
| sudo smartctl -c /dev/sdc | |
| </source> | |
| Example output: | |
| <source lang="text"> | |
| [...] | |
| Short self-test routine | |
| recommended polling time: ( 2) minutes. | |
| Extended self-test routine | |
| recommended polling time: ( 83) minutes. | |
| Conveyance self-test routine | |
| recommended polling time: ( 5) minutes. | |
| [...] | |
| </source> | |
| The following command starts the desired test (in Background Mode): | |
| <source lang="bash"> | |
| sudo smartctl -t <short|long|conveyance|select> /dev/sdc | |
| </source> | |
| It is also possible to perform an "offline" test.<ref>[http://sourceforge.net/apps/trac/smartmontools/wiki/test_offline Smartmontools Wiki] (sourceforge.net)</ref> However, only of the standard self test (Short Test) is performed. | |
| Example output: | |
| <source lang="text"> | |
| smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.5.0-39-generic] (local build) | |
| Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net | |
| === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === | |
| Sending command: "Execute SMART Short self-test routine immediately in off-line mode". | |
| Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful. | |
| Testing has begun. | |
| Please wait 2 minutes for test to complete. | |
| Test will complete after Mon Sep 2 15:32:30 2013 | |
| Use smartctl -X to abort test. | |
| </source> | |
| To perform the tests in Foreground Mode a "-C" must be added to the command. | |
| <source lang="bash"> | |
| sudo smartctl -t <short|long|conveyance|select> -C /dev/sdc | |
| </source> | |
| == Viewing the Test Results == | |
| In general, the test results are included in the output of the following commands: | |
| <source lang="bash"> | |
| sudo smartctl -a /dev/sdc | |
| </source> | |
| Example: | |
| <source lang="text"> | |
| [...] | |
| SMART Self-test log structure revision number 1 | |
| Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error | |
| # 1 Short offline Completed without error 00% 2089 - | |
| # 2 Extended offline Completed without error 00% 2087 - | |
| # 3 Short offline Completed without error 00% 2084 - | |
| SMART Selective self-test log data structure revision number 1 | |
| SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS | |
| 1 0 0 Not_testing | |
| 2 0 0 Not_testing | |
| 3 0 0 Not_testing | |
| 4 0 0 Not_testing | |
| 5 0 0 Not_testing | |
| Selective self-test flags (0x0): | |
| After scanning selected spans, do NOT read-scan remainder of disk. | |
| If Selective self-test is pending on power-up, resume after 0 minute delay. | |
| [...] | |
| </source> | |
| The following command can also be used, if only the test results should are displayed: | |
| <source lang="bash"> | |
| sudo smartctl -l selftest /dev/sdc | |
| </source> | |
| Example output: | |
| <source lang="text"> | |
| smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.5.0-39-generic] (local build) | |
| Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net | |
| === START OF READ SMART DATA SECTION === | |
| SMART Self-test log structure revision number 1 | |
| Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error | |
| # 1 Short offline Completed without error 00% 2089 - | |
| # 2 Extended offline Completed without error 00% 2087 - | |
| # 3 Short offline Completed without error 00% 2084 - | |
| </source> | |
| == References== | |
| <references/> | |
| [[Category:Smartmontools]] | |
| [[de:SMART Tests mit smartctl]] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment