Sunday, January 17, 2010

The In's and Out's of Fsck - Dealing with corrupt filesystems

fsck is used to check and resolve problems with filesystems. If you have corruption on one or more filesystems then read on.. Fsck is not used to check the functions of disks - under Solaris use format(8) for that.
Above all remember this;
You MUST NOT run fsck on a mounted filesystem

If you're in a hurry, skip down to Interacting with Fsck..

Most people's first experience with fsck comes after their system has crashed and they're faced with cryptic and daunting questions from it. This is unfortunate because they're probably under considerable pressure to get the system running again and don't know what to do. If you're new to Unix and responsible for one or more systems I would encourage you to find an unimportant workstation and experiment with fsck a little - umount a filesystem and fsck it. If the machine doesn't have any data on it you could pull the power and see what happens when the machine reboots...

This FAQ focuses on Solaris, though most of it is also applicable to other Unix variants, including Linux.
How fsck normally works

Unix, any Unix, will refuse to mount a filesystem that was not unmounted cleanly. This is because it may be corrupt and mounting a corrupt filesystem will likely cause the system to crash.

When the system boots all filesystems are checked to see whether they are Clean. The term simply means whether the filesystem was unmounted properly after it's last use. If the filesystem is Dirty then fsck will be called in to check it out in more detail. Some Unix variants such as Linux will also run fsck after the filesystem has been mounted N times - N is the maximal mount count.

Modern Unix systems run fsck automatically in what is known as Preen mode. In this mode fsck will fix minor problems that do not result in data loss - such as the Clean/Dirty state flag. If it finds any problems that may result in data loss it will flip into Interactive mode - this is how most people first encounter fsck.
Interacting with Fsck

When you first encounter fsck it seems that though only people with a PhD in computer science should be dealing with it - the messages are that cryptic.

Its really not that hard; tell someone to deal with the panicing users, close the door, and turn your phone off. You need to concentrate on this....

Take note of these points;

You must not mount a corrupt filesystem.
Some systems (including older Solaris systems) will let you mount a corrupt filesystem after fsck has been run on it. Doing so will almost certainly cause the system to crash later and your corruption might be even worse.
Most interaction with fsck consists of answering Yes or No
This to a series questions that, in essence mean 'Shall I fix this corruption?'. Newcomers are inclined to answer No because they don't understand the implications. If you answer No even once, the filesystem corruption may not be cleared. You must run fsck again in this instance.

Minor Corruption

I define minor corruption as where you've not lost data, but fsck can't tell.
An example of fsck encountering a minor corruption is show below;

sun (ksh) # fsck /dev/rdsk/c0t3d0s3
** /dev/rdsk/c0t3d0s3
** Last mounted on /usr
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
UNREF FILE I=343651 OWNER=root MODE=100644
SIZE=0 MTIME=Jun 13 09:43 2003
CLEAR? y

** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? y

25947 files, 588044 used, 133186 free (11674 frags, 15189 blocks, 1.6%
fragmentation)

Here fsck found an unreferenced file - that's an inode with no directory entry pointing to it. There's no name on the file because filenames are stored in directories. The only information shown is the inode number (I=343651), size, ownership, permissions and modification time. This inode refers to a file that is empty. Also as the Inode number is a high one it's very unlikely that this file is important - we answer Y (yes) to the CLEAR? question.

The superblock's free block count ("FREE BLK COUNT") will likely always be wrong if fsck made any modification to the file system on earlier phases. We answer Y (yes) to tell fsck to correct it.

Fsck's preen mode could not be expected to resolve this problem automatically - it is possible that an empty file could be significant. We made a judgement here, as you may have to.
Mid-Level Corruption

If you get to this point then you have lost at least one and possibly several files. If you're lucky you've only lost a few files that were open when the system crashed. At worst you've lost several directories and with them all the files in them. Send out for the backup tape, you're going to need it.

The following example of corruption showing loss of real data has been abridged for inclusion here;

** /dev/rdsk/c0t1d0s6
** Last Mounted on
** Phase 1 - Check Blocks and Sizes
UNKNOWN FILE TYPE I=97
CLEAR? yes

UNALLOCATED I=10 OWNER=root MODE=0
SIZE=0 MTIME=Jan 1 07:00 1970
NAME=?

REMOVE? yes

** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
UNREF DIR I=213509 OWNER=root MODE=40755
SIZE=512 MTIME=Mar 13 17:16 1999
RECONNECT? yes

** Phase 4 - Check Reference Counts
LINK COUNT DIR I=35722 OWNER=bin MODE=40755
SIZE=512 MTIME=Mar 13 17:24 1999 COUNT 5 SHOULD BE 4
ADJUST? yes

** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? yes

2683 files, 164403 used, 504020 free (1804 frags, 62777 blocks,
0.2% fragmentation)

***** FILE SYSTEM WAS MODIFIED *****

Phase#1 shows that we've lost two files, we have no idea of there size or contents. The I=10 entry is probably suspect because all the values are zero - root is UID 0, and the 1st Jan 1970 also equates to an epoch time of 0. I=10 is a very low inode number - in general the lower the number the more serious the problem. Both lost inodes could have been directories - there is no way of knowing. There has been serious corruption of the inode table here.

Phase#3 reveals an unconnected directory. This is a directory that is not included in any other directory, and should only hold true for the root inode, which with I=213509 this certainly is not. The 'RECONNECT? yes' causes fsck to make an entry in the lost+found directory, the name will be '#213509'. Once the filesystem is mounted you can 'cd /lost+found/#213509' and investigate what the directory contains and possibly identify where in the filesystem it should be.

Phase#4 shows a directory with an incorrect link count. The inode holding the directory has a link count of 5, but fsck could only find 4 directory entries pointing to it. This is probably the least serious error shown on this run.

The filesystem is probably safe to mount, though to be 100% sure you ought to fsck it again.

Assuming this is the only corrupt filesystem you can either 'exit' single user mode, or simply reboot the machine.

After the machine boots you need to decide what to do with this filesystem. This is a judgement call that you must make and which depends on may factors outside the scope of this FAQ. Personally, faced with the above fsck results, then unless the filesystem was totally unimportant I consider that the overall level of damage to it sufficient to warrant a full restore.

You shouldn't spend to long trying to fix this level of corruption, if more than half a dozen files have gone west you need to be considering restoring the whole filesystem from backup.
Severe Corruption

At this level you may have lost the entire filesystem. It really a case of seeing what you can salvage rather than getting the filesystem back on it's feet. If it's a file system that the system can live without to boot then you might consider removing it from /etc/vfstab (/etc/fstab on linux) so that you can boot the system multi-user.

If you run fsck on what you consider to be a 'good' filesystem, and see something like this, then you have severe corruption;

sun# fsck /dev/rdsk/c0t0d0s1
** /dev/rdsk/c0t1d0s1 (NO WRITE)
BAD SUPER BLOCK: MAGIC NUMBER WRONG
USE AN ALTERNATE SUPER-BLOCK TO SUPPLY NEEDED INFORMATION;
eg. fsck [-F ufs] -o b=# [special ...]
where # is the alternate super block. SEE fsck_ufs(1M).

fsck did not identify this partition as containing a filesystem. Double check that you entered the correct device file, assuming you did...

Using alternate superblocks

When you create a filesystem with newfs it pumps out a long list of numbers - super-block locations. The super-block contains key information about a filesystem, without it you don't have a usable filesystem. Solaris creates a backup super-block at the start of every cylinder group and there is always one at block #32. Try this, who knows....

sun (ksh) # fsck -o b=32 /dev/rdsk/c0t1d0s6
Alternate super block location: 32.
** /dev/rdsk/c0t1d0s6
** Last Mounted on
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? y


2746 files, 169956 used, 498467 free (1051 frags, 62177 blocks,
0.1% fragmentation)
***** FILE SYSTEM WAS MODIFIED *****

Well that doesn happen very often ! Looks like the superblock itself was the only thing corrupted. It lives at the start of the disk, so perhaps something wrote there ?

Officially you are supposed to record the super-block numbers when you create filesystem, no-one ever does. Assuming the filesystem was created with default parameters you can get a list of super-block backups by running newfs with the '-N' option;

sun (ksh) # newfs -N /dev/rdsk/c0t1d0s1
/dev/rdsk/c0t1d0s1: 237000 sectors in 50 cylinders of 20 tracks,
237 sectors
115.7MB in 4 cyl groups (16 c/g, 37.03MB/g, 17792 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 76112, 152192, 228272,

This yields the next superblock backup at block 76112, this you can try if you weren't as lucky as me, though to be honest if things are that bad it's probably a waste of time

No comments: