I forgot to mention a few more points. 1) fsck -y isn't designed, at least in single pass mode, to ensure the filesystem is completely fixed (see next point), It is designed to make sure that further writes to the fs don't make it worse, among other things.
2) After running fsck, and it reports, "FILE SYSTEM MODIFIED", you must run fsck again to make sure you don't have further errors. When running fsck it must be run, repeatedly until you no longer get the message above. Only then are all the problems with the fs fixed.
3) If you have in the list of messages a note about hard link count being wrong and the count is too few, you *will* cause the deletion of the file, by using the -y fix. This, on a file that would have been totally recoverable had you not chosen the -y option.
4) In the particular case at hand, where there are multiply linked files, you may also wind up rewriting over the top of a good file with information that is not good, hence causing you to lose a file, had you chosen to not clone the file. The opposite is also true.
Lastly, my source for my notes comes from the linux-ext4 mailing list, but what do those guys know anyway?
Jack
Would you mind posting references to the conversations? A lot of this is new information to me, and I've been running ext* file systems for years. Most of the documentation I've found is pretty old.
Matt
On Sun, Sep 5, 2010 at 2:02 AM, Jack quiet_celt@yahoo.com wrote:
I forgot to mention a few more points.
fsck -y isn't designed, at least in single pass mode, to ensure the filesystem is completely fixed (see next point), It is designed to make sure that further writes to the fs don't make it worse, among other things.
After running fsck, and it reports, "FILE SYSTEM MODIFIED", you must run fsck again to make sure you don't have further errors. When running fsck it must be run, repeatedly until you no longer get the message above. Only then are all the problems with the fs fixed.
If you have in the list of messages a note about hard link count being wrong and the count is too few, you *will* cause the deletion of the file, by using the -y fix. This, on a file that would have been totally recoverable had you not chosen the -y option.
In the particular case at hand, where there are multiply linked files, you may also wind up rewriting over the top of a good file with information that is not good, hence causing you to lose a file, had you chosen to not clone the file. The opposite is also true.
Lastly, my source for my notes comes from the linux-ext4 mailing list, but what do those guys know anyway?
Jack
KCLUG mailing list KCLUG@kclug.org http://kclug.org/mailman/listinfo/kclug
You would have to go to the linux-ext4 mailing list archive and do a search. I'm only stating from notes (copy and pastes w/ source notation, actually) I made of the conversations.
As far as 1 and 2 goes, this should be documented in many places (google). Some filesystem errors can mask other issues and new problems may appear after running a first pass fsck. Generally you want to re-run fsck until it reports no errors. Here's a reference that explinas it quite succintly (look for "was modified"): http://support.apple.com/kb/ts1417
As far as point 3 goes. that should be obvious and easily verified either: with any Linux reference that speaks of hardlinks, or creating a file and making hardlinks to it and testing. When a file is hardlinked, the system keeps track of the number of references to it, once the counter reaches zero the file is released. A hard link creates links, but retains one file. You can delete the original file and the links are still good. A softlink keeps one file and makes links, but if you delete the original file the link points to nothing. So if you have a problem with a multiply hardlinked file where the count is 1 and should be 2 and fsck -y deletes one of the references, the count goes to zero and poof, the file is gone.
Item 4 is more obscure, and I found it out from the ext4 mailing list. THe other information is findable there also.
Furthermore, if you run fsck on a mounted file system, you must cleanly reboot to ensure the changes made take effect. This is widely documented (one source: Running Linux, O'Reilly Books).
Lastly, if fsck fails because it can't mount it to check (ie the main superblock is corrupted) and you're using an ext2/3/4 filesystem. You can run fsck.extN -f -b <offset> <device> where offset is a multiple of 8192 plus 1 or whatever size you're using for blocking, and N is 2, 3, or 4. 8192 is, or was, the default block size. The -f is required, because the superblock copy may appear "clean". Every block has a copy of the superblock on ext2 and later filesystems.
Jack
--- On Sun, 9/5/10, Matthew Copple mcopple@kcopensource.org wrote:
From: Matthew Copple mcopple@kcopensource.org Subject: Re: Ubuntu 9.04 with file issues Cross posted by intent- we're supposed to compare notes on such problems - I hope. To: kclug@kclug.org Date: Sunday, September 5, 2010, 11:06 AM Would you mind posting references to the conversations? A lot of this is new information to me, and I've been running ext* file systems for years. Most of the documentation I've found is pretty old.
Matt
Okay this is just my take on this situation being a casual linux user and sysadmin.
fsck -y should only be run in very specific cases. Even still the situations that Jack seems to be talking about indicate (at least to me) that there is far more wrong with your computer than just a corrupted file system. Sounds more like a hardware problem rather than a software (RAM, processor, or even drive).
There is ZERO need to be dramatic about a command line tool. fsck -y is perfectly fine and safe to run, providing you have not made any major system changes, and under the following conditions:
1) You cannot afford downtime (i.e. this is a live, production system) 1.a) You have no CURRENT, UP TO DATE backup
2) You have no backups
3) last but not least.. Sensitive data.
Now granted, if you have sensitive data on your computer, you better be backing it up on a regular basis. that is a no brainer. if fsck -y is deleting files, then it is most likely due to a hard failure in the system (This is *NOT* a program failure in many cases I've seen).
For whatever reason someone is running fsck to clean file systems, It is likely they do not have a current backup. My recommendation is, first and formost.. BACKUP THE SYSTEM... then restore from backup once you have a clean system to work from.
This drama is really unneccessary.
From one linux user who prefers drama-free lists.. :-) (Yeah, i know..
wishful thinking!)