Adventures with FSCK
Back-ups are always a good idea… this week I lost two HDs. One on the PC, I was able to recover all the data and transfer it to a remote server. The other, my Ubuntu, Tambaqui of war.
I arrived at work in the morning and saw that the computer was turned off. As the Tambaqui is also a small server, I usually leave it running. I plugged in the fish and the boot occurred normally (I always say that only what was previously working perfectly will give you problems :-), otherwise nobody would come up with that worn-out excuse: yesterday it was fine!). Looking at the files after booting, I saw that there were several files with strange names with ??? and !!!!… a sign of file system problems, but fsck swore everything was okay :-)
I then ran init 1 to start in single-user mode. And there I ran fsck from the Ubuntu menu… which detected several problems. After analyzing, I made the boot and to my surprise the disk was empty !
Since I had already lost one HD the day before, I said: again! :-( I saw that the root directory had gone missing. I needed only two directories with my data, a practice I’ve been following for years. Well, I took a quick look at the disk and confirmed it was indeed empty. I could only boot from the Ubuntu CD, but I didn’t have any problems mounting the partition. I started preparing to reinstall Ubuntu when I saw that the partition wasn’t empty.
I never really believed in lost+found . Every time I had problems with Linux, I found only pieces of my files inside lost+found. But this time things were different. Among these pieces there were directories with names like #xxxxxx, thousands of them. Soon I discovered where my files had gone.
Then I started the following procedure to search for my files:
ls -laR lost+found > bigls.txt
Fsck had cut off my directories and left everything at one level! The option of generating the bigls.txt file was just to be able to search calmly what could be saved. The final file was over 40MB! But that’s nothing for less. Using the / to search, I was able to navigate the small ls monster I had created and soon detect what needed to be saved.
After that, it was easy to move the most important directories to a network disk. The filename I got from bigls.txt, opened in another window with less. Simple and practical. I’m still recovering installations on the machine because I couldn’t recover much of my /etc (tomcat with ldap is a pain).
In this new installation, I’ll leave an rsync running to the network disk… at least for data and /etc :-)
As for the other disk, running Windows XP… on my old notebook (2HDs lost in 2 months and counting !)… I was also able to mount the disk that no longer booted using the Ubuntu CD. It all became very simple because Ubuntu already identifies the Windows disk and it’s ultra easy to mount, just clicking on the corresponding volume in the menu. I like using the command line, but I have to confess that this time I enjoyed using the graphical tools in Ubuntu. I opened two browsers, one with the Windows partition and another with a remote disk using Samba. The copy was smooth, just dragging and dropping! Who would have thought one day to do this on Linux :-)
A program that helps see boot problems on Windows is testdisk . It detects partitions and helps rebuild the boot sector among other things. Too bad it’s not available through apt-get for Ubuntu 9.04. But in the link above, there’s an option for downloading. Just download and unpack the binary version. A ready-to-run executable (statically linked) is already ready to be used.
Another program for the toolkit is SmartMonTools . It accesses the information from S.M.A.R.T. on your hard drive and helps you decide what happened. Of course, it’s better to use this tool before having problems, but my Windows XP decided to die without a last breath. I arrived in the morning and it was already dead, but without data loss. You never know if it’s a virus problem or simply a hardware defect that you hadn’t noticed. These utilities help execute tests even post-mortem . Of course, depending on the state of your hard drive, because if it doesn’t boot or isn’t even detected… forget about it.