originally published on February 22, 2005 at
newsforge.com
It finally happened. One of the production web servers, running Red Hat
Enterprise Linux (RHEL), was in trouble. Big trouble. On the surface,
it seemed to be running fine, serving up web pages normally. While
performing some routine maintenance, I happened to run an "ls" command
in the root directory and it returned and empty listing. No directories
or files. Nothing.
Where are the files?
From the root directory, I tried a "cd boot" and successfully changed
directories into the /boot directory. The "ls" command showed files in
the /boot directory so it appeared there was some file system damage.
Checking the system message log showed hundreds of these foreboding
entries:
Dec 21 01:05:01 linux01 kernel: EXT3-fs error
(device cciss0(104,1)): ext3_new_block: Allocating block
in system zone - block = 96
Errors in the file system (ext3) can't be a good sign. I decided the
best course of action was to boot into rescue mode and run a file system
check (fsck) to clear up any problems. The fsck
turned up a lot of bad directory entries and offered to repair them.
Since there were so many, I ran the command again with the automatic
repair switch. After some time and lot of scary messages, the repair
finished. I held my breath and rebooted. The boot loader failed to
find the kernel. Another reboot into rescue mode showed that the file
system was now clean, a little too clean. Now there really were no
files left. Only the /lost+found directory survived the repair and it
was empty.
Want to Get Away?
In eight years of running Linux, I had never needed to completely
restore a production system. Most restores were for individual files
or directories. While I had confidence in my backup system, that
confidence was put the ultimate test. In moments like this, it is
evident how important a well tested backup system is.
The system runs a shell script via cron to do a full backup every night.
I prefer full backups compared to incremental or differential if
possible. If the amount of data you have to backup is too large, you
may have to use other backup strategies. In this case, the entire
system and all content take up about 10 GB, easily fitting on a single
DLT tape.
The backup script is written using tar and is not compressed using
software, but is compressed by the tape drive. While there are plenty
of fancy backup systems available, both open source and commercial, I
rely on the simple, tried and true, GNU version of tar. Combined with a
shell script, I have a backup system that is easy to understand and use.
The admins of non-Linux servers at my site use an expensive, name brand
commercial backup system, but their backups are not nearly as reliable.
On the other hand, tar does not offer some of the advanced features that
may be needed in certain environments.
Here is the backup script I used:
#!/bin/sh
# Backup
# This script backs up all server data to tape.
# Delete old mysql dumps
rm -f -r /backup/mysql
mkdir /backup/mysql
# Dump all mysql databases
mysqldump --add-drop-table -A --user=root \
--password=xxxx > /backup/mysql/databases.sql
# Initialize the tape drive
if /bin/mt -f "/dev/nst0" tell > /dev/null 2>&1
then
# Some drives require zeroing the data before
# they can be overwritten.
/bin/mt -f "/dev/nst0" rewind > /dev/null 2>&1
/bin/dd if=/dev/zero of="/dev/nst0" \
bs=32k count=1 > /dev/null 2>&1
/bin/mt -f "/dev/nst0" rewind > /dev/null 2>&1
else
echo "Backup aborted: No tape loaded"
exit 1
fi
# Do backup
/bin/tar --create --verbose --preserve \
--ignore-failed-read --file=/dev/nst0 / \
> /backup/filelist.txt
# Add completion date to filelist
echo "Backup complete on " `date` \
>> /backup/filelist.txt
/bin/mt -f "/dev/nst0" rewind
/bin/mt -f "/dev/nst0" eject
The "mt", or magnetic tape command is used to control SCSI tape drives.
A couple of notable things going on in the script include a dump of all
MySQL databases to text files. Although MyISAM format databases in
MySQL can be restored directly from tape, it is safer to restore the
databases from a dump.
The script also saves the list of files backed up to a log file
(/backup/filelist.txt). When tar runs, it sends the file names it is
copying to standard out and the script redirects them to the log file.
You need the file list to do an efficient bare metal restore from tape
(more on that later). Notice the --preserve option in the tar command.
It is needed to ensure that file permissions are saved along with the
files.
The backup and restore scripts shown are useful when backing up to a
single tape drive. Do with them what you will. As always, no technical
support is provided.
Just Like Starting Over
Since the system would not boot, the first step was to get at least a
minimal system running so I could run a restore. I dug out my RHEL CDs
and started the install process. There is a handy option during the
install, at the bottom of the package selection screen, called "Minimal
Install". I chose that option to save time during the install and
because I planned to overwrite the system later when I restored it from
tape.
One tricky point about a full restore is that the directories need to be
restored in the order they exist on the tape. The tape drive is a
serial device that can't read backward. So, if the first file you
restore is half way through the tape, you can't go back and restore
something from the first part of the tape without rewinding it and
running another restore. That's why the filelist.txt log file is so
important. Of course, the filelist.txt file was destroyed with the rest
of the system so that is the first file that had to be restored. The
backup tape from the previous night was still on site (our off site
rotations happen once a week). Once I restored the filelist.txt file, I
browsed through the list to determine the order that the directories
were written to the tape. Then, I placed that list in the restore
script below.
Here is the restore script:
#!/bin/sh
# Restore everything
# This script restores all system files from tape.
#
# Initialize the tape drive
if /bin/mt -f "/dev/nst0" tell > /dev/null 2>&1
then
# Rewind before restore
/bin/mt -f "/dev/nst0" rewind > /dev/null 2>&1
else
echo "Restore aborted: No tape loaded"
exit 1
fi
# Do restore
# The directory order must match the order on the tape.
#
/bin/tar --extract --verbose --preserve \
--file=/dev/nst0 var etc root usr lib boot \
bin home sbin backup
# note: in many cases, these directories don't need
# to be restored: initrd opt misc tmp mnt
# Rewind tape when done
/bin/mt -f "/dev/nst0" rewind
The list of directories to restore are passed as parameters to tar in
the script. Just as in the backup script, it is important to use the
--preserve switch so that file permissions are restored to the way they
were before the backup. I could have just restored the / directory, but
there were a couple of directories I wanted to exclude so I decided to
be explicit about what to restore. If you want to use this script for
your own restores, be sure the list of directories matches the order
they were backed up on your system.
Although it is listed in the restore script, I removed the /boot
directory from my restore. The reason is that I suspected my file
system problem was related to a kernel upgrade I had done three days
earlier. By not restoring the /boot directory, the system would
continue to use the stock kernel that shipped on the CDs until I
upgraded it. I also wanted to exclude the /tmp directory and a few
other directories that I knew were not important.
The restore ran for a long time, but uneventfully. Finally, I rebooted
the system, reloaded the MySQL databases from the dumps, and the system
was fully restored and working perfectly. Just over four hours elapsed
from total meltdown to complete restore. I believe I could trim at
least an hour off that time if I had to do it a second time.
Postmortem
I filed a bug report with Red Hat
Bugzilla, but I could only provide log files from the day before the
crash. All core files and logs from the day of the crash were lost when
I tried to repair the file system. I exchanged posts with a Red Hat
engineer, but we were not able to precisely nail down the cause. I
suspect the problem was either in the RAID driver code or ext3 code. I
should note that the server is a relatively new HP Proliant server with
an Intel hyperthreaded Pentium 4 processor. Because the Linux kernel
sees a hyperthreaded processor as a dual processor, I was using an SMP
kernel when the problem arose. I reasoned that I might squeeze a few
percentage points of performance out of the SMP kernel. This bug may
only manifest when running on a hyperthreaded processor in SMP mode. I
don't have a spare server to try to recreate it.
After the restore, I went back to the uniprocessor kernel and have not
yet patched it back up to the level it had been. Happily, the ext3
error has not returned. I scan the logs every day, but it has been well
over a month since the restore and there are still no signs of trouble.
I am looking forward to my next full restore, probably some time in 2013.

This work is licensed under a
Creative Commons Attribution-NonCommercial 2.5 License.