Backup shell script - change and version info (also see similar comments at end of script)

Version 4.30, 1-Jan-2005:
Backup version 4.30 is a bugfix release for version 4.20. There are two known bugs in version 4.20:

(1) Diagnosed and reported by Scott Hamilton, who also submitted a working fix - thanks, Scott:
When performing compressed-mode backups via -g or -b (*.gz* and *.bz2.*) with backup 4.20, the archive files are created without the identifying .gz or .bz2 string in the names. This can cause havoc during a subsequent recovery if you aren't aware of it, since attempting to recover from such a file set "as is" can produce a garbage result.

A forward/backward compatible workaround that's guaranteed to work with such mis-named archives with any flavour of unix is simply to allow users to specify -g or -b with -r during recovery to force the correct decompression method, and versions 4.30+ allow this.

If you do wish to stage a recovery from such a disc set as created by backup 4.20 and you're not entirely sure if it was g-zipped or b-zipped on the day, you should use the "file" command to find out for certain - eg:
file <archive-chunk-name>.
If the result of this command is something like
<archive-chunk-name>: gzip compressed data, from Unix
you'll need to do the recovery via backup -rg rather than just backup -r.

You could alternatively, of course, copy the archive pieces onto a large hard-drive and rename them and recover from that copy. Or you could even re-burn these renamed copies onto a new set of CDs. But I felt obliged to provide a "command line" fix at the very minimum.

Note that version 4.20 will have no problems in recovering from compressed archives created with other versions, as these will be named correctly. There are likewise no problems with recovery from uncompressed archives via version 4.20, but you should replace it anyway because of another bug:

(2) The other bug introduced in 4.20 causes it to ask for recovery CD/DVDs twice. The script reads disc 1 normally, but for disc 2 etc, it will request the disc twice. Just leaving the disc in the tray and hitting "Enter" will cause everything to proceed normally - but it's disconcerting and can potentially cause one or more skipped heartbeats plus a measurable increase in blood pressure, something we can all do without.

Background: This second bug came about via some restructuring of the 'archive name-checking' code after I discovered that it was possible to insert dodgy discs from a different set during a recovery and fool the script. For example, you could insert disc 1, but then give it disc 2 from another set. It would continue on as if nothing had happened, albeit with a "garbage" complaint in the error log. I did some restructuring to ensure that it couldn't be fooled like this - except that after moving the 'nextcd' function call into an inner 'while-loop' to provide that improved checking, I omitted to remove the original call from further back.

How could such a snafu get past testing? I can only assume that it was late at night and I shrugged my shoulders and figured that I was having some read problems with my CD drive with the test set.

Re bug #1 - well, I must admit that I've never used compression for my own backups, purely in the interests of speed and CPU load. I do always do a test recovery off a compressed CD set before posting any new version of the script, but it looks like I'll also need to make sure I create a fresh compressed test archive set as well. No big deal - CDs are cheap enough these days (pity dual-layer DVDs still aren't).

Anyway, in a nutshell, even if you aren't using compressed backups, you should still replace version 4.20.

Nov 15, 2004:
Backup version 4.20. Two new parameters added by Bradley E. Maggard; -D <dir> now specifies the backup working directory, and -B <filename> specifies the backup-exclusions file. As is normal with all parameters for the backup script, both are optional - only use them if you need them. As Brad explained:

"I have 2 machines in our home office. I run Linux on mine and Samba to mount the other Windows machine in order to back it up. I run two sequential commands to backup each machine separately:

backup -D /var/backups/WinMachine -d /mnt/WinMachine -g -B bex.WinMachine
backup -D /var/backups/LinuxMachine -d / -g -B bex.LinuxMachine"

Thanks Brad. I've taken your modified version and slightly rearranged it to allow the exclusions file (via -B) to be specified relative to BKUP_DIR (if you give a simple file name), or absolute (if it includes a path component). But your idea was the main thing, and well worth including.

I've also added the -s (size) parameter to allow the chunk size to be set to suit DVDs - 4.7, 8.54, 9.4 and 17.08 Gb. This also required the upgrading of 'fsplit' to version 2.2 to get past its 2Gb file-size limit. Assumes your OS has 'big file' support, eg: elf2 or above for Linux. As always, see bottom of script for more details, or type backup -h for a quick usage summary.

Jan 29, 2003:
Backup version 4.0. Allow for compression via -g (gzip) or -b (bzip2) in the backup creation phase (recovery (etc) are unchanged). Move the exceptions list out of the main script into a separate file (bex). Plus other minor improvements (see bottom of script for full details).

Sep 29, 2002:
Backup version 3.41. As suggested by Stephen Goodenough (UK), the eject

Nov 15, 2004:
Backup version 4.20. Two new parameters added by Bradley E. Maggard; -D <dir> now specifies the backup working directory, and -B <filename> specifies the backup-exclusions file. As is normal with all parameters for the backup script, both are optional - only use them if you need them. As Brad explained:

"I have 2 machines in our home office. I run Linux on mine and Samba to mount the other Windows machine in order to back it up. I run two sequential commands to backup each machine separately:

backup -D /var/backups/WinMachine -d /mnt/WinMachine -g -B bex.WinMachine
backup -D /var/backups/LinuxMachine -d / -g -B bex.LinuxMachine"

Thanks Brad. I've taken your modified version and slightly rearranged it to allow the exclusions file (via -B) to be specified relative to BKUP_DIR (if you give a simple file name), or absolute (if it includes a path component). But your idea was the main thing, and well worth including.

I've also added the -s (size) parameter to allow the chunk size to be set to suit DVDs - 4.7, 8.54, 9.4 and 17.08 Gb. This also required the upgrading of 'fsplit' to version 2.2 to get past its 2Gb file-size limit. Assumes your OS has 'big file' support, eg: elf2 or above for Linux. As always, see bottom of script for more details, or type backup -h for a quick usage summary.

Jan 29, 2003:
Backup version 4.0. Allow for compression via -g (gzip) or -b (bzip2) in the backup creation phase (recovery (etc) are unchanged). Move the exceptions list out of the main script into a separate file (bex). Plus other minor improvements (see bottom of script for full details).

Sep 29, 2002:
Backup version 3.41. As suggested by Stephen Goodenough (UK), the eject command is too general. It should read "eject $DEV" to ensure that systems with >1 CD drive do the right thing when the script needs to change CDs. Thanks, Stephen.

Aug 11, 2002:
Backup version 3.40. Records brief run details in an accumulating log file - nominally "Log.backup", nominally in BKUP_DIR (and yeh, I know - too many different log files).

Aug 5, 2002:
Backup version 3.32. Panic not - I merely improved and/or corrected one or two comments in the script. No functionality change at all from version 3.31.

June 22, 2002:
Backup version 3.31. Recovery or archive list mode (-r/-t/-T) can now take a file(s) parameter to over-ride the local CD drive as an archive recovery source.

April 13, 2002:
Backup version 3.20. Absolute path exclusions in SED_ARGS beginning with ^/ were having no effect when backing up sub-trees using the new -d flag. Added a new constant ABS_REL as a separate sed RE and rearranged the code slightly to correct.

Rewrote the -p (pattern-matching) retrieval section on this web page to provide some concrete examples. Added some discussion as to the differences between the GNU cpio and other (commercial Unix) versions re pattern matching syntax.

April 5, 2002:
Backup vers 3.15. Added recovery functionality to the script (via -r). This uses a sneaky FIFO scheme as dreamt up by Jean-Francois Landry (Canada) that now permits recovery without the need for copying the CD chunks back to the hard drive first. (Type "backup -h" to see the expanded command set.)

Update this page to reflect the changes, and also archive the old page and provide link to it from this page.

December 5, 2001:
Backup vers 2.6. Remove another bug in backup script introduced in version 2.5 (arrgghhhh ...) - a typo on FIND_ARGS was causing delta's to run as full backups. Again it's thanks to Justin Noack, JN Computer Care for quickly pointing this bug out. (And let's hope that this is the last change)

November 21, 2001:
Backup vers 2.5. Remove a bug in backup script which affected a restore into an empty directory. The find command for archive creation was being used with an arg of -type f and this was excluding directory information (UID, GID and permissions) from the archive. Result of this was that a restore into an empty directory would create all sub-directories with UID.GID = root.root and perms of 700. (File perms were okay.) Thanks to Justin Noack, JN Computer Care and Coloma Community Schools.

October 8, 2001:
Add a new section to this web page called "Verifying your first run". Plus other minor tidy-ups to improve readability.

June 18, 2001:
Add a new section to this web page called "Making intermediate (delta) backups".

June 10, 2001:
backup vers 2.4: Add '-H crc' flag to cpio command in script to checksum everything and to enable > 64K i-nodes. Update the recovery strategy sections of this page to overcome the 2Gb file size limit of some systems (eg: Linux).

June 9, 2001:
Add some more 'exclusion' directories to script after installing Mandrake 7.2

March 11, 2001:
Updated fsplit to take stdin, so the backup script now pipes the output of cpio through fsplit without the need to generate an intermediate (full cpio archive) first. (And just try to do that with NT or Win2000 or whatever they call it this year :-)