Thursday, December 31, 2009

Backing up like the cool kids (linux)

So you want to backup a bunch of files or a few very large files to DVD. Should be simple enough right? Unless you have files larger then a single DVD or you want some redundancy with your backup. IE if you loose one or two DVDs you can still get all your data back.

The solution to this problem is fairly simple.

You will need the following...
1) Enough free disk space to hold all the data + 4GB. I recommend NOT using the disk you are backing up from. If you have eSATA attached externals like I do they work FANTASTICALLY!
2) mdadm
3) sha256sum [OPTIONAL]

Method:
To start out we need to make the block files that will be written to the DVDs. Normally you will see people making these with dd. Which is how we will be doing it however unlike most we will make use of the seek option which reduces the creation time from minutes to seconds.

However before we start making images we need to determine how much space we need. For this example we will be backing up 20GB of data. This means we need 6 block files at 4GB each. The attentive of you will realize that many block files is enough to hold 24GB of data. That is correct! However one of thoes blocks will be the parody block which allows us to loose any single DVD and still get back all our DATA.

To create the block files you will need to run the following commands
cd /my/backup/scratch/directory
dd if=/dev/zero bs=1024 seek=4194303 count=1 of=block0
dd if=/dev/zero bs=1024 seek=4194303 count=1 of=block1
dd if=/dev/zero bs=1024 seek=4194303 count=1 of=block2
dd if=/dev/zero bs=1024 seek=4194303 count=1 of=block3
dd if=/dev/zero bs=1024 seek=4194303 count=1 of=block4
dd if=/dev/zero bs=1024 seek=4194303 count=1 of=block5

Whats going on here: We are using dd to create block files of 4GB in size. The seek option tells dd to call truncate on the out file (of=block#) and then write count=1 blocks to the end. Using the seek option has the benefit of 1) taking much less time then if omitted and 2) since we request the whole 4GB at once greatly reducing fragmentation.

So now we have our block files, lets use em. Now we need to make linux turn these block files into virtual disks. For this we will use losetup.
First we need to make sure that none of the loop0-5 devices are in use. To check if a loop device is currently in use run the command
sudo losetup /dev/loop# replacing # with the number of the device 0-5
Assuming none of the devices are in use we can run the following commands
sudo losetup /dev/loop0 block0
sudo losetup /dev/loop1 block1
sudo losetup /dev/loop2 block2
sudo losetup /dev/loop3 block3
sudo losetup /dev/loop4 block4
sudo losetup /dev/loop5 block5

Now that we have our virtual block devices lets combine them into one large raid5 array so we can begin our backup. To do this we will employ the linux software raid provided by mdadm
Before you create a mdadm disk you should check and see which IDs are free. Most likely if you have never used mdadm before /dev/md0 will be free. But, its always good to make sure.
cat /proc/mdstat
will print out all the active mdadm disks. I have a raid1 running on /dev/md0 in my system so this outputs
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sdb1[0] sda1[1]
      488383936 blocks [2/2] [UU]
Therefor I cant use /dev/md0 and must use /dev/md1 however on your system /dev/md0 is likely free so we will continue assuming /dev/md0 is free.

So lets create that raid5 disk at /dev/md0
sudo mdadm --create /dev/md0 -n 6 -l 5 -x 0 -f --assume-clean /dev/loop[0-5]

Whats goin on here?:
--create - create a new disk at /dev/md0
-n 6 - six "physical disks" /dev/loop[0-5]
-l 5 - level 5 raid or raid5
-x 0 - no spare disks - by default mdadm will spare one of your disks. WE DONT WANT THIS
-f - force mdadm will bitch if you have -x 0 and dont specify force
--assume-clean - mdadm will attempt to sync the virtual disks and we dont want it to
/dev/loop[0-5] - expands to /dev/loop0 /dev/loop1 /dev/loop2 ...  /dev/loop5

This should create a new disk /dev/md0 and running cat /proc/mdstat should list it.

Now lets shove a file system on our new virtual array. I like XFS because it formats FAST.
sudo mkfs.xfs /dev/md0

Lastly we need to mount this disk somewhere so we can put files on it.
sudo mkdir /mnt/backup
sudo mount /dev/md0 /mnt/backup

FILE COPY TIME - go get some tea...
Copy your files to /mnt/backup

TIME PASSES!

Once our backup is complete we need to break apart our array so we can copy the parts to DVDs
Unmount the virtual disk
sudo umount /dev/md0
Stop the mdadm array
sudo mdadm --stop /dev/md0
Turn off the loop devices
sudo losetup -d /dev/loop[0-5]

Now if all went well we should have 6 files sitting around waiting to be backed up.
BUT WAIT.
[OPTIONAL STEP - this takes quite a bit of time]
If your paranoid like me you will want some assurance later on that your files did not get corrupted when you go to restore them.
sha256sum data* > sha256sums
will generate sha256 sums for all the block files and dump them into the sha256sums file. COPY THIS FILE TO EVERY DVD YOU MAKE!

OK! Now we have our block files and possibly a sha256sums file.
Start BURNIN!
Put a blank DVD into the drive and burn the first block file. If you made a sums file copy it as well.
Do this for each block file.

Once you have copied all the block files you can delete them.

YOUR DONE!

------------------- Restore --------------------------
OH GOD! [Insert catastrophe here] HAS OCCURRED! I need to restore my backup!
Easy as pie I say!
First get your backup DVDs
Copy the block files off of the DVDs onto a spare hardrive
If all went well your lucky, if not no worries assuming you only lost one block file.
If you generated a sha256sums file copy that too and run sha256sum -c sha256sums
If any of the files are corrupt DELETE them. If more then one file is corrupt or you lost a disk and also had a corrupt file try recopying the block file. If it still fails your likely hosed but you can try the following steps... maybe...

Now we need to reverse the process stated before with some modifications.
1) Create the loop devices - if any where lost omit the line to create its loop device
sudo losetup /dev/loop0 block0
sudo losetup /dev/loop1 block1
sudo losetup /dev/loop2 block2
sudo losetup /dev/loop3 block3
sudo losetup /dev/loop4 block4
sudo losetup /dev/loop5 block5

2) Assemble the array. Remember CHECK AND MAKE SURE that /dev/md0 is free and if not pick another number.
sudo mdadm --examine /dev/loop0
This will print a bunch of info. The bit you want is the UUID field. Copy the uuid of the disk.
sudo mdadm --assemble /dev/md0 --uuid=COPIED UUID

Assuming you did not loose more then one disk, got the UUID correct, and md0 was free this should have created /dev/md0

2) sudo mount /dev/md0 /mnt/backup

Yay! now you can copy your backed up files.

TIME PASSES

Now that you have restored your files you can remove your copied block files - If you lost a block file / disk please check the section below this before running these commands!
sudo umount /dev/md0
sudo mdadm --stop /dev/md0
sudo losetup -d /dev/loop[0-5]
rm data*

And your done!

--------- Even more! Oh no when doing a restore one of my disks / block files was damaged -------
Fear not fair citizen!
True, you DO have good reason to be worried because if you loose one more disk all is lost! But we can regenerate the lost disk!
Lets assume we lost block4
1) create a new block4
dd if=/dev/zero bs=1024 seek=4194303 count=1 of=block4

2) make a loop device for it
sudo losetup /dev/loop4

3) add it to the damaged md array
sudo mdadm --add /dev/md0 /dev/loop4

4) Wait for the array to resinc - this will take quite some time!
You can check the progress by running
cat /proc/mdstat

5) Once the resync has completed you can dissemble the array. You will need to copy the new block files and sum file (if you created one) to new DVDs. The regenerated block4 file WILL NOT WORK with the old block files. You need to make A COMPLETELY NEW set of DVDs.

Well thats it friends. If you have questions or comments please feel free to contact me or leave a comment... you know, for comments.

No comments:

Post a Comment