SmartFile provides a backup client for Windows. However, if you have Linux servers, it is just as important to back them up as well. Since SmartFile provides FTP access to your space, this task can be easily accomplished with some tools you likely already have installed.
This article will detail the steps to perform a simple, safe, encrypted backup directly to the SmartFile servers. At the end of the article a script will be provided that you can simply install onto your system to perform nightly backups.
Tar is historically for creating Tape ARchives. Thus backing up to a tape would usually involve using tar. Tar has many features including compression and is great for performing backups of Linux systems. Not only can it write to a tape, but also to a file on disk. Further, it can write the archive to stdout, so it can be fed into another program.
OpenSSL is an open source library and command line application that is capable of performing myriad encryption tasks. It is basically the swiss army knife of encryption for Linux systems. For our purposes, we will use it to encrypt our backup file before sending it to the FTP server. By default openssl will read input from stdin and output to stdout. This is perfect for our purposes.
cURL is a network client that is URL driven. It allows uploading or downloading to or from FTP or HTTP servers. For us, the main feature that cURL provides is that you can stream data directly to a file on an FTP server. Let me explain, while most FTP clients will allow you to upload a file from your file system to an FTP server this requires that the file you wish to send to the FTP server already exist on your disk. What is wanted for our backup is a way to “stream” the backup file directly to the FTP server without touching the local disk. cURL provides this with the -T option. If -T is passed – as the file, then the file data is read from stdin.
Now that we are familiar with the tools, let’s take a look at how we will use them all together. Linux allows multiple commands to be chained together by piping the output (stdout) of one command on to the input (stdin) of another command. The | or pipe character is used for this purpose. Thus at a high level, we will be doing the following.
tar | openssl | curl
Tar will create the backup of our system, openssl will then encrypt that backup and curl will transfer it to the FTP server, all without creating any temporary files that we would otherwise need to be cleaned up later.
All that remains is to determine what parameters each of the above commands needs to be given to get the behavior we want.
Tar – Parameters.
To create an archive, you use the c option. To compress the archive using Bzip2, you use the j option. Since we want to back up the entire system, our tar command thus far is.
tar cj /
By omitting any option to save the archive to disk, tar will by default output it to stdout. This allows us to pass the archive data to the next program in our chain without saving it to disk.
There are certain directories within your Linux system that should not be backed up. Some examples are:
/proc – The proc file system is provided by the Linux kernel and contains information about running programs.
/sys – The sys file system is provided by the kernel and contains information about hardware.
/dev – The dev file system consists of device nodes, which represent Linux device drivers.
Backing up the above directories would be folly, as they are provided by the kernel, and some of them (/dev/zero) are actually infinite in size. So, the second set of parameters we will pass to tar will exclude these file systems.
tar cj / –exclude=/proc –exclude=/dev –exclude=/sys
You may also wish to exclude /mnt, as generally you will have other file systems mounted there. These may be remote file systems that are already being backed up via other means. Of course, /mnt may contain file systems that you wish to back up. Your system configuration will dictate your choice here.
OpenSSL – Parameters.
We want openssl to perform encryption, thus we pass it the enc option. Also, I have opted to use the aes-256 algorithm in cbc mode, so we must pass that as well. Finally, openssl requires a key to perform our encryption. This key will be derived from a passphrase, this derivative procedure will use a salt value, so we also provide that option. We will store the passphrase in a file, so that openssl can retrieve it from that file.
openssl enc -aes-256-cbc -salt -pass file:/etc/backup-key
And we can create the key by doing the following.
echo ‘This is my backup key!’ > /etc/backup-key
chmod 400 /etc/backup-key
Of course, you are well-advised to use something other than the example key above.
cURL – Parameters.
Now, the final step in our backup procedure is to actually transfer the file to SmartFile. We will do this using cURL and the FTP protocol. cURL is driven by URLs, so we must provide one.
This tells curl to connect to www.smartfile.com and move into the backup directory. However, if the backup directory does not exist, curl will fail. Thus we will ask curl to create it for us if it does not exist.
curl –ftp-create-dirs ftp://www.smartfile.com/backup/
Now, as I alluded to before, we want curl to upload the data that it receives from it’s stdin. This is achieved by using the -T option like so.
curl –ftp-create-dirs -T – ftp://www.smartfile.com/backup/
If we want to use SSL, there are a couple of other options to provide. I suggest skipping SSL if you are already encrypting the backup file. However, if you want to use SSL, you would use the following parameters.
curl –ftp-create-dirs –ftp-ssl –ftp-ssl-reqd –insecure -T – ftp://www.smartfile.com/backup/
We are almost done, the final bit of information that curl needs is a username and password. We could have provided it as part of the URL, but that would expose our credentials to anyone snooping on the machine while the backup is running. It is safer to place the credentials into a file and instruct curl to retrieve them from the file. cURL is capable of doing this using a .netrc file. You can create the .netrc file like so.
echo machine www.smartfile.com login <username> password <password> > ~/.netrc
chmod 400 ~/.netrc
Of course, replace <username> and <password> with your username and password respectively. Now we instruct cURL to use our new .netrc file.
curl –ftp-create-dirs –ftp-ssl –ftp-ssl-reqd –insecure –netrc -T – ftp://www.smartfile.com/backup/
Putting it all together.
Now that you understand the basic building blocks of our backup to FTP solution. Please allow me provide you with a working script. This script was written and tested on CentOS 5.4. Some of the utilities used are out-of-date, for example, the version of curl available from the CentOS repositories uses some deprecated options, on other distributions, you may need to make modifications to these options. You will need to edit the configuration section of the script if you want to customize the behavior.
To install and use this backup script follow the steps below.
Download the script in the following location and ensure it is executable.
wget https://www.smartfile.com/downloads/smartfile-backup.sh -O /usr/local/bin/smartfile-backup.sh
chmod +x /usr/local/bin/smartfile-backup.sh
Customize the configuration section.
Create your key and .netrc files as directed above.
Finally, schedule it to run with cron. The example below will run at midnight every night.
0 0 * * * /usr/local/bin/smartfile-backup.sh
You can also run the script manually to ensure it works properly.
/bin/bash -x /usr/local/bin/smartfile-backup.sh
Restoring from a backup.
To restore the backup, or to retrieve files from the backup you can follow the steps below.
Download the backup file.
Decrypt the backup file.
Use tar to extract what you need.
Download the backup file.
You can either use the SmartFile web interface or FTP to retrieve the file.
Decrypt the backup file.
You can use OpenSSL to decrypt the file. The following command line would do the trick.
openssl enc -d -aes-256-cbc -salt -pass pass:’This is my backup key!’ -in full-2010-06-03.tar.bz2 -out full-2010-06-03.tar.bz2.dec
Use tar to extract what you need.
You can either extract the entire archive or a portion of it. Below are commands to perform either task. For more information, read the tar man page..
tar xjf full-2010-06-03.tar.bz2.dec -C /tmp/restore
tar xjf full-2010-06-03.tar.bz2.dec -C /tmp/restore /path/to/file
** Note **
You may receive the following warning during extraction:
bzip2: (stdin): trailing garbage after EOF ignored
This seems harmless, you can get rid of it by either writing the archive to disk before transfer or using gzip instead of bzip2. The archive still decompresses fine, but tar is apparently outputting some additional garbage when using bzip2 and outputting to stdout. I personally still using bzip2 and stdout, as the advantages (greater compression ratio, no temp disk space required) outweigh the disadvantages.