Smart rotating backup with rsync

I made a small script, which backups all my important files to remote machine via rsync. Cron runs the script every day. During the update, yesterdays update is moved into folder with yesterdays name of the day of the week and current update is stored in folder today. Thus I store 8 snapshots of my backup, one for today and one for every day of the last 7 days.

The script gzips all data with --rsyncable option and then moves them very efficiently via rsync, through secure SSH tunel, onto remote machine. I utilized a smart rsync setting, which not only copies just the diff between todays and yesterdays backup, but which also copies yesterdays backup to yesterdays snapshot folder, all in single rsync command.

Preparation

The script is written in bash, so it starts with

# rsync.sh
# !/bin/bash

then I define local backup sources as list of absolute paths to directories. Don't forget to add trailing slash to the paths.

# rsync.sh
LOCAL_SOURCES=('/root/' '/www/' '/home/' '/etc/nginx/' '/etc/php5/fpm/')

I also define local directory, where the backup is prepared. The directory preserves data from last update until they are overwriten with new backup, so it could be utilized for quick accessing backup data on local machine.

# rsync.sh
LOCAL_PATH="/var/backups/backup_full"

I'm also dumping MySQL database into backup, so next is my DB data

# rsync.sh
MYSQL_USER="******"
MYSQL_PASSWORD_FILE="~/.passwd/db.pass"

The .pass file is a plaintext file, which only owner has access to. I created it in my home folder in hidden .passwd directory. The commands for creating could be like this

$ command line
cd ~                                     # go to my home folder
mkdir .passwd                            # create .passwd directory
cd .passwd                               # go to .passwd folder
touch db.pass                            # creates new db.pass
echo "my_secret_db_password" >> db.pass  # writes your password to file
chmod 400 db.pass                        # changes permissions

Then we have to define our remote machine. I connect via localhost because the remote server is connected through secure ssh tunel.

#rsync.sh
REMOTE_USER="******"
REMOTE_PASSWORD_FILE="/root/.passwd/remote.pass"
REMOTE_PATH="path_to_backup_dir_on_remote_without_leading_or_trailing_slash"
REMOTE_DESTINATION=`date +%a -d "yesterday"` # get yesterdays name of day of the week
RSYNC_SERVER="--password-file=$REMOTE_PASSWORD_FILE rsync://$REMOTE_USER@localhost/$REMOTE_USER/$REMOTE_PATH"
RSYNC_OPTIONS="--progress --force --ignore-errors --delete-excluded --delete --backup --backup-dir=/$REMOTE_PATH/$REMOTE_DESTINATION -a"

The most important setting contains last row RSYNC_OPTIONS.

  • The --delete and --delete-excluded tells rsync to delete files in remote, if they were deleted on local (standard behavior is keep all on remote even if deleted on local).
  • The --backup and --backup-dir takes previous (yesterdays) backup and moves it in some directory on remote. In our case to the directory with yesterdays name of the day of the week.
  • For more info see rsync documentation.

Processing

Now we have defined all our inputs so we start to wire up the backup process.

Firstly the script checks if our local working backup direcotry exists and if not, creates it

# rsync.sh
[ -d $LOCAL_PATH ] || (echo "local backup ditectory does not exists, creating $LOCAL_PATH" && mkdir $LOCAL_PATH)

Then I dump and gzip my database

# rsync.sh
archive="$LOCAL_PATH/mysql.sql.gz"
echo "dumping MySQL databases to $archive" && /usr/bin/mysqldump -u$MYSQL_USER -p`cat $MYSQL_PASSWORD_FILE` --all-databases | gzip --rsyncable > $archive

The trick here is using --rsyncable options for gzip. This produce slightly larger archives, but with ability ty sync them efficiently via rsync. Without this option, the rsync would not be able to made a diff of archives and therefor will always transfer all the data, which will flush all the power of rsync.

Next I gzip all defined backup sources into working directory

# rsync.sh
for source in "${LOCAL_SOURCES[@]}"; do
    [[ $source =~ ([^\/]*)\/?$ ]] # use regular expresion to extract last directory name from the path, which is used as name for the archive
    archive="$LOCAL_PATH/${BASH_REMATCH[1]}.tar.gz"
    echo "compress $source to $archive" && tar -c $source | gzip --rsyncable > $archive
done

Now we have all our backup sources gziped and gathered in our $LOCAL_PATH directory. Just transfer the files to remote and we are done.

Firstly, add some paths to system variable PATH to simplify further commands

# rsync.sh
export PATH=$PATH:/bin:/usr/bin:/usr/local/bin

Then we need to make little hack. Supose its tuesday. Because we did not run backup yet, Today directory still contains data from monday. Rsync will copy data from Today to Mon directory, but we need to clean Mon directory first. To achieve that via rsync, we create new, empty directory in our home folder and we sync it to Mon snapshot directory. Thanks to rsync options --delete, the Mon directory is fully synced with empty directory and is therefor emptied.

# rsync.sh
# the following lines clears the last weeks snapshot
EMPTY_DIR="__tempemptydir"
[ -d $HOME/$EMPTY_DIR ] || mkdir $HOME/$EMPTY_DIR
rsync --delete -a $HOME/$EMPTY_DIR/ $RSYNC_SERVER/$REMOTE_DESTINATION
rmdir $HOME/$EMPTY_DIR

Note: the script creates and destroys __tempemptydir in my home folder. If you're using or plan to use such directory name in your home folder, change the EMPTY_DIR variable value.

And these are the last and most important lines

# rsync.sh
# now the actual transfer
TODAY_DIR="Today"
rsync $RSYNC_OPTIONS $LOCAL_PATH/ $RSYNC_SERVER/$TODAY_DIR

All the details are stored in variables and were explained earlier. If you run the script daily, the Today directory contains previous backup, which is backup from yesterday. During the transfer, the last (yesterdays) backup is copied to remote directory with yesterdays name of the day of the week and then the yesterdays updated in Today directory is rsynced to current update, which basically means, that only diffs between yesterday and today are transfered over network.

Before you run the script, be sure, that you have created remote folder for updates, which I define in REMOTE_PATH variable. The rsync can create only single level of directories, so it will produce all the Today as well as Mon to Sun directories for you, but if you define remote path as backup and there would be no /backup/ directory on remote, the script fails.

Complete script

Here is my complete script

# !/bin/bash

#**************************************
# Config local
#
LOCAL_SOURCES=('/root/' '/www/' '/home/' '/etc/nginx/' '/etc/php5/fpm/')
LOCAL_PATH="/var/backups/backup_full"
#**************************************
# Config MySQL Database
#
MYSQL_USER="******"
MYSQL_PASSWORD_FILE="~/.passwd/db.pass"
#**************************************
# Config remote
#
REMOTE_USER="******"
REMOTE_PASSWORD_FILE="/root/.passwd/wdisk.pass"
REMOTE_PATH="path_to_backup_dir_on_remote"
REMOTE_DESTINATION=`date +%a -d "yesterday"`
RSYNC_SERVER="--password-file=$REMOTE_PASSWORD_FILE rsync://$REMOTE_USER@localhost/$REMOTE_USER/$REMOTE_PATH"
RSYNC_OPTIONS="--progress --force --ignore-errors --delete-excluded --delete --backup --backup-dir=/$REMOTE_PATH/$REMOTE_DESTINATION -a"

[ -d $LOCAL_PATH ] || (echo "local backup ditectory does not exists, creating $LOCAL_PATH" && mkdir $LOCAL_PATH)

archive="$LOCAL_PATH/mysql.sql.gz"
echo "dumping MySQL databases to $archive" && /usr/bin/mysqldump -u$MYSQL_USER -p`cat $MYSQL_PASSWORD_FILE` --all-databases | gzip --rsyncable > $archive

for source in "${LOCAL_SOURCES[@]}"; do
    [[ $source =~ ([^\/]*)\/?$ ]]
    archive="$LOCAL_PATH/${BASH_REMATCH[1]}.tar.gz"
    echo "compress $source to $archive" && tar -c $source | gzip --rsyncable > $archive
done

export PATH=$PATH:/bin:/usr/bin:/usr/local/bin

# the following lines clears the last weeks snapshot
EMPTY_DIR="__tempemptydir"
[ -d $HOME/$EMPTY_DIR ] || mkdir $HOME/$EMPTY_DIR
rsync --delete -a $HOME/$EMPTY_DIR/ $RSYNC_SERVER/$REMOTE_DESTINATION
rmdir $HOME/$EMPTY_DIR

# now the actual transfer
TODAY_DIR="Today"
rsync $RSYNC_OPTIONS $LOCAL_PATH/ $RSYNC_SERVER/$TODAY_DIR

And this is my cron setting

$ command line
crontab -e

# daily rsync updates in 03:17 AM
17 03 * * * /bin/bash /root/rsync.sh

Put it to good use.