Incremental backups with rsync
A cronjob on a server rsyncs all files from different locations to a single folder. Then, that folder gets versioned and backed up again, so that we can go back in time. Optionally, everything can be encrypted and sent to a cloud storage for aditional safety.
So how does it work? First of all some configurations:
Let's make a backup_config.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17#!/bin/bash
BACKUPDISK=/home/backupdisk
#the backup folder
BACKUPDIR=$BACKUPDISK/backup
#folders by date and time
ARCHIVEDIR=$BACKUPDISK/archive
RSYNC_LOCK_FILE=/tmp/rsync.lock
#this script will populate the archive
INCREMENTAL=/home/s2/bin/backup/incremental.sh
#how much the disk can be full before deleting old archives
MAX_PERCENT_USED=85
Then we need to actually backup some stuff to $BACKUPDIR
. The rsync_black.sh
is the script that backs up my windows desktop pc (his name is black
). The script connects with ssh to the remote pc, and backs up the
.gnupg
,.ssh
,AppData/Roaming/copyq
,AppData/Roaming/Mozilla/Firefox/Profiles
folders to $BACKUPDIR/black/Users/s2/
:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27#!/bin/bash
. `dirname $0`/backup_config.sh
host=black
(
flock -xn 200
if [ $? != 0 ]; then exit 0; fi;
ping -c1 $host>/dev/null 2>&1
if [ $? != 0 ]; then exit 0; fi
mkdir -p $BACKUPDIR/black
mkdir -p $BACKUPDIR/black/Users/s2/ && /home/s2/bin/backup/rsync-wrapper.sh \
-R -aqz --numeric-ids --delete-after --ignore-missing-args --exclude '*.lock' \
-e 'ssh' s2@$host:.gnupg \
:.ssh \
:AppData/Roaming/copyq \
:AppData/Roaming/Mozilla/Firefox/Profiles \
$BACKUPDIR/black/Users/s2/
$INCREMENTAL
) 200>$RSYNC_LOCK_FILE
The script uses rsync-wrapper.sh
, because sometimes files on the running desktop vanish while being backed up, and we want to ignore this error. So we don't use rsync
directly to backup, but wrap it with rsync-wrapper.sh
:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24#!/usr/bin/env bash
REAL_RSYNC=/usr/bin/rsync
IGNOREEXIT=24
IGNOREOUT='^(file has vanished: |rsync warning: some files vanished before they could be transferred)'
# If someone installs this as "rsync", make sure we don't affect a server run.
for arg in "${@}"; do
if [[ "$arg" == --server ]]; then
exec $REAL_RSYNC "${@}"
exit $? # Not reached
fi
done
set -o pipefail
# This filters stderr without merging it with stdout:
{ $REAL_RSYNC "${@}" 2>&1 1>&3 3>&- | grep -E -v "$IGNOREOUT"; ret=${PIPESTATUS[0]}; } 3>&1 1>&2
if [[ $ret == $IGNOREEXIT ]]; then
ret=0
fi
exit $ret
Like our rsync_black.sh
file above, we can create more files like that, to backup other computers. For example rsync_31337.it.sh
backs up folders on a remote server:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35#!/bin/bash
. `dirname $0`/backup_config.sh
(
flock -xn 200
if [ $? != 0 ]; then exit 0; fi;
mkdir -p $BACKUPDIR/31337.it
rsync -R -aqz --numeric-ids --delete-after -e 'ssh -p 22022' \
--exclude /home/n3wz/n3wz/run/solr/solr-5.3.1/server/solr/fcku/data \
root@vps.31337.it:/home/vmail \
:/home/s2 \
:/root \
:/home/n3wz \
:/etc/letsencrypt \
:/etc/postfix \
:/etc/postgresql \
:/etc/postgresql-common \
:/etc/apache2 \
:/etc/news \
:/etc/prosody \
:/etc/dovecot \
:/etc/default \
:/etc/dkimkeys \
:/etc/opendkim.conf \
:/etc/opendmarc.conf \
:/var/spool/cron/crontabs \
:/var/www \
$BACKUPDIR/31337.it/
$INCREMENTAL
) 200>$RSYNC_LOCK_FILE
Now we have all our folders from all our remote computers on $BACKUPDIR
. We need to version them. $INCREMENTAL
is responsible for that. This is incremental.sh
:1
2
3
4
5
6
7
8
9
10
11
12
13#!/bin/bash
. `dirname $0`/backup_config.sh
CURBACKUPDATE=`date "+%F_%T"`
BACKUPSUBDIR=`date "+%Y/%m/%d"`
mkdir -p $ARCHIVEDIR/$BACKUPSUBDIR &&
nice -n 10 rsync -aq --inplace --numeric-ids \
--link-dest=$ARCHIVEDIR/backup_current \
$BACKUPDIR/ \
$ARCHIVEDIR/$BACKUPSUBDIR/backup_$CURBACKUPDATE &&
(cd $ARCHIVEDIR && rm backup_current && ln -sf $BACKUPSUBDIR/backup_$CURBACKUPDATE backup_current)
in $ARCHIVEDIR
we create a lot of folders named with date and time, that contain our backup at a given point in time, so we can recover everything in case of lost data.
When our backupdisk reaches $MAX_PERCENT_USED
, we need to clean up, by deleting old archives. To do that, we use remove_old.sh
:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45#!/bin/bash
. `dirname $0`/backup_config.sh
get_used_space() {
local inodes=`df -i $BACKUPDISK|grep 'dev'|awk '{print $5}'|sed 's/%//'`
local space=`df $BACKUPDISK|grep 'dev'|awk '{print $5}'|sed 's/%//'`
if [ $inodes -gt $space ]; then
used_space=$inodes
else
used_space=$space
fi
echo used space is $used_space
}
check_used_space() {
get_used_space
if [ $used_space -lt $MAX_PERCENT_USED ]; then
echo used space is less than $MAX_PERCENT_USED
echo empty directoies
sudo find $ARCHIVEDIR/????/?? -depth -maxdepth 1 -empty -type d -exec rm -rf {} \;
sudo find $ARCHIVEDIR/???? -depth -maxdepth 2 -empty -type d -exec rm -rf {} \;
echo exiting
exit 0
fi
}
(
flock -xn 200
if [ $? != 0 ]; then exit 0; fi;
check_used_space
for stuff in `ls -tr1d $ARCHIVEDIR/????/??/??`; do
echo deleting $stuff
sudo rm -rf $stuff
check_used_space
done
) 200>$RSYNC_LOCK_FILE
With all our files in one folder, we have everyhing in place. I have them in my home dir, in /home/s2/bin/backup
:1
2
3
4
5
6
7
8
9
10/home/s2/bin/backup
|
├─ backup_config.sh
├─ incremental.sh
├─ remove_old.sh
├─ rsync_31337.it.sh
├─ rsync_black.sh
├─ rsync_home.sh
├─ rsync_laptop.sh
└─ rsync-wrapper.sh
So, now if we run rsync_31337.it.sh
, all our files from the computer 31337.it get copied over to our $BACKUPDIR
, and incremental.sh
will then create our backup folder in archive
, representing our files at this specific point in time.
$ARCHIVEDIR
will look like this:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22/home/backupdisk/archive/
├── 2023
│ ├── 04
│ │ ├── 23
│ │ │ ├── backup_2023-04-23_16:31:34
│ │ │ └── backup_2023-04-23_18:30:29
│ │ ├── 24
│ │ │ └── backup_2023-04-24_10:40:45
[...]
├── 2024
│ ├── 01
│ │ ├── 01
│ │ │ └── backup_2024-01-01_04:30:12
│ │ └── 29
│ │ └── backup_2024-01-29_04:32:43
│ └── 02
│ ├── 01
│ │ └── backup_2024-02-01_04:32:45
│ └── ...
└── backup_current -> 2024/02/27/backup_2024-02-27_04:30:19
Now we just need to run them periodically with cron
:1
2
3
4
5#backup
30 4 * * * sudo /home/s2/bin/backup/rsync_home.sh
30 */2 * * * sudo /home/s2/bin/backup/rsync_black.sh
45 */2 * * * sudo /home/s2/bin/backup/remove_old.sh >/dev/null
Done.
Optionally, we could compress and crypt our $BACKUPDIR
or $ARCHIVEDIR
, and upload everything to some cloud storage for additional safety.