Jump to content

Backup script for block devices


piknew

Recommended Posts

Hi, if anybody would like to use - please see attached script for backup of block devices.  I used it when executing backup for topics: http://forum.armbian.com/index.php/topic/2073-orange-pi-plus-2e-upgrade-to-520/and http://forum.armbian.com/index.php/topic/1681-migrate-from-orange-pi-plus-2-to-orange-pi-plus-2e/

 

Requirement:

  • dd, gzip, ssh client (I guess they should be available by default on every system)
  • lsblk, mount (should be available on every system)
  • pv tool; you may install it "apt-get install pv"

Script is identifying list of available block devices. The condition is that it should be block device with at least one partition and it cannot be mounted at the moment of script execution.

 

When user chooses block device to backup then script will ask for some parameters of receiving system (please notice, that you may leave defaults or enter own. Defaults (if needed can be changed in related conf file).

 

So, script is getting stream of data by dd, then passing to pv tool to display progress, then pipe to gzip and pipe to ssh session as specified by parameters.

 

Remark: even if ssh is used to transfer stream of data there is NO SECURITY related to it. Because of options:

 

-o StrictHostKeyChecking=no

-o UserKnownHostsFile=/dev/null

 

I am wondering that maybe this one in unnecessary (I haven't tested it, tests have been execute only with private/public key authentication).

-o PasswordAuthentication=no

 

The default name of archive contains date and processor serial id or hostname if previous is not available. I have tested this script only or Orange Pi Plus 2, Orange Pi PC and Orange Pi Plus 2E.

 

Please feel free to use it.

 

backup_block_device.tar.gz

V1 - initial version.

backup_block_device-v1.tar.gz

 

V2:

  • Added choice of compression methods: none, gzip, bzip2, xz
  • NFO file is created together with image.

backup_block_device-v2.tar.gz

 

Script is also available here:

https://code.online-mp.pl/svn/public/backup_block_device/trunk/

Link to comment
Share on other sites

Armbian & Khadas are rewarding contributors

Just as a recommendation: Using SSH might create a bottleneck here since strong ciphers might get negotiated:

gzip -c | ssh -c arcfour

should work on/between all will work only on outdated systems -- see below. On a quad-core H3 maybe using '7za a -t7z -bd -m0=lzma2 -mx=3 -mfb=64 -md=32m -ms=on' is the better idea (since multithreaded, leading to better compression, therefore increasing network throughput) useless since 7-zip acts only multi-threaded when input is a file.

 

Edit: Instead of BS assumptions made by me some numbers below. I tested through 7-zip, pbzip2 (multi-threaded even if data comes from stdin) and gzip. Most important lesson: It depends on how full the device or the partitions on it are since the 3 tools perform differently if they have to compress real data vs. 'unused space' (zeroed out in the best case). I tested on an OPi Plus 2E running kernel 4.8 with fixed cpufreq settings (1296 MHz). The numbers are: throughput with real data and then throughput with 'zeroed out' device contents (I used 'dd if=/dev/mmcblk2 bs=10M | pv -s 15267840K' to watch what's happening)

7zr a -bd -t7z -m0=lzma -mx=1 -mfb=64 -md=32m -ms=on: 0.x /  9 MB/s
gzip -c:                                              2.x / 17 MB/s
pbzip2 -1 -c:                                         4.x / 40 MB/s

7-zip and gzip are single-threaded when data is supplied by stdin and max out one CPU core, pbzip2 is multi-threaded and maxes out all CPU cores. Due to the latter if compression on the H3 device happens I would strongly recommend pbzip2 with fastest compression setting. But since even in this case data that can be read from eMMC with 80 MB/s (78 MB/s measured here) will lead to just 4.x MB/s since compression will become the bottleneck compressing on the H3 device itself seems only be useful when the device is almost empty.

 

Possible conclusions:

  • on Fast Ethernet equipped systems 'pbzip2 -1' on the H3 board might show best results
  • on GbE systems (when the target is also GbE capable and pretty fast) sending uncompressed data through the network and compressing on the target seems like a good idea
  • The pbzip2 numbers above are result of sending compressed data to /dev/null. In case 'expensive' encryption is also used (SSH) the numbers will be lower (not the case with 7-zip and gzip since they're single-threaded)
  • in any case it helps with compression a lot if empty space is zeroed out before especially after intensive usage. With mounted rootfs eg. 'dd if=/dev/zero of=$HOME/zeroes bs=10M || rm $HOME/zeroes'
Link to comment
Share on other sites

I guess it is not so compatible?

admin@PKSERVER:~$ ssh -c arcfour localhost
no matching cipher found: client arcfour server aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com,chacha20-poly1305@openssh.com
admin@PKSERVER:~$

I think it would be better to leave "negotiation" (some algorithms should match).

 

Link to comment
Share on other sites

I think it would be better to leave "negotiation" (some algorithms should match).

 

Yes, you're right. Here are both reason and some interesting numbers: https://blog.famzah.net/2015/06/26/openssh-ciphers-performance-benchmark-update-2015/ (we do a lot of server migrations and transferring 40 TB data between Xeon systems arcfour vs. 'default' made often a difference of more than a day, I just realize how outdated the software versions at our customer's locations are now :) ). 

 

Anyway: by using a more efficient compression the size of the data stream can be reduced so in case encryption is a bottleneck (most probably not on Fast Ethernet devices) throughput will increase (and that applies then to Fast Ethernet devices too since network is there the bottleneck). To free up H3 from encryption the 'none' cipher could be used but not with any distro packages anymore so piping through netcat would be an alternative too.

Link to comment
Share on other sites

Well ...

 

- you dont provide a usage manual for your script.

- raw archives need a full understanding of partitions and filesystems !

- when you copy and compress a fs, you loose some crucial information : the size of partition needed to restore it. You need to store that somehow because you will end up with archives which you dont know why you have done it, and that require long and tedious operation to restore or simply identify.

- it is always much more difficult to restore and manage archives than to make archives ...

Link to comment
Share on other sites

- when you copy and compress a fs

 

Huh? The script acts based on devices instead and allows only choosing those that are not currently mounted (very good). At least that's what I got when looking through. But I agree, a self contained short description/documentation would be great.

Link to comment
Share on other sites

Well ...

 

- you dont provide a usage manual for your script.

- raw archives need a full understanding of partitions and filesystems !

- when you copy and compress a fs, you loose some crucial information : the size of partition needed to restore it. You need to store that somehow because you will end up with archives which you dont know why you have done it, and that require long and tedious operation to restore or simply identify.

- it is always much more difficult to restore and manage archives than to make archives ...

 

1. right. But I think it is quite simply to use. I guess additional README file shall be enough...

2. right. But I guess when you backup block device then you know what are you doing. However, script for restore would be quite good here (see also point 4).

3. hmm, maybe like this? File "1435503102ce64000028_2016-09-19_1474302693.img.gz" is my backup which I have created last time.

root@PKSERVER:/data/backup-image# (gunzip -c 1435503102ce64000028_2016-09-19_1474302693.img.gz 2>/dev/null | dd bs=512 skip=0 count=1 of=temp) && fdisk -lu temp && rm temp
1+0 records in
1+0 records out
512 bytes (512  copied, 0.00436383 s, 117 kB/s

Disk temp: 512 B, 512 bytes, 1 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x576a94d9

Device Boot Start      End  Sectors  Size Id Type
temp1        2048 30230303 30228256 14.4G 83 Linux

root@PKSERVER:/data/backup-image#

4. actually it is very easy. The same - ssh, gunzip and dd with pipe...

Link to comment
Share on other sites

For "slices and overlays" I have no experience (so you may elaborate). What I can guess that overlay file must exist on mounted rw block device (or any other rw device/network).

 

The script has very precise purpose - create an image of unmounted block device (disk with partitions). Here is an example that loop devices with partitions will not be visible:

root@PKSERVER:/data/temp/device# ll
total 7696
-rw-r--r-- 1 root root 268435456 Sep 21 11:08 block_device.img
root@PKSERVER:/data/temp/device# losetup -f -P block_device.img
root@PKSERVER:/data/temp/device# lsblk
NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda            8:0    1 115.5G  0 disk
└─sda1         8:1    1 115.5G  0 part /data
loop0          7:0    0   256M  0 loop
└─loop0p1    259:0    0 252.9M  0 loop
mmcblk0boot0 179:16   0     4M  1 disk
mmcblk0boot1 179:32   0     4M  1 disk
mmcblk0      179:0    0  14.6G  0 disk
└─mmcblk0p1  179:1    0  14.4G  0 part /
root@PKSERVER:/data/temp/device# backup_block_device.sh
List of block devices:
NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda            8:0    1 115.5G  0 disk
└─sda1         8:1    1 115.5G  0 part /data
loop0          7:0    0   256M  0 loop
└─loop0p1    259:0    0 252.9M  0 loop
mmcblk0boot0 179:16   0     4M  1 disk
mmcblk0boot1 179:32   0     4M  1 disk
mmcblk0      179:0    0  14.6G  0 disk
└─mmcblk0p1  179:1    0  14.4G  0 part /

No devices available!

root@PKSERVER:/data/temp/device# fdisk -lu /dev/loop0

Disk /dev/loop0: 256 MiB, 268435456 bytes, 524288 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x66d10c8d

Device       Boot Start    End Sectors   Size Id Type
/dev/loop0p1       2048 520000  517953 252.9M 83 Linux

root@PKSERVER:/data/temp/device#

Link to comment
Share on other sites

Just a quick note: In the meantime I came to the conclusion that 7-zip is the wrong tool when used on any SBC and that it depends on both contents of the block device (real data vs. unused/zeroed space) and whether SBC and backup host have GbE or not. Please see updated post #2 above ('pbzip2 -1' might be useful but not in every situation)

Link to comment
Share on other sites

For "slices and overlays" I have no experience (so you may elaborate). What I can guess that overlay file must exist on mounted rw block device (or any other rw device/network).

 

The script has very precise purpose - create an image of unmounted block device (disk with partitions). Here is an example that loop devices with partitions will not be visible:

 

root@PKSERVER:/data/temp/device# ll
total 7696
-rw-r--r-- 1 root root 268435456 Sep 21 11:08 block_device.img
root@PKSERVER:/data/temp/device# losetup -f -P block_device.img
root@PKSERVER:/data/temp/device# lsblk
NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda            8:0    1 115.5G  0 disk
└─sda1         8:1    1 115.5G  0 part /data
loop0          7:0    0   256M  0 loop
└─loop0p1    259:0    0 252.9M  0 loop
mmcblk0boot0 179:16   0     4M  1 disk
mmcblk0boot1 179:32   0     4M  1 disk
mmcblk0      179:0    0  14.6G  0 disk
└─mmcblk0p1  179:1    0  14.4G  0 part /
root@PKSERVER:/data/temp/device# backup_block_device.sh
List of block devices:
NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda            8:0    1 115.5G  0 disk
└─sda1         8:1    1 115.5G  0 part /data
loop0          7:0    0   256M  0 loop
└─loop0p1    259:0    0 252.9M  0 loop
mmcblk0boot0 179:16   0     4M  1 disk
mmcblk0boot1 179:32   0     4M  1 disk
mmcblk0      179:0    0  14.6G  0 disk
└─mmcblk0p1  179:1    0  14.4G  0 part /

No devices available!

root@PKSERVER:/data/temp/device# fdisk -lu /dev/loop0

Disk /dev/loop0: 256 MiB, 268435456 bytes, 524288 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x66d10c8d

Device       Boot Start    End Sectors   Size Id Type
/dev/loop0p1       2048 520000  517953 252.9M 83 Linux

root@PKSERVER:/data/temp/device#

You know, all I said is because I am a moron, and although I got professional experience on the matter, I always make archive and forget about it, what there were for, and if it is useful to keep them.

 

I just had a rapid sight at your script and I dont want to tell stupid things. AFAIK, a block device is just a pointer on a driver instance that **can** handle block operations on a raw file/device and cache blocks. It can as well point to an oracle DB file, a filesystem a full disk with boot block, partition table and partitions, a part of a raid/lvm system, a filesystem that is used to compose another filesystem, on your etc/passwd copied to a partition. I dont even know what lsblk would report in the last case but I would guess "part/data".

 

Nevertheless, you will certainly know what you used your disks or partitions for. But if you are just half as forgetful as I am, you wont remember it next year. So I just recommend to save information when you create big archives that will encumber your disks and may need very different handling to restore them. It is only practical experience : I loose (or may have loose - I dont even know) data because I had not well documented the structure and drives used in a RAID/LVM installation some years ago. Physical archives are more difficult to manage than logical ones (files, or db exports)

 

BTW, in 1435503102ce64000028_2016-09-19_1474302693, what is 1474302693 ? Is that the compressed size (that you already can know by the file size) ? You can of course retrieve the part size by decompressing threw wc, but there is useful information that you only can store by entering threw keyboard.

Link to comment
Share on other sites

BTW, in 1435503102ce64000028_2016-09-19_1474302693, what is 1474302693 ? Is that the compressed size (that you already can know by the file size) ? You can of course retrieve the part size by decompressing threw wc, but there is useful information that you only can store by entering threw keyboard.

 

This is result of command date +%s - so... seconds since 1970-01-01 00:00:00 UTC. As I wanted to differentiate files even if I create a few backups during one day. In my case:

root@PKSERVER:~# date --date='@1474302693'
Mon Sep 19 18:31:33 CEST 2016

Original compressing method is gzip (deflate) - so based on the resulting file it is quite difficult to determine size of "file inside". Currently you must fully "gunzip" the archive to get size of data. I will follow tkraiser's advices and maybe modify the script to use other compressing methods.

 

I acknowledge your point (but still I guess that fo SBCs there is too much to mention about RAIDs/LVMs etc. - if you have configuration like this then definitely you should use other, dedicated method of backup, not simple interactive script).

 

Basically it would be quite simple to create additional .nfo file with some other details that can be gathered during the process and may be helpful when eg. mounting image file later or when trying to restore image on other block device.

Link to comment
Share on other sites

...

if you have configuration like this then definitely you should use other, dedicated method of backup, not simple interactive script).

...

 

In fact, I now avoid complicated configurations. I am an adept of the KISS philosophy (keep it simple stupid).

 

BTW, (@tkaiser) what about compressed fs for archives ? Compression, reliability, performance ? (Not for physical archives but for rsync copies).

Link to comment
Share on other sites

Edit: Added pigz numbers for .gz and .zip

 

BTW, (@tkaiser) what about compressed fs for archives ? Compression, reliability, performance ? (Not for physical archives but for rsync copies).

 

One approach is to simply pipe the whole device's contents through netcat to a beefy Linux/Solaris/FreeBSD host where btrfs/ZFS with maximum compression are the target. With GbE equipped boards you get maximum performance and both maximum disk space savings. It's reliable too since both FS do checksumming. In fact it's the best way to do so especially when later used together with rsync or btrfs send/receive: http://forum.armbian.com/index.php/topic/1331-armbian-sd-card-backup/?p=10071

 

But in case you want to save your eMMC and have only Fast Ethernet on your board (PC Plus for example) on the device a small Armbian installation on SD card using the mass storage USB gadget (g_mass_storage) module could be convenient. Shut down your device, insert SD card, connect the Micro OTG port to an USB port of the host and then simply share the whole /dev/mmcblk1 device as USB disk (instructions like here)

 

To finish my testing done along the way today. This is how the eMMC of my OPi Plus 2E looks like (containing a legacy Armbian installation that is a few weeks old):

/dev/mmcblk2p1   15G  2.4G   11G  18% /mnt/emmc

So it's not even 20% used and I never added/changed/deleted that much. I filled all empty space with zeroes which improved copying speeds just a little (that might look totally different if your filesystem is quite busy, then doing a device backup in such a situation as above where 82% should be free means dealing with probably 60% or even more 'real data' that is just marked as not used by the filesystem in question).

 

Anyway: I dropped 7-zip now (way too heavy for these little gems), also had a look at lz4 now (apt-get install liblz4-tool) and pigz (apt-get install pigz) and measured overall performance piping my eMMC through the different compressors to /dev/null:

gzip -c:            10.4 MB/s  1065 MB
pbzip2 -1 -c:       15.2 MB/s  1033 MB
lz4 - -z -c -9 -B4: 18.0 MB/s  1276 MB
pigz -c:            25.2 MB/s  1044 MB
pigz --zip -c:      25.2 MB/s  1044 MB

Nice: lz4 acts single-threaded so in case you want to pipe this through the network using SSH there are still 3 CPU cores free. It's also the fastest with speeds up to 50MB/s when only zeroes are compressed. Not so nice: Inefficient compression/size. And pigz looks like the best allrounder (way faster than pbzip2 while size is almost the same)

 

So in case it's about a Fast Ethernet H3 device that should do device backups through the network it absolutely depends on the amount of 'real data' vs. empty/zeroes. Too lazy to test this out now.

 

In case anyone here has a H3 device with an installation that is rather old and fs is or was filled above 90% --> you're the person who could provide numbers for that!

Link to comment
Share on other sites

I will follow tkraiser's advices and maybe modify the script to use other compressing methods.

 

Another suggestion is to rely on btrfs on the target host and better use keyless SSH authentication. Then you can do some funny things to save space since btrfs has some nice features if you stop compressing on the H3 device at all (which is the best variant on GbE equipped device as we've seen).

 

Simply use on the backup target filesystem compression (mount -o compress) and before you create a new whole device image copy an already existing one with the special 'cp -reflink=always' trick.

 

Given you have one 1435503102ce64000028_2016-09-19_1474302693.img image (16 GB in size, due to btrfs' transparent file compression only needing the space really needed after applying zlib compression, maybe 1.5 GB) and want to do another one 2 hours later:

cp -reflink=always 1435503102ce64000028_2016-09-19_1474302693.img 1435503102ce64000028_2016-09-19_1474309893.img

The cp takes 0.01 seconds, the new file looks as it's 16 GB in size but you still only waste 1.5GB on disk since both files now still share all chunks. 100 MB might have changed at the source in the meantime so when you now start to send the contents of the eMMC into 1435503102ce64000028_2016-09-19_1474309893.img (which takes some time since the whole contents have to be transmitted through the network) only changed chunks will be written new and of course compressed on the fly. So in case the new stuff is able to be compressed with a 2:1 ration after the whole operation you have 50 MB less on your target filesystem while it looks like there are 32 GB files lying around.

 

Using modern filesystems you can implement stuff like versioning or 'incremental' cloning at the filesystem level while dealing with uncompressed device images ready to be burned everytime.

 

'Work smarter not harder' (won't happen anytime soon, everyone uses the tools he used already last century ;) )

Link to comment
Share on other sites

New version can be downloaded (please see first post). I had some issues related to 7z compression, but I guess xz will be the same (lzma).

 

Additionally please see results of backup test for all? compression methods. I have executed tests on Orange Pi PC (100MBit ethernet):

 

nfo.zip

logs.zip

 

Result files:

root@PKSERVER:/data/temp# ll 543550200b0e340000cd*
-rw-r--r-- 1 root root 15523119104 Sep 22 19:19 543550200b0e340000cd_2016-09-22_1474563274.img
-rw-r--r-- 1 root root        1678 Sep 22 19:19 543550200b0e340000cd_2016-09-22_1474563274.nfo
-rw-r--r-- 1 root root  1029735811 Sep 22 19:52 543550200b0e340000cd_2016-09-22_1474565396.img.gz
-rw-r--r-- 1 root root        1678 Sep 22 19:52 543550200b0e340000cd_2016-09-22_1474565396.nfo
-rw-r--r-- 1 root root   962251987 Sep 22 20:48 543550200b0e340000cd_2016-09-22_1474567109.img.bz2
-rw-r--r-- 1 root root        1678 Sep 22 20:48 543550200b0e340000cd_2016-09-22_1474567109.nfo

I guess the best choice is still gzip. If you select none then bottleneck is speed of network. For bzip2 is is somehow acceptable, but xz... forget..., the timing was unacceptable (and I have interrupted this test).

 549MiB 0:16:18 [ 746KiB/s] [====>                                       ]  3% ETA 7:02:56
Link to comment
Share on other sites

 

Using modern filesystems you can implement stuff like versioning or 'incremental' cloning at the filesystem level while dealing with uncompressed device images ready to be burned everytime.

 

'Work smarter not harder' (won't happen anytime soon, everyone uses the tools he used already last century ;) )

 

'Work smarter not harder' is definitely the way to go. Automating and optimizing transport and compression of block devices solves what ? Intelligent versioning/cloning of file systems has been standard proven Linux practice with rsync and ext3 filesystems for ages and btrfs hopefully continues that tradition. The proof of any backup is not backing up your data. Your surprises start when trying to RESTORE. Zillions are wasted on backup concepts for lack of actual experience in restoring.

Link to comment
Share on other sites

Automating and optimizing transport and compression of block devices solves what ?

 

It was an interesting excercise (at least for me -- I made some experiments with lz4-tools and pbzip2 and tried to share the results here as usual).

 

There is a use case for stuff like that: And that's efficient device cloning as a measure of precaution or even 'move the installation'. Of course this should become part of the main distro later and stuff like device selection should not be the result of re-inventing the wheel but using routines from nand_sata_install (which is known to work on every of the 40 SBC we currently support). But that would require refactoring of Armbian's routines (maybe a library that can be called by different tools called then nand_emmc_usb_sata_install, device_backup, installation_mover, whatever).

 

So please don't discourage people making experiments like that even if the unaltered results are of no use for you. Discussions like this are helpful IMO to get some basic knowledge and to move in a direction where we might be able to provide a more generic backup/clone functionality for all our users.

Link to comment
Share on other sites

So please don't discourage people making experiments like that even if the unaltered results are of no use for you. Discussions like this are helpful IMO to get some basic knowledge and to move in a direction where we might be able to provide a more generic backup/clone functionality for all our users.

 

I love to encourage people making experiments and sharing their results. Basic knowledge is gained by learning and working with the basics. Users tend to trip over automated and nondocumented functionalities. Spending time on getting familiar with Linux basics is needed anyway for Armbian users, the paradigm shift from Windows to Linux cannot be avoided by automating away the basics.

 

The clone functionality is provided by dd 

 

The restore functionality can be provided by a reasonably managed backup/restore system allowing access to unaltered versioned images of filesystems. A versioned backups is made by rsync referencing an earlier backup, storage and runtime are minimized by using hardlinks. Need a file from three weeks ago ? Go fetch it from the archive. Need to restore a complete system ? Get it from the archive.

 

The beauty of this approach ? It is proven and simple, has little dependencies on ressources ( any old or new, physical or virtual Linux system will do ). 

Link to comment
Share on other sites

Source code is here (anonymous access to browse code, I can grant rw if needed to somebody, just send PM to me):

 

https://code.online-mp.pl/svn/public/

 

BTW & OT: additionally I put code of my own tool "backup". I designed and writtten it to support my requirements about backup from NAS to external drives. The process is stateful eg. sqlite is used to keep internal state of backup. I am running it for more than 1 year (3 times a day).

[/share/USBDisk1/logs] # /share/MD0_DATA/backup/backup
backup, version: [0.1.0], build: [Sep 13 2016 10:33:55], usage:
backup [-h]        - display this help message
backup <options>   - parameters as specified below:
<-f config_file>   - (required, single)   - list of source <-> destination directories for synchronization
<-d database_path> - (required, single)   - path to database file (sqlite3) which will be used as information storage
[-l log_dir]       - (optional, single)   - path to directory where log file will be written (default is empty = log to stdout/stderr)
[-L log_level]     - (optional, single)   - logging level, one of: DEBUG, INFO, NOTICE, WARNING, ERROR, CRITICAL (default is NOTICE)
[-p pid_file]      - (optional, single)   - path to file which will be created with process identifier of backup operation
[-X exclude_dir]   - (optional, multiple) - name of directory to be skipped during synchronization
[-x exclude_file]  - (optional, multiple) - name of file to be skipped during synchronization
[-M check_meta]    - (optional, single)   - flag (bitmask) to specify additional checking for file comparison (1 - by mode, 2 - by owner's uid, 4 - by owner's gid; default is 0 = no check)
[-c cleanup_wait]  - (optional, single)   - determine after how many backup sessions files or directories will be deleted from destination (default is 0 = immediate)
[-T file_system]   - (optional, single)   - enable compatibility for specified file system (available options: NTFS)

And some logs... I am executing it from NAS

[/share/USBDisk1/logs] # head -30 backup.20160923.110005.13573.log
2016-09-23|11:00:11|554474|13573|NOTICE|Starting application [backup] (version: [0.1.0], build: [Sep 13 2016 10:33:55], PID = [13573])
2016-09-23|11:00:11|554901|13573|NOTICE|Using executable [/share/MD0_DATA/backup/backup]
2016-09-23|11:00:11|555059|13573|NOTICE|Using logger level [NOTICE] (2) and logger [/share/USBDisk1/logs/backup.20160923.110005.13573.log]
2016-09-23|11:00:11|556175|13573|NOTICE|Using PID file: [/share/USBDisk1/temp/backup.pid]
2016-09-23|11:00:11|556366|13573|NOTICE|Setting up of file and directory creation mask (umask) to [000], previous umask = [022]
2016-09-23|11:00:11|570426|13573|NOTICE|Using database (v3.14.2): [/share/USBDisk1/misc/backup.dat]
2016-09-23|11:00:11|570659|13573|NOTICE|Using configuration file: [/share/USBDisk1/conf/backup.cfg]
2016-09-23|11:00:11|570859|13573|NOTICE|Using destination directory create depth level: 2 [create]
2016-09-23|11:00:11|571050|13573|NOTICE|Excluding directory name: [.upload_cache]
2016-09-23|11:00:11|571187|13573|NOTICE|Excluding directory name: [.Qsync]
2016-09-23|11:00:11|571320|13573|NOTICE|Excluding directory name: [.@__thumb]
2016-09-23|11:00:11|571452|13573|NOTICE|Excluding directory name: [@Recycle]
2016-09-23|11:00:11|571581|13573|NOTICE|Excluding file name: [Thumbs.db]
2016-09-23|11:00:11|571708|13573|NOTICE|Excluding file name: [.bash_history]
2016-09-23|11:00:11|571851|13573|NOTICE|Using check meta flag [0] (check by mode = no, check by owner's uid = no, check by owner's gid = no)
2016-09-23|11:00:11|572016|13573|NOTICE|Using clean-up wait of value: 12 [postponed clean-up]
2016-09-23|11:00:11|572158|13573|NOTICE|Using compatibility for [NTFS] file system
2016-09-23|11:00:11|572342|13573|NOTICE|Using TEMP database directory: [/share/USBDisk1/misc]
2016-09-23|11:00:11|572556|13573|NOTICE|Successfully connected to TEMP database
2016-09-23|11:00:11|573079|13573|NOTICE|Successfully connected to [/share/USBDisk1/misc/backup.dat] database
2016-09-23|11:00:21|095328|13573|NOTICE|Using session identifier [1801]
2016-09-23|11:00:21|338993|13573|NOTICE|Synchronization of [/mnt/HDA_ROOT/.config] => [/share/USBDisk1/data/backup/scheduled/etc/config] is started
2016-09-23|11:00:21|405905|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [.]
2016-09-23|11:00:21|453484|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [license]
2016-09-23|11:00:21|464243|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [license/.lc]
2016-09-23|11:00:21|485333|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [license/.gnupg]
2016-09-23|11:00:21|487405|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [license/.req]
2016-09-23|11:00:21|498122|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [.hd_info]
2016-09-23|11:00:21|521208|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [php.d]
2016-09-23|11:00:21|537634|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [cloudconnector]

Link to comment
Share on other sites

Can you see my first post? Because I am not trying to solve anything. I wanted to share my script to make backup of block device - maybe somebody wants to use it (which I specified in my post). Last post - I share other tool which I am using to backup NAS for many months without any problems (the purpose and way of working is totally different).

 

Now discussion goes far far away from main topic (and I see from your posts that it started to focus on backup as general). And I have use cases for which my tools are covering in 100%:

1. Backup of block device for easy migration and /or restore

2. Backup execute on filesystem with some features which eg. rsync does not have.

 

Again: I am not having any issues with these tools I have written and want to share to community. Of course I am glad to hear anything that may be worth to have function or even - that there is some bug :mellow: .

 

 

@piknew

 

Could you please detail the use case you're trying to solve ? From what I get from a quick glance you're automating backup of NTFS directories.

 

No, NTFS is destination. When you try to backup files from ext2/3/4 to NTFS you will encounter problems with eg.: NTFS streams and some other entities which are not allowed or have different meaning on each filesystem.

 

So, I have external USB HDD (2TB) which I am using as backup device for some important data from my NAS. I didn't wanted to format it (I mean external drive) as ext2/3/4 because I can easy insert it to any Windows machine and then I will have immediate access to any of stored files.

Additionally I didn't want that some system files are backed up (so this is the reason exclude options have been implemented). Additionally I would like to have some "grace period" if by coincidence user deletes file from NAS (backup tool will delete this file as well but after some grace period defined by [-c cleanup_wait]). This is short story...

Link to comment
Share on other sites

Addendum: We can now also do eMMC backup from a Windows, Linux or OS X host simply by connecting the H3 device with Micro USB to an USB port of the host. The H3 device then acts as a card or eMMC reader: http://forum.armbian.com/index.php/topic/2125-armbian-for-orange-pi-does-not-boot/?p=16455

 

In my setup (OS X with a rather slow MacBook) I was limited to 14.7 MiB/sec USB transfer speed (piping to /dev/null). I didn't care (maybe some tweaks needed to get this faster with our new tool) and tested through 4 compression tools:

7-zip      779 MB       20m40        12MiB/s
pbzip2     990 MB       16m59      14.6MiB/s
xz         765 MB       34m55       7.1MiB/s
zip       1066 MB       20m40        12MiB/s

Winner when it's about performance is pbzip2 (not bzip2, the 'p' is for 'parallel' -- this is a special bzip2 version running multithreaded even when data is on stdin!), regarding size it's 7-zip (acts also somewhat multithreaded even in stdin mode)

 

Details below:

 

 

bash-3.2# time dd if=/dev/rdisk3 bs=1m | pv -s 15267840K | pbzip2 -9 -b40qk -c >emmc.bz2
14910+0 records in
14910+0 records out
15634268160 bytes transferred in 1019.126745 secs (15340848 bytes/sec)
14.6GiB 0:16:59 [14.6MiB/s] [====================================================================================================>] 100%            

real	16m59.163s
user	9m12.303s
sys	0m50.981s

bash-3.2# du -sh emmc.bz2 
967M	emmc.bz2

bash-3.2# time dd if=/dev/rdisk3 bs=1m | pv -s 15267840K | 7za a -t7z -m0=lzma -mx=9 -mfb=64 -md=32m -ms=on -si emmc.7z
7-Zip (A) [64] 9.20  Copyright (c) 1999-2010 Igor Pavlov  2010-11-18
p7zip Version 9.20 (locale=utf8,Utf16=on,HugeFiles=on,4 CPUs)
Creating archive emmc.7z

Compressing  [Content]    0%   9MiB 0:00:01 [8.96MiB/s]
14910+0 records in
14910+0 records out
15634268160 bytes transferred in 1820.663478 secs (8587127 bytes/sec)
    0%14.6GiB 0:30:20 [8.19MiB/s] [====================================================================================================>] 100%            
      

Everything is Ok

real	30m20.741s
user	33m13.561s
sys	1m12.921s

bash-3.2# du -sh emmc.7z 
760M	emmc.7z

bash-3.2# time dd if=/dev/rdisk3 bs=1m | pv -s 15267840K | zip -9 >emmc.zip
  adding: -  12MiB 0:00:01 [  12MiB/s]
14910+0 records in
14910+0 records out
15634268160 bytes transferred in 1240.301981 secs (12605211 bytes/sec)
14.6GiB 0:20:40 [  12MiB/s] [====================================================================================================>] 100%            
 (deflated 93%)

real	20m40.314s
user	6m5.263s
sys	0m24.822s



bash-3.2# time dd if=/dev/rdisk3 bs=1m | pv -s 15267840K | xz -9 >emmc.xz
14910+0 records in
14910+0 records out
15634268160 bytes transferred in 2095.000087 secs (7462658 bytes/sec)
14.6GiB 0:34:55 [7.12MiB/s] [====================================================================================================>] 100%            

real	34m55.086s
user	22m32.800s
sys	0m36.705s

bash-3.2# ls -la emmc.*
-rw-r--r--  1 root  staff   797420766 27 Sep 22:34 emmc.7z
-rw-r--r--  1 root  staff  1013703474 27 Sep 21:54 emmc.bz2
-rw-r--r--  1 root  staff   783585672 27 Sep 23:32 emmc.xz
-rw-r--r--  1 root  staff  1091182330 27 Sep 22:55 emmc.zip 

 

 

 

Important note: the results are somewhat meaningless since it has to be checked whether we can improve USB transfer speeds (the eMMC I tested with gets close to 80 MB/s locally). If this is the case the two tools that can compress multithreaded when data is on stdin might show way better performance numbers.

Link to comment
Share on other sites

Can you please take a look at (lines 106-148) and suggest if everything is optimal (i see that in your test you are using -9 which requires a lot of cpu)?

 

svn co https://code.online-mp.pl/svn/public/backup_block_device/trunk/backup_block_device

 

Additional question is related to 7z - I have 7z, 7za. I noticed that sometimes you have been using 7zr. Why so?

admin@PKSERVER:~$ which 7z
/usr/bin/7z
admin@PKSERVER:~$ which 7za
/usr/bin/7za
admin@PKSERVER:~$ which 7zr
admin@PKSERVER:~$ ll /usr/bin/7z*
-rwxr-xr-x 1 root root 40 Jun  8 18:07 /usr/bin/7za
-rwxr-xr-x 1 root root 39 Jun  8 18:07 /usr/bin/7z
admin@PKSERVER:~$ apt list | grep 7z

WARNING: apt does not have a stable CLI interface yet. Use with caution in scripts.

p7zip/stable,stable 9.20.1~dfsg.1-4.1+deb8u2 armhf
p7zip-full/stable,stable,now 9.20.1~dfsg.1-4.1+deb8u2 armhf [installed]
p7zip-rar/stable 9.20.1~ds.1-3 armhf
admin@PKSERVER:~$

To clarify - are command lines compatible between 7z, 7za & 7zr? An it would be (please note last -so):

7z a -t7z -m0=lzma -mx=9 -mfb=64 -md=32m -ms=on -si -so
Link to comment
Share on other sites

Additional question is related to 7z - I have 7z, 7za. I noticed that sometimes you have been using 7zr. Why so?

 

For whatever reasons there exist two packages on Debian/Ubuntu:

root@orangepiplus2e:~# dpkg-query -S /usr/bin/7z*
p7zip-full: /usr/bin/7z
p7zip-full: /usr/bin/7za
p7zip: /usr/bin/7zr

Regarding .7z both packages are identical, simply check the output of 'apt-cache show p7zip' and 'apt-cache show p7zip-full' to get the differences.

 

Please be aware that my last test has been done on an i5 MacBook running OS X accessing the eMMC through USB (MacBook not that strong but plently of CPU power compared to any SBC, especially when it's about single-threaded performance). On any SBC pigz, pbzip2 or lz4-tools are the only options if you really care somewhat about speed.

 

The problem with most if not all efficient compressors is that they perform multi-threaded only if input is a file or a list of files and that they have severe problems when input is on stdin (coming from dd, pv or whatever). It should also be noted that all 4 archives I created were not able to be processed by Etcher in the next step. And my current testings are meant to do some basic research to improve situation with Etcher (get support for .7z or Armbian maybe switching to .xz in the future... and being able to do device backups with Etcher too)

 

Edit: I forgot to test on the H3 device through pigz before -- added numbers to post #14 above.

Link to comment
Share on other sites

Edit: I forgot to test on the H3 device through pigz before -- added numbers to post #14 above.

 

Thanks. pigz is much more efficient, it is using all cores:

 

post-1705-0-36708200-1475088371_thumb.png

 

I have updated my script to use pigz (and pbzip2) if available (over standard gzip and bzip2).

 

See last commit (rev.17) : https://code.online-mp.pl/svn/public/backup_block_device/trunk/

 

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines