piknew Posted September 21, 2016 Share Posted September 21, 2016 Hi, if anybody would like to use - please see attached script for backup of block devices. I used it when executing backup for topics: http://forum.armbian.com/index.php/topic/2073-orange-pi-plus-2e-upgrade-to-520/and http://forum.armbian.com/index.php/topic/1681-migrate-from-orange-pi-plus-2-to-orange-pi-plus-2e/ Requirement: dd, gzip, ssh client (I guess they should be available by default on every system) lsblk, mount (should be available on every system) pv tool; you may install it "apt-get install pv" Script is identifying list of available block devices. The condition is that it should be block device with at least one partition and it cannot be mounted at the moment of script execution. When user chooses block device to backup then script will ask for some parameters of receiving system (please notice, that you may leave defaults or enter own. Defaults (if needed can be changed in related conf file). So, script is getting stream of data by dd, then passing to pv tool to display progress, then pipe to gzip and pipe to ssh session as specified by parameters. Remark: even if ssh is used to transfer stream of data there is NO SECURITY related to it. Because of options: -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null I am wondering that maybe this one in unnecessary (I haven't tested it, tests have been execute only with private/public key authentication). -o PasswordAuthentication=no The default name of archive contains date and processor serial id or hostname if previous is not available. I have tested this script only or Orange Pi Plus 2, Orange Pi PC and Orange Pi Plus 2E. Please feel free to use it. backup_block_device.tar.gz V1 - initial version. backup_block_device-v1.tar.gz V2: Added choice of compression methods: none, gzip, bzip2, xz NFO file is created together with image. backup_block_device-v2.tar.gz Script is also available here: https://code.online-mp.pl/svn/public/backup_block_device/trunk/ Link to comment Share on other sites More sharing options...
tkaiser Posted September 21, 2016 Share Posted September 21, 2016 Just as a recommendation: Using SSH might create a bottleneck here since strong ciphers might get negotiated: gzip -c | ssh -c arcfour should work on/between all will work only on outdated systems -- see below. On a quad-core H3 maybe using '7za a -t7z -bd -m0=lzma2 -mx=3 -mfb=64 -md=32m -ms=on' is the better idea (since multithreaded, leading to better compression, therefore increasing network throughput) useless since 7-zip acts only multi-threaded when input is a file. Edit: Instead of BS assumptions made by me some numbers below. I tested through 7-zip, pbzip2 (multi-threaded even if data comes from stdin) and gzip. Most important lesson: It depends on how full the device or the partitions on it are since the 3 tools perform differently if they have to compress real data vs. 'unused space' (zeroed out in the best case). I tested on an OPi Plus 2E running kernel 4.8 with fixed cpufreq settings (1296 MHz). The numbers are: throughput with real data and then throughput with 'zeroed out' device contents (I used 'dd if=/dev/mmcblk2 bs=10M | pv -s 15267840K' to watch what's happening) 7zr a -bd -t7z -m0=lzma -mx=1 -mfb=64 -md=32m -ms=on: 0.x / 9 MB/s gzip -c: 2.x / 17 MB/s pbzip2 -1 -c: 4.x / 40 MB/s 7-zip and gzip are single-threaded when data is supplied by stdin and max out one CPU core, pbzip2 is multi-threaded and maxes out all CPU cores. Due to the latter if compression on the H3 device happens I would strongly recommend pbzip2 with fastest compression setting. But since even in this case data that can be read from eMMC with 80 MB/s (78 MB/s measured here) will lead to just 4.x MB/s since compression will become the bottleneck compressing on the H3 device itself seems only be useful when the device is almost empty. Possible conclusions: on Fast Ethernet equipped systems 'pbzip2 -1' on the H3 board might show best results on GbE systems (when the target is also GbE capable and pretty fast) sending uncompressed data through the network and compressing on the target seems like a good idea The pbzip2 numbers above are result of sending compressed data to /dev/null. In case 'expensive' encryption is also used (SSH) the numbers will be lower (not the case with 7-zip and gzip since they're single-threaded) in any case it helps with compression a lot if empty space is zeroed out before especially after intensive usage. With mounted rootfs eg. 'dd if=/dev/zero of=$HOME/zeroes bs=10M || rm $HOME/zeroes' Link to comment Share on other sites More sharing options...
piknew Posted September 21, 2016 Author Share Posted September 21, 2016 I guess it is not so compatible? admin@PKSERVER:~$ ssh -c arcfour localhost no matching cipher found: client arcfour server aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com,chacha20-poly1305@openssh.com admin@PKSERVER:~$ I think it would be better to leave "negotiation" (some algorithms should match). Link to comment Share on other sites More sharing options...
tkaiser Posted September 21, 2016 Share Posted September 21, 2016 I think it would be better to leave "negotiation" (some algorithms should match). Yes, you're right. Here are both reason and some interesting numbers: https://blog.famzah.net/2015/06/26/openssh-ciphers-performance-benchmark-update-2015/ (we do a lot of server migrations and transferring 40 TB data between Xeon systems arcfour vs. 'default' made often a difference of more than a day, I just realize how outdated the software versions at our customer's locations are now ). Anyway: by using a more efficient compression the size of the data stream can be reduced so in case encryption is a bottleneck (most probably not on Fast Ethernet devices) throughput will increase (and that applies then to Fast Ethernet devices too since network is there the bottleneck). To free up H3 from encryption the 'none' cipher could be used but not with any distro packages anymore so piping through netcat would be an alternative too. Link to comment Share on other sites More sharing options...
arox Posted September 21, 2016 Share Posted September 21, 2016 Well ... - you dont provide a usage manual for your script. - raw archives need a full understanding of partitions and filesystems ! - when you copy and compress a fs, you loose some crucial information : the size of partition needed to restore it. You need to store that somehow because you will end up with archives which you dont know why you have done it, and that require long and tedious operation to restore or simply identify. - it is always much more difficult to restore and manage archives than to make archives ... 1 Link to comment Share on other sites More sharing options...
tkaiser Posted September 21, 2016 Share Posted September 21, 2016 - when you copy and compress a fs Huh? The script acts based on devices instead and allows only choosing those that are not currently mounted (very good). At least that's what I got when looking through. But I agree, a self contained short description/documentation would be great. Link to comment Share on other sites More sharing options...
piknew Posted September 21, 2016 Author Share Posted September 21, 2016 Well ... - you dont provide a usage manual for your script. - raw archives need a full understanding of partitions and filesystems ! - when you copy and compress a fs, you loose some crucial information : the size of partition needed to restore it. You need to store that somehow because you will end up with archives which you dont know why you have done it, and that require long and tedious operation to restore or simply identify. - it is always much more difficult to restore and manage archives than to make archives ... 1. right. But I think it is quite simply to use. I guess additional README file shall be enough... 2. right. But I guess when you backup block device then you know what are you doing. However, script for restore would be quite good here (see also point 4). 3. hmm, maybe like this? File "1435503102ce64000028_2016-09-19_1474302693.img.gz" is my backup which I have created last time. root@PKSERVER:/data/backup-image# (gunzip -c 1435503102ce64000028_2016-09-19_1474302693.img.gz 2>/dev/null | dd bs=512 skip=0 count=1 of=temp) && fdisk -lu temp && rm temp 1+0 records in 1+0 records out 512 bytes (512 copied, 0.00436383 s, 117 kB/s Disk temp: 512 B, 512 bytes, 1 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x576a94d9 Device Boot Start End Sectors Size Id Type temp1 2048 30230303 30228256 14.4G 83 Linux root@PKSERVER:/data/backup-image# 4. actually it is very easy. The same - ssh, gunzip and dd with pipe... Link to comment Share on other sites More sharing options...
arox Posted September 21, 2016 Share Posted September 21, 2016 Note that you can as well save data that represents slices or overlays. If you dont save as much information as possible, you can remove your archives older than a fortnight ... Link to comment Share on other sites More sharing options...
piknew Posted September 21, 2016 Author Share Posted September 21, 2016 For "slices and overlays" I have no experience (so you may elaborate). What I can guess that overlay file must exist on mounted rw block device (or any other rw device/network). The script has very precise purpose - create an image of unmounted block device (disk with partitions). Here is an example that loop devices with partitions will not be visible: root@PKSERVER:/data/temp/device# ll total 7696 -rw-r--r-- 1 root root 268435456 Sep 21 11:08 block_device.img root@PKSERVER:/data/temp/device# losetup -f -P block_device.img root@PKSERVER:/data/temp/device# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 1 115.5G 0 disk └─sda1 8:1 1 115.5G 0 part /data loop0 7:0 0 256M 0 loop └─loop0p1 259:0 0 252.9M 0 loop mmcblk0boot0 179:16 0 4M 1 disk mmcblk0boot1 179:32 0 4M 1 disk mmcblk0 179:0 0 14.6G 0 disk └─mmcblk0p1 179:1 0 14.4G 0 part / root@PKSERVER:/data/temp/device# backup_block_device.sh List of block devices: NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 1 115.5G 0 disk └─sda1 8:1 1 115.5G 0 part /data loop0 7:0 0 256M 0 loop └─loop0p1 259:0 0 252.9M 0 loop mmcblk0boot0 179:16 0 4M 1 disk mmcblk0boot1 179:32 0 4M 1 disk mmcblk0 179:0 0 14.6G 0 disk └─mmcblk0p1 179:1 0 14.4G 0 part / No devices available! root@PKSERVER:/data/temp/device# fdisk -lu /dev/loop0 Disk /dev/loop0: 256 MiB, 268435456 bytes, 524288 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x66d10c8d Device Boot Start End Sectors Size Id Type /dev/loop0p1 2048 520000 517953 252.9M 83 Linux root@PKSERVER:/data/temp/device# Link to comment Share on other sites More sharing options...
tkaiser Posted September 21, 2016 Share Posted September 21, 2016 Just a quick note: In the meantime I came to the conclusion that 7-zip is the wrong tool when used on any SBC and that it depends on both contents of the block device (real data vs. unused/zeroed space) and whether SBC and backup host have GbE or not. Please see updated post #2 above ('pbzip2 -1' might be useful but not in every situation) Link to comment Share on other sites More sharing options...
arox Posted September 21, 2016 Share Posted September 21, 2016 For "slices and overlays" I have no experience (so you may elaborate). What I can guess that overlay file must exist on mounted rw block device (or any other rw device/network). The script has very precise purpose - create an image of unmounted block device (disk with partitions). Here is an example that loop devices with partitions will not be visible: root@PKSERVER:/data/temp/device# ll total 7696 -rw-r--r-- 1 root root 268435456 Sep 21 11:08 block_device.img root@PKSERVER:/data/temp/device# losetup -f -P block_device.img root@PKSERVER:/data/temp/device# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 1 115.5G 0 disk └─sda1 8:1 1 115.5G 0 part /data loop0 7:0 0 256M 0 loop └─loop0p1 259:0 0 252.9M 0 loop mmcblk0boot0 179:16 0 4M 1 disk mmcblk0boot1 179:32 0 4M 1 disk mmcblk0 179:0 0 14.6G 0 disk └─mmcblk0p1 179:1 0 14.4G 0 part / root@PKSERVER:/data/temp/device# backup_block_device.sh List of block devices: NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 1 115.5G 0 disk └─sda1 8:1 1 115.5G 0 part /data loop0 7:0 0 256M 0 loop └─loop0p1 259:0 0 252.9M 0 loop mmcblk0boot0 179:16 0 4M 1 disk mmcblk0boot1 179:32 0 4M 1 disk mmcblk0 179:0 0 14.6G 0 disk └─mmcblk0p1 179:1 0 14.4G 0 part / No devices available! root@PKSERVER:/data/temp/device# fdisk -lu /dev/loop0 Disk /dev/loop0: 256 MiB, 268435456 bytes, 524288 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x66d10c8d Device Boot Start End Sectors Size Id Type /dev/loop0p1 2048 520000 517953 252.9M 83 Linux root@PKSERVER:/data/temp/device# You know, all I said is because I am a moron, and although I got professional experience on the matter, I always make archive and forget about it, what there were for, and if it is useful to keep them. I just had a rapid sight at your script and I dont want to tell stupid things. AFAIK, a block device is just a pointer on a driver instance that **can** handle block operations on a raw file/device and cache blocks. It can as well point to an oracle DB file, a filesystem a full disk with boot block, partition table and partitions, a part of a raid/lvm system, a filesystem that is used to compose another filesystem, on your etc/passwd copied to a partition. I dont even know what lsblk would report in the last case but I would guess "part/data". Nevertheless, you will certainly know what you used your disks or partitions for. But if you are just half as forgetful as I am, you wont remember it next year. So I just recommend to save information when you create big archives that will encumber your disks and may need very different handling to restore them. It is only practical experience : I loose (or may have loose - I dont even know) data because I had not well documented the structure and drives used in a RAID/LVM installation some years ago. Physical archives are more difficult to manage than logical ones (files, or db exports) BTW, in 1435503102ce64000028_2016-09-19_1474302693, what is 1474302693 ? Is that the compressed size (that you already can know by the file size) ? You can of course retrieve the part size by decompressing threw wc, but there is useful information that you only can store by entering threw keyboard. Link to comment Share on other sites More sharing options...
piknew Posted September 21, 2016 Author Share Posted September 21, 2016 BTW, in 1435503102ce64000028_2016-09-19_1474302693, what is 1474302693 ? Is that the compressed size (that you already can know by the file size) ? You can of course retrieve the part size by decompressing threw wc, but there is useful information that you only can store by entering threw keyboard. This is result of command date +%s - so... seconds since 1970-01-01 00:00:00 UTC. As I wanted to differentiate files even if I create a few backups during one day. In my case: root@PKSERVER:~# date --date='@1474302693' Mon Sep 19 18:31:33 CEST 2016 Original compressing method is gzip (deflate) - so based on the resulting file it is quite difficult to determine size of "file inside". Currently you must fully "gunzip" the archive to get size of data. I will follow tkraiser's advices and maybe modify the script to use other compressing methods. I acknowledge your point (but still I guess that fo SBCs there is too much to mention about RAIDs/LVMs etc. - if you have configuration like this then definitely you should use other, dedicated method of backup, not simple interactive script). Basically it would be quite simple to create additional .nfo file with some other details that can be gathered during the process and may be helpful when eg. mounting image file later or when trying to restore image on other block device. Link to comment Share on other sites More sharing options...
arox Posted September 21, 2016 Share Posted September 21, 2016 ... if you have configuration like this then definitely you should use other, dedicated method of backup, not simple interactive script). ... In fact, I now avoid complicated configurations. I am an adept of the KISS philosophy (keep it simple stupid). BTW, (@tkaiser) what about compressed fs for archives ? Compression, reliability, performance ? (Not for physical archives but for rsync copies). Link to comment Share on other sites More sharing options...
tkaiser Posted September 21, 2016 Share Posted September 21, 2016 Edit: Added pigz numbers for .gz and .zip BTW, (@tkaiser) what about compressed fs for archives ? Compression, reliability, performance ? (Not for physical archives but for rsync copies). One approach is to simply pipe the whole device's contents through netcat to a beefy Linux/Solaris/FreeBSD host where btrfs/ZFS with maximum compression are the target. With GbE equipped boards you get maximum performance and both maximum disk space savings. It's reliable too since both FS do checksumming. In fact it's the best way to do so especially when later used together with rsync or btrfs send/receive: http://forum.armbian.com/index.php/topic/1331-armbian-sd-card-backup/?p=10071 But in case you want to save your eMMC and have only Fast Ethernet on your board (PC Plus for example) on the device a small Armbian installation on SD card using the mass storage USB gadget (g_mass_storage) module could be convenient. Shut down your device, insert SD card, connect the Micro OTG port to an USB port of the host and then simply share the whole /dev/mmcblk1 device as USB disk (instructions like here) To finish my testing done along the way today. This is how the eMMC of my OPi Plus 2E looks like (containing a legacy Armbian installation that is a few weeks old): /dev/mmcblk2p1 15G 2.4G 11G 18% /mnt/emmc So it's not even 20% used and I never added/changed/deleted that much. I filled all empty space with zeroes which improved copying speeds just a little (that might look totally different if your filesystem is quite busy, then doing a device backup in such a situation as above where 82% should be free means dealing with probably 60% or even more 'real data' that is just marked as not used by the filesystem in question). Anyway: I dropped 7-zip now (way too heavy for these little gems), also had a look at lz4 now (apt-get install liblz4-tool) and pigz (apt-get install pigz) and measured overall performance piping my eMMC through the different compressors to /dev/null: gzip -c: 10.4 MB/s 1065 MB pbzip2 -1 -c: 15.2 MB/s 1033 MB lz4 - -z -c -9 -B4: 18.0 MB/s 1276 MB pigz -c: 25.2 MB/s 1044 MB pigz --zip -c: 25.2 MB/s 1044 MB Nice: lz4 acts single-threaded so in case you want to pipe this through the network using SSH there are still 3 CPU cores free. It's also the fastest with speeds up to 50MB/s when only zeroes are compressed. Not so nice: Inefficient compression/size. And pigz looks like the best allrounder (way faster than pbzip2 while size is almost the same) So in case it's about a Fast Ethernet H3 device that should do device backups through the network it absolutely depends on the amount of 'real data' vs. empty/zeroes. Too lazy to test this out now. In case anyone here has a H3 device with an installation that is rather old and fs is or was filled above 90% --> you're the person who could provide numbers for that! Link to comment Share on other sites More sharing options...
tkaiser Posted September 22, 2016 Share Posted September 22, 2016 I will follow tkraiser's advices and maybe modify the script to use other compressing methods. Another suggestion is to rely on btrfs on the target host and better use keyless SSH authentication. Then you can do some funny things to save space since btrfs has some nice features if you stop compressing on the H3 device at all (which is the best variant on GbE equipped device as we've seen). Simply use on the backup target filesystem compression (mount -o compress) and before you create a new whole device image copy an already existing one with the special 'cp -reflink=always' trick. Given you have one 1435503102ce64000028_2016-09-19_1474302693.img image (16 GB in size, due to btrfs' transparent file compression only needing the space really needed after applying zlib compression, maybe 1.5 GB) and want to do another one 2 hours later: cp -reflink=always 1435503102ce64000028_2016-09-19_1474302693.img 1435503102ce64000028_2016-09-19_1474309893.img The cp takes 0.01 seconds, the new file looks as it's 16 GB in size but you still only waste 1.5GB on disk since both files now still share all chunks. 100 MB might have changed at the source in the meantime so when you now start to send the contents of the eMMC into 1435503102ce64000028_2016-09-19_1474309893.img (which takes some time since the whole contents have to be transmitted through the network) only changed chunks will be written new and of course compressed on the fly. So in case the new stuff is able to be compressed with a 2:1 ration after the whole operation you have 50 MB less on your target filesystem while it looks like there are 32 GB files lying around. Using modern filesystems you can implement stuff like versioning or 'incremental' cloning at the filesystem level while dealing with uncompressed device images ready to be burned everytime. 'Work smarter not harder' (won't happen anytime soon, everyone uses the tools he used already last century ) Link to comment Share on other sites More sharing options...
piknew Posted September 22, 2016 Author Share Posted September 22, 2016 New version can be downloaded (please see first post). I had some issues related to 7z compression, but I guess xz will be the same (lzma). Additionally please see results of backup test for all? compression methods. I have executed tests on Orange Pi PC (100MBit ethernet): nfo.zip logs.zip Result files: root@PKSERVER:/data/temp# ll 543550200b0e340000cd* -rw-r--r-- 1 root root 15523119104 Sep 22 19:19 543550200b0e340000cd_2016-09-22_1474563274.img -rw-r--r-- 1 root root 1678 Sep 22 19:19 543550200b0e340000cd_2016-09-22_1474563274.nfo -rw-r--r-- 1 root root 1029735811 Sep 22 19:52 543550200b0e340000cd_2016-09-22_1474565396.img.gz -rw-r--r-- 1 root root 1678 Sep 22 19:52 543550200b0e340000cd_2016-09-22_1474565396.nfo -rw-r--r-- 1 root root 962251987 Sep 22 20:48 543550200b0e340000cd_2016-09-22_1474567109.img.bz2 -rw-r--r-- 1 root root 1678 Sep 22 20:48 543550200b0e340000cd_2016-09-22_1474567109.nfo I guess the best choice is still gzip. If you select none then bottleneck is speed of network. For bzip2 is is somehow acceptable, but xz... forget..., the timing was unacceptable (and I have interrupted this test). 549MiB 0:16:18 [ 746KiB/s] [====> ] 3% ETA 7:02:56 Link to comment Share on other sites More sharing options...
tkaiser Posted September 22, 2016 Share Posted September 22, 2016 For bzip2 is is somehow acceptable Why using bzip2 on a quad core device when pbzip2 is available? I've been talking all the time only about the multi-threaded variant since it's way faster Link to comment Share on other sites More sharing options...
rodolfo Posted September 22, 2016 Share Posted September 22, 2016 Using modern filesystems you can implement stuff like versioning or 'incremental' cloning at the filesystem level while dealing with uncompressed device images ready to be burned everytime. 'Work smarter not harder' (won't happen anytime soon, everyone uses the tools he used already last century ) 'Work smarter not harder' is definitely the way to go. Automating and optimizing transport and compression of block devices solves what ? Intelligent versioning/cloning of file systems has been standard proven Linux practice with rsync and ext3 filesystems for ages and btrfs hopefully continues that tradition. The proof of any backup is not backing up your data. Your surprises start when trying to RESTORE. Zillions are wasted on backup concepts for lack of actual experience in restoring. Link to comment Share on other sites More sharing options...
arox Posted September 22, 2016 Share Posted September 22, 2016 "everyone uses the tools he used already last century" Well tools are only part of the problems. In particular when people dont know if or why they need it. (And in that case tools tend to become the problem) 1 Link to comment Share on other sites More sharing options...
tkaiser Posted September 23, 2016 Share Posted September 23, 2016 Automating and optimizing transport and compression of block devices solves what ? It was an interesting excercise (at least for me -- I made some experiments with lz4-tools and pbzip2 and tried to share the results here as usual). There is a use case for stuff like that: And that's efficient device cloning as a measure of precaution or even 'move the installation'. Of course this should become part of the main distro later and stuff like device selection should not be the result of re-inventing the wheel but using routines from nand_sata_install (which is known to work on every of the 40 SBC we currently support). But that would require refactoring of Armbian's routines (maybe a library that can be called by different tools called then nand_emmc_usb_sata_install, device_backup, installation_mover, whatever). So please don't discourage people making experiments like that even if the unaltered results are of no use for you. Discussions like this are helpful IMO to get some basic knowledge and to move in a direction where we might be able to provide a more generic backup/clone functionality for all our users. Link to comment Share on other sites More sharing options...
rodolfo Posted September 23, 2016 Share Posted September 23, 2016 So please don't discourage people making experiments like that even if the unaltered results are of no use for you. Discussions like this are helpful IMO to get some basic knowledge and to move in a direction where we might be able to provide a more generic backup/clone functionality for all our users. I love to encourage people making experiments and sharing their results. Basic knowledge is gained by learning and working with the basics. Users tend to trip over automated and nondocumented functionalities. Spending time on getting familiar with Linux basics is needed anyway for Armbian users, the paradigm shift from Windows to Linux cannot be avoided by automating away the basics. The clone functionality is provided by dd The restore functionality can be provided by a reasonably managed backup/restore system allowing access to unaltered versioned images of filesystems. A versioned backups is made by rsync referencing an earlier backup, storage and runtime are minimized by using hardlinks. Need a file from three weeks ago ? Go fetch it from the archive. Need to restore a complete system ? Get it from the archive. The beauty of this approach ? It is proven and simple, has little dependencies on ressources ( any old or new, physical or virtual Linux system will do ). Link to comment Share on other sites More sharing options...
piknew Posted September 23, 2016 Author Share Posted September 23, 2016 Source code is here (anonymous access to browse code, I can grant rw if needed to somebody, just send PM to me): https://code.online-mp.pl/svn/public/ BTW & OT: additionally I put code of my own tool "backup". I designed and writtten it to support my requirements about backup from NAS to external drives. The process is stateful eg. sqlite is used to keep internal state of backup. I am running it for more than 1 year (3 times a day). [/share/USBDisk1/logs] # /share/MD0_DATA/backup/backup backup, version: [0.1.0], build: [Sep 13 2016 10:33:55], usage: backup [-h] - display this help message backup <options> - parameters as specified below: <-f config_file> - (required, single) - list of source <-> destination directories for synchronization <-d database_path> - (required, single) - path to database file (sqlite3) which will be used as information storage [-l log_dir] - (optional, single) - path to directory where log file will be written (default is empty = log to stdout/stderr) [-L log_level] - (optional, single) - logging level, one of: DEBUG, INFO, NOTICE, WARNING, ERROR, CRITICAL (default is NOTICE) [-p pid_file] - (optional, single) - path to file which will be created with process identifier of backup operation [-X exclude_dir] - (optional, multiple) - name of directory to be skipped during synchronization [-x exclude_file] - (optional, multiple) - name of file to be skipped during synchronization [-M check_meta] - (optional, single) - flag (bitmask) to specify additional checking for file comparison (1 - by mode, 2 - by owner's uid, 4 - by owner's gid; default is 0 = no check) [-c cleanup_wait] - (optional, single) - determine after how many backup sessions files or directories will be deleted from destination (default is 0 = immediate) [-T file_system] - (optional, single) - enable compatibility for specified file system (available options: NTFS) And some logs... I am executing it from NAS [/share/USBDisk1/logs] # head -30 backup.20160923.110005.13573.log 2016-09-23|11:00:11|554474|13573|NOTICE|Starting application [backup] (version: [0.1.0], build: [Sep 13 2016 10:33:55], PID = [13573]) 2016-09-23|11:00:11|554901|13573|NOTICE|Using executable [/share/MD0_DATA/backup/backup] 2016-09-23|11:00:11|555059|13573|NOTICE|Using logger level [NOTICE] (2) and logger [/share/USBDisk1/logs/backup.20160923.110005.13573.log] 2016-09-23|11:00:11|556175|13573|NOTICE|Using PID file: [/share/USBDisk1/temp/backup.pid] 2016-09-23|11:00:11|556366|13573|NOTICE|Setting up of file and directory creation mask (umask) to [000], previous umask = [022] 2016-09-23|11:00:11|570426|13573|NOTICE|Using database (v3.14.2): [/share/USBDisk1/misc/backup.dat] 2016-09-23|11:00:11|570659|13573|NOTICE|Using configuration file: [/share/USBDisk1/conf/backup.cfg] 2016-09-23|11:00:11|570859|13573|NOTICE|Using destination directory create depth level: 2 [create] 2016-09-23|11:00:11|571050|13573|NOTICE|Excluding directory name: [.upload_cache] 2016-09-23|11:00:11|571187|13573|NOTICE|Excluding directory name: [.Qsync] 2016-09-23|11:00:11|571320|13573|NOTICE|Excluding directory name: [.@__thumb] 2016-09-23|11:00:11|571452|13573|NOTICE|Excluding directory name: [@Recycle] 2016-09-23|11:00:11|571581|13573|NOTICE|Excluding file name: [Thumbs.db] 2016-09-23|11:00:11|571708|13573|NOTICE|Excluding file name: [.bash_history] 2016-09-23|11:00:11|571851|13573|NOTICE|Using check meta flag [0] (check by mode = no, check by owner's uid = no, check by owner's gid = no) 2016-09-23|11:00:11|572016|13573|NOTICE|Using clean-up wait of value: 12 [postponed clean-up] 2016-09-23|11:00:11|572158|13573|NOTICE|Using compatibility for [NTFS] file system 2016-09-23|11:00:11|572342|13573|NOTICE|Using TEMP database directory: [/share/USBDisk1/misc] 2016-09-23|11:00:11|572556|13573|NOTICE|Successfully connected to TEMP database 2016-09-23|11:00:11|573079|13573|NOTICE|Successfully connected to [/share/USBDisk1/misc/backup.dat] database 2016-09-23|11:00:21|095328|13573|NOTICE|Using session identifier [1801] 2016-09-23|11:00:21|338993|13573|NOTICE|Synchronization of [/mnt/HDA_ROOT/.config] => [/share/USBDisk1/data/backup/scheduled/etc/config] is started 2016-09-23|11:00:21|405905|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [.] 2016-09-23|11:00:21|453484|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [license] 2016-09-23|11:00:21|464243|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [license/.lc] 2016-09-23|11:00:21|485333|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [license/.gnupg] 2016-09-23|11:00:21|487405|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [license/.req] 2016-09-23|11:00:21|498122|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [.hd_info] 2016-09-23|11:00:21|521208|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [php.d] 2016-09-23|11:00:21|537634|13573|NOTICE|Scanning: BASE = [/mnt/HDA_ROOT/.config], PATH = [cloudconnector] Link to comment Share on other sites More sharing options...
rodolfo Posted September 23, 2016 Share Posted September 23, 2016 @piknew Could you please detail the use case you're trying to solve ? From what I get from a quick glance you're automating backup of NTFS directories. Link to comment Share on other sites More sharing options...
piknew Posted September 23, 2016 Author Share Posted September 23, 2016 Can you see my first post? Because I am not trying to solve anything. I wanted to share my script to make backup of block device - maybe somebody wants to use it (which I specified in my post). Last post - I share other tool which I am using to backup NAS for many months without any problems (the purpose and way of working is totally different). Now discussion goes far far away from main topic (and I see from your posts that it started to focus on backup as general). And I have use cases for which my tools are covering in 100%: 1. Backup of block device for easy migration and /or restore 2. Backup execute on filesystem with some features which eg. rsync does not have. Again: I am not having any issues with these tools I have written and want to share to community. Of course I am glad to hear anything that may be worth to have function or even - that there is some bug . @piknew Could you please detail the use case you're trying to solve ? From what I get from a quick glance you're automating backup of NTFS directories. No, NTFS is destination. When you try to backup files from ext2/3/4 to NTFS you will encounter problems with eg.: NTFS streams and some other entities which are not allowed or have different meaning on each filesystem. So, I have external USB HDD (2TB) which I am using as backup device for some important data from my NAS. I didn't wanted to format it (I mean external drive) as ext2/3/4 because I can easy insert it to any Windows machine and then I will have immediate access to any of stored files. Additionally I didn't want that some system files are backed up (so this is the reason exclude options have been implemented). Additionally I would like to have some "grace period" if by coincidence user deletes file from NAS (backup tool will delete this file as well but after some grace period defined by [-c cleanup_wait]). This is short story... Link to comment Share on other sites More sharing options...
rodolfo Posted September 23, 2016 Share Posted September 23, 2016 Ok - thanks for the info. Link to comment Share on other sites More sharing options...
tkaiser Posted September 27, 2016 Share Posted September 27, 2016 Addendum: We can now also do eMMC backup from a Windows, Linux or OS X host simply by connecting the H3 device with Micro USB to an USB port of the host. The H3 device then acts as a card or eMMC reader: http://forum.armbian.com/index.php/topic/2125-armbian-for-orange-pi-does-not-boot/?p=16455 In my setup (OS X with a rather slow MacBook) I was limited to 14.7 MiB/sec USB transfer speed (piping to /dev/null). I didn't care (maybe some tweaks needed to get this faster with our new tool) and tested through 4 compression tools: 7-zip 779 MB 20m40 12MiB/s pbzip2 990 MB 16m59 14.6MiB/s xz 765 MB 34m55 7.1MiB/s zip 1066 MB 20m40 12MiB/s Winner when it's about performance is pbzip2 (not bzip2, the 'p' is for 'parallel' -- this is a special bzip2 version running multithreaded even when data is on stdin!), regarding size it's 7-zip (acts also somewhat multithreaded even in stdin mode) Details below: bash-3.2# time dd if=/dev/rdisk3 bs=1m | pv -s 15267840K | pbzip2 -9 -b40qk -c >emmc.bz2 14910+0 records in 14910+0 records out 15634268160 bytes transferred in 1019.126745 secs (15340848 bytes/sec) 14.6GiB 0:16:59 [14.6MiB/s] [====================================================================================================>] 100% real 16m59.163s user 9m12.303s sys 0m50.981s bash-3.2# du -sh emmc.bz2 967M emmc.bz2 bash-3.2# time dd if=/dev/rdisk3 bs=1m | pv -s 15267840K | 7za a -t7z -m0=lzma -mx=9 -mfb=64 -md=32m -ms=on -si emmc.7z 7-Zip (A) [64] 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18 p7zip Version 9.20 (locale=utf8,Utf16=on,HugeFiles=on,4 CPUs) Creating archive emmc.7z Compressing [Content] 0% 9MiB 0:00:01 [8.96MiB/s] 14910+0 records in 14910+0 records out 15634268160 bytes transferred in 1820.663478 secs (8587127 bytes/sec) 0%14.6GiB 0:30:20 [8.19MiB/s] [====================================================================================================>] 100% Everything is Ok real 30m20.741s user 33m13.561s sys 1m12.921s bash-3.2# du -sh emmc.7z 760M emmc.7z bash-3.2# time dd if=/dev/rdisk3 bs=1m | pv -s 15267840K | zip -9 >emmc.zip adding: - 12MiB 0:00:01 [ 12MiB/s] 14910+0 records in 14910+0 records out 15634268160 bytes transferred in 1240.301981 secs (12605211 bytes/sec) 14.6GiB 0:20:40 [ 12MiB/s] [====================================================================================================>] 100% (deflated 93%) real 20m40.314s user 6m5.263s sys 0m24.822s bash-3.2# time dd if=/dev/rdisk3 bs=1m | pv -s 15267840K | xz -9 >emmc.xz 14910+0 records in 14910+0 records out 15634268160 bytes transferred in 2095.000087 secs (7462658 bytes/sec) 14.6GiB 0:34:55 [7.12MiB/s] [====================================================================================================>] 100% real 34m55.086s user 22m32.800s sys 0m36.705s bash-3.2# ls -la emmc.* -rw-r--r-- 1 root staff 797420766 27 Sep 22:34 emmc.7z -rw-r--r-- 1 root staff 1013703474 27 Sep 21:54 emmc.bz2 -rw-r--r-- 1 root staff 783585672 27 Sep 23:32 emmc.xz -rw-r--r-- 1 root staff 1091182330 27 Sep 22:55 emmc.zip Important note: the results are somewhat meaningless since it has to be checked whether we can improve USB transfer speeds (the eMMC I tested with gets close to 80 MB/s locally). If this is the case the two tools that can compress multithreaded when data is on stdin might show way better performance numbers. Link to comment Share on other sites More sharing options...
piknew Posted September 28, 2016 Author Share Posted September 28, 2016 Can you please take a look at (lines 106-148) and suggest if everything is optimal (i see that in your test you are using -9 which requires a lot of cpu)? svn co https://code.online-mp.pl/svn/public/backup_block_device/trunk/backup_block_device Additional question is related to 7z - I have 7z, 7za. I noticed that sometimes you have been using 7zr. Why so? admin@PKSERVER:~$ which 7z /usr/bin/7z admin@PKSERVER:~$ which 7za /usr/bin/7za admin@PKSERVER:~$ which 7zr admin@PKSERVER:~$ ll /usr/bin/7z* -rwxr-xr-x 1 root root 40 Jun 8 18:07 /usr/bin/7za -rwxr-xr-x 1 root root 39 Jun 8 18:07 /usr/bin/7z admin@PKSERVER:~$ apt list | grep 7z WARNING: apt does not have a stable CLI interface yet. Use with caution in scripts. p7zip/stable,stable 9.20.1~dfsg.1-4.1+deb8u2 armhf p7zip-full/stable,stable,now 9.20.1~dfsg.1-4.1+deb8u2 armhf [installed] p7zip-rar/stable 9.20.1~ds.1-3 armhf admin@PKSERVER:~$ To clarify - are command lines compatible between 7z, 7za & 7zr? An it would be (please note last -so): 7z a -t7z -m0=lzma -mx=9 -mfb=64 -md=32m -ms=on -si -so Link to comment Share on other sites More sharing options...
tkaiser Posted September 28, 2016 Share Posted September 28, 2016 Additional question is related to 7z - I have 7z, 7za. I noticed that sometimes you have been using 7zr. Why so? For whatever reasons there exist two packages on Debian/Ubuntu: root@orangepiplus2e:~# dpkg-query -S /usr/bin/7z* p7zip-full: /usr/bin/7z p7zip-full: /usr/bin/7za p7zip: /usr/bin/7zr Regarding .7z both packages are identical, simply check the output of 'apt-cache show p7zip' and 'apt-cache show p7zip-full' to get the differences. Please be aware that my last test has been done on an i5 MacBook running OS X accessing the eMMC through USB (MacBook not that strong but plently of CPU power compared to any SBC, especially when it's about single-threaded performance). On any SBC pigz, pbzip2 or lz4-tools are the only options if you really care somewhat about speed. The problem with most if not all efficient compressors is that they perform multi-threaded only if input is a file or a list of files and that they have severe problems when input is on stdin (coming from dd, pv or whatever). It should also be noted that all 4 archives I created were not able to be processed by Etcher in the next step. And my current testings are meant to do some basic research to improve situation with Etcher (get support for .7z or Armbian maybe switching to .xz in the future... and being able to do device backups with Etcher too) Edit: I forgot to test on the H3 device through pigz before -- added numbers to post #14 above. Link to comment Share on other sites More sharing options...
piknew Posted September 28, 2016 Author Share Posted September 28, 2016 Edit: I forgot to test on the H3 device through pigz before -- added numbers to post #14 above. Thanks. pigz is much more efficient, it is using all cores: I have updated my script to use pigz (and pbzip2) if available (over standard gzip and bzip2). See last commit (rev.17) : https://code.online-mp.pl/svn/public/backup_block_device/trunk/ Link to comment Share on other sites More sharing options...
earth08 Posted October 4, 2016 Share Posted October 4, 2016 How to backup my whole board, So that in case any thing go wrong I can just restore it Link to comment Share on other sites More sharing options...
Recommended Posts