Koen Vervloesem Posted February 19, 2021 Posted February 19, 2021 (edited) I have installed Armbian 21.02 Ubuntu 20.04 Focal with Linux kernel 5.10.16 and OpenZFS 2.0.1-1 on my Helios64. Created a three-disk encrypted RAIDZ pool with the following command: sudo zpool create -f -o ashift=12 -O compression=lz4 -O atime=off -O acltype=posixacl -O xattr=sa -O encryption=aes-256-gcm -O keylocation=prompt -O keyformat=passphrase tank raidz1 /dev/disk/by-id/foobar1 /dev/disk/by-id/foobar2 /dev/disk/by-id/foobar3 Then I did a single sequential read benchmark with fio on /tank: cd /tank sudo fio --name=single-sequential-read --ioengine=posixaio --rw=read --bs=1m --size=16g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1 This results in a speed of 62 MB/s, much lower than I would expect. I created exactly the same pool with three disks of the exact model (same batch) on Ubuntu 20.04 with Linux kernel 5.4.0 and OpenZFS 0.8.3 on a HPE ProLiant MicroServer Gen10, and the performance of the same benchmark is 232 MB/s. Does anyone have the same experience of this low encrypted ZFS performance on the Helios64? Is this because of the encryption? Looking at the output of top during the benchmark, the CPU seems to be taxed much, while on the HPE machine CPU usage is much less. Edited February 19, 2021 by Koen Vervloesem disk path
Gareth Halfacree Posted February 19, 2021 Posted February 19, 2021 1 hour ago, Koen Vervloesem said: Does anyone have the same experience of this low encrypted ZFS performance on the Helios64? Is this because of the encryption? Looking at the output of top during the benchmark, the CPU seems to be taxed much, while on the HPE machine CPU usage is much less. Your MicroServer has either an Opteron or Ryzen processor in it, either one of which is considerably more powerful than the Arm-based RK3399. As a quick test, I ran OpenSSL benchmarks for AES-256-CBC on my Ryzen 2700X desktop, an older N54L MicroServer, and the Helios64, block size 8129 bytes. Helios64: 68411.39kB/s. N54L: 127620.44kB/s. Desktop: 211711.31kB/s. From that, you can see the Helios64 CPU is your bottleneck: 68,411.39kB/s is about 67MB/s, or within shouting distance of your 62MB/s real-world throughput - and that's just encryption, without the LZ4 compression overhead. 5
grek Posted February 20, 2021 Posted February 20, 2021 Hey, I have Mirror ZFS, with LUKS encryption on the top + SSD Cache root@helios64:~# zpool status pool: data state: ONLINE scan: scrub repaired 0B in 05:27:58 with 0 errors on Sun Feb 14 20:06:49 2021 config: NAME STATE READ WRITE CKSUM data ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 sda-crypt ONLINE 0 0 0 sdb-crypt ONLINE 0 0 0 cache sdc2 ONLINE 0 0 0 Here You have my results.. it looks like it is faster to have LUKS at the top ... Quote fio --name=single-sequential-read --ioengine=posixaio --rw=read --bs=1m --size=16g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1 single-sequential-read: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=posixaio, iodepth=1 fio-3.12 Starting 1 process single-sequential-read: Laying out IO file (1 file / 16384MiB) Jobs: 1 (f=1): [R(1)][100.0%][r=122MiB/s][r=122 IOPS][eta 00m:00s] single-sequential-read: (groupid=0, jobs=1): err= 0: pid=9237: Sat Feb 20 10:13:39 2021 read: IOPS=113, BW=114MiB/s (119MB/s)(6818MiB/60002msec) slat (usec): min=2, max=2613, avg=17.74, stdev=32.25 clat (usec): min=1313, max=442050, avg=8762.93, stdev=15815.09 lat (usec): min=1324, max=442069, avg=8780.67, stdev=15815.08 clat percentiles (msec): | 1.00th=[ 3], 5.00th=[ 3], 10.00th=[ 3], 20.00th=[ 3], | 30.00th=[ 3], 40.00th=[ 3], 50.00th=[ 4], 60.00th=[ 7], | 70.00th=[ 10], 80.00th=[ 12], 90.00th=[ 15], 95.00th=[ 21], | 99.00th=[ 77], 99.50th=[ 95], 99.90th=[ 228], 99.95th=[ 257], | 99.99th=[ 443] bw ( KiB/s): min=22528, max=147456, per=99.97%, avg=116320.82, stdev=25332.39, samples=120 iops : min= 22, max= 144, avg=113.51, stdev=24.79, samples=120 lat (msec) : 2=0.51%, 4=50.89%, 10=18.77%, 20=24.49%, 50=3.15% lat (msec) : 100=1.77%, 250=0.32%, 500=0.07% cpu : usr=0.55%, sys=0.28%, ctx=6854, majf=7, minf=74 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=6818,0,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=114MiB/s (119MB/s), 114MiB/s-114MiB/s (119MB/s-119MB/s), io=6818MiB (7149MB), run=60002-60002msec
ShadowDance Posted February 20, 2021 Posted February 20, 2021 Native ZFS encryption speed is not optimal on ARM and is limited by CPU speed on the Helios64. The optimizations that have gone in are limited to amd64 based architectures and require CPU features not available on ARM. Another consideration is that because of the lack of CPU features, the CPU will be heavily loaded during encrypted reads and writes meaning there are less resources available for other tasks. The problem isn't AES though, which is fully supported by the RK3309, it's GCM. This means that you can do full disk encryption via LUKS and run ZFS without encryption on top -- this is what I do. It's the best we can have at the moment and for the foreseeable future, nobody to my knowledge is working on ARM encryption optimizations for ZFS currently. Edit: This may be of interest: https://github.com/openzfs/zfs/issues/10347 1
Koen Vervloesem Posted February 20, 2021 Author Posted February 20, 2021 Thanks all for the confirmation and the interesting numbers! ZFS with LUKS seems to be the way to go then for optimal performance.
Recommended Posts