Encrypted OpenZFS performance


Koen Vervloesem
 Share

4 4

Recommended Posts

I have installed Armbian 21.02 Ubuntu 20.04 Focal with Linux kernel 5.10.16 and OpenZFS 2.0.1-1 on my Helios64. Created a three-disk encrypted RAIDZ pool with the following command:

 

sudo zpool create -f -o ashift=12 -O compression=lz4 -O atime=off -O acltype=posixacl -O xattr=sa -O encryption=aes-256-gcm -O keylocation=prompt -O keyformat=passphrase tank raidz1 /dev/disk/by-id/foobar1 /dev/disk/by-id/foobar2 /dev/disk/by-id/foobar3

 

Then I did a single sequential read benchmark with fio on /tank:

 

cd /tank
sudo fio --name=single-sequential-read --ioengine=posixaio --rw=read --bs=1m --size=16g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1

 

This results in a speed of 62 MB/s, much lower than I would expect.

 

I created exactly the same pool with three disks of the exact model (same batch) on Ubuntu 20.04 with Linux kernel 5.4.0 and OpenZFS 0.8.3 on a HPE ProLiant MicroServer Gen10, and the performance of the same benchmark is 232 MB/s.

 

Does anyone have the same experience of this low encrypted ZFS performance on the Helios64? Is this because of the encryption? Looking at the output of top during the benchmark, the CPU seems to be taxed much, while on the HPE machine CPU usage is much less.

Edited by Koen Vervloesem
disk path
Link to post
Share on other sites

Armbian is a community driven open source project. Do you like to contribute your code?

1 hour ago, Koen Vervloesem said:

Does anyone have the same experience of this low encrypted ZFS performance on the Helios64? Is this because of the encryption? Looking at the output of top during the benchmark, the CPU seems to be taxed much, while on the HPE machine CPU usage is much less.

 

Your MicroServer has either an Opteron or Ryzen processor in it, either one of which is considerably more powerful than the Arm-based RK3399.

 

As a quick test, I ran OpenSSL benchmarks for AES-256-CBC on my Ryzen 2700X desktop, an older N54L MicroServer, and the Helios64, block size 8129 bytes.

 

Helios64: 68411.39kB/s.

N54L: 127620.44kB/s.

Desktop: 211711.31kB/s.

 

From that, you can see the Helios64 CPU is your bottleneck: 68,411.39kB/s is about 67MB/s, or within shouting distance of your 62MB/s real-world throughput - and that's just encryption, without the LZ4 compression overhead.

 

Link to post
Share on other sites

Hey,

I have Mirror ZFS, with LUKS encryption on the top + SSD Cache

root@helios64:~# zpool status
  pool: data
 state: ONLINE
  scan: scrub repaired 0B in 05:27:58 with 0 errors on Sun Feb 14 20:06:49 2021
config:
    NAME           STATE     READ WRITE CKSUM
    data           ONLINE       0     0     0
      mirror-0     ONLINE       0     0     0
        sda-crypt  ONLINE       0     0     0
        sdb-crypt  ONLINE       0     0     0
    cache
      sdc2         ONLINE       0     0     0

 

Here You have my results.. it looks like it is faster to have LUKS at the top ...

 

Quote

 fio --name=single-sequential-read --ioengine=posixaio --rw=read --bs=1m --size=16g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
single-sequential-read: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=posixaio, iodepth=1
fio-3.12
Starting 1 process
single-sequential-read: Laying out IO file (1 file / 16384MiB)
Jobs: 1 (f=1): [R(1)][100.0%][r=122MiB/s][r=122 IOPS][eta 00m:00s]
single-sequential-read: (groupid=0, jobs=1): err= 0: pid=9237: Sat Feb 20 10:13:39 2021
  read: IOPS=113, BW=114MiB/s (119MB/s)(6818MiB/60002msec)
    slat (usec): min=2, max=2613, avg=17.74, stdev=32.25
    clat (usec): min=1313, max=442050, avg=8762.93, stdev=15815.09
     lat (usec): min=1324, max=442069, avg=8780.67, stdev=15815.08
    clat percentiles (msec):
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    3], 20.00th=[    3],
     | 30.00th=[    3], 40.00th=[    3], 50.00th=[    4], 60.00th=[    7],
     | 70.00th=[   10], 80.00th=[   12], 90.00th=[   15], 95.00th=[   21],
     | 99.00th=[   77], 99.50th=[   95], 99.90th=[  228], 99.95th=[  257],
     | 99.99th=[  443]
   bw (  KiB/s): min=22528, max=147456, per=99.97%, avg=116320.82, stdev=25332.39, samples=120
   iops        : min=   22, max=  144, avg=113.51, stdev=24.79, samples=120
  lat (msec)   : 2=0.51%, 4=50.89%, 10=18.77%, 20=24.49%, 50=3.15%
  lat (msec)   : 100=1.77%, 250=0.32%, 500=0.07%
  cpu          : usr=0.55%, sys=0.28%, ctx=6854, majf=7, minf=74
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=6818,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=114MiB/s (119MB/s), 114MiB/s-114MiB/s (119MB/s-119MB/s), io=6818MiB (7149MB), run=60002-60002msec
 

 

Link to post
Share on other sites

Native ZFS encryption speed is not optimal on ARM and is limited by CPU speed on the Helios64. The optimizations that have gone in are limited to amd64 based architectures and require CPU features not available on ARM. Another consideration is that because of the lack of CPU features, the CPU will be heavily loaded during encrypted reads and writes meaning there are less resources available for other tasks. The problem isn't AES though, which is fully supported by the RK3309, it's GCM. This means that you can do full disk encryption via LUKS and run ZFS without encryption on top -- this is what I do. It's the best we can have at the moment and for the foreseeable future, nobody to my knowledge is working on ARM encryption optimizations for ZFS currently.

 

Edit: This may be of interest: https://github.com/openzfs/zfs/issues/10347

Link to post
Share on other sites

 Share

4 4