djurny Posted April 4, 2021 Share Posted April 4, 2021 @jsr my bad for the filename. Will check on my box as soon as I can.Can you also do cryptsetup benchmark? I wonder if the names in crypto and the ones used by cryptsetup might have changed somehow. Can you also share syslog of your box? The Marvell crypto drivers are also announced there, something might be off.(Note I'm not an expert in this crypto business so I might be barking up the verkeerde boom.)Groetjes,Sent from my SM-T500 using Tapatalk 0 Quote Link to comment Share on other sites More sharing options...
jsr Posted April 5, 2021 Share Posted April 5, 2021 Hi @djurny, here you go: cryptsetup benchmark: # Tests are approximate using memory only (no storage IO). PBKDF2-sha1 258524 iterations per second for 256-bit key PBKDF2-sha256 364595 iterations per second for 256-bit key PBKDF2-sha512 179550 iterations per second for 256-bit key PBKDF2-ripemd160 191345 iterations per second for 256-bit key PBKDF2-whirlpool 28370 iterations per second for 256-bit key argon2i 4 iterations, 71599 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time) argon2id 4 iterations, 73566 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time) # Algorithm | Key | Encryption | Decryption aes-cbc 128b 95.1 MiB/s 98.1 MiB/s serpent-cbc 128b 27.6 MiB/s 30.7 MiB/s twofish-cbc 128b 40.3 MiB/s 45.4 MiB/s aes-cbc 256b 90.7 MiB/s 93.1 MiB/s serpent-cbc 256b 28.0 MiB/s 31.3 MiB/s twofish-cbc 256b 41.2 MiB/s 46.2 MiB/s aes-xts 256b 62.7 MiB/s 54.9 MiB/s serpent-xts 256b 29.7 MiB/s 31.1 MiB/s twofish-xts 256b 44.0 MiB/s 45.7 MiB/s aes-xts 512b 47.7 MiB/s 41.5 MiB/s serpent-xts 512b 29.8 MiB/s 31.1 MiB/s twofish-xts 512b 44.0 MiB/s 45.7 MiB/s interesting, I do see CESA interrupts count up from the cryptsetup benchmark: 48: 24618 0 GIC-0 51 Level f1090000.crypto 49: 0 0 GIC-0 52 Level f1090000.crypto syslog: https://pastebin.com/AacFP5R8 Thanks! 0 Quote Link to comment Share on other sites More sharing options...
gprovost Posted April 6, 2021 Author Share Posted April 6, 2021 I observe the same behavior. It seems to be linked to the introduction of the ESSIV kernel module in Linux Kernel 5.4 When creating and opening a new LUKS2 device, I can still see the interrupt of the crypto engine increasing. However when exercising the mounted encrypted device I don't see anymore increase of interrupt. What I realized is that there is this module essiv that get loaded and it seems to suggest it bypass the CESA crypto engine. We need to find a way to force dm-crypt to use marvell_cesa root@helios4:~# cat /proc/crypto name : essiv(cbc(aes),sha256) driver : essiv(cbc(aes-generic),sha256-generic) module : essiv priority : 100 refcnt : 1 selftest : passed internal : no type : skcipher async : no blocksize : 16 min keysize : 16 max keysize : 32 ivsize : 16 chunksize : 16 walksize : 16 [...] name : cbc(aes) driver : mv-cbc-aes module : marvell_cesa priority : 300 refcnt : 1 selftest : passed internal : no type : skcipher async : yes blocksize : 16 min keysize : 16 max keysize : 32 ivsize : 16 chunksize : 16 walksize : 16 0 Quote Link to comment Share on other sites More sharing options...
djurny Posted April 10, 2021 Share Posted April 10, 2021 On 4/6/2021 at 12:10 PM, gprovost said: I observe the same behavior. It seems to be linked to the introduction of the ESSIV kernel module in Linux Kernel 5.4 When creating and opening a new LUKS2 device, I can still see the interrupt of the crypto engine increasing. However when exercising the mounted encrypted device I don't see anymore increase of interrupt. What I realized is that there is this module essiv that get loaded and it seems to suggest it bypass the CESA crypto engine. We need to find a way to force dm-crypt to use marvell_cesa [..] I also see the same. Tried simple things, like blacklisting essiv kernel module, but that prevents cryptsetup from working at all. There seems to be a dependency of the authenc kernel module to essiv. Not sure what it was before, as I don't have any box running on older kernel at the moment: filename: /lib/modules/5.10.21-mvebu/kernel/crypto/essiv.ko [..] depends: authenc intree: Y name: essiv [..] Without authenc, crytpsetup starts complaining: Apr 7 22:52:04 localhost kernel: [ 126.468108] device-mapper: ioctl: error adding target to table Apr 7 22:53:36 localhost kernel: [ 218.613317] essiv: Unknown symbol crypto_authenc_extractkeys (err -2) Groetjes, 0 Quote Link to comment Share on other sites More sharing options...
m4110c Posted September 8, 2021 Share Posted September 8, 2021 Hi there, I have a Helios 4 (last batch) and just wanted to ask if CESA is still not functional with the current Buster release? I'm in the midst of a new setup and wonder if I should even try to use CESA. Also one question for clarification: Here in the forums, but also in the Kool-wiki, I it seems to me, that people are saying that CPU crypto is superior to CESA in terms of performance... As I'm quite noobish regarding this, I would be very grateful for any recommendations. Thanks in advance! 0 Quote Link to comment Share on other sites More sharing options...
m4110c Posted September 8, 2021 Share Posted September 8, 2021 I should add, that I'm using the NAS for storing large files that are almost only written, medium files that are written once and read from time to time, and small files that are read and written often.. 0 Quote Link to comment Share on other sites More sharing options...
m4110c Posted September 8, 2021 Share Posted September 8, 2021 Am 11.7.2020 um 07:50 schrieb sfx2000: With CESA - many assumptions based on Armada 38x - where things in scope looked very good on ARM-V7A - both the Marvell cores and the later ARM-a9's... Armada is mixed up here... let's just say that 38x has massive bandwidth inside the chip - MV3720/Mochi doesn't... MV3720 - always a tradeoff - more CPU and throughput, or offload to the CESA units, which is probably similar to other "off-loads" for dedicated accelerators Personally - I would not recommend CESA here,,, I would recommend core here as the is CPU/MEM, and CESA is limited, and so is the bus within the chip itself. I don't understand this exatly. The poster above you found these values: - Plain access to disk: 180MB/s - LUKS2 + CESA: 140MB/s - LUKS2 (no CESA): 52MB/s So I wonder how/why you think not using CESA is better here. From my point of view this seems to be a contradiction. Or is there something I don't see? 0 Quote Link to comment Share on other sites More sharing options...
m4110c Posted September 8, 2021 Share Posted September 8, 2021 Double post, sorry. (Why can't I delete my own posts?) 0 Quote Link to comment Share on other sites More sharing options...
jsr Posted September 9, 2021 Share Posted September 9, 2021 11 hours ago, m4110c said: I have a Helios 4 (last batch) and just wanted to ask if CESA is still not functional with the current Buster release? I'm on Armbian 21.08.2 Buster with Linux 5.10.60-mvebu and CESA is not functional. Regards, James 0 Quote Link to comment Share on other sites More sharing options...
m4110c Posted September 9, 2021 Share Posted September 9, 2021 vor 11 Minuten schrieb jsr: I'm on Armbian 21.08.2 Buster with Linux 5.10.60-mvebu and CESA is not functional. Regards, James Thanks! That means, it makes no sense to even use aes-cbc-essiv:sha256 anymore. Kobol Team is out of business so there will be no fix regarding this... I guess I'll do a cryptsetup benchmark and just use the fastest cipher it finds. 0 Quote Link to comment Share on other sites More sharing options...
m4110c Posted September 9, 2021 Share Posted September 9, 2021 vor 1 Stunde schrieb m4110c: Thanks! That means, it makes no sense to even use aes-cbc-essiv:sha256 anymore. Kobol Team is out of business so there will be no fix regarding this... I guess I'll do a cryptsetup benchmark and just use the fastest cipher it finds. Strangely when running a `cryptsetup benchmark` I can see that the interrupt is triggered: Doesn't that mean that it's using CESA? Also, I wonder why there is no interrupt activity on 49 but just on 48. Seems the distribution to both CPUs is not working..? 0 Quote Link to comment Share on other sites More sharing options...
sfx2000 Posted September 13, 2021 Share Posted September 13, 2021 On 9/8/2021 at 4:03 PM, m4110c said: So I wonder how/why you think not using CESA is better here. From my point of view this seems to be a contradiction. Or is there something I don't see? Armada 38x is fairly decent with CESA in specific use cases - and it's worth the effort perhaps to get it up and running (armbian, if I recall, doesn't enable it by default) MV3720 is a different chip - and there, it's better to skip the CESA units, and go with software on the cores, IMHO... 0 Quote Link to comment Share on other sites More sharing options...
m4110c Posted September 13, 2021 Share Posted September 13, 2021 vor 11 Stunden schrieb sfx2000: Armada 38x is fairly decent with CESA in specific use cases - and it's worth the effort perhaps to get it up and running (armbian, if I recall, doesn't enable it by default) MV3720 is a different chip - and there, it's better to skip the CESA units, and go with software on the cores, IMHO... Aaah ok, now I get it. You referred to the MV3720.. That makes sense. Thanks for the update! 0 Quote Link to comment Share on other sites More sharing options...
Pali Posted October 3, 2021 Share Posted October 3, 2021 A38x has 32-bit ARM cores. A3720 has 64-bit A53 ARM cores. 64-bit ARM cores compared to 32-bit cores have additional instructions for crypto operations. And based on some tests it seems that some crypto done via 64-bit ARM blocks is really faster than crypto done on CESA on the same SoC. So the best is to do own tests and measure if CESA or ARM is faster. So think about it like that 64-bit ARM CPU itself has some kind of crypto accelerator engine. 1 Quote Link to comment Share on other sites More sharing options...
sfx2000 Posted November 1, 2021 Share Posted November 1, 2021 Exactly - MV3720 running in 64-bit is fairly impressive all told, which is why I kind of mentioned that one might be better off rather than using CESA on that chip. Armada 38X and XP, it's the other way around, where CESA is much better than running on cores - benefit of those chips being focused on networking/communications processing as opposed to application focused.. Depending on needs of course - the main benefit of running CESA is the offload from the cores so they are available for other tasks. 0 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.