tkaiser Posted February 27, 2018 Author Posted February 27, 2018 USB3 anomalies / problems On 15.2.2018 at 10:50 PM, tkaiser said: In the meantime I created one RAID0 out of 4 SSDs (as can be seen in the picture above -- 2 x USB3, 2 x SATA) and let the iozone test repeat: When I tested this almost 2 weeks ago I did not pay attention close enough to the crappy write performance: 470 MB/s with 4 SSDs in parallel attached to all SATA and USB3 ports is just horribly low given that we have a 'per port' and a 'per port group' limitation of around 390 MB/s. What we should've seen is +650 MB/s taking the overhead into account. But 470 MB/s was already an indication that there's something wrong. Fortunately in the meantime an ODROID community member tested various mirror attemps with 2 Seagate USB3 disks, reported 'RAID 0 doubles disk IO' while in reality showing exactly the opposite: None of his three mirror attempts (mdraid, lvm and btrfs) reported write performance exceeding 50 MB/s which is insanely low for a RAID0 made out of two 3.5" disks (such lousy numbers are usually not even possible with 2 USB2 disks on separate USB2 ports). So let's take a look again: EVO840 and EVO750 both in JMS567 enclosures connected to each USB3 port. I simply created an mdraid RAID0 and measured sequential performance with 'taskset -c 5 iozone -e -I -a -s 500M -r 16384k -i 0 -i 1': kB reclen write rewrite read reread 512000 16384 85367 85179 312532 315012 Yep, there's something seriously wrong when accessing two USB3 disks in parallel. Only 85 MB/s write and 310 MB/s read is way too low especially for rather fast SSDs. 'iostat 1' output shows that each disk when writing remains at ~83 tps (transactions per second): https://pastebin.com/CvgA3ggQ Ok, let's try to get a clue what's bottlenecking. I removed the RAID0 and formatted both SSDs as ext4. First tests with only one SSD active at a time: kB reclen write rewrite read reread EVO840 512000 16384 378665 382100 388932 392917 EVO750 512000 16384 386473 385902 377608 383549 Now trying to start the iozone runs at the same time (of course iozone tasks sent to different CPU cores to avoid CPU bottlenecks, same applies to IRQs: that's /proc/interrupts after test execution): kB reclen write rewrite read reread EVO840 512000 16384 243482 215862 192638 160677 EVO750 512000 16384 214356 252474 168322 195164 So there is still some sort of a limitation but at least it's not as severe as in the mirror modes when all accesses to the two USB connected disks happen exactly in parallel. When looking closer we see another USB3 problem long known from N1's little sibling ROCK64 (any RK3328 device is a much nearer relative to N1 than any of the other ODROIDs): [ 3.433165] xhci-hcd xhci-hcd.7.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring [ 3.433183] xhci-hcd xhci-hcd.7.auto: @00000000efc59440 00000000 00000000 1b000000 01078001 [ 3.441152] xhci-hcd xhci-hcd.8.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring [ 3.441171] xhci-hcd xhci-hcd.8.auto: @00000000efc7e440 00000000 00000000 1b000000 01078001 [ 11.363314] xhci-hcd xhci-hcd.7.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring [ 11.376118] xhci-hcd xhci-hcd.7.auto: @00000000efc59e30 00000000 00000000 1b000000 01078001 [ 11.385567] xhci-hcd xhci-hcd.8.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring [ 11.395145] xhci-hcd xhci-hcd.8.auto: @00000000efc7ec30 00000000 00000000 1b000000 01078000 [ 465.710783] usb 8-1: new SuperSpeed USB device number 3 using xhci-hcd [ 465.807944] xhci-hcd xhci-hcd.8.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring [ 465.817503] xhci-hcd xhci-hcd.8.auto: @00000000efc7ea90 00000000 00000000 1b000000 01078001 [ 468.601895] usb 6-1: new SuperSpeed USB device number 3 using xhci-hcd [ 468.671876] xhci-hcd xhci-hcd.7.auto: ERROR Transfer event for disabled endpoint or incorrect stream ring [ 468.671881] xhci-hcd xhci-hcd.7.auto: @00000000efc591f0 00000000 00000000 1b000000 01078001 I updated bootloader and kernel this morning and have no idea whether this was introduced (again?) just recently or existed already before: root@odroid:~# dpkg -l | egrep "odroid|bootini" ii bootini 20180226-8 arm64 boot.ini and its relatives for ODROID-N1 ii linux-odroidn1 4.4.112-16 arm64 Linux kernel for ODROID-N1 But I guess we're still talking about a lot of room for improvements when it's about XHCI/USB3, BSP kernel and RK3399 Edit: Strangely when I tested with USB3 when I received the N1 two weeks ago the RAID0 results weren't that low. Now I remembered what happened back then: I immediately discovered coherent pool size being too low and increased that to 2MB (gets removed every time the 'bootini' package will be updated). And guess what: that does the trick. I added 'coherent_pool=2M' to kernel cmdline and we're back at normal performance though there's still a ~390 MB/s overall limitation. 1
tkaiser Posted March 7, 2018 Author Posted March 7, 2018 On 15.2.2018 at 8:56 PM, tkaiser said: N1's SATA implementation and how it's 'advertised' (rootfs on SATA) pose another challenge but this is something for a later post (the sh*tshow known from 'SD cards' the last years now arriving at a different product category called 'SSD'). I decided to address these reliability issues over there in Hardkernel's forum instead: https://forum.odroid.com/viewtopic.php?f=149&t=30125&p=217733#p217733
tkaiser Posted June 24, 2018 Author Posted June 24, 2018 And... already canceled: https://www.cnx-software.com/2018/06/24/odroid-n1-canceled-due-to-ram-supply-issues-odroid-n2-coming-later-this-year/ So looking back playing around with early N1 development samples served the purpose to generate some RK3399 specific numbers that are valid for all other RK3399 designs as well
constantius Posted June 25, 2018 Posted June 25, 2018 (edited) on the forum hardkernel they cancelled odroid N1. The reason is end of production 1GB module of ram by samsung and hynix. They are not enable to reach 4GB of RAM. Other makers will have trouble too. They introduced Odroid N2. It should come December 18. CPU Amlogic 922 Quad core cortex A75 Quad core Cortex A55. It will be the first single board computer with ARM-8 rev 2. Edited June 25, 2018 by zador.blood.stained Merged here
tkaiser Posted June 25, 2018 Author Posted June 25, 2018 1 hour ago, constantius said: They introduced Odroid N2. It should come December 18. CPU Amlogic 922 Quad core cortex A75 Quad core Cortex A55 Today nobody (outside Hardkernel) knows for sure whether they use S922 (other SoC vendors like HiSilicon exist also) and nobody (outside Amlogic) knows for sure which CPU cores are inside S922 1
Recommended Posts