amazingfate Posted December 8, 2024 Posted December 8, 2024 Just checked ubuntu noble image from https://www.armbian.com/armsom-sige7/, which has 6.1.84 kernel. nvme disk is detected. I did not do a stress test on it. Simple mounting partition, writing some data one it is fine. 0 Quote
AaronNGray Posted December 8, 2024 Author Posted December 8, 2024 @going I am testing the Kernel and the PCIe M2 controller. THe NVME is fine and will not suffer from these number of rewrites. 0 Quote
AaronNGray Posted December 9, 2024 Author Posted December 9, 2024 @eselarm @Igor Unfortunaely on BananPi-M7 a self built Linux bananapim7 6.1.84-vendor-rk35xx kernel still has PCIe M2 NVMe stability issues when running `stress --verbose --hdd 4` after a period on 5 or so minutes. @amazingfate Can you please run a `stress --verbose --hdd 4` for 20 minutes just to verify the sige7 is stable. If so I might get one myself. @going I would much appreciate your input on this. The USB C monitoring device I got does not have minimums only maximums so its very hard to gain any useful information from it. Even with monitioring using a test meter with minimums valid minimums are hard to obtain. Ideally a separate RPi I2C monitoring device is required to know what is going on. Armbian Debian seems very unstable especially at startup or with FireFox tabs dying on startup and Chromium not even starting up. Building a 6.1.84-vendor-rk35xx and 6.12.2 Ubuntu Nobel builds for further testing... 0 Quote
amazingfate Posted December 9, 2024 Posted December 9, 2024 I don't use my nvme disk as root partition, so I use fio to test the io performance: $ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=1200s --time_based --directory=/mnt randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1 fio-3.36 Starting 1 process Jobs: 1 (f=1): [W(1)][100.0%][eta 00m:00s] randwrite: (groupid=0, jobs=1): err= 0: pid=4208: Tue Dec 10 00:35:12 2024 write: IOPS=47.6k, BW=186MiB/s (195MB/s)(219GiB/1205176msec); 0 zone resets clat (nsec): min=875, max=3507.7M, avg=3209.46, stdev=612372.83 lat (nsec): min=1166, max=3507.7M, avg=3291.55, stdev=612373.49 clat percentiles (nsec): | 1.00th=[ 1464], 5.00th=[ 1752], 10.00th=[ 2040], 20.00th=[ 2040], | 30.00th=[ 2040], 40.00th=[ 2320], 50.00th=[ 2320], 60.00th=[ 2320], | 70.00th=[ 2320], 80.00th=[ 2320], 90.00th=[ 2640], 95.00th=[ 3504], | 99.00th=[ 5280], 99.50th=[ 5536], 99.90th=[ 9664], 99.95th=[53504], | 99.99th=[91648] bw ( KiB/s): min= 384, max=1467567, per=100.00%, avg=734206.83, stdev=482602.67, samples=626 iops : min= 96, max=366891, avg=183551.56, stdev=120650.66, samples=626 lat (nsec) : 1000=0.01% lat (usec) : 2=9.80%, 4=86.66%, 10=3.45%, 20=0.03%, 50=0.01% lat (usec) : 100=0.04%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01% lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01% lat (msec) : 2000=0.01%, >=2000=0.01% cpu : usr=2.85%, sys=14.76%, ctx=193337, majf=0, minf=21 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,57409537,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: bw=186MiB/s (195MB/s), 186MiB/s-186MiB/s (195MB/s-195MB/s), io=219GiB (235GB), run=1205176-1205176msec Disk stats (read/write): nvme0n1: ios=0/234874, sectors=0/457824096, merge=0/198, ticks=0/281679979, in_queue=281774755, util=92.39% 0 Quote
Igor Posted December 9, 2024 Posted December 9, 2024 2 hours ago, AaronNGray said: still has PCIe M2 NVMe stability issues If you didn't fix PCI driver, it must have the same isssues. But its good to know / sad to hear, upgrade doesn't fix this. 2 hours ago, AaronNGray said: Armbian Debian seems very unstable especially at startup or with FireFox tabs dying on startup and Chromium not even starting up. Again you want to beat our recommendations This is (close to) embedded Linux. You have to use, what we tell you to use. Or learn the hard way ... Debian is in worse condition (only on this platform) and there is nothing we can do about. Except removing it. Chromium is broken for few weeks and it will remain broken until Debian fix it ... while Chromium for Armbian Ubuntu is compiled and provided by us. And it works. Workaround is installing previous version, but you will anyway lack video aceeleration. But it won't crash. 2 hours ago, AaronNGray said: Building a 6.1.84-vendor-rk35xx and 6.12.2 Ubuntu Nobel builds for further testing... Remember to include right extensions: ENABLE_EXTENSIONS: "v4l2loopback-dkms,mesa-vpu" or experience will be bad. 0 Quote
AaronNGray Posted December 9, 2024 Author Posted December 9, 2024 Quote $ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=1200s --time_based --directory=/mnt randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1 @amazingfate This is failing repeatedly indeterminately on 6.1.84-vendor-rk35xx on my build of Armbian Debian on BananaPi-M7. 0 Quote
AaronNGray Posted December 9, 2024 Author Posted December 9, 2024 (edited) Hi @Igor just to let you know what I am actually after a headless target machine with 32GB of RAM and 1/2 TB of NVMe SSD. The only reason I wanted a GUI and Browser was for easy reporting for state. To fill you in I am actually trying to get a Green low power ARM solution for a number of Climate Change related projects from serving websites, social media system, and also a dedicated crawler and search engine for scientific fields. I have been using Debian since 2017 on BananaPi-M3's, before then two pairs of dual rsync'ed Fujitsu-Siemens Primergy 470 then separately dual rync'ed HP DL 140 G3's servers from 2006 RedHat Fedora Core 4 to Fedora 22 in 2016. I had 36 hours of downtime due to internet outages and a power outage, and one hour which was my fault Edited December 9, 2024 by AaronNGray 0 Quote
AaronNGray Posted December 9, 2024 Author Posted December 9, 2024 (edited) The BananaPi-M7 NVMe issue is in the original Ubuntu Desktop 22.04.3 v5.10.160 kernel 5.release from Armbian too, detectable with fio. Edited December 9, 2024 by AaronNGray 0 Quote
AaronNGray Posted December 9, 2024 Author Posted December 9, 2024 (edited) Armbian 24.11.1 Ubuntu Nobel Gnome - BananaPi-M7 $ uname -a Linux bananapim7 6.1.75-vendor-rk35xx #1 SMP Tue Nov 12 08:48:32 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux $ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=120s --time_based --directory=/mnt/nvme0n1p1 randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1 fio-3.36 Starting 1 process Jobs: 1 (f=1): [W(1)][100.0%][w=932MiB/s][w=239k IOPS][eta 00m:00s] randwrite: (groupid=0, jobs=1): err= 0: pid=5992: Mon Dec 9 23:30:24 2024 write: IOPS=161k, BW=628MiB/s (659MB/s)(74.0GiB/120602msec); 0 zone resets clat (usec): min=3, max=484604, avg= 4.52, stdev=419.98 lat (usec): min=3, max=484605, avg= 4.61, stdev=419.98 clat percentiles (nsec): | 1.00th=[ 3504], 5.00th=[ 3504], 10.00th=[ 3504], 20.00th=[ 3504], | 30.00th=[ 3792], 40.00th=[ 3792], 50.00th=[ 3792], 60.00th=[ 3792], | 70.00th=[ 3792], 80.00th=[ 4080], 90.00th=[ 4640], 95.00th=[ 4640], | 99.00th=[ 6432], 99.50th=[ 8160], 99.90th=[11712], 99.95th=[13376], | 99.99th=[19584] bw ( KiB/s): min= 3848, max=991048, per=100.00%, avg=663199.83, stdev=303212.13, samples=234 iops : min= 962, max=247762, avg=165799.98, stdev=75803.01, samples=234 lat (usec) : 4=77.06%, 10=22.68%, 20=0.25%, 50=0.01%, 100=0.01% lat (usec) : 250=0.01%, 500=0.01%, 750=0.01% lat (msec) : 4=0.01%, 10=0.01%, 50=0.01%, 100=0.01%, 250=0.01% lat (msec) : 500=0.01% cpu : usr=11.32%, sys=79.07%, ctx=41238, majf=0, minf=9 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,19398657,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: bw=628MiB/s (659MB/s), 628MiB/s-628MiB/s (659MB/s-659MB/s), io=74.0GiB (79.5GB), run=120602-120602msec $ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=600s --time_based --directory=/mnt/nvme0n1p1 randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1 fio-3.36 Starting 1 process fio: io_u error on file /mnt/nvme0n1p1/randwrite.0.0: Input/output error: write offset=0, buflen=4096 fio: pid=6257, err=5/file:io_u.c:1896, func=io_u error, error=Input/output error randwrite: (groupid=0, jobs=1): err= 5 (file:io_u.c:1896, func=io_u error, error=Input/output error): pid=6257: Mon Dec 9 23:38:19 2024 write: IOPS=37.5k, BW=146MiB/s (154MB/s)(6144MiB/41945msec); 0 zone resets clat (usec): min=3, max=354943, avg= 4.95, stdev=494.19 lat (usec): min=3, max=354944, avg= 5.05, stdev=494.19 clat percentiles (nsec): | 1.00th=[ 3504], 5.00th=[ 3504], 10.00th=[ 3504], 20.00th=[ 3504], | 30.00th=[ 3792], 40.00th=[ 3792], 50.00th=[ 3792], 60.00th=[ 3792], | 70.00th=[ 4384], 80.00th=[ 4640], 90.00th=[ 4960], 95.00th=[ 5856], | 99.00th=[ 8160], 99.50th=[11328], 99.90th=[21120], 99.95th=[24960], | 99.99th=[36096] bw ( KiB/s): min=44894, max=943952, per=100.00%, avg=599182.00, stdev=308987.51, samples=21 iops : min=11223, max=235988, avg=149795.48, stdev=77246.82, samples=21 lat (usec) : 4=62.59%, 10=36.72%, 20=0.57%, 50=0.11%, 100=0.01% lat (usec) : 500=0.01%, 750=0.01% lat (msec) : 500=0.01% cpu : usr=2.61%, sys=19.77%, ctx=79, majf=0, minf=21 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,1572865,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: bw=146MiB/s (154MB/s), 146MiB/s-146MiB/s (154MB/s-154MB/s), io=6144MiB (6442MB), run=41945-41945msec Still getting the same issue : ( @amazingfate I am wondering since the sige7 seems AFAICT to be produced by SINOVOiP BPi as theres a BPi logo on one I have seen that this maybe a board version issue ? https://www.armsom.org/sige7 Edited December 9, 2024 by AaronNGray 0 Quote
amazingfate Posted December 10, 2024 Posted December 10, 2024 BPi M7 is produced by armsom, and armsom is invested by BPi so there is a BPi logo on board. 0 Quote
AaronNGray Posted December 10, 2024 Author Posted December 10, 2024 (edited) @amazingfate Would you look on your board to see if there is any sign of a version number or any other codes on it please ? Also, would you mind either giving a link or uploading your OS image to Google Drive so I can try running it please ? I will take my board out of its case and examine it properly first thing tomorrow. Thank you for all your help ! Edited December 10, 2024 by AaronNGray 0 Quote
amazingfate Posted December 10, 2024 Posted December 10, 2024 I'm sorry that the image I downloaed from https://www.armbian.com/armsom-sige7/ is running mainline kernel 6.12.1. I did not check the 6.1 vendor kernel. My board is at home and I don't remember the board version. I will test with 6.1 kernel again when I get home today. BTW what kind of NVME SSD you are using? 0 Quote
Igor Posted December 10, 2024 Posted December 10, 2024 9 hours ago, AaronNGray said: I have been using Debian since 2017 on BananaPi-M3's, before then two pairs of dual rsync'ed Fujitsu-Siemens Primergy 470 then separately dual rync'ed HP DL 140 G3's servers from 2006 RedHat Fedora Core 4 to Fedora 22 in 2016. I had 36 hours of downtime due to internet outages and a power outage, and one hour which was my fault Debian on PC and Debian on SBC is quite different. Userspace, the less relevant part of OS, is coming from there, yes. Ubuntu userspace is, on the other hand, more polished, usually with more updated versions of user space librarires. In tl;dr; its better choice. In the process of OS assembly, we remove Canonical stuff, which means our Ubuntu is more or less the same as Debian, just more polished. Kernel is responsible for most of the troubles people report on this forum / have with those boards. And that has little in common with Debian and Ubuntu. Their kernels mainly don't even boot. And when they do, people would only have significantly more troubles in the area of interacting with hardware - low level instability such as you experience. 9 hours ago, AaronNGray said: headless target machine Then mainline based kernel should be what you are looking after. I have two such machines (Rock 5b), with SSD drives, running GitHub runners. We assemble images on them. They are in production for about a year, running some old test version of mainline kernel (Linux 6.7.0-rc1-edge-rockchip-rk3588) https://paste.armbian.com/fibacokoji 0 Quote
eselarm Posted December 10, 2024 Posted December 10, 2024 @AaronNGray As amazingfate hints, running a 6.12 mainline based kernel would likely get you further a bit faster in figuring out why the NVME/PCIe storage fails. It is a RK3588, so pretty standard. Then it might simply be that your specific type of NVME (and/or its internal firmware variant/version) might be the cause of the problems. I have been following many efforts getting a Pi5 running with NVME and not every NVME is guaranteed to work. Problem with Pi5 is that extra flatcable and interface board and powering that is well, this is N/A for RK3588 SBCs with M.2 M-key on-board. If you only need headless/server in the end, use that. An maybe not noble but bookworm based. I spend several hours yesterday to discover that my carefully prepared noble image for Rock3A is unstable w.r.t. SD-card at boot, but same kernel version with bookworm no issues. Is another topic, other board, but still maybe a hint. I currently have no clue (yet) why the noble install ran fine in a VM on RK3588S (pinned to 2x Cortex-A55) and not on the real RK3568 (4x Cortex-A55, many more HW peripherals). Seems like U-boot config/version or so. 0 Quote
going Posted December 10, 2024 Posted December 10, 2024 06.12.2024 в 22:11, AaronNGray сказал: I have installed a Samsung 970 EVO Plus 2 TB PCIe NVMe M.2 Internal Solid State Drive (SSD) (MZ-V7S2T0) device on the board. Are we discussing this particular NVME here? If so, then publish the diagnostics from smartctl. This will help shed some light on some issues. There are two unpleasant moments here. 1) An old and actively used disk with a large number of failed memory cells. The internal disk controller will redistribute the exhausted memory cells from those used at the time of the write operation. Which greatly reduces the recording speed. Disk manufacturers warn about this. 2) The internal disk controller works well only for a few file systems such as FAT, NTFS, ISO. I have two such devices, a SD card and an SSD drive from Samsung. On EXT4 and BTRFS file systems, the speed of read and write operations is greatly reduced. This is my personal experience and this case is for devices that were manufactured in 2008, 2010. 0 Quote
going Posted December 10, 2024 Posted December 10, 2024 2 часа назад, eselarm сказал: Then it might simply be that your specific type of NVME (and/or its internal firmware variant/version) might be the cause of the problems. I have been following many efforts getting a Pi5 running with NVME and not every NVME is guaranteed to work. Problem with Pi5 is that extra flatcable and interface board and powering that is well, this is N/A for RK3588 SBCs with M.2 M-key on-board. I will ask for a little more detail about this problem. Perhaps the translation was inaccurate. 0 Quote
AaronNGray Posted December 10, 2024 Author Posted December 10, 2024 @eselarm I have several Samsung 970 EVO Plus NVMe M.2 SSD'a :- https://www.samsung.com/uk/memory-storage/nvme-ssd/970-evo-plus-nvme-m-2-ssd-2tb-mz-v7s2t0bw/ I also have an old SK Hynix HFS256GD9TNG-L2A0A BA I will test with. 0 Quote
AaronNGray Posted December 10, 2024 Author Posted December 10, 2024 @going Not sure but the NVMe might be the common factor. 0 Quote
AaronNGray Posted December 10, 2024 Author Posted December 10, 2024 @amazingfate I have a v1.1 board which seems to be the most common version AFAICT 0 Quote
AaronNGray Posted December 10, 2024 Author Posted December 10, 2024 @eselarm I have several Samsung 970 EVO Plus NVMe M.2 SSD'a :- https://www.samsung.com/uk/memory-storage/nvme-ssd/970-evo-plus-nvme-m-2-ssd-2tb-mz-v7s2t0bw/ I also have an old SK Hynix HFS256GD9TNG-L2A0A BA I will test with once its backed up. Can people please give what make of M.2 NVMe they are using with the RK3588 SBC's please ? 0 Quote
eselarm Posted December 10, 2024 Posted December 10, 2024 (edited) @AaronNGrayI have the 500G variant in my NanoPi-R6C: Model: "Samsung SSD 970 EVO Plus 500GB" FW Rev: 1B2QEXM7 You can use the tool 'nvme list' to see your details. This 970 EVO Plus series is well known, I don't expect it a source or trouble. A year ago a RaspberryPi engineer mentioned fantastic speed on a Pi5 (via some adaptor board that was available at that time). I think it was PCI-E v3 mode, that is not formally supported on a Pi5, but generally works. But that is just some side info. Your board is 4-lane PCI-E v3 formally supported by the RK3588. My RK3588S based NanoPi-R6C is only 1 lane PCi-E v2, I have not seen any issues in dmesg. Have you already booted/tested with 6.12.2 mainline or later? You don't need t build it, it is available via apt on the armbian beta repo. Make sure the correct files are in /boot (or where your U-boot looks for at poweron). Edited December 10, 2024 by eselarm 0 Quote
amazingfate Posted December 10, 2024 Posted December 10, 2024 I'm having a v1.1 board. Now I boot a noble cli image with 6.1.84 vendor kernel. 120s fio test is fine: $ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=120s --time_based --directory=/mnt randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1 fio-3.36 Starting 1 process Jobs: 1 (f=1): [W(1)][100.0%][w=1079MiB/s][w=276k IOPS][eta 00m:00s] randwrite: (groupid=0, jobs=1): err= 0: pid=4531: Wed Dec 11 01:49:08 2024 write: IOPS=229k, BW=895MiB/s (939MB/s)(105GiB/120116msec); 0 zone resets clat (nsec): min=1166, max=1465.6M, avg=3015.61, stdev=345679.23 lat (nsec): min=1458, max=1465.6M, avg=3105.60, stdev=345687.63 clat percentiles (nsec): | 1.00th=[ 1752], 5.00th=[ 1752], 10.00th=[ 1752], 20.00th=[ 2040], | 30.00th=[ 2040], 40.00th=[ 2040], 50.00th=[ 2040], 60.00th=[ 2320], | 70.00th=[ 3216], 80.00th=[ 3216], 90.00th=[ 3504], 95.00th=[ 4080], | 99.00th=[ 6112], 99.50th=[ 7584], 99.90th=[ 11072], 99.95th=[ 13376], | 99.99th=[749568] bw ( KiB/s): min=49304, max=1457152, per=100.00%, avg=929145.40, stdev=313692.78, samples=236 iops : min=12326, max=364288, avg=232286.35, stdev=78423.29, samples=236 lat (usec) : 2=15.55%, 4=79.07%, 10=5.21%, 20=0.15%, 50=0.01% lat (usec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01% lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (msec) : 100=0.01%, 2000=0.01% cpu : usr=15.40%, sys=77.45%, ctx=52410, majf=0, minf=22 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,27525121,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: bw=895MiB/s (939MB/s), 895MiB/s-895MiB/s (939MB/s-939MB/s), io=105GiB (113GB), run=120116-120116msec Disk stats (read/write): nvme0n1: ios=1/113583, sectors=8/216264040, merge=0/248, ticks=9/45093762, in_queue=45095689, util=90.05% I'm testing with a 256G NVME SSD. 0 Quote
AaronNGray Posted December 10, 2024 Author Posted December 10, 2024 @amazingfate You need to try at least 10 minutes initially for the fault to occur. 0 Quote
amazingfate Posted December 11, 2024 Posted December 11, 2024 After 20min of fio test, nvme disk is still fine $ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=1200s --time_based --directory=/mnt [sudo] password for jfliu: randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1 fio-3.36 Starting 1 process Jobs: 1 (f=1): [W(1)][93.4%][w=93.4MiB/s][w=23.9k IOPS][eta 01m:19s] Jobs: 1 (f=1): [W(1)][100.0%][eta 00m:00s] randwrite: (groupid=0, jobs=1): err= 0: pid=5569: Wed Dec 11 10:26:57 2024 write: IOPS=49.7k, BW=194MiB/s (203MB/s)(227GiB/1200478msec); 0 zone resets clat (nsec): min=1166, max=3770.1M, avg=18646.17, stdev=2441229.11 lat (nsec): min=1458, max=3770.1M, avg=18735.33, stdev=2441230.02 clat percentiles (nsec): | 1.00th=[ 1464], 5.00th=[ 1752], 10.00th=[ 2040], | 20.00th=[ 2040], 30.00th=[ 2320], 40.00th=[ 2640], | 50.00th=[ 2640], 60.00th=[ 2928], 70.00th=[ 3216], | 80.00th=[ 3216], 90.00th=[ 4080], 95.00th=[ 4640], | 99.00th=[ 7904], 99.50th=[ 10560], 99.90th=[8716288], | 99.95th=[9109504], 99.99th=[9240576] bw ( KiB/s): min= 24, max=1543457, per=100.00%, avg=227861.36, stdev=262064.21, samples=2093 iops : min= 6, max=385864, avg=56965.30, stdev=65516.07, samples=2093 lat (usec) : 2=7.53%, 4=80.62%, 10=11.27%, 20=0.41%, 50=0.03% lat (usec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01% lat (msec) : 2=0.01%, 4=0.01%, 10=0.13%, 20=0.01%, 50=0.01% lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01% lat (msec) : 2000=0.01%, >=2000=0.01% cpu : usr=3.12%, sys=17.15%, ctx=251826, majf=0, minf=12 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,59630438,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: bw=194MiB/s (203MB/s), 194MiB/s-194MiB/s (203MB/s-203MB/s), io=227GiB (244GB), run=1200478-1200478msec Disk stats (read/write): nvme0n1: ios=0/265014, sectors=0/414735560, merge=0/734, ticks=0/1115350653, in_queue=1115646197, util=98.45% 0 Quote
AaronNGray Posted December 11, 2024 Author Posted December 11, 2024 Okay, feeling very stupid now, sorry for so much noise ! I tried a different SSD - SK Hynix HFS256GD9TNG-L2A0A BA and its working perfectly passing the fio 20 minute test. The new out of the box Samsung 970 Plus NVMe seems to be faulty ! The Samsung is giving the following :- ``` $ sudo smartctl --all /dev/nvme0n1 [smartctl 7.4 2023-08-01 r5530 [aarch64-linux-6.1.75-vendor-rk35xx] (local build) Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: Samsung SSD 970 EVO Plus 2TB Serial Number: S6P1NF0WA20535R Firmware Version: 4B2QEXM7 PCI Vendor/Subsystem ID: 0x144d IEEE OUI Identifier: 0x002538 Total NVM Capacity: 2,000,398,934,016 [2.00 TB] Unallocated NVM Capacity: 0 Controller ID: 6 NVMe Version: 1.3 Number of Namespaces: 1 Namespace 1 Size/Capacity: 2,000,398,934,016 [2.00 TB] Namespace 1 Utilization: 38,269,112,320 [38.2 GB] Namespace 1 Formatted LBA Size: 512 Namespace 1 IEEE EUI-64: 002538 5a31b1287c Local Time is: Tue Dec 10 00:54:31 2024 GMT Firmware Updates (0x16): 3 Slots, no Reset required Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test Optional NVM Commands (0x0057): Comp Wr_Unc DS_Mngmt Sav/Sel_Feat Timestmp Log Page Attributes (0x0f): S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Maximum Data Transfer Size: 128 Pages Warning Comp. Temp. Threshold: 82 Celsius Critical Comp. Temp. Threshold: 85 Celsius Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 7.59W - - 0 0 0 0 0 0 1 + 7.59W - - 1 1 1 1 0 200 2 + 7.59W - - 2 2 2 2 0 1000 3 - 0.0500W - - 3 3 3 3 2000 1200 4 - 0.0050W - - 4 4 4 4 500 9500 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 29 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 0% Data Units Read: 8,405 [4.30 GB] Data Units Written: 272,389 [139 GB] Host Read Commands: 103,937 Host Write Commands: 621,088 Controller Busy Time: 3 Power Cycles: 372 Power On Hours: 25 Unsafe Shutdowns: 339 Media and Data Integrity Errors: 0 Error Information Log Entries: 0 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Temperature Sensor 1: 29 Celsius Temperature Sensor 2: 36 Celsius Error Information (NVMe Log 0x01, 16 of 64 entries) No Errors Logged Read Self-test Log failed: Invalid Field in Command (0x002) ``` I will try updating the firmware. 0 Quote
AaronNGray Posted December 15, 2024 Author Posted December 15, 2024 No new firmware updates, so ordered WD Blue SN580 2TB SSD, which is now working fine and has been soak tested over night and is running fine. My next issue is finding the right kernel. I would still like to use Debian if possible as well. 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.