Jump to content

Recommended Posts

Posted

@eselarm @Igor Unfortunaely on BananPi-M7 a self built Linux bananapim7 6.1.84-vendor-rk35xx kernel still has PCIe M2 NVMe stability issues when running `stress --verbose --hdd 4` after a period on 5 or so minutes.

 

@amazingfate Can you please run a `stress --verbose --hdd 4` for 20 minutes just to verify the sige7 is stable. If so I might get one myself.

  

@going I would much appreciate your input on this.

 

The USB C monitoring device I got does not have minimums only maximums so its very hard to gain any useful information from it. Even with monitioring using a test meter with minimums valid minimums are hard to obtain. Ideally a separate RPi I2C monitoring device is required to know what is going on. 

 

Armbian Debian seems very unstable especially at startup or with FireFox tabs dying on startup and Chromium not even starting up.

 

Building a 6.1.84-vendor-rk35xx and 6.12.2 Ubuntu Nobel builds for further testing...

Posted

I don't use my nvme disk as root partition, so I use fio to test the io performance:

$ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=1200s --time_based --directory=/mnt
randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1
fio-3.36
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][eta 00m:00s]                          
randwrite: (groupid=0, jobs=1): err= 0: pid=4208: Tue Dec 10 00:35:12 2024
  write: IOPS=47.6k, BW=186MiB/s (195MB/s)(219GiB/1205176msec); 0 zone resets
    clat (nsec): min=875, max=3507.7M, avg=3209.46, stdev=612372.83
     lat (nsec): min=1166, max=3507.7M, avg=3291.55, stdev=612373.49
    clat percentiles (nsec):
     |  1.00th=[ 1464],  5.00th=[ 1752], 10.00th=[ 2040], 20.00th=[ 2040],
     | 30.00th=[ 2040], 40.00th=[ 2320], 50.00th=[ 2320], 60.00th=[ 2320],
     | 70.00th=[ 2320], 80.00th=[ 2320], 90.00th=[ 2640], 95.00th=[ 3504],
     | 99.00th=[ 5280], 99.50th=[ 5536], 99.90th=[ 9664], 99.95th=[53504],
     | 99.99th=[91648]
   bw (  KiB/s): min=  384, max=1467567, per=100.00%, avg=734206.83, stdev=482602.67, samples=626
   iops        : min=   96, max=366891, avg=183551.56, stdev=120650.66, samples=626
  lat (nsec)   : 1000=0.01%
  lat (usec)   : 2=9.80%, 4=86.66%, 10=3.45%, 20=0.03%, 50=0.01%
  lat (usec)   : 100=0.04%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
  lat (msec)   : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2000=0.01%, >=2000=0.01%
  cpu          : usr=2.85%, sys=14.76%, ctx=193337, majf=0, minf=21
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,57409537,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=186MiB/s (195MB/s), 186MiB/s-186MiB/s (195MB/s-195MB/s), io=219GiB (235GB), run=1205176-1205176msec

Disk stats (read/write):
  nvme0n1: ios=0/234874, sectors=0/457824096, merge=0/198, ticks=0/281679979, in_queue=281774755, util=92.39%

Posted
2 hours ago, AaronNGray said:

still has PCIe M2 NVMe stability issues


If you didn't fix PCI driver, it must have the same isssues. But its good to know / sad to hear, upgrade doesn't fix this.

 

2 hours ago, AaronNGray said:

Armbian Debian seems very unstable especially at startup or with FireFox tabs dying on startup and Chromium not even starting up.


Again you want to beat our recommendations :) This is (close to) embedded Linux. You have to use, what we tell you to use. 

 

image.png

 

Or learn the hard way  ... Debian is in worse condition (only on this platform) and there is nothing we can do about. Except removing it. Chromium is broken for few weeks and it will remain broken until Debian fix it ... while Chromium for Armbian Ubuntu is compiled and provided by us. And it works.

 

Workaround is installing previous version, but you will anyway lack video aceeleration. But it won't crash.

 

2 hours ago, AaronNGray said:

Building a 6.1.84-vendor-rk35xx and 6.12.2 Ubuntu Nobel builds for further testing...


Remember to include right extensions:
 

ENABLE_EXTENSIONS: "v4l2loopback-dkms,mesa-vpu"

 

or experience will be bad.

Posted
Quote

$ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=1200s --time_based --directory=/mnt
randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1

@amazingfate This is failing repeatedly indeterminately on 6.1.84-vendor-rk35xx on my build of Armbian Debian on BananaPi-M7.

 

Posted (edited)

Hi @Igor just to let you know what I am actually after a headless target machine with 32GB of RAM and 1/2 TB of NVMe SSD. The only reason I wanted a GUI and Browser was for easy reporting for state. To fill you in I am actually trying to get a Green low power ARM solution for a number of Climate Change related projects from serving websites, social media system, and also a dedicated crawler and search engine for scientific fields. I have been using Debian since 2017 on BananaPi-M3's, before then two pairs of dual rsync'ed Fujitsu-Siemens Primergy 470 then separately dual rync'ed HP DL 140 G3's servers from 2006 RedHat Fedora Core 4 to Fedora 22 in 2016. I had 36 hours of downtime due to internet outages and a power outage, and one hour which was my fault

 

Edited by AaronNGray
Posted (edited)

The BananaPi-M7 NVMe issue is in the original Ubuntu Desktop 22.04.3 v5.10.160 kernel 5.release from Armbian too, detectable with fio.

 

Edited by AaronNGray
Posted (edited)

Armbian 24.11.1 Ubuntu Nobel Gnome - BananaPi-M7
 

$ uname -a
Linux bananapim7 6.1.75-vendor-rk35xx #1 SMP Tue Nov 12 08:48:32 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
 

$ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=120s --time_based --directory=/mnt/nvme0n1p1
randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1
fio-3.36
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][w=932MiB/s][w=239k IOPS][eta 00m:00s]
randwrite: (groupid=0, jobs=1): err= 0: pid=5992: Mon Dec  9 23:30:24 2024
  write: IOPS=161k, BW=628MiB/s (659MB/s)(74.0GiB/120602msec); 0 zone resets
    clat (usec): min=3, max=484604, avg= 4.52, stdev=419.98
     lat (usec): min=3, max=484605, avg= 4.61, stdev=419.98
    clat percentiles (nsec):
     |  1.00th=[ 3504],  5.00th=[ 3504], 10.00th=[ 3504], 20.00th=[ 3504],
     | 30.00th=[ 3792], 40.00th=[ 3792], 50.00th=[ 3792], 60.00th=[ 3792],
     | 70.00th=[ 3792], 80.00th=[ 4080], 90.00th=[ 4640], 95.00th=[ 4640],
     | 99.00th=[ 6432], 99.50th=[ 8160], 99.90th=[11712], 99.95th=[13376],
     | 99.99th=[19584]
   bw (  KiB/s): min= 3848, max=991048, per=100.00%, avg=663199.83, stdev=303212.13, samples=234
   iops        : min=  962, max=247762, avg=165799.98, stdev=75803.01, samples=234
  lat (usec)   : 4=77.06%, 10=22.68%, 20=0.25%, 50=0.01%, 100=0.01%
  lat (usec)   : 250=0.01%, 500=0.01%, 750=0.01%
  lat (msec)   : 4=0.01%, 10=0.01%, 50=0.01%, 100=0.01%, 250=0.01%
  lat (msec)   : 500=0.01%
  cpu          : usr=11.32%, sys=79.07%, ctx=41238, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,19398657,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=628MiB/s (659MB/s), 628MiB/s-628MiB/s (659MB/s-659MB/s), io=74.0GiB (79.5GB), run=120602-120602msec

 

$ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=600s --time_based --directory=/mnt/nvme0n1p1
randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1
fio-3.36
Starting 1 process
fio: io_u error on file /mnt/nvme0n1p1/randwrite.0.0: Input/output error: write offset=0, buflen=4096
fio: pid=6257, err=5/file:io_u.c:1896, func=io_u error, error=Input/output error

randwrite: (groupid=0, jobs=1): err= 5 (file:io_u.c:1896, func=io_u error, error=Input/output error): pid=6257: Mon Dec  9 23:38:19 2024
  write: IOPS=37.5k, BW=146MiB/s (154MB/s)(6144MiB/41945msec); 0 zone resets
    clat (usec): min=3, max=354943, avg= 4.95, stdev=494.19
     lat (usec): min=3, max=354944, avg= 5.05, stdev=494.19
    clat percentiles (nsec):
     |  1.00th=[ 3504],  5.00th=[ 3504], 10.00th=[ 3504], 20.00th=[ 3504],
     | 30.00th=[ 3792], 40.00th=[ 3792], 50.00th=[ 3792], 60.00th=[ 3792],
     | 70.00th=[ 4384], 80.00th=[ 4640], 90.00th=[ 4960], 95.00th=[ 5856],
     | 99.00th=[ 8160], 99.50th=[11328], 99.90th=[21120], 99.95th=[24960],
     | 99.99th=[36096]
   bw (  KiB/s): min=44894, max=943952, per=100.00%, avg=599182.00, stdev=308987.51, samples=21
   iops        : min=11223, max=235988, avg=149795.48, stdev=77246.82, samples=21
  lat (usec)   : 4=62.59%, 10=36.72%, 20=0.57%, 50=0.11%, 100=0.01%
  lat (usec)   : 500=0.01%, 750=0.01%
  lat (msec)   : 500=0.01%
  cpu          : usr=2.61%, sys=19.77%, ctx=79, majf=0, minf=21
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1572865,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=146MiB/s (154MB/s), 146MiB/s-146MiB/s (154MB/s-154MB/s), io=6144MiB (6442MB), run=41945-41945msec
 

Still getting the same issue : (

 

@amazingfate I am wondering since the sige7 seems AFAICT to be produced by SINOVOiP BPi as theres a BPi logo on one I have seen that this maybe a board version issue ?

 

https://www.armsom.org/sige7

 

 

Edited by AaronNGray
Posted (edited)

@amazingfate Would you look on your board to see if there is any sign of a version number or any other codes on it please ?

 

Also, would you mind either giving a link or uploading your OS image to Google Drive so I can try running it please ?

 

I will take my board out of its case and examine it properly first thing tomorrow.

 

Thank you for all your help !

 

Edited by AaronNGray
Posted
9 hours ago, AaronNGray said:

I have been using Debian since 2017 on BananaPi-M3's, before then two pairs of dual rsync'ed Fujitsu-Siemens Primergy 470 then separately dual rync'ed HP DL 140 G3's servers from 2006 RedHat Fedora Core 4 to Fedora 22 in 2016. I had 36 hours of downtime due to internet outages and a power outage, and one hour which was my fault


Debian on PC and Debian on SBC is quite different. Userspace, the less relevant part of OS, is coming from there, yes. Ubuntu userspace is, on the other hand, more polished, usually with more updated versions of user space librarires. In tl;dr; its better choice. In the process of OS assembly, we remove Canonical stuff, which means our Ubuntu is more or less the same as Debian, just more polished. Kernel is responsible for most of the troubles people report on this forum / have with those boards. And that has little in common with Debian and Ubuntu. Their kernels mainly don't even boot. And when they do, people would only have significantly more troubles in the area of interacting with hardware  - low level instability such as you experience.

 

9 hours ago, AaronNGray said:

headless target machine


Then mainline based kernel should be what you are looking after. I have two such machines (Rock 5b), with SSD drives, running GitHub runners. We assemble images on them. They are in production for about a year, running some old test version of mainline kernel (Linux 6.7.0-rc1-edge-rockchip-rk3588) https://paste.armbian.com/fibacokoji

Posted

@AaronNGray As amazingfate hints, running a 6.12 mainline based kernel would likely get you further a bit faster in figuring out why the NVME/PCIe storage fails. It is a RK3588, so pretty standard.

Then it might simply be that your specific type of NVME (and/or its internal firmware variant/version) might be the cause of the problems. I have been following many efforts getting a Pi5 running with NVME and not every NVME is guaranteed to work. Problem with Pi5 is that extra flatcable and interface board and powering that is well, this is N/A for RK3588 SBCs with M.2 M-key on-board.

 

If you only need headless/server in the end, use that. An maybe not noble but bookworm based. I spend several hours yesterday to discover that my carefully prepared noble image for Rock3A is unstable w.r.t. SD-card at boot, but same kernel version with bookworm no issues. Is another topic, other board, but still maybe a hint. I currently have no clue (yet) why the noble install ran fine in a VM on RK3588S (pinned to 2x Cortex-A55) and not on the real RK3568 (4x Cortex-A55, many more HW peripherals). Seems like U-boot config/version or so.

Posted
06.12.2024 в 22:11, AaronNGray сказал:

I have installed a Samsung 970 EVO Plus 2 TB PCIe NVMe M.2 Internal Solid State Drive (SSD) (MZ-V7S2T0) device on the board.

Are we discussing this particular NVME here?

If so, then publish the diagnostics from smartctl.

This will help shed some light on some issues.

 

There are two unpleasant moments here.
1) An old and actively used disk with a large number of failed memory cells.
The internal disk controller will redistribute the exhausted memory cells from those used at the time of the write operation.

Which greatly reduces the recording speed.
Disk manufacturers warn about this.

2) The internal disk controller works well only for a few file systems such as FAT, NTFS, ISO.

I have two such devices, a SD card and an SSD drive from Samsung.

On EXT4 and BTRFS file systems, the speed of read and write operations is greatly reduced.

 

This is my personal experience and this case is for devices that were manufactured in 2008, 2010.

Posted
2 часа назад, eselarm сказал:

Then it might simply be that your specific type of NVME (and/or its internal firmware variant/version) might be the cause of the problems. I have been following many efforts getting a Pi5 running with NVME and not every NVME is guaranteed to work. Problem with Pi5 is that extra flatcable and interface board and powering that is well, this is N/A for RK3588 SBCs with M.2 M-key on-board.

I will ask for a little more detail about this problem. Perhaps the translation was inaccurate.

Posted (edited)

@AaronNGrayI have the 500G variant in my NanoPi-R6C:

Model: "Samsung SSD 970 EVO Plus 500GB"

FW Rev: 1B2QEXM7
 

You can use the tool 'nvme list' to see your details.

 

This 970 EVO Plus series is well known, I don't expect it a source or trouble. A year ago a RaspberryPi engineer mentioned fantastic speed on a Pi5 (via some adaptor board that was available at that time). I think it was PCI-E v3 mode, that is not formally supported on a Pi5, but generally works. But that is just some side info. Your board is 4-lane PCI-E v3 formally supported by the RK3588.

My RK3588S based NanoPi-R6C is only 1 lane PCi-E v2, I have not seen any issues in dmesg.

 

Have you already booted/tested with 6.12.2 mainline or later?

You don't need t build it, it is available via apt on the armbian beta repo. Make sure the correct files are in /boot (or where your U-boot looks for at poweron).

 

Edited by eselarm
Posted

I'm having a v1.1 board.

Now I boot a noble cli image with 6.1.84 vendor kernel.

120s fio test is fine:

$ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=120s --time_based --directory=/mnt
randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1
fio-3.36
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][w=1079MiB/s][w=276k IOPS][eta 00m:00s]
randwrite: (groupid=0, jobs=1): err= 0: pid=4531: Wed Dec 11 01:49:08 2024
  write: IOPS=229k, BW=895MiB/s (939MB/s)(105GiB/120116msec); 0 zone resets
    clat (nsec): min=1166, max=1465.6M, avg=3015.61, stdev=345679.23
     lat (nsec): min=1458, max=1465.6M, avg=3105.60, stdev=345687.63
    clat percentiles (nsec):
     |  1.00th=[  1752],  5.00th=[  1752], 10.00th=[  1752], 20.00th=[  2040],
     | 30.00th=[  2040], 40.00th=[  2040], 50.00th=[  2040], 60.00th=[  2320],
     | 70.00th=[  3216], 80.00th=[  3216], 90.00th=[  3504], 95.00th=[  4080],
     | 99.00th=[  6112], 99.50th=[  7584], 99.90th=[ 11072], 99.95th=[ 13376],
     | 99.99th=[749568]
   bw (  KiB/s): min=49304, max=1457152, per=100.00%, avg=929145.40, stdev=313692.78, samples=236
   iops        : min=12326, max=364288, avg=232286.35, stdev=78423.29, samples=236
  lat (usec)   : 2=15.55%, 4=79.07%, 10=5.21%, 20=0.15%, 50=0.01%
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
  lat (msec)   : 100=0.01%, 2000=0.01%
  cpu          : usr=15.40%, sys=77.45%, ctx=52410, majf=0, minf=22
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,27525121,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=895MiB/s (939MB/s), 895MiB/s-895MiB/s (939MB/s-939MB/s), io=105GiB (113GB), run=120116-120116msec

Disk stats (read/write):
  nvme0n1: ios=1/113583, sectors=8/216264040, merge=0/248, ticks=9/45093762, in_queue=45095689, util=90.05%

 

I'm testing with a 256G NVME SSD.

Posted

After 20min of fio test, nvme disk is still fine

$ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=1200s --time_based --directory=/mnt
[sudo] password for jfliu: 
randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1
fio-3.36
Starting 1 process
Jobs: 1 (f=1): [W(1)][93.4%][w=93.4MiB/s][w=23.9k IOPS][eta 01m:19s]

Jobs: 1 (f=1): [W(1)][100.0%][eta 00m:00s]                          
randwrite: (groupid=0, jobs=1): err= 0: pid=5569: Wed Dec 11 10:26:57 2024
  write: IOPS=49.7k, BW=194MiB/s (203MB/s)(227GiB/1200478msec); 0 zone resets
    clat (nsec): min=1166, max=3770.1M, avg=18646.17, stdev=2441229.11
     lat (nsec): min=1458, max=3770.1M, avg=18735.33, stdev=2441230.02
    clat percentiles (nsec):
     |  1.00th=[   1464],  5.00th=[   1752], 10.00th=[   2040],
     | 20.00th=[   2040], 30.00th=[   2320], 40.00th=[   2640],
     | 50.00th=[   2640], 60.00th=[   2928], 70.00th=[   3216],
     | 80.00th=[   3216], 90.00th=[   4080], 95.00th=[   4640],
     | 99.00th=[   7904], 99.50th=[  10560], 99.90th=[8716288],
     | 99.95th=[9109504], 99.99th=[9240576]
   bw (  KiB/s): min=   24, max=1543457, per=100.00%, avg=227861.36, stdev=262064.21, samples=2093
   iops        : min=    6, max=385864, avg=56965.30, stdev=65516.07, samples=2093
  lat (usec)   : 2=7.53%, 4=80.62%, 10=11.27%, 20=0.41%, 50=0.03%
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.13%, 20=0.01%, 50=0.01%
  lat (msec)   : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2000=0.01%, >=2000=0.01%
  cpu          : usr=3.12%, sys=17.15%, ctx=251826, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,59630438,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=194MiB/s (203MB/s), 194MiB/s-194MiB/s (203MB/s-203MB/s), io=227GiB (244GB), run=1200478-1200478msec

Disk stats (read/write):
  nvme0n1: ios=0/265014, sectors=0/414735560, merge=0/734, ticks=0/1115350653, in_queue=1115646197, util=98.45%

Posted

Okay, feeling very stupid now, sorry for so much noise !

I tried a different SSD - SK Hynix HFS256GD9TNG-L2A0A BA and its working perfectly passing the fio 20 minute test.

The new out of the box Samsung 970 Plus NVMe seems to be faulty !

The Samsung is giving the following :-
```

$ sudo smartctl --all /dev/nvme0n1
[smartctl 7.4 2023-08-01 r5530 [aarch64-linux-6.1.75-vendor-rk35xx] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       Samsung SSD 970 EVO Plus 2TB
Serial Number:                      S6P1NF0WA20535R
Firmware Version:                   4B2QEXM7
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Total NVM Capacity:                 2,000,398,934,016 [2.00 TB]
Unallocated NVM Capacity:           0
Controller ID:                      6
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          2,000,398,934,016 [2.00 TB]
Namespace 1 Utilization:            38,269,112,320 [38.2 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            002538 5a31b1287c
Local Time is:                      Tue Dec 10 00:54:31 2024 GMT
Firmware Updates (0x16):            3 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x0057):     Comp Wr_Unc DS_Mngmt Sav/Sel_Feat Timestmp
Log Page Attributes (0x0f):         S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size:         128 Pages
Warning  Comp. Temp. Threshold:     82 Celsius
Critical Comp. Temp. Threshold:     85 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     7.59W       -        -    0  0  0  0        0       0
 1 +     7.59W       -        -    1  1  1  1        0     200
 2 +     7.59W       -        -    2  2  2  2        0    1000
 3 -   0.0500W       -        -    3  3  3  3     2000    1200
 4 -   0.0050W       -        -    4  4  4  4      500    9500

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        29 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    8,405 [4.30 GB]
Data Units Written:                 272,389 [139 GB]
Host Read Commands:                 103,937
Host Write Commands:                621,088
Controller Busy Time:               3
Power Cycles:                       372
Power On Hours:                     25
Unsafe Shutdowns:                   339
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               29 Celsius
Temperature Sensor 2:               36 Celsius

Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors Logged

Read Self-test Log failed: Invalid Field in Command (0x002)

```

 

I will try updating the firmware.

 

Posted

No new firmware updates, so ordered WD Blue SN580 2TB SSD, which is now working fine and has been soak tested over night and is running fine.

 

My next issue is finding the right kernel. I would still like to use Debian if possible as well.

 

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines