Jump to content

AaronNGray

Members
  • Posts

    75
  • Joined

  • Last visited

Profile Information

  • Gender
    Male
  • Location
    UK
  • Interests
    Language and Operating System Research and Development, Open Source, Computer Science, Type Theory, Category Theory, Genetics, Bass, and Guitar

Contact Methods

  • Website URL
    https://aarongray.org

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. No new firmware updates, so ordered WD Blue SN580 2TB SSD, which is now working fine and has been soak tested over night and is running fine. My next issue is finding the right kernel. I would still like to use Debian if possible as well.
  2. Okay, feeling very stupid now, sorry for so much noise ! I tried a different SSD - SK Hynix HFS256GD9TNG-L2A0A BA and its working perfectly passing the fio 20 minute test. The new out of the box Samsung 970 Plus NVMe seems to be faulty ! The Samsung is giving the following :- ``` $ sudo smartctl --all /dev/nvme0n1 [smartctl 7.4 2023-08-01 r5530 [aarch64-linux-6.1.75-vendor-rk35xx] (local build) Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: Samsung SSD 970 EVO Plus 2TB Serial Number: S6P1NF0WA20535R Firmware Version: 4B2QEXM7 PCI Vendor/Subsystem ID: 0x144d IEEE OUI Identifier: 0x002538 Total NVM Capacity: 2,000,398,934,016 [2.00 TB] Unallocated NVM Capacity: 0 Controller ID: 6 NVMe Version: 1.3 Number of Namespaces: 1 Namespace 1 Size/Capacity: 2,000,398,934,016 [2.00 TB] Namespace 1 Utilization: 38,269,112,320 [38.2 GB] Namespace 1 Formatted LBA Size: 512 Namespace 1 IEEE EUI-64: 002538 5a31b1287c Local Time is: Tue Dec 10 00:54:31 2024 GMT Firmware Updates (0x16): 3 Slots, no Reset required Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test Optional NVM Commands (0x0057): Comp Wr_Unc DS_Mngmt Sav/Sel_Feat Timestmp Log Page Attributes (0x0f): S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Maximum Data Transfer Size: 128 Pages Warning Comp. Temp. Threshold: 82 Celsius Critical Comp. Temp. Threshold: 85 Celsius Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 7.59W - - 0 0 0 0 0 0 1 + 7.59W - - 1 1 1 1 0 200 2 + 7.59W - - 2 2 2 2 0 1000 3 - 0.0500W - - 3 3 3 3 2000 1200 4 - 0.0050W - - 4 4 4 4 500 9500 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 29 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 0% Data Units Read: 8,405 [4.30 GB] Data Units Written: 272,389 [139 GB] Host Read Commands: 103,937 Host Write Commands: 621,088 Controller Busy Time: 3 Power Cycles: 372 Power On Hours: 25 Unsafe Shutdowns: 339 Media and Data Integrity Errors: 0 Error Information Log Entries: 0 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Temperature Sensor 1: 29 Celsius Temperature Sensor 2: 36 Celsius Error Information (NVMe Log 0x01, 16 of 64 entries) No Errors Logged Read Self-test Log failed: Invalid Field in Command (0x002) ``` I will try updating the firmware.
  3. @amazingfate You need to try at least 10 minutes initially for the fault to occur.
  4. @eselarm I have several Samsung 970 EVO Plus NVMe M.2 SSD'a :- https://www.samsung.com/uk/memory-storage/nvme-ssd/970-evo-plus-nvme-m-2-ssd-2tb-mz-v7s2t0bw/ I also have an old SK Hynix HFS256GD9TNG-L2A0A BA I will test with once its backed up. Can people please give what make of M.2 NVMe they are using with the RK3588 SBC's please ?
  5. @amazingfate I have a v1.1 board which seems to be the most common version AFAICT
  6. @going Not sure but the NVMe might be the common factor.
  7. @eselarm I have several Samsung 970 EVO Plus NVMe M.2 SSD'a :- https://www.samsung.com/uk/memory-storage/nvme-ssd/970-evo-plus-nvme-m-2-ssd-2tb-mz-v7s2t0bw/ I also have an old SK Hynix HFS256GD9TNG-L2A0A BA I will test with.
  8. @amazingfate Would you look on your board to see if there is any sign of a version number or any other codes on it please ? Also, would you mind either giving a link or uploading your OS image to Google Drive so I can try running it please ? I will take my board out of its case and examine it properly first thing tomorrow. Thank you for all your help !
  9. Armbian 24.11.1 Ubuntu Nobel Gnome - BananaPi-M7 $ uname -a Linux bananapim7 6.1.75-vendor-rk35xx #1 SMP Tue Nov 12 08:48:32 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux $ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=120s --time_based --directory=/mnt/nvme0n1p1 randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1 fio-3.36 Starting 1 process Jobs: 1 (f=1): [W(1)][100.0%][w=932MiB/s][w=239k IOPS][eta 00m:00s] randwrite: (groupid=0, jobs=1): err= 0: pid=5992: Mon Dec 9 23:30:24 2024 write: IOPS=161k, BW=628MiB/s (659MB/s)(74.0GiB/120602msec); 0 zone resets clat (usec): min=3, max=484604, avg= 4.52, stdev=419.98 lat (usec): min=3, max=484605, avg= 4.61, stdev=419.98 clat percentiles (nsec): | 1.00th=[ 3504], 5.00th=[ 3504], 10.00th=[ 3504], 20.00th=[ 3504], | 30.00th=[ 3792], 40.00th=[ 3792], 50.00th=[ 3792], 60.00th=[ 3792], | 70.00th=[ 3792], 80.00th=[ 4080], 90.00th=[ 4640], 95.00th=[ 4640], | 99.00th=[ 6432], 99.50th=[ 8160], 99.90th=[11712], 99.95th=[13376], | 99.99th=[19584] bw ( KiB/s): min= 3848, max=991048, per=100.00%, avg=663199.83, stdev=303212.13, samples=234 iops : min= 962, max=247762, avg=165799.98, stdev=75803.01, samples=234 lat (usec) : 4=77.06%, 10=22.68%, 20=0.25%, 50=0.01%, 100=0.01% lat (usec) : 250=0.01%, 500=0.01%, 750=0.01% lat (msec) : 4=0.01%, 10=0.01%, 50=0.01%, 100=0.01%, 250=0.01% lat (msec) : 500=0.01% cpu : usr=11.32%, sys=79.07%, ctx=41238, majf=0, minf=9 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,19398657,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: bw=628MiB/s (659MB/s), 628MiB/s-628MiB/s (659MB/s-659MB/s), io=74.0GiB (79.5GB), run=120602-120602msec $ sudo fio --name=randwrite --ioengine=sync --rw=write --bs=4k --numjobs=1 --size=1G --runtime=600s --time_based --directory=/mnt/nvme0n1p1 randwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1 fio-3.36 Starting 1 process fio: io_u error on file /mnt/nvme0n1p1/randwrite.0.0: Input/output error: write offset=0, buflen=4096 fio: pid=6257, err=5/file:io_u.c:1896, func=io_u error, error=Input/output error randwrite: (groupid=0, jobs=1): err= 5 (file:io_u.c:1896, func=io_u error, error=Input/output error): pid=6257: Mon Dec 9 23:38:19 2024 write: IOPS=37.5k, BW=146MiB/s (154MB/s)(6144MiB/41945msec); 0 zone resets clat (usec): min=3, max=354943, avg= 4.95, stdev=494.19 lat (usec): min=3, max=354944, avg= 5.05, stdev=494.19 clat percentiles (nsec): | 1.00th=[ 3504], 5.00th=[ 3504], 10.00th=[ 3504], 20.00th=[ 3504], | 30.00th=[ 3792], 40.00th=[ 3792], 50.00th=[ 3792], 60.00th=[ 3792], | 70.00th=[ 4384], 80.00th=[ 4640], 90.00th=[ 4960], 95.00th=[ 5856], | 99.00th=[ 8160], 99.50th=[11328], 99.90th=[21120], 99.95th=[24960], | 99.99th=[36096] bw ( KiB/s): min=44894, max=943952, per=100.00%, avg=599182.00, stdev=308987.51, samples=21 iops : min=11223, max=235988, avg=149795.48, stdev=77246.82, samples=21 lat (usec) : 4=62.59%, 10=36.72%, 20=0.57%, 50=0.11%, 100=0.01% lat (usec) : 500=0.01%, 750=0.01% lat (msec) : 500=0.01% cpu : usr=2.61%, sys=19.77%, ctx=79, majf=0, minf=21 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,1572865,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: bw=146MiB/s (154MB/s), 146MiB/s-146MiB/s (154MB/s-154MB/s), io=6144MiB (6442MB), run=41945-41945msec Still getting the same issue : ( @amazingfate I am wondering since the sige7 seems AFAICT to be produced by SINOVOiP BPi as theres a BPi logo on one I have seen that this maybe a board version issue ? https://www.armsom.org/sige7
  10. The BananaPi-M7 NVMe issue is in the original Ubuntu Desktop 22.04.3 v5.10.160 kernel 5.release from Armbian too, detectable with fio.
  11. Hi @Igor just to let you know what I am actually after a headless target machine with 32GB of RAM and 1/2 TB of NVMe SSD. The only reason I wanted a GUI and Browser was for easy reporting for state. To fill you in I am actually trying to get a Green low power ARM solution for a number of Climate Change related projects from serving websites, social media system, and also a dedicated crawler and search engine for scientific fields. I have been using Debian since 2017 on BananaPi-M3's, before then two pairs of dual rsync'ed Fujitsu-Siemens Primergy 470 then separately dual rync'ed HP DL 140 G3's servers from 2006 RedHat Fedora Core 4 to Fedora 22 in 2016. I had 36 hours of downtime due to internet outages and a power outage, and one hour which was my fault
  12. @amazingfate This is failing repeatedly indeterminately on 6.1.84-vendor-rk35xx on my build of Armbian Debian on BananaPi-M7.
  13. @eselarm @Igor Unfortunaely on BananPi-M7 a self built Linux bananapim7 6.1.84-vendor-rk35xx kernel still has PCIe M2 NVMe stability issues when running `stress --verbose --hdd 4` after a period on 5 or so minutes. @amazingfate Can you please run a `stress --verbose --hdd 4` for 20 minutes just to verify the sige7 is stable. If so I might get one myself. @going I would much appreciate your input on this. The USB C monitoring device I got does not have minimums only maximums so its very hard to gain any useful information from it. Even with monitioring using a test meter with minimums valid minimums are hard to obtain. Ideally a separate RPi I2C monitoring device is required to know what is going on. Armbian Debian seems very unstable especially at startup or with FireFox tabs dying on startup and Chromium not even starting up. Building a 6.1.84-vendor-rk35xx and 6.12.2 Ubuntu Nobel builds for further testing...
  14. @going I am testing the Kernel and the PCIe M2 controller. THe NVME is fine and will not suffer from these number of rewrites.
  15. I turned off the errors readonly mode off in `/etc/fstab` and its running along the stress --verbose --hdd 4` test as suspected. I am wondering if errors can be detected and acted upon in a different manor by the filing system ? I may try btrfs as amother test when I can do so.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines