Krita Posted July 26, 2021 Share Posted July 26, 2021 (edited) Hi all, been trying to sort out a stability problem with what was a rock solid helios64, i moved from running armbian on the eMMC to a 256GB sandisk extreme SD card due to plex continually filling it up. currently running Armbian 21.05.6 Buster with Linux 5.10.43-rockchip64, running OMV, Plex and ZFS not much more it appears this is where things go pear shaped: Spoiler [ 6636.457621] ata1.00: exception Emask 0x2 SAct 0x3800000 SErr 0x1000400 action 0x6 [ 6636.458306] ata1.00: irq_stat 0x08000000 [ 6636.458659] ata1: SError: { Proto TrStaTrns } [ 6636.459051] ata1.00: failed command: READ FPDMA QUEUED [ 6636.459516] ata1.00: cmd 60/58:b8:e0:b3:73/05:00:01:00:00/40 tag 23 ncq dma 7 00416 in [ 6636.459516] res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v iolation) [ 6636.460898] ata1.00: status: { DRDY } [ 6636.461227] ata1.00: failed command: READ FPDMA QUEUED [ 6636.461743] ata1.00: cmd 60/40:c0:80:ae:73/05:00:01:00:00/40 tag 24 ncq dma 6 88128 in [ 6636.461743] res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v iolation) [ 6636.457621] ata1.00: exception Emask 0[ 6636.463130] ata1.00: status: { DRDY } x2 SAct 0x3800000 SErr 0x1000400 action 0x6 [ 6636.458[ 6636.463942] ata1.00: failed command: READ FPDMA QUEUED 306] ata1.00: irq_stat 0x08000000 [ [ 6636.464874] ata1.00: cmd 60/20:c8:c0:b3:73/00:00:01:00:00/40 tag 25 ncq dma 16384 in [ 6636.464874] res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v iolation) 6636.458659] ata1: SError: { Proto TrStaTrns }[ 6636.466733] ata1.00: status: { DRDY } [ 6636.459051] ata1.00: failed command: READ FPDMA QUEUED [ 6636.459516] ata1.00: cmd 60/58:b8:e0:b3:73/05:00:01:00:00/40 tag 23 ncq dma 7 00416 in res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v iolation) [ 6636.460898] ata1.00: status: { DRDY } [ 6636.461227] ata1.00: failed command: READ FPDMA QUEUED [ 6636.461743] ata1.00: cmd 60/40:c0:80:ae:73/05:00:01:00:00/40 tag 24 ncq dma 6 88128 in res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v iolation) [ 6636.463130] ata1.00: status: { DRDY } [ 6636.463942] ata1.00: failed command: READ FPDMA QUEUED [ 6636.464874] ata1.00: cmd 60/20:c8:c0:b3:73/00:00:01:00:00/40 tag 25 ncq dma 1 6384 in res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v iolation) [ 6636.466733] ata1.00: status: { DRDY } [ 6636.467511] ata1: hard resetting link [ 6636.943205] blk_update_request: I/O error, dev sda, sector 24359904 op 0x0:(R EAD) flags 0x700 phys_seg 12 prio class 0 [ 6636.944176] zio pool=helios vdev=/dev/disk/by-id/ata-WDC_WD40PURX-64GVNY0_WD- WCC4E2LRTU90-part1 error=5 type=1 offset=12471222272 size=700416 flags=40080ca8 [ 6636.945526] blk_update_request: I/O error, dev sda, sector 24358528 op 0x0:(R EAD) flags 0x4700 phys_seg 168 prio class 0 [ 6636.941590] ata1: SATA link up 6.0 Gbps (SS[ 6636.946685] blk_update_request: I/O error, dev sda, sector 24359872 op 0x0:(READ) flags 0x700 phys_seg 4 prio c lass 0 tatus 133 SControl 300) [ 6636.942889] ata1.00 6636.947962] zio pool=helios vdev=/dev/disk/by-id/ata-WDC _WD40PURX-64GVNY0_WD-WCC4E2LRTU90-part1 error=5 type=1 offset=12470517760 size=7 04512 flags=40080ca8 m: configured for UDMA/133 [ 6636.943169] sd 0:0:0:0: [sda] tag#23 UNKNOWN(0x2003) Result: hostbyte=0x00 dr iverbyte=0x08 cmd_age=5s [ 6636.943179] sd 0:0:0:0: [sda] tag#23 Sense Key : 0x5 [current] [ 6636.943186] sd 0:0:0:0: [sda] tag#23 ASC=0x21 ASCQ=0x4 [ 6636.943196] sd 0:0:0:0: [sda] tag#23 CDB: opcode=0x88 88 00 00 00 00 00 01 73 b3 e0 00 00 05 58 00 00 [ 6636.943205] blk_update_request: I/O error, dev sda, sector 24359904 op 0x0:(R EAD) flags 0x700 phys_seg 12 prio class 0 [ 6636.944176] zio pool=helios vdev=/dev/disk/by-id/ata-WDC_WD40PURX-64GVNY0_WD- WCC4E2LRTU90-part1 error=5 type=1 offset=12471222272 size=700416 flags=40080ca8 [ 6636.945496] sd 0:0:0:0: [sda] tag#24 UNKNOWN(0x2003) Result: hostbyte=0x00 dr iverbyte=0x08 cmd_age=5s [ 6636.945504] sd 0:0:0:0: [sda] tag#24 Sense Key : 0x5 [current] [ 6636.945511] sd 0:0:0:0: [sda] tag#24 ASC=0x21 ASCQ=0x4 [ 6636.945518] sd 0:0:0:0: [sda] tag#24 CDB: opcode=0x88 88 00 00 00 00 00 01 73 ae 80 00 00 05 40 00 00 [ 6636.945526] blk_update_request: I/O error, dev sda, sector 24358528 op 0x0:(R EAD) flags 0x4700 phys_seg 168 prio class 0 [ 6636.946645] sd 0:0:0:0: [sda] tag#25 UNKNOWN(0x2003) Result: hostbyte=0x00 dr iverbyte=0x08 cmd_age=5s [ 6636.946659] sd 0:0:0:0: [sda] tag#25 Sense Key : 0x5 [current] [ 6636.946666] sd 0:0:0:0: [sda] tag#25 ASC=0x21 ASCQ=0x4 [ 6636.946676] sd 0:0:0:0: [sda] tag#25 CDB: opcode=0x88 88 00 00 00 00 00 01 73 b3 c0 00 00 00 20 00 00 [ 6636.946685] blk_update_request: I/O error, dev sda, sector 24359872 op 0x0:(R EAD) flags 0x700 phys_seg 4 prio class 0 [ 6636.947962] zio pool=helios vdev=/dev/disk/by-id/ata-WDC_WD40PURX-64GVNY0_WD- WCC4E2LRTU90-part1 error=5 type=1 offset=12470517760 size=704512 flags=40080ca8 [ 6636.949804] ata1: EH complete DDR Version 1.24 20191016 In soft reset SRX channel 0 CS = 0 MR0=0x18 MR4=0x2 MR5=0x1 MR8=0x10 MR12=0x72 MR14=0x72 MR18=0x0 MR19=0x0 MR24=0x8 MR25=0x0 channel 1 CS = 0 MR0=0x18 MR4=0x1 MR5=0x1 MR8=0x10 MR12=0x72 MR14=0x72 MR18=0x0 MR19=0x0 MR24=0x8 MR25=0x0 channel 0 training pass! channel 1 training pass! change freq to 416MHz 0,1 Channel 0: LPDDR4,416MHz Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB Channel 1: LPDDR4,416MHz Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB 256B stride channel 0 CS = 0 MR0=0x18 MR4=0x2 MR5=0x1 MR8=0x10 MR12=0x72 MR14=0x72 MR18=0x0 MR19=0x0 MR24=0x8 MR25=0x0 channel 1 CS = 0 MR0=0x18 MR4=0x1 MR5=0x1 MR8=0x10 MR12=0x72 MR14=0x72 MR18=0x0 MR19=0x0 MR24=0x8 MR25=0x0 channel 0 training pass! channel 1 training pass! channel 0, cs 0, advanced training done channel 1, cs 0, advanced training done change freq to 856MHz 1,0 ch 0 ddrconfig = 0x101, ddrsize = 0x40 ch 1 ddrconfig = 0x101, ddrsize = 0x40 pmugrf_os_reg[2] = 0x32C1F2C1, stride = 0xD ddr_set_rate to 328MHZ ddr_set_rate to 666MHZ ddr_set_rate to 928MHZ channel 0, cs 0, advanced training done channel 1, cs 0, advanced training done ddr_set_rate to 416MHZ, ctl_index 0 ddr_set_rate to 856MHZ, ctl_index 1 support 416 856 328 666 928 MHz, current 856MHz OUT Boot1: 2019-03-14, version: 1.19 CPUId = 0x0 ChipType = 0x10, 323 mmc: ERROR: SDHCI ERR:cmd:0x102,stat:0x18000 mmc: ERROR: Card did not respond to voltage select! emmc reinit mmc: ERROR: SDHCI ERR:cmd:0x102,stat:0x18000 mmc: ERROR: Card did not respond to voltage select! emmc reinit mmc: ERROR: SDHCI ERR:cmd:0x102,stat:0x18000 mmc: ERROR: Card did not respond to voltage select! SdmmcInit=2 1 mmc0:cmd5,20 SdmmcInit=0 0 BootCapSize=0 UserCapSize=244016MB FwPartOffset=2000 , 0 StorageInit ok = 56917 SecureMode = 0 SecureInit read PBA: 0x4 SecureInit read PBA: 0x404 SecureInit read PBA: 0x804 SecureInit read PBA: 0xc04 SecureInit read PBA: 0x1004 SecureInit read PBA: 0x1404 SecureInit read PBA: 0x1804 SecureInit read PBA: 0x1c04 SecureInit ret = 0, SecureMode = 0 atags_set_bootdev: ret:(0) GPT 0x3380ec0 signature is wrong recovery gpt... GPT 0x3380ec0 signature is wrong recovery gpt fail! LoadTrust Addr:0x4000 No find bl30.bin No find bl32.bin Load uboot, ReadLba = 2000 hdr 0000000003380880 + 0x0:0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x0 0,0x00,0x00,0x00,0x00,0x00, Load OK, addr=0x200000, size=0xe5b60 RunBL31 0x40000 NOTICE: BL31: v1.3(debug):42583b6 NOTICE: BL31: Built : 07:55:13, Oct 15 2019 NOTICE: BL31: Rockchip release version: v1.1 INFO: GICv3 with legacy support detected. ARM GICV3 driver initialized in EL3 INFO: Using opteed sec cpu_context! INFO: boot cpu mask: 0 INFO: plat_rockchip_pmu_init(1190): pd status 3e INFO: BL31: Initializing runtime services WARNING: No OPTEE provided by BL2 boot loader, Booting device without OPTEE init ialization. SMC`s destined for OPTEE will return SMC_UNK ERROR: Error initializing runtime service opteed_fast INFO: BL31: Preparing for EL3 exit to normal world INFO: Entry point address = 0x200000 INFO: SPSR = 0x3c9 U-Boot 2020.10-armbian (May 06 2021 - 17:13:15 +0000) SoC: Rockchip rk3399 Reset cause: WDOG DRAM: 3.9 GiB PMIC: RK808 SF: Detected w25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB MMC: mmc@fe320000: 1, sdhci@fe330000: 0 Loading Environment from MMC... Card did not respond to voltage select! *** Warning - No block device, using default environment In: serial Out: serial Err: serial Model: Helios64 Revision: 1.2 - 4GB non ECC Net: eth0: ethernet@fe300000 scanning bus for devices... starting USB... Bus usb@fe380000: USB EHCI 1.00 Bus dwc3: usb maximum-speed not found Register 2000140 NbrPorts 2 Starting the controller USB XHCI 1.10 scanning bus usb@fe380000 for devices... 1 USB Device(s) found scanning bus dwc3 for devices... cannot reset port 4!? 5 USB Device(s) found scanning usb for storage devices... 1 Storage Device(s) found Hit any key to stop autoboot: 0 switch to partitions #0, OK mmc1 is current device ** No partition table - mmc 1 ** Card did not respond to voltage select! Device 0: Vendor: Seagate Rev: 070B Prod: Expansion Desk Type: Hard Disk Capacity: 2861588.4 MB = 2794.5 GB (732566645 x 4096) ... is now current device ** Unrecognized filesystem type ** scanning bus for devices... Device 0: unknown device Speed: 1000, full duplex BOOTP broadcast 1 DHCP client bound to address 192.168.0.254 (3 ms) *** Warning: no boot file name; using 'C0A800FE.img' Using ethernet@fe300000 device TFTP from server 192.168.0.1; our IP address is 192.168.0.254 Filename 'C0A800FE.img'. Load address: 0x800800 Loading: T T T T T T T T T T Retry count exceeded; starting again missing environment variable: pxeuuid missing environment variable: bootfile Retrieving file: pxelinux.cfg/01-64-62-66-d0-01-58 I've aready given i a bump in voltage as recommended in the config file but i cant tell where to check if it took the settting, frequency is set in armbian-config to ondemand. does anyone have any ideas? Thanks in advance Krita Edited July 28, 2021 by Krita formatting 0 Quote Link to comment Share on other sites More sharing options...
Krita Posted July 27, 2021 Author Share Posted July 27, 2021 apparently my searching skills are rubbish, found a thread with the FPDMA QUEUED fault, will investigate that, but seems thats a seperate isue to the reboot. 0 Quote Link to comment Share on other sites More sharing options...
Krita Posted July 28, 2021 Author Share Posted July 28, 2021 So Ive had the crash happen again after allmost exactly 24hrs, this time with logging verbose 5 so i actually got a bit of data. this was under no load at all, no one logged on just a single PC on the network running USB-C serial logging. I have no idea how to interoperate the data but I'm instigating what it means to the best of my ability, any pointer would be great Spoiler [88215.370992] Unable to handle kernel paging request at virtual address 003f8000118b99f0 [88215.371700] Mem abort info: [88215.371948] ESR = 0x96000004 [88215.372220] EC = 0x25: DABT (current EL), IL = 32 bits [88215.372686] SET = 0, FnV = 0 [88215.372956] EA = 0, S1PTW = 0 [88215.373234] Data abort info: [88215.373488] ISV = 0, ISS = 0x00000004 [88215.373825] CM = 0, WnR = 0 [88215.374087] [003f8000118b99f0] address between user and kernel address ranges [88215.374713] Internal error: Oops: 96000004 [#1] PREEMPT SMP [88215.375203] Modules linked in: softdog governor_performance snd_soc_hdmi_codec r8152 rockchipdrm snd_soc_rockchip_i2s dw_mipi_dsi rockchip_vdec(C) hantro_vpu(C) dw_hdmi rockchip_rga snd_soc_core analogix_dp v4l2_h264 snd_pcm_dmaengine videobuf2_dma_contig videobuf2_dma_sg videobuf2_vmalloc v4l2_mem2mem gpio_charger drm_kms_helper videobuf2_memops fusb302 snd_pcm cec videobuf2_v4l2 pwm_fan panfrost leds_pwm snd_timer tcpm rc_core videobuf2_common gpu_sched snd typec videodev sg drm soundcore mc drm_panel_orientation_quirks gpio_beeper cpufreq_dt zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) nfsd zcommon(POE) auth_rpcgss znvpair(POE) nfs_acl zavl(POE) icp(POE) lockd spl(OE) grace ledtrig_netdev lm75 sunrpc ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear md_mod uas realtek dwmac_rk stmmac_platform stmmac pcs_xpcs adc_keys [88215.382244] CPU: 5 PID: 0 Comm: swapper/5 Tainted: P C OE 5.10.43-rockchip64 #21.05.4 [88215.383019] Hardware name: Helios64 (DT) [88215.383366] pstate: 60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--) [88215.383904] pc : scheduler_tick+0xc4/0x140 [88215.384266] lr : scheduler_tick+0xc4/0x140 [88215.384626] sp : ffff800011c13d90 [88215.384918] x29: ffff800011c13d90 x28: 0000503bdc75d080 [88215.385387] x27: ffff0000f77bb6c0 x26: 0000000000000002 [88215.385856] x25: 0000000000000080 x24: ffff80001156a000 [88215.386325] x23: ffff000000711d00 x22: ffff80001157fd00 [88215.386793] x21: 0000000000000005 x20: ffff8000118b99c8 [88215.387261] x19: ffff0000f77c7d00 x18: 0000000000000000 [88215.387729] x17: 0000000000000000 x16: 00000000000073c0 [88215.388197] x15: 000028eb9d29edfa x14: 0000000000000000 [88215.388666] x13: 0000000000000270 x12: 0000000000000000 [88215.389134] x11: 0000000000000000 x10: 0000000000000004 [88215.389603] x9 : 0000000000000270 x8 : 0000000000000000 [88215.390071] x7 : ffff0000f77c7d00 x6 : ffff0000f77c8800 [88215.390540] x5 : 0000000000001270 x4 : ffff8000e6248000 [88215.391007] x3 : 0000000000010001 x2 : ffff80001156a000 [88215.391475] x1 : ffff8000112a1f58 x0 : 0000000000000005 [88215.391944] Call trace: [88215.392163] scheduler_tick+0xc4/0x140 [88215.392497] update_process_times+0x8c/0xa0 [88215.392867] tick_sched_handle.isra.19+0x40/0x58 [88215.393273] tick_sched_timer+0x58/0xb0 [88215.393613] __hrtimer_run_queues+0x104/0x388 [88215.393997] hrtimer_interrupt+0xf4/0x250 [88215.394353] arch_timer_handler_phys+0x30/0x40 [88215.394746] handle_percpu_devid_irq+0xa0/0x298 [88215.395144] generic_handle_irq+0x30/0x48 [88215.395498] __handle_domain_irq+0x94/0x108 [88215.395869] gic_handle_irq+0xc0/0x140 [88215.396202] el1_irq+0xc8/0x180 [88215.396482] arch_cpu_idle+0x18/0x28 [88215.396799] default_idle_call+0x44/0x1bc [88215.397152] do_idle+0x204/0x278 [88215.397437] cpu_startup_entry+0x28/0x60 [88215.397789] secondary_start_kernel+0x170/0x180 [88215.398191] Code: 94000cfb aa1303e0 94369ab7 94051e08 (f8757a82) [88215.398733] ---[ end trace 57a1570b3962ce1b ]--- [88215.399139] Kernel panic - not syncing: Oops: Fatal exception in interrupt [88215.399743] SMP: stopping secondary CPUs [88215.400093] Kernel Offset: disabled [88215.400402] CPU features: 0x0240022,6100200c [88215.400777] Memory Limit: none [88215.401056] Rebooting in 90 seconds.. 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.