Jump to content

Recommended Posts

Posted (edited)

Hi all,

 

been trying to sort out a stability problem with what was a rock solid helios64, i moved from running armbian on the eMMC to a 256GB sandisk extreme SD card due to plex continually filling it up.

 

currently running  Armbian 21.05.6 Buster with Linux 5.10.43-rockchip64, running OMV, Plex and ZFS not much more

 

it appears this is where things go pear shaped:

 

Spoiler
[ 6636.457621] ata1.00: exception Emask 0x2 SAct 0x3800000 SErr 0x1000400 action                         0x6
[ 6636.458306] ata1.00: irq_stat 0x08000000
[ 6636.458659] ata1: SError: { Proto TrStaTrns }
[ 6636.459051] ata1.00: failed command: READ FPDMA QUEUED
[ 6636.459516] ata1.00: cmd 60/58:b8:e0:b3:73/05:00:01:00:00/40 tag 23 ncq dma 7                        00416 in
[ 6636.459516]          res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v                        iolation)
[ 6636.460898] ata1.00: status: { DRDY }
[ 6636.461227] ata1.00: failed command: READ FPDMA QUEUED
[ 6636.461743] ata1.00: cmd 60/40:c0:80:ae:73/05:00:01:00:00/40 tag 24 ncq dma 6                        88128 in
[ 6636.461743]          res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v                        iolation)
[ 6636.457621] ata1.00: exception Emask 0[ 6636.463130] ata1.00: status: { DRDY                         }
x2 SAct 0x3800000 SErr 0x1000400 action 0x6
[ 6636.458[ 6636.463942] ata1.00: failed command: READ FPDMA QUEUED
306] ata1.00: irq_stat 0x08000000
[ [ 6636.464874] ata1.00: cmd 60/20:c8:c0:b3:73/00:00:01:00:00/40 tag 25 ncq dma                         16384 in
[ 6636.464874]          res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v                        iolation)
6636.458659] ata1: SError: { Proto TrStaTrns }[ 6636.466733] ata1.00: status: {                         DRDY }

[ 6636.459051] ata1.00: failed command: READ FPDMA QUEUED
[ 6636.459516] ata1.00: cmd 60/58:b8:e0:b3:73/05:00:01:00:00/40 tag 23 ncq dma 7                        00416 in
                        res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v                        iolation)
[ 6636.460898] ata1.00: status: { DRDY }
[ 6636.461227] ata1.00: failed command: READ FPDMA QUEUED
[ 6636.461743] ata1.00: cmd 60/40:c0:80:ae:73/05:00:01:00:00/40 tag 24 ncq dma 6                        88128 in
                        res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v                        iolation)
[ 6636.463130] ata1.00: status: { DRDY }
[ 6636.463942] ata1.00: failed command: READ FPDMA QUEUED
[ 6636.464874] ata1.00: cmd 60/20:c8:c0:b3:73/00:00:01:00:00/40 tag 25 ncq dma 1                        6384 in
                        res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v                        iolation)
[ 6636.466733] ata1.00: status: { DRDY }
[ 6636.467511] ata1: hard resetting link
[ 6636.943205] blk_update_request: I/O error, dev sda, sector 24359904 op 0x0:(R                        EAD) flags 0x700 phys_seg 12 prio class 0
[ 6636.944176] zio pool=helios vdev=/dev/disk/by-id/ata-WDC_WD40PURX-64GVNY0_WD-                        WCC4E2LRTU90-part1 error=5 type=1 offset=12471222272 size=700416 flags=40080ca8
[ 6636.945526] blk_update_request: I/O error, dev sda, sector 24358528 op 0x0:(R                        EAD) flags 0x4700 phys_seg 168 prio class 0
[ 6636.941590] ata1: SATA link up 6.0 Gbps (SS[ 6636.946685] blk_update_request:                         I/O error, dev sda, sector 24359872 op 0x0:(READ) flags 0x700 phys_seg 4 prio c                        lass 0
tatus 133 SControl 300)
[ 6636.942889] ata1.00 6636.947962] zio pool=helios vdev=/dev/disk/by-id/ata-WDC                        _WD40PURX-64GVNY0_WD-WCC4E2LRTU90-part1 error=5 type=1 offset=12470517760 size=7                        04512 flags=40080ca8
m: configured for UDMA/133
[ 6636.943169] sd 0:0:0:0: [sda] tag#23 UNKNOWN(0x2003) Result: hostbyte=0x00 dr                        iverbyte=0x08 cmd_age=5s
[ 6636.943179] sd 0:0:0:0: [sda] tag#23 Sense Key : 0x5 [current]
[ 6636.943186] sd 0:0:0:0: [sda] tag#23 ASC=0x21 ASCQ=0x4
[ 6636.943196] sd 0:0:0:0: [sda] tag#23 CDB: opcode=0x88 88 00 00 00 00 00 01 73                         b3 e0 00 00 05 58 00 00
[ 6636.943205] blk_update_request: I/O error, dev sda, sector 24359904 op 0x0:(R                        EAD) flags 0x700 phys_seg 12 prio class 0
[ 6636.944176] zio pool=helios vdev=/dev/disk/by-id/ata-WDC_WD40PURX-64GVNY0_WD-                        WCC4E2LRTU90-part1 error=5 type=1 offset=12471222272 size=700416 flags=40080ca8
[ 6636.945496] sd 0:0:0:0: [sda] tag#24 UNKNOWN(0x2003) Result: hostbyte=0x00 dr                        iverbyte=0x08 cmd_age=5s
[ 6636.945504] sd 0:0:0:0: [sda] tag#24 Sense Key : 0x5 [current]
[ 6636.945511] sd 0:0:0:0: [sda] tag#24 ASC=0x21 ASCQ=0x4
[ 6636.945518] sd 0:0:0:0: [sda] tag#24 CDB: opcode=0x88 88 00 00 00 00 00 01 73                         ae 80 00 00 05 40 00 00
[ 6636.945526] blk_update_request: I/O error, dev sda, sector 24358528 op 0x0:(R                        EAD) flags 0x4700 phys_seg 168 prio class 0
[ 6636.946645] sd 0:0:0:0: [sda] tag#25 UNKNOWN(0x2003) Result: hostbyte=0x00 dr                        iverbyte=0x08 cmd_age=5s
[ 6636.946659] sd 0:0:0:0: [sda] tag#25 Sense Key : 0x5 [current]
[ 6636.946666] sd 0:0:0:0: [sda] tag#25 ASC=0x21 ASCQ=0x4
[ 6636.946676] sd 0:0:0:0: [sda] tag#25 CDB: opcode=0x88 88 00 00 00 00 00 01 73                         b3 c0 00 00 00 20 00 00
[ 6636.946685] blk_update_request: I/O error, dev sda, sector 24359872 op 0x0:(R                        EAD) flags 0x700 phys_seg 4 prio class 0
[ 6636.947962] zio pool=helios vdev=/dev/disk/by-id/ata-WDC_WD40PURX-64GVNY0_WD-                        WCC4E2LRTU90-part1 error=5 type=1 offset=12470517760 size=704512 flags=40080ca8
[ 6636.949804] ata1: EH complete
DDR Version 1.24 20191016
In
soft reset
SRX
channel 0
CS = 0
MR0=0x18
MR4=0x2
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
change freq to 416MHz 0,1
Channel 0: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
Channel 1: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
256B stride
channel 0
CS = 0
MR0=0x18
MR4=0x2
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
channel 0, cs 0, advanced training done
channel 1, cs 0, advanced training done
change freq to 856MHz 1,0
ch 0 ddrconfig = 0x101, ddrsize = 0x40
ch 1 ddrconfig = 0x101, ddrsize = 0x40
pmugrf_os_reg[2] = 0x32C1F2C1, stride = 0xD
ddr_set_rate to 328MHZ
ddr_set_rate to 666MHZ
ddr_set_rate to 928MHZ
channel 0, cs 0, advanced training done
channel 1, cs 0, advanced training done
ddr_set_rate to 416MHZ, ctl_index 0
ddr_set_rate to 856MHZ, ctl_index 1
support 416 856 328 666 928 MHz, current 856MHz
OUT
Boot1: 2019-03-14, version: 1.19
CPUId = 0x0
ChipType = 0x10, 323
mmc: ERROR: SDHCI ERR:cmd:0x102,stat:0x18000
mmc: ERROR: Card did not respond to voltage select!
emmc reinit
mmc: ERROR: SDHCI ERR:cmd:0x102,stat:0x18000
mmc: ERROR: Card did not respond to voltage select!
emmc reinit
mmc: ERROR: SDHCI ERR:cmd:0x102,stat:0x18000
mmc: ERROR: Card did not respond to voltage select!
SdmmcInit=2 1
mmc0:cmd5,20
SdmmcInit=0 0
BootCapSize=0
UserCapSize=244016MB
FwPartOffset=2000 , 0
StorageInit ok = 56917
SecureMode = 0
SecureInit read PBA: 0x4
SecureInit read PBA: 0x404
SecureInit read PBA: 0x804
SecureInit read PBA: 0xc04
SecureInit read PBA: 0x1004
SecureInit read PBA: 0x1404
SecureInit read PBA: 0x1804
SecureInit read PBA: 0x1c04
SecureInit ret = 0, SecureMode = 0
atags_set_bootdev: ret:(0)
GPT 0x3380ec0 signature is wrong
recovery gpt...
GPT 0x3380ec0 signature is wrong
recovery gpt fail!
LoadTrust Addr:0x4000
No find bl30.bin
No find bl32.bin
Load uboot, ReadLba = 2000
hdr 0000000003380880 + 0x0:0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x0                        0,0x00,0x00,0x00,0x00,0x00,

Load OK, addr=0x200000, size=0xe5b60
RunBL31 0x40000
NOTICE:  BL31: v1.3(debug):42583b6
NOTICE:  BL31: Built : 07:55:13, Oct 15 2019
NOTICE:  BL31: Rockchip release version: v1.1
INFO:    GICv3 with legacy support detected. ARM GICV3 driver initialized in EL3
INFO:    Using opteed sec cpu_context!
INFO:    boot cpu mask: 0
INFO:    plat_rockchip_pmu_init(1190): pd status 3e
INFO:    BL31: Initializing runtime services
WARNING: No OPTEE provided by BL2 boot loader, Booting device without OPTEE init                        ialization. SMC`s destined for OPTEE will return SMC_UNK
ERROR:   Error initializing runtime service opteed_fast
INFO:    BL31: Preparing for EL3 exit to normal world
INFO:    Entry point address = 0x200000
INFO:    SPSR = 0x3c9


U-Boot 2020.10-armbian (May 06 2021 - 17:13:15 +0000)

SoC: Rockchip rk3399
Reset cause: WDOG
DRAM:  3.9 GiB
PMIC:  RK808
SF: Detected w25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB
MMC:   mmc@fe320000: 1, sdhci@fe330000: 0
Loading Environment from MMC... Card did not respond to voltage select!
*** Warning - No block device, using default environment

In:    serial
Out:   serial
Err:   serial
Model: Helios64
Revision: 1.2 - 4GB non ECC
Net:   eth0: ethernet@fe300000
scanning bus for devices...
starting USB...
Bus usb@fe380000: USB EHCI 1.00
Bus dwc3: usb maximum-speed not found
Register 2000140 NbrPorts 2
Starting the controller
USB XHCI 1.10
scanning bus usb@fe380000 for devices... 1 USB Device(s) found
scanning bus dwc3 for devices... cannot reset port 4!?
5 USB Device(s) found
       scanning usb for storage devices... 1 Storage Device(s) found
Hit any key to stop autoboot:  0
switch to partitions #0, OK
mmc1 is current device
** No partition table - mmc 1 **
Card did not respond to voltage select!

Device 0: Vendor: Seagate  Rev: 070B Prod: Expansion Desk
            Type: Hard Disk
            Capacity: 2861588.4 MB = 2794.5 GB (732566645 x 4096)
... is now current device
** Unrecognized filesystem type **
scanning bus for devices...

Device 0: unknown device
Speed: 1000, full duplex
BOOTP broadcast 1
DHCP client bound to address 192.168.0.254 (3 ms)
*** Warning: no boot file name; using 'C0A800FE.img'
Using ethernet@fe300000 device
TFTP from server 192.168.0.1; our IP address is 192.168.0.254
Filename 'C0A800FE.img'.
Load address: 0x800800
Loading: T T T T T T T T T T
Retry count exceeded; starting again
missing environment variable: pxeuuid
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/01-64-62-66-d0-01-58

 

I've aready given i a bump in voltage as recommended in the config file but i cant tell where to check if it  took the settting, frequency is set in armbian-config to ondemand.

 

does anyone have any ideas?

 

Thanks in advance

Krita

 

 

Edited by Krita
formatting
Posted

apparently my searching skills are rubbish, found a thread with the FPDMA QUEUED fault, will investigate that, but seems thats a seperate isue to the reboot.

Posted

So Ive had the crash happen again after allmost exactly 24hrs, this time with logging verbose 5 so i actually got a bit of data. this was under no load at all, no one logged on just a single PC on the network running USB-C serial logging.

 

I have no idea how to interoperate the data but I'm instigating what it means to the best of my ability, any pointer would be great

 

 

Spoiler

[88215.370992] Unable to handle kernel paging request at virtual address 003f8000118b99f0

[88215.371700] Mem abort info:

[88215.371948]   ESR = 0x96000004

[88215.372220]   EC = 0x25: DABT (current EL), IL = 32 bits

[88215.372686]   SET = 0, FnV = 0

[88215.372956]   EA = 0, S1PTW = 0

[88215.373234] Data abort info:

[88215.373488]   ISV = 0, ISS = 0x00000004

[88215.373825]   CM = 0, WnR = 0

[88215.374087] [003f8000118b99f0] address between user and kernel address ranges

[88215.374713] Internal error: Oops: 96000004 [#1] PREEMPT SMP

[88215.375203] Modules linked in: softdog governor_performance snd_soc_hdmi_codec r8152 rockchipdrm snd_soc_rockchip_i2s dw_mipi_dsi rockchip_vdec(C) hantro_vpu(C) dw_hdmi rockchip_rga snd_soc_core analogix_dp v4l2_h264 snd_pcm_dmaengine videobuf2_dma_contig videobuf2_dma_sg videobuf2_vmalloc v4l2_mem2mem gpio_charger drm_kms_helper videobuf2_memops fusb302 snd_pcm cec videobuf2_v4l2 pwm_fan panfrost leds_pwm snd_timer tcpm rc_core videobuf2_common gpu_sched snd typec videodev sg drm soundcore mc drm_panel_orientation_quirks gpio_beeper cpufreq_dt zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) nfsd zcommon(POE) auth_rpcgss znvpair(POE) nfs_acl zavl(POE) icp(POE) lockd spl(OE) grace ledtrig_netdev lm75 sunrpc ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear md_mod uas realtek dwmac_rk stmmac_platform stmmac pcs_xpcs adc_keys

[88215.382244] CPU: 5 PID: 0 Comm: swapper/5 Tainted: P         C OE     5.10.43-rockchip64 #21.05.4

[88215.383019] Hardware name: Helios64 (DT)

[88215.383366] pstate: 60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)

[88215.383904] pc : scheduler_tick+0xc4/0x140

[88215.384266] lr : scheduler_tick+0xc4/0x140

[88215.384626] sp : ffff800011c13d90

[88215.384918] x29: ffff800011c13d90 x28: 0000503bdc75d080

[88215.385387] x27: ffff0000f77bb6c0 x26: 0000000000000002

[88215.385856] x25: 0000000000000080 x24: ffff80001156a000

[88215.386325] x23: ffff000000711d00 x22: ffff80001157fd00

[88215.386793] x21: 0000000000000005 x20: ffff8000118b99c8

[88215.387261] x19: ffff0000f77c7d00 x18: 0000000000000000

[88215.387729] x17: 0000000000000000 x16: 00000000000073c0

[88215.388197] x15: 000028eb9d29edfa x14: 0000000000000000

[88215.388666] x13: 0000000000000270 x12: 0000000000000000

[88215.389134] x11: 0000000000000000 x10: 0000000000000004

[88215.389603] x9 : 0000000000000270 x8 : 0000000000000000

[88215.390071] x7 : ffff0000f77c7d00 x6 : ffff0000f77c8800

[88215.390540] x5 : 0000000000001270 x4 : ffff8000e6248000

[88215.391007] x3 : 0000000000010001 x2 : ffff80001156a000

[88215.391475] x1 : ffff8000112a1f58 x0 : 0000000000000005

[88215.391944] Call trace:

[88215.392163]  scheduler_tick+0xc4/0x140

[88215.392497]  update_process_times+0x8c/0xa0

[88215.392867]  tick_sched_handle.isra.19+0x40/0x58

[88215.393273]  tick_sched_timer+0x58/0xb0

[88215.393613]  __hrtimer_run_queues+0x104/0x388

[88215.393997]  hrtimer_interrupt+0xf4/0x250

[88215.394353]  arch_timer_handler_phys+0x30/0x40

[88215.394746]  handle_percpu_devid_irq+0xa0/0x298

[88215.395144]  generic_handle_irq+0x30/0x48

[88215.395498]  __handle_domain_irq+0x94/0x108

[88215.395869]  gic_handle_irq+0xc0/0x140

[88215.396202]  el1_irq+0xc8/0x180

[88215.396482]  arch_cpu_idle+0x18/0x28

[88215.396799]  default_idle_call+0x44/0x1bc

[88215.397152]  do_idle+0x204/0x278

[88215.397437]  cpu_startup_entry+0x28/0x60

[88215.397789]  secondary_start_kernel+0x170/0x180

[88215.398191] Code: 94000cfb aa1303e0 94369ab7 94051e08 (f8757a82)

[88215.398733] ---[ end trace 57a1570b3962ce1b ]---

[88215.399139] Kernel panic - not syncing: Oops: Fatal exception in interrupt

[88215.399743] SMP: stopping secondary CPUs

[88215.400093] Kernel Offset: disabled

[88215.400402] CPU features: 0x0240022,6100200c

[88215.400777] Memory Limit: none

[88215.401056] Rebooting in 90 seconds..

 

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines