Jump to content

heliso64 uncommanded reboot, then fails to find SDcard


Krita

Recommended Posts

Hi all,

 

been trying to sort out a stability problem with what was a rock solid helios64, i moved from running armbian on the eMMC to a 256GB sandisk extreme SD card due to plex continually filling it up.

 

currently running  Armbian 21.05.6 Buster with Linux 5.10.43-rockchip64, running OMV, Plex and ZFS not much more

 

it appears this is where things go pear shaped:

 

Spoiler
[ 6636.457621] ata1.00: exception Emask 0x2 SAct 0x3800000 SErr 0x1000400 action                         0x6
[ 6636.458306] ata1.00: irq_stat 0x08000000
[ 6636.458659] ata1: SError: { Proto TrStaTrns }
[ 6636.459051] ata1.00: failed command: READ FPDMA QUEUED
[ 6636.459516] ata1.00: cmd 60/58:b8:e0:b3:73/05:00:01:00:00/40 tag 23 ncq dma 7                        00416 in
[ 6636.459516]          res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v                        iolation)
[ 6636.460898] ata1.00: status: { DRDY }
[ 6636.461227] ata1.00: failed command: READ FPDMA QUEUED
[ 6636.461743] ata1.00: cmd 60/40:c0:80:ae:73/05:00:01:00:00/40 tag 24 ncq dma 6                        88128 in
[ 6636.461743]          res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v                        iolation)
[ 6636.457621] ata1.00: exception Emask 0[ 6636.463130] ata1.00: status: { DRDY                         }
x2 SAct 0x3800000 SErr 0x1000400 action 0x6
[ 6636.458[ 6636.463942] ata1.00: failed command: READ FPDMA QUEUED
306] ata1.00: irq_stat 0x08000000
[ [ 6636.464874] ata1.00: cmd 60/20:c8:c0:b3:73/00:00:01:00:00/40 tag 25 ncq dma                         16384 in
[ 6636.464874]          res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v                        iolation)
6636.458659] ata1: SError: { Proto TrStaTrns }[ 6636.466733] ata1.00: status: {                         DRDY }

[ 6636.459051] ata1.00: failed command: READ FPDMA QUEUED
[ 6636.459516] ata1.00: cmd 60/58:b8:e0:b3:73/05:00:01:00:00/40 tag 23 ncq dma 7                        00416 in
                        res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v                        iolation)
[ 6636.460898] ata1.00: status: { DRDY }
[ 6636.461227] ata1.00: failed command: READ FPDMA QUEUED
[ 6636.461743] ata1.00: cmd 60/40:c0:80:ae:73/05:00:01:00:00/40 tag 24 ncq dma 6                        88128 in
                        res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v                        iolation)
[ 6636.463130] ata1.00: status: { DRDY }
[ 6636.463942] ata1.00: failed command: READ FPDMA QUEUED
[ 6636.464874] ata1.00: cmd 60/20:c8:c0:b3:73/00:00:01:00:00/40 tag 25 ncq dma 1                        6384 in
                        res 40/00:bc:e0:b3:73/00:00:01:00:00/40 Emask 0x2 (HSM v                        iolation)
[ 6636.466733] ata1.00: status: { DRDY }
[ 6636.467511] ata1: hard resetting link
[ 6636.943205] blk_update_request: I/O error, dev sda, sector 24359904 op 0x0:(R                        EAD) flags 0x700 phys_seg 12 prio class 0
[ 6636.944176] zio pool=helios vdev=/dev/disk/by-id/ata-WDC_WD40PURX-64GVNY0_WD-                        WCC4E2LRTU90-part1 error=5 type=1 offset=12471222272 size=700416 flags=40080ca8
[ 6636.945526] blk_update_request: I/O error, dev sda, sector 24358528 op 0x0:(R                        EAD) flags 0x4700 phys_seg 168 prio class 0
[ 6636.941590] ata1: SATA link up 6.0 Gbps (SS[ 6636.946685] blk_update_request:                         I/O error, dev sda, sector 24359872 op 0x0:(READ) flags 0x700 phys_seg 4 prio c                        lass 0
tatus 133 SControl 300)
[ 6636.942889] ata1.00 6636.947962] zio pool=helios vdev=/dev/disk/by-id/ata-WDC                        _WD40PURX-64GVNY0_WD-WCC4E2LRTU90-part1 error=5 type=1 offset=12470517760 size=7                        04512 flags=40080ca8
m: configured for UDMA/133
[ 6636.943169] sd 0:0:0:0: [sda] tag#23 UNKNOWN(0x2003) Result: hostbyte=0x00 dr                        iverbyte=0x08 cmd_age=5s
[ 6636.943179] sd 0:0:0:0: [sda] tag#23 Sense Key : 0x5 [current]
[ 6636.943186] sd 0:0:0:0: [sda] tag#23 ASC=0x21 ASCQ=0x4
[ 6636.943196] sd 0:0:0:0: [sda] tag#23 CDB: opcode=0x88 88 00 00 00 00 00 01 73                         b3 e0 00 00 05 58 00 00
[ 6636.943205] blk_update_request: I/O error, dev sda, sector 24359904 op 0x0:(R                        EAD) flags 0x700 phys_seg 12 prio class 0
[ 6636.944176] zio pool=helios vdev=/dev/disk/by-id/ata-WDC_WD40PURX-64GVNY0_WD-                        WCC4E2LRTU90-part1 error=5 type=1 offset=12471222272 size=700416 flags=40080ca8
[ 6636.945496] sd 0:0:0:0: [sda] tag#24 UNKNOWN(0x2003) Result: hostbyte=0x00 dr                        iverbyte=0x08 cmd_age=5s
[ 6636.945504] sd 0:0:0:0: [sda] tag#24 Sense Key : 0x5 [current]
[ 6636.945511] sd 0:0:0:0: [sda] tag#24 ASC=0x21 ASCQ=0x4
[ 6636.945518] sd 0:0:0:0: [sda] tag#24 CDB: opcode=0x88 88 00 00 00 00 00 01 73                         ae 80 00 00 05 40 00 00
[ 6636.945526] blk_update_request: I/O error, dev sda, sector 24358528 op 0x0:(R                        EAD) flags 0x4700 phys_seg 168 prio class 0
[ 6636.946645] sd 0:0:0:0: [sda] tag#25 UNKNOWN(0x2003) Result: hostbyte=0x00 dr                        iverbyte=0x08 cmd_age=5s
[ 6636.946659] sd 0:0:0:0: [sda] tag#25 Sense Key : 0x5 [current]
[ 6636.946666] sd 0:0:0:0: [sda] tag#25 ASC=0x21 ASCQ=0x4
[ 6636.946676] sd 0:0:0:0: [sda] tag#25 CDB: opcode=0x88 88 00 00 00 00 00 01 73                         b3 c0 00 00 00 20 00 00
[ 6636.946685] blk_update_request: I/O error, dev sda, sector 24359872 op 0x0:(R                        EAD) flags 0x700 phys_seg 4 prio class 0
[ 6636.947962] zio pool=helios vdev=/dev/disk/by-id/ata-WDC_WD40PURX-64GVNY0_WD-                        WCC4E2LRTU90-part1 error=5 type=1 offset=12470517760 size=704512 flags=40080ca8
[ 6636.949804] ata1: EH complete
DDR Version 1.24 20191016
In
soft reset
SRX
channel 0
CS = 0
MR0=0x18
MR4=0x2
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
change freq to 416MHz 0,1
Channel 0: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
Channel 1: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
256B stride
channel 0
CS = 0
MR0=0x18
MR4=0x2
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
channel 0, cs 0, advanced training done
channel 1, cs 0, advanced training done
change freq to 856MHz 1,0
ch 0 ddrconfig = 0x101, ddrsize = 0x40
ch 1 ddrconfig = 0x101, ddrsize = 0x40
pmugrf_os_reg[2] = 0x32C1F2C1, stride = 0xD
ddr_set_rate to 328MHZ
ddr_set_rate to 666MHZ
ddr_set_rate to 928MHZ
channel 0, cs 0, advanced training done
channel 1, cs 0, advanced training done
ddr_set_rate to 416MHZ, ctl_index 0
ddr_set_rate to 856MHZ, ctl_index 1
support 416 856 328 666 928 MHz, current 856MHz
OUT
Boot1: 2019-03-14, version: 1.19
CPUId = 0x0
ChipType = 0x10, 323
mmc: ERROR: SDHCI ERR:cmd:0x102,stat:0x18000
mmc: ERROR: Card did not respond to voltage select!
emmc reinit
mmc: ERROR: SDHCI ERR:cmd:0x102,stat:0x18000
mmc: ERROR: Card did not respond to voltage select!
emmc reinit
mmc: ERROR: SDHCI ERR:cmd:0x102,stat:0x18000
mmc: ERROR: Card did not respond to voltage select!
SdmmcInit=2 1
mmc0:cmd5,20
SdmmcInit=0 0
BootCapSize=0
UserCapSize=244016MB
FwPartOffset=2000 , 0
StorageInit ok = 56917
SecureMode = 0
SecureInit read PBA: 0x4
SecureInit read PBA: 0x404
SecureInit read PBA: 0x804
SecureInit read PBA: 0xc04
SecureInit read PBA: 0x1004
SecureInit read PBA: 0x1404
SecureInit read PBA: 0x1804
SecureInit read PBA: 0x1c04
SecureInit ret = 0, SecureMode = 0
atags_set_bootdev: ret:(0)
GPT 0x3380ec0 signature is wrong
recovery gpt...
GPT 0x3380ec0 signature is wrong
recovery gpt fail!
LoadTrust Addr:0x4000
No find bl30.bin
No find bl32.bin
Load uboot, ReadLba = 2000
hdr 0000000003380880 + 0x0:0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x0                        0,0x00,0x00,0x00,0x00,0x00,

Load OK, addr=0x200000, size=0xe5b60
RunBL31 0x40000
NOTICE:  BL31: v1.3(debug):42583b6
NOTICE:  BL31: Built : 07:55:13, Oct 15 2019
NOTICE:  BL31: Rockchip release version: v1.1
INFO:    GICv3 with legacy support detected. ARM GICV3 driver initialized in EL3
INFO:    Using opteed sec cpu_context!
INFO:    boot cpu mask: 0
INFO:    plat_rockchip_pmu_init(1190): pd status 3e
INFO:    BL31: Initializing runtime services
WARNING: No OPTEE provided by BL2 boot loader, Booting device without OPTEE init                        ialization. SMC`s destined for OPTEE will return SMC_UNK
ERROR:   Error initializing runtime service opteed_fast
INFO:    BL31: Preparing for EL3 exit to normal world
INFO:    Entry point address = 0x200000
INFO:    SPSR = 0x3c9


U-Boot 2020.10-armbian (May 06 2021 - 17:13:15 +0000)

SoC: Rockchip rk3399
Reset cause: WDOG
DRAM:  3.9 GiB
PMIC:  RK808
SF: Detected w25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB
MMC:   mmc@fe320000: 1, sdhci@fe330000: 0
Loading Environment from MMC... Card did not respond to voltage select!
*** Warning - No block device, using default environment

In:    serial
Out:   serial
Err:   serial
Model: Helios64
Revision: 1.2 - 4GB non ECC
Net:   eth0: ethernet@fe300000
scanning bus for devices...
starting USB...
Bus usb@fe380000: USB EHCI 1.00
Bus dwc3: usb maximum-speed not found
Register 2000140 NbrPorts 2
Starting the controller
USB XHCI 1.10
scanning bus usb@fe380000 for devices... 1 USB Device(s) found
scanning bus dwc3 for devices... cannot reset port 4!?
5 USB Device(s) found
       scanning usb for storage devices... 1 Storage Device(s) found
Hit any key to stop autoboot:  0
switch to partitions #0, OK
mmc1 is current device
** No partition table - mmc 1 **
Card did not respond to voltage select!

Device 0: Vendor: Seagate  Rev: 070B Prod: Expansion Desk
            Type: Hard Disk
            Capacity: 2861588.4 MB = 2794.5 GB (732566645 x 4096)
... is now current device
** Unrecognized filesystem type **
scanning bus for devices...

Device 0: unknown device
Speed: 1000, full duplex
BOOTP broadcast 1
DHCP client bound to address 192.168.0.254 (3 ms)
*** Warning: no boot file name; using 'C0A800FE.img'
Using ethernet@fe300000 device
TFTP from server 192.168.0.1; our IP address is 192.168.0.254
Filename 'C0A800FE.img'.
Load address: 0x800800
Loading: T T T T T T T T T T
Retry count exceeded; starting again
missing environment variable: pxeuuid
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/01-64-62-66-d0-01-58

 

I've aready given i a bump in voltage as recommended in the config file but i cant tell where to check if it  took the settting, frequency is set in armbian-config to ondemand.

 

does anyone have any ideas?

 

Thanks in advance

Krita

 

 

Edited by Krita
formatting
Link to comment
Share on other sites

So Ive had the crash happen again after allmost exactly 24hrs, this time with logging verbose 5 so i actually got a bit of data. this was under no load at all, no one logged on just a single PC on the network running USB-C serial logging.

 

I have no idea how to interoperate the data but I'm instigating what it means to the best of my ability, any pointer would be great

 

 

Spoiler

[88215.370992] Unable to handle kernel paging request at virtual address 003f8000118b99f0

[88215.371700] Mem abort info:

[88215.371948]   ESR = 0x96000004

[88215.372220]   EC = 0x25: DABT (current EL), IL = 32 bits

[88215.372686]   SET = 0, FnV = 0

[88215.372956]   EA = 0, S1PTW = 0

[88215.373234] Data abort info:

[88215.373488]   ISV = 0, ISS = 0x00000004

[88215.373825]   CM = 0, WnR = 0

[88215.374087] [003f8000118b99f0] address between user and kernel address ranges

[88215.374713] Internal error: Oops: 96000004 [#1] PREEMPT SMP

[88215.375203] Modules linked in: softdog governor_performance snd_soc_hdmi_codec r8152 rockchipdrm snd_soc_rockchip_i2s dw_mipi_dsi rockchip_vdec(C) hantro_vpu(C) dw_hdmi rockchip_rga snd_soc_core analogix_dp v4l2_h264 snd_pcm_dmaengine videobuf2_dma_contig videobuf2_dma_sg videobuf2_vmalloc v4l2_mem2mem gpio_charger drm_kms_helper videobuf2_memops fusb302 snd_pcm cec videobuf2_v4l2 pwm_fan panfrost leds_pwm snd_timer tcpm rc_core videobuf2_common gpu_sched snd typec videodev sg drm soundcore mc drm_panel_orientation_quirks gpio_beeper cpufreq_dt zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) nfsd zcommon(POE) auth_rpcgss znvpair(POE) nfs_acl zavl(POE) icp(POE) lockd spl(OE) grace ledtrig_netdev lm75 sunrpc ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear md_mod uas realtek dwmac_rk stmmac_platform stmmac pcs_xpcs adc_keys

[88215.382244] CPU: 5 PID: 0 Comm: swapper/5 Tainted: P         C OE     5.10.43-rockchip64 #21.05.4

[88215.383019] Hardware name: Helios64 (DT)

[88215.383366] pstate: 60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)

[88215.383904] pc : scheduler_tick+0xc4/0x140

[88215.384266] lr : scheduler_tick+0xc4/0x140

[88215.384626] sp : ffff800011c13d90

[88215.384918] x29: ffff800011c13d90 x28: 0000503bdc75d080

[88215.385387] x27: ffff0000f77bb6c0 x26: 0000000000000002

[88215.385856] x25: 0000000000000080 x24: ffff80001156a000

[88215.386325] x23: ffff000000711d00 x22: ffff80001157fd00

[88215.386793] x21: 0000000000000005 x20: ffff8000118b99c8

[88215.387261] x19: ffff0000f77c7d00 x18: 0000000000000000

[88215.387729] x17: 0000000000000000 x16: 00000000000073c0

[88215.388197] x15: 000028eb9d29edfa x14: 0000000000000000

[88215.388666] x13: 0000000000000270 x12: 0000000000000000

[88215.389134] x11: 0000000000000000 x10: 0000000000000004

[88215.389603] x9 : 0000000000000270 x8 : 0000000000000000

[88215.390071] x7 : ffff0000f77c7d00 x6 : ffff0000f77c8800

[88215.390540] x5 : 0000000000001270 x4 : ffff8000e6248000

[88215.391007] x3 : 0000000000010001 x2 : ffff80001156a000

[88215.391475] x1 : ffff8000112a1f58 x0 : 0000000000000005

[88215.391944] Call trace:

[88215.392163]  scheduler_tick+0xc4/0x140

[88215.392497]  update_process_times+0x8c/0xa0

[88215.392867]  tick_sched_handle.isra.19+0x40/0x58

[88215.393273]  tick_sched_timer+0x58/0xb0

[88215.393613]  __hrtimer_run_queues+0x104/0x388

[88215.393997]  hrtimer_interrupt+0xf4/0x250

[88215.394353]  arch_timer_handler_phys+0x30/0x40

[88215.394746]  handle_percpu_devid_irq+0xa0/0x298

[88215.395144]  generic_handle_irq+0x30/0x48

[88215.395498]  __handle_domain_irq+0x94/0x108

[88215.395869]  gic_handle_irq+0xc0/0x140

[88215.396202]  el1_irq+0xc8/0x180

[88215.396482]  arch_cpu_idle+0x18/0x28

[88215.396799]  default_idle_call+0x44/0x1bc

[88215.397152]  do_idle+0x204/0x278

[88215.397437]  cpu_startup_entry+0x28/0x60

[88215.397789]  secondary_start_kernel+0x170/0x180

[88215.398191] Code: 94000cfb aa1303e0 94369ab7 94051e08 (f8757a82)

[88215.398733] ---[ end trace 57a1570b3962ce1b ]---

[88215.399139] Kernel panic - not syncing: Oops: Fatal exception in interrupt

[88215.399743] SMP: stopping secondary CPUs

[88215.400093] Kernel Offset: disabled

[88215.400402] CPU features: 0x0240022,6100200c

[88215.400777] Memory Limit: none

[88215.401056] Rebooting in 90 seconds..

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines