So Ive had the crash happen again after allmost exactly 24hrs, this time with logging verbose 5 so i actually got a bit of data. this was under no load at all, no one logged on just a single PC on the network running USB-C serial logging.
I have no idea how to interoperate the data but I'm instigating what it means to the best of my ability, any pointer would be great
Just got my Helios64 crashed: red error led is blinking.
I had a raspberry pi left logging on the USB-C and caught the crash:
[567513.689265] rk_gmac-dwmac fe300000.ethernet eth0: Link is Down
[567519.833508] rk_gmac-dwmac fe300000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off
[567828.048254] rk_gmac-dwmac fe300000.ethernet eth0: Link is Down
[567834.198593] rk_gmac-dwmac fe300000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off
[568870.455100] rk_gmac-dwmac fe300000.ethernet eth0: Link is Down
[568876.599602] rk_gmac-dwmac fe300000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off
[647513.282616] Unable to handle kernel paging request at virtual address 00078000118b99f0
[647513.283329] Mem abort info:
[647513.283584] ESR = 0x96000004
[647513.283863] EC = 0x25: DABT (current EL), IL = 32 bits
[647513.284336] SET = 0, FnV = 0
[647513.284615] EA = 0, S1PTW = 0
[647513.284899] Data abort info:
[647513.285161] ISV = 0, ISS = 0x00000004
[647513.285506] CM = 0, WnR = 0
[647513.285776] [00078000118b99f0] address between user and kernel address ranges
[647513.286411] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[647513.286909] Modules linked in: iptable_nat iptable_filter bpfilter wireguard libchacha20poly1305 poly1305_neon ip6_udp_tunnel udp_tunnel libblake2s libcurve25519_generic libblake2s_generic veth nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter bridge aufs ipt_REJECT nf_reject_ipv4 rfkill governor_performance n
ft_chain_nat xt_nat xt_MASQUERADE nf_nat xt_addrtype nft_counter zram xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables nfnetlink r8152 snd_soc_hdmi_codec snd_soc_rockchip_i2s hantro_vpu(C) rockchipdrm rockchip_vdec(C) snd_soc_core dw_mipi_dsi v4l2_h264 dw_hdmi snd_pcm_dmaengine ro
ckchip_rga videobuf2_dma_contig analogix_dp snd_pcm pwm_fan videobuf2_dma_sg v4l2_mem2mem snd_timer gpio_charger videobuf2_vmalloc leds_pwm snd panfrost videobuf2_memops fusb302 drm_kms_helper videobuf2_v4l2 tcpm soundcore gpu_sched videobuf2_common cec typec rc_core videodev sg drm mc drm_panel_orientation_quirks gpi
o_beeper cpufreq_dt ledtrig_netdev lm75 ip_tables
[647513.287063] x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear md_mod realtek dwmac_rk stmmac_platform stmmac pcs_xpcs adc_keys
[647513.296230] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G C 5.10.35-rockchip64 #21.05.1
[647513.297012] Hardware name: Helios64 (DT)
[647513.297367] pstate: 60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)
[647513.297911] pc : scheduler_tick+0xc4/0x140
[647513.298279] lr : scheduler_tick+0xc4/0x140
[647513.298647] sp : ffff800011c13d90
[647513.298947] x29: ffff800011c13d90 x28: 00024ced30605580
[647513.299424] x27: ffff0000f77bb6c0 x26: 0000000000000000
[647513.299900] x25: 0000000000000080 x24: ffff80001156a000
[647513.300375] x23: ffff000000711d00 x22: ffff80001157fd00
[647513.300851] x21: 0000000000000005 x20: ffff8000118b99c8
[647513.301327] x19: ffff0000f77c7d00 x18: 0000000000000610
[647513.301803] x17: 0000000000000010 x16: 0000000000000000
[647513.302280] x15: 0000000000000006 x14: 0000000000000000
[647513.302756] x13: 0000000000000095 x12: 0000000000000000
[647513.303231] x11: 0000000000000000 x10: 0000000000000004
[647513.303707] x9 : 0000000000000095 x8 : 0000000000000000
[647513.304184] x7 : ffff0000f77c7d00 x6 : ffff0000f77c8800
[647513.304659] x5 : 0000000000001095 x4 : ffff8000e6248000
[647513.305135] x3 : 0000000000010001 x2 : ffff80001156a000
[647513.305612] x1 : ffff8000112a1c88 x0 : 0000000000000005
[647513.306088] Call trace:
[647513.306314] scheduler_tick+0xc4/0x140
[647513.306658] update_process_times+0x8c/0xa0
[647513.307035] tick_sched_handle.isra.19+0x40/0x58
[647513.307449] tick_sched_timer+0x58/0xb0
[647513.307795] __hrtimer_run_queues+0x104/0x388
[647513.308187] hrtimer_interrupt+0xf4/0x250
[647513.308551] arch_timer_handler_phys+0x30/0x40
[647513.308950] handle_percpu_devid_irq+0xa0/0x298
[647513.309357] generic_handle_irq+0x30/0x48
[647513.309718] __handle_domain_irq+0x94/0x108
[647513.310097] gic_handle_irq+0xc0/0x140
[647513.310436] el1_irq+0xc0/0x180
[647513.310724] arch_cpu_idle+0x18/0x28
[647513.311047] default_idle_call+0x44/0x1bc
[647513.311409] do_idle+0x204/0x278
[647513.311701] cpu_startup_entry+0x28/0x60
[647513.312056] secondary_start_kernel+0x170/0x180
[647513.312466] Code: 94000cfb aa1303e0 94369a27 940518e0 (f8757a82)
[647513.313015] ---[ end trace 2613ef5b92c55060 ]---
[647513.313430] Kernel panic - not syncing: Oops: Fatal exception in interrupt
[647513.314040] SMP: stopping secondary CPUs
[647513.314403] Kernel Offset: disabled
[647513.314718] CPU features: 0x0240022,6100200c
[647513.315101] Memory Limit: none
[647513.315387] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
Don't mind the ethernet port up/down messages (the connected router likes to reboot himself).
The system is running from SD card.
$ uname -a
Linux helios64 5.10.35-rockchip64 #21.05.1 SMP PREEMPT Fri May 7 13:53:11 UTC 2021 aarch64 GNU/Linux
If you need any other information, feel free to ask.
There are a number of modifications that have been suggested that people implement to address certain issues.
The ones I can find are:
- In /boot/armbianEnv.txt:
extraargs=libata.force=3.0
- If doing debugging, also add:
verbosity=7
console=serial
extraargs=earlyprintk ignore_loglevel
- In /boot/boot.cmd
regulator dev vdd_log
regulator value 930000
regulator dev vdd_center
regulator value 950000
and then run:
mkimage -C none -A arm -T script -d /boot/boot.cmd /boot/boot.scr
- In /etc/default/cpufrequtils:
ENABLE=true
MIN_SPEED=408000
MAX_SPEED=1800000
GOVERNOR=ondemand
(or 1200000 instead of 1800000)
- And if using ZFS:
for disk in /sys/block/sd[a-e]/queue/scheduler; do echo none > $disk; done
I've gathered these from a variety of threads. Am I missing any here?
Yup, fixed. Thank you. It may be the default and non-changeable, but it's a suggestion that has been put in from the Kobol team, which is why I included it.