Jump to content

TheGuv

Members
  • Posts

    15
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. OK, so I had to reboot soon after the above post due to the server not being responsive. it was still alive with occasional disk activity and the heartbeat LED was happily flashing with the correct cadence. Looks like the r8152 timed out again and could not recover. Not a lot of info to go on, but I don't think it was a busy server at the time. Are the DMA errors what are normally seen with an overloaded core? 2024-02-05T03:08:32.752569+00:00 helios64 kernel: [40379.321499] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 3 ep 3 with no TDs queued? 2024-02-05T03:08:32.752783+00:00 helios64 kernel: [40379.321554] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 3 ep 3 with no TDs queued? 2024-02-05T03:08:32.765068+00:00 helios64 kernel: [40379.337314] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 3 comp_code 1 2024-02-05T03:08:32.765234+00:00 helios64 kernel: [40379.337362] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000f5119f70 trb-start 00000000f5119f50 trb-end 00000000f5119f50 seg-start 00000000f5119000 seg-end 00000000f5119ff0 2024-02-05T03:08:32.765245+00:00 helios64 kernel: [40379.337421] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 3 comp_code 1 2024-02-05T03:08:32.765299+00:00 helios64 kernel: [40379.337445] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000f5119f80 trb-start 00000000f5119f50 trb-end 00000000f5119f50 seg-start 00000000f5119000 seg-end 00000000f5119ff0 2024-02-05T03:08:32.765396+00:00 helios64 kernel: [40379.337485] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 3 comp_code 1 2024-02-05T03:08:32.765466+00:00 helios64 kernel: [40379.337510] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000f5119f90 trb-start 00000000f5119f50 trb-end 00000000f5119f50 seg-start 00000000f5119000 seg-end 00000000f5119ff0 2024-02-05T03:08:32.765473+00:00 helios64 kernel: [40379.337549] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 3 comp_code 1 2024-02-05T03:08:32.765476+00:00 helios64 kernel: [40379.337572] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000f5119fa0 trb-start 00000000f5119f50 trb-end 00000000f5119f50 seg-start 00000000f5119000 seg-end 00000000f5119ff0 2024-02-05T03:08:32.765480+00:00 helios64 kernel: [40379.337606] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 3 comp_code 1 2024-02-05T03:08:32.765528+00:00 helios64 kernel: [40379.337629] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000f5119fb0 trb-start 00000000f5119f50 trb-end 00000000f5119f50 seg-start 00000000f5119000 seg-end 00000000f5119ff0 2024-02-05T03:08:32.765536+00:00 helios64 kernel: [40379.337663] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 3 comp_code 1 2024-02-05T03:08:32.766028+00:00 helios64 kernel: [40379.337686] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000f5119fc0 trb-start 00000000f5119f50 trb-end 00000000f5119f50 seg-start 00000000f5119000 seg-end 00000000f5119ff0 2024-02-05T03:08:32.766039+00:00 helios64 kernel: [40379.337719] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 3 comp_code 1 2024-02-05T03:08:32.766043+00:00 helios64 kernel: [40379.337742] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000f5119fd0 trb-start 00000000f5119f50 trb-end 00000000f5119f50 seg-start 00000000f5119000 seg-end 00000000f5119ff0 2024-02-05T03:08:32.766048+00:00 helios64 kernel: [40379.337781] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 3 comp_code 1 2024-02-05T03:08:32.766052+00:00 helios64 kernel: [40379.337805] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000f5119fe0 trb-start 00000000f5119f50 trb-end 00000000f5119f50 seg-start 00000000f5119000 seg-end 00000000f5119ff0 2024-02-05T03:08:32.766057+00:00 helios64 kernel: [40379.337839] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 3 comp_code 1 2024-02-05T03:08:32.766061+00:00 helios64 kernel: [40379.337862] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000f5119ff0 trb-start 00000000f5119f50 trb-end 00000000f5119f50 seg-start 00000000f5119000 seg-end 00000000f5119ff0 2024-02-05T03:08:32.766065+00:00 helios64 kernel: [40379.337895] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 3 comp_code 1 2024-02-05T03:08:32.766069+00:00 helios64 kernel: [40379.337918] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000f511a000 trb-start 00000000f5119f50 trb-end 00000000f5119f50 seg-start 00000000f5119000 seg-end 00000000f5119ff0 024-02-05T03:08:40.392810+00:00 helios64 kernel: [40386.965499] ------------[ cut here ]------------ 2024-02-05T03:08:40.392875+00:00 helios64 kernel: [40386.965584] NETDEV WATCHDOG: enx646266d00c4f (r8152): transmit queue 0 timed out 7628 ms 2024-02-05T03:08:40.392883+00:00 helios64 kernel: [40386.965775] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x29c/0x2b4 2024-02-05T03:08:40.395293+00:00 helios64 kernel: [40386.965831] Modules linked in: wireguard libchacha20poly1305 poly1305_neon libcurve25519_generic ip6_udp_tunnel udp_tunnel tls xt_MASQUERADE xt_nat xt_tcpudp iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bpfilter zram binfmt_misc snd_soc_hdmi_codec snd_soc_rockchip_i2s hantro_vpu rockchip_vdec(C) snd_soc_core v4l2_vp9 snd_compress videobuf2_dma_contig snd_pcm_dmaengine rockchip_rga v4l2_h264 snd_pcm leds_pwm videobuf2_dma_sg v4l2_mem2mem pwm_fan panfrost videobuf2_memops gpio_charger gpu_sched videobuf2_v4l2 videodev drm_shmem_helper videobuf2_common snd_timer snd mc rk_crypto rng_core soundcore gpio_beeper cpufreq_dt sg lm75 ledtrig_netdev nfsd drivetemp auth_rpcgss nfs_acl lockd grace dm_mod sunrpc ip_tables x_tables autofs4 xfs efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid0 multipath linear cdc_ncm cdc_ether usbnet raid1 r8152 md_mod realtek fusb302 tcpm typec dwmac_rk stmmac_platform stmmac pcs_xpcs adc_keys 2024-02-05T03:08:40.396875+00:00 helios64 kernel: [40386.966750] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G C 6.7.2-edge-rockchip64 #3 2024-02-05T03:08:40.396885+00:00 helios64 kernel: [40386.966786] Hardware name: Helios64 (DT) 2024-02-05T03:08:40.396891+00:00 helios64 kernel: [40386.966797] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) 2024-02-05T03:08:40.396897+00:00 helios64 kernel: [40386.966820] pc : dev_watchdog+0x29c/0x2b4 2024-02-05T03:08:40.396903+00:00 helios64 kernel: [40386.966852] lr : dev_watchdog+0x29c/0x2b4 2024-02-05T03:08:40.396908+00:00 helios64 kernel: [40386.966879] sp : ffff800080003dc0 2024-02-05T03:08:40.396914+00:00 helios64 kernel: [40386.966893] x29: ffff800080003dc0 x28: ffff800080ec96fc x27: ffff800080003ec0 2024-02-05T03:08:40.396920+00:00 helios64 kernel: [40386.966940] x26: ffff800081849008 x25: 0000000000001dcc x24: ffff800081ba7000 2024-02-05T03:08:40.396927+00:00 helios64 kernel: [40386.966986] x23: 0000000000000000 x22: ffff0000f570b41c x21: ffff0000f570b000 2024-02-05T03:08:40.396933+00:00 helios64 kernel: [40386.967030] x20: ffff0000f568aa00 x19: ffff0000f570b4c8 x18: ffffffffffffffff 2024-02-05T03:08:40.396938+00:00 helios64 kernel: [40386.967076] x17: 64656d6974203020 x16: 6575657571207469 x15: 6d736e617274203a 2024-02-05T03:08:40.396944+00:00 helios64 kernel: [40386.967121] x14: 2932353138722820 x13: 0000000000000377 x12: 00000000ffffffea 2024-02-05T03:08:40.397021+00:00 helios64 kernel: [40386.967166] x11: 00000000ffffefff x10: 00000000ffffefff x9 : ffff800081c26668 2024-02-05T03:08:40.397105+00:00 helios64 kernel: [40386.967211] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000005000000 2024-02-05T03:08:40.397115+00:00 helios64 kernel: [40386.967255] x5 : 0000000000000001 x4 : 0000000000000040 x3 : 0000000000000001 2024-02-05T03:08:40.397121+00:00 helios64 kernel: [40386.967298] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff800081bb4500 2024-02-05T03:08:40.397126+00:00 helios64 kernel: [40386.967344] Call trace: 2024-02-05T03:08:40.397131+00:00 helios64 kernel: [40386.967356] dev_watchdog+0x29c/0x2b4 2024-02-05T03:08:40.397149+00:00 helios64 kernel: [40386.967387] call_timer_fn+0x34/0x1c0 2024-02-05T03:08:40.397157+00:00 helios64 kernel: [40386.967420] __run_timers.part.0+0x228/0x2f4 2024-02-05T03:08:40.397163+00:00 helios64 kernel: [40386.967451] run_timer_softirq+0x48/0x84 2024-02-05T03:08:40.397168+00:00 helios64 kernel: [40386.967481] __do_softirq+0x150/0x3e4 2024-02-05T03:08:40.397173+00:00 helios64 kernel: [40386.967505] ____do_softirq+0x10/0x1c 2024-02-05T03:08:40.397178+00:00 helios64 kernel: [40386.967533] call_on_irq_stack+0x24/0x4c 2024-02-05T03:08:40.397183+00:00 helios64 kernel: [40386.967558] do_softirq_own_stack+0x1c/0x2c 2024-02-05T03:08:40.397188+00:00 helios64 kernel: [40386.967584] irq_exit_rcu+0x9c/0xcc 2024-02-05T03:08:40.397193+00:00 helios64 kernel: [40386.967616] el1_interrupt+0x38/0x68 2024-02-05T03:08:40.397198+00:00 helios64 kernel: [40386.967644] el1h_64_irq_handler+0x18/0x24 2024-02-05T03:08:40.397265+00:00 helios64 kernel: [40386.967671] el1h_64_irq+0x64/0x68 2024-02-05T03:08:40.397277+00:00 helios64 kernel: [40386.967693] cpuidle_enter_state+0xc4/0x4bc 2024-02-05T03:08:40.397283+00:00 helios64 kernel: [40386.967722] cpuidle_enter+0x38/0x50 2024-02-05T03:08:40.397287+00:00 helios64 kernel: [40386.967749] do_idle+0x1fc/0x270 2024-02-05T03:08:40.397292+00:00 helios64 kernel: [40386.967779] cpu_startup_entry+0x34/0x3c 2024-02-05T03:08:40.397297+00:00 helios64 kernel: [40386.967807] kernel_init+0x0/0x1e0 2024-02-05T03:08:40.397302+00:00 helios64 kernel: [40386.967835] arch_post_acpi_subsys_init+0x0/0x8 2024-02-05T03:08:40.397360+00:00 helios64 kernel: [40386.967865] start_kernel+0x6c4/0x8fc 2024-02-05T03:08:40.397372+00:00 helios64 kernel: [40386.967891] __primary_switched+0xb4/0xbc 2024-02-05T03:08:40.397377+00:00 helios64 kernel: [40386.967925] ---[ end trace 0000000000000000 ]--- 2024-02-05T03:08:40.397383+00:00 helios64 kernel: [40386.967975] r8152 2-1.4:1.0 enx646266d00c4f: Tx timeout 2024-02-05T03:08:45.516397+00:00 helios64 kernel: [40392.085520] xhci-hcd xhci-hcd.0.auto: xHCI host not responding to stop endpoint command 2024-02-05T03:08:45.516458+00:00 helios64 kernel: [40392.085569] xhci-hcd xhci-hcd.0.auto: xHCI host controller not responding, assume dead 2024-02-05T03:08:45.516467+00:00 helios64 kernel: [40392.085685] xhci-hcd xhci-hcd.0.auto: HC died; cleaning up 2024-02-05T03:08:45.516472+00:00 helios64 kernel: [40392.085776] usb 1-1: USB disconnect, device number 2 2024-02-05T03:08:45.516477+00:00 helios64 kernel: [40392.087845] usb 2-1: USB disconnect, device number 2 2024-02-05T03:08:45.516482+00:00 helios64 kernel: [40392.087878] r8152-cfgselector 2-1.4: USB disconnect, device number 3 2024-02-05T03:08:45.516487+00:00 helios64 kernel: [40392.088121] r8152 2-1.4:1.0 enx646266d00c4f: Tx status -108 2024-02-05T03:08:45.516494+00:00 helios64 kernel: [40392.088156] r8152 2-1.4:1.0 enx646266d00c4f: Tx status -108 2024-02-05T03:08:45.516501+00:00 helios64 kernel: [40392.088182] r8152 2-1.4:1.0 enx646266d00c4f: Tx status -2 2024-02-05T03:08:45.516508+00:00 helios64 kernel: [40392.088229] r8152 2-1.4:1.0 enx646266d00c4f: Tx status -108 2024-02-05T03:08:45.516513+00:00 helios64 kernel: [40392.088416] r8152 2-1.4:1.0 enx646266d00c4f: Get ether addr fail
  2. Currently testing 6.7.2 and this might be the most reliable kernel for me yet currently on 4 days (!) uptime. The only issue I am experiencing is Tx timeout on the r8152 ethernet port which causes a pause for a couple of seconds e.g.: [285266.096001] r8152 2-1.4:1.0 enx646266d00c4f: Tx timeout [285266.100079] r8152 2-1.4:1.0 enx646266d00c4f: Tx status -2 [285266.100157] r8152 2-1.4:1.0 enx646266d00c4f: Tx status -2 [285266.100230] r8152 2-1.4:1.0 enx646266d00c4f: Tx status -2 [285266.100375] r8152 2-1.4:1.0 enx646266d00c4f: Tx status -2 [285268.220401] r8152-cfgselector 2-1.4: reset SuperSpeed USB device number 3 using xhci-hcd I am using firmware from the kernel source tree: [ 13.639966] usbcore: registered new device driver r8152-cfgselector [ 13.853975] r8152-cfgselector 2-1.4: reset SuperSpeed USB device number 3 using xhci-hcd [ 14.076359] r8152 2-1.4:1.0: load rtl8156a-2 v2 04/27/23 successfully [ 14.132728] r8152 2-1.4:1.0 eth0: v1.12.13 [ 14.132950] usbcore: registered new interface driver r8152 The kernel device tree does not appear to contain the patches for either the eMMC speed and strobe fixes, or the CPU cache. I tried patching them myself and things didn't end well, but I'm no expert and probably things are differen for 6.7.x
  3. ebin-dev thank you so much. I now see how to add the l2-cache. The result is much the same with no more than 3 days of uptime before a random reboot occurs, or (rarely) a frozen box with a red system light. I did indeed flash the bootloader via the armbian-config update bootloader (option 5). I'm using the bootloader from linux-u-boot-edge-helios64_22.02.1_arm64.deb I'm still seeing r8152 USB timeouts like this: 2024-01-30T04:16:06.975984+00:00 helios64 kernel: [42578.782562] r8152 2-1.4:1.0 enx646266d00c4f: Tx timeout 2024-01-30T04:16:06.983800+00:00 helios64 kernel: [42578.786383] r8152 2-1.4:1.0 enx646266d00c4f: Tx status -2 2024-01-30T04:16:06.983874+00:00 helios64 kernel: [42578.786571] r8152 2-1.4:1.0 enx646266d00c4f: Tx status -2 2024-01-30T04:16:06.983886+00:00 helios64 kernel: [42578.786710] r8152 2-1.4:1.0 enx646266d00c4f: Tx status -2 2024-01-30T04:16:06.983897+00:00 helios64 kernel: [42578.786812] r8152 2-1.4:1.0 enx646266d00c4f: Tx status -2 2024-01-30T04:16:09.103853+00:00 helios64 kernel: [42580.906680] r8152-cfgselector 2-1.4: reset SuperSpeed USB device number 4 using xhci-hcd and I am using the rtl_nic firmware from git.kernel.org. I guess it could be NFS causing problems although the network would have been quiet at 4am. I'll give 5.1.71 a whirl.
  4. @ebin-dev Am I correct that the 6.6.8 debs you kindly share from your Dropbox do not contain the hs400 mmc speed, extended strobe and CPU cache patches to the DTB? I worked out how to patch the mmc things but couldn't work out how to apply the cache patches shown in the kernel list dtsi patch set. In other news the 6.6.8 version is probably the best I've had on my system and now typically lasts 3 days between crashes. It was a lot worse until I realised that I was using the gmac ethernet port and not the r8152 2.5Gb one and that was causing a lot more instability. The r8152 does timeout sometimes though even with the mainline firmware and offloading enabled. I do use the system as an NFS server though with several systems connected.
  5. Thank you @ebin-dev for the help and clues. I tried 5.15.52 installing three .deb files linux-dtb-current-rockchip64_22.05.4_arm64.deb linux-headers-current-rockchip64_22.05.4_arm64.deb linux-image-current-rockchip64_22.05.4_arm64.deb downloaded from the 5.15.52 pool directory but afterwards I had no networking, and both the serial and console were acting strange with hangs and pauses which prevented me from login in to see what was going on. I reverted to 5.10.43. Did I need to do something else, like different firmware in /lib/firmware/rtl_nic which is currently from https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/rtl_nic ? Should I have updated the initrd after installing the .debs? I know I'm being daft somewhere but after so many different tests over the last weeks I'm kind of spinning !
  6. I appreciate the work being done to get us some stability. I'm currently running 5.10.43 with r8152 v2.14.0 but finding it unstable with several reboots each day. I think 6.1.36 for me was marginally better but not much in it. @ebin-dev seems to be having a good time with 5.10.43 so I'm assuming I've missed something along the way. I have: uBoot from 22.02.1 kernel 5.10.43-rockchip64 #21.05.4 SMP PREEMPT Wed Jun 16 08:02:12 UTC 2021 r8152a v2.14.0 (2020/09/24) CPU min/max 1608000/performance. 4x HDD using mergerfs and snapraid, 1xSSD for root drive and booting from eMMC Did I miss something?
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines