stefan.steffens Posted March 11, 2019 Posted March 11, 2019 TINKERBOARD FREEZES UNDER NFS LOAD Hi, I'm not sure where to post my issue but the Rockwell Eth issue which freezes my Tinkerboard under NFS load is back with the new images. So far I've been using the 'Armbian_5.59_Tinkerboard_Ubuntu_bionic_next_4.14.67_desktop' image upgraded via apt to the most recent version ( ARMBIAN 5.73 stable Ubuntu 18.04.2 LTS 4.19.20-rockchip). Based on prior advice I've added sudo ethtool -K eth0 tx off rx off to /etc/rc.local as the rk3328 is buggy on tx offloading Given the new images I've tried 'Armbian_5.75_Tinkerboard_Ubuntu_bionic_next_4.19.20_desktop' with the same entry in /etc/rc.local but as soon as I start multiple compiles via NFS the Tinkerboard freezes - just like the old image did prior to the ethtool modifications. Any idea how what might be wrong as I otherwise like some of the new features from the updated image?
JMCC Posted March 11, 2019 Posted March 11, 2019 (edited) On 3/4/2019 at 4:35 AM, TonyMac32 said: For the mainline ethernet tx/rx delays, were those just copied from the Rock64 or were they empirically determined? Firefly added some patches to adjust tx/rx delays(ref), as well as for disabling UHS modes that were causing trouble and lowering dmc and sdmmc frequencies and some other tweaks. I patched all of them into the mainline DTB, that was the one we added to rockchip64-Default in the early stages of our Renegade support. I later patched that same Default DTB to add 1.5 GHz CPU freq, and to improve the LED's functions. and that's the one we are using right now for 4.4.y Default. So we can probably just use our Default DTB patches for mainline too. -------------------- Besides that, I am planning to find some time to double-check the SD card thing, probably re-enabling UHS and adding some necessary kernel configs for them to work. The main reason for that is the fact that my Renegade hangs on shutdown (not on reboot), and in the next system boot it makes a fsck on the SD card. So I am suspecting that disabling the sd regulators causes some trouble here, and I want to try to re-enable them. However, I am waiting for some Renegade sample from @Da Xue, just to make sure those problems are not exclusive to my particular unit, which has shown to be "special" in other occasions regarding SD cards. Edited March 11, 2019 by JMCC Edited: not use the patched Default DTB for mainline, but apply the Default DTB patches to mainline
JMCC Posted March 12, 2019 Posted March 12, 2019 Okay, I just had a look at the commit history, and the history of "/patch/kernel/rockchip64-default/Add_dts_rk3328-roc-cc.patch" was lost when it was merged into a commit for creating the rockchip64 family: now only the tiny LED patch that I applied recently is appearing. So it will be harder to figure out which patches I applied in the first place to make the board work. I'll try to figure it out. [EDIT]: Nevermind, I just figured out the directory containing the patches changed the name. Now I found the original DTS. @Igor please ignore the question I just deleted, in case you got to read it. 1
JMCC Posted March 12, 2019 Posted March 12, 2019 On 3/4/2019 at 4:35 AM, TonyMac32 said: Any argument against moving Ayufan's RK3328 mainline to "next" and putting true mainline at "dev"? I think it's to the point that 5.1/5.2 are nearly to RK3288 levels of support in mainline. Then why not directly pushing 5.2 into "next", if it has better support than ayufan's 4.20 ? In any case, if that's gonna be too much hassle, +1 for adding something to "next", even if it's ayufan's 4.20. Having just default for these boards is boring.
TonyMac32 Posted March 12, 2019 Posted March 12, 2019 Lol we will have some work to do on new kernel anyway, making it dev. My goal is to wean us off of the special kernels so we can work on getting rk3399 integrated into a single family, what we're doing right now is insane.
JMCC Posted March 12, 2019 Posted March 12, 2019 6 minutes ago, TonyMac32 said: getting rk3399 integrated into a single family, Definitely. Right now, there are "first class" and "second class" RK3399 boards: those under the rk3399 family get all the media fancy stuff, while those in the rockchip64 family don't.
balbes150 Posted March 13, 2019 Posted March 13, 2019 IMHO the branch "next" should use the primary sources, branch NEXT Linux with minimal patches. The "dev" branch can use any kernel sources, any patches for the kernel and dtb. In this case, you will know exactly in what state the support for specific processors in the main core is and work on patches to be sent to the main core. After all, the main task is to get full support in the main kernel by default (without additional patches in Armbian itself, except for the patch to build DEB packages).
jpegxguy Posted March 14, 2019 Posted March 14, 2019 I tried for the third time to get an answer from archlinux's kevin on the usb3 patch for renegade on archlinuxARM on another account (scetchy, I know). Apparently I'm added to the sh*tlist for "blatantly" ignoring his Oh, so important rules that he had to ban me from contributing to the project instead of acting like a normal adult. I sent 2 emails - ignored - and looked over the contributor guidelines again. I formatted my patch as seen in other merged PRs and the onl response I got was " Skirting bans from your blatant, irreverent disregard for posted rules and requirements and then continuing that tradition has now ensured your permanent placement on the sh*tlist " I'm so done with this guy. He thinks he's a god. Just look at the way he talks, jesus christ. Seems like he's allergic to equal support for rk3328 boards. Rock64 has exclusive rights to usb3... At least this way I finally got an answer instead of the usual ignore.
jpegxguy Posted March 14, 2019 Posted March 14, 2019 BTW: The hostility of Arch Linux ARM towards contributions is nothing new: So many PRs go closed without an answer. Sucks for the distro that the devs run on permanent Red Bull rage 1
TonyMac32 Posted March 14, 2019 Posted March 14, 2019 Well, let's avoid other distros and their management as far as discussion topics go. That's a bummer, and your patches are welcome here. 1
jpegxguy Posted March 14, 2019 Posted March 14, 2019 Yeah, sorry. I was just really bummed myself. I'll be looking more to actual mainline for mainline stuff, as it should be
jad3675 Posted March 16, 2019 Posted March 16, 2019 (edited) On 3/11/2019 at 3:31 PM, stefan.steffens said: TINKERBOARD FREEZES UNDER NFS LOAD Hi, I'm not sure where to post my issue but the Rockwell Eth issue which freezes my Tinkerboard under NFS load is back with the new images. So far I've been using the 'Armbian_5.59_Tinkerboard_Ubuntu_bionic_next_4.14.67_desktop' image upgraded via apt to the most recent version ( ARMBIAN 5.73 stable Ubuntu 18.04.2 LTS 4.19.20-rockchip). Based on prior advice I've added sudo ethtool -K eth0 tx off rx off to /etc/rc.local as the rk3328 is buggy on tx offloading Given the new images I've tried 'Armbian_5.75_Tinkerboard_Ubuntu_bionic_next_4.19.20_desktop' with the same entry in /etc/rc.local but as soon as I start multiple compiles via NFS the Tinkerboard freezes - just like the old image did prior to the ethtool modifications. Any idea how what might be wrong as I otherwise like some of the new features from the updated image? Check if your 'kernel.hung_task_panic' is set to 1. My fresh install had it set and it was causing me issues with network based storage. sysctl kernel.hung_task_panic If it is, set it to zero sysctl -w kernel.hung_task_panic=0 sysctl -p John Edited March 16, 2019 by jad3675
stefan.steffens Posted March 18, 2019 Posted March 18, 2019 Thanks for the hint but funnily when I started from scratch with a new SD card I couldn't reproduce the crashes even under heave NFS load on GigEth :-) And this while 'kernel.hung_task_panic' was set to 1 Though I'm sure of not having changed anything (I'm using a step-by-step cook book to make sure I don't forget anything) compared to the first setup I'm now a happy user again as it's working.
Arglebargle Posted March 25, 2019 Posted March 25, 2019 What's memory performance like on mainline vs 4.4? I've been catching myself up on the state of mainline Rockchip support this morning and stumbled across this: https://github.com/rockchip-linux/u-boot/issues/34. I don't have an unused board to boot and test myself but if indeed DDR clocks are being set low by u-boot and ram performance is no bueno it might be worth a known issue mention when/if mainline is accepted for support.
JMCC Posted March 25, 2019 Posted March 25, 2019 20 minutes ago, Arglebargle said: What's memory performance like on mainline vs 4.4? I've been catching myself up on the state of mainline Rockchip support this morning and stumbled across this: https://github.com/rockchip-linux/u-boot/issues/34. I don't have an unused board to boot and test myself but if indeed DDR clocks are being set low by u-boot and ram performance is no bueno it might be worth a known issue mention when/if mainline is accepted for support. RAM DVFS is broken right now in 4.4 Renegade, it stays at a fixed frequency of 1024 MHz, with the possibility to increase it to ~1050 through sysfs (which is close enough to the theoretical maximum of 1066). The Firefly people themselves lowered the frequency via device tree, to increase stability, right when they also lowered SD card speed. In the mainline kernel, the SD card frequency is not capped, but there is no info on the device tree about RAM speed. I can try make some tests, and post here the results when done.
TonyMac32 Posted March 28, 2019 Posted March 28, 2019 Cleaned up the default kernel patches, clock frequencies for additional video modes, also /proc/board_info exists again (can tell if Tinker or S, might be used by some ASUS software) Also if you build a dev image you have a working kernel 5.1 2
jpegxguy Posted April 8, 2019 Posted April 8, 2019 Status update on the ethernet TX issue (iperf tests ending abruptly with some kilobytes transferred and the like) The cause has been known for a while to be some error in the hardware offloading of TX checksumming. (TX is when the board sends bits out) One way to avoid that issue is to disable TX checksumming (getting the CPU itself to handle that) e.g.: using ethtool introducing a flag in the dts that among other things, does just that or by disabling it dynamically for certain devices with a new flag But disabling that offloading seems to come with a performance hit. There's some discussion going on about a different approach, which is to tweak the PBL value for TX. https://lkml.org/lkml/2019/4/1/1382 Recently I've been using U-Boot's handy fdt commands to test stuff on my Renegade, and introducing a certain change in txpbl, replacing force_thresh_dma_mode if it is there: fdt rm /ethernet@ff540000 snps,force_thresh_dma_mode fdt set /ethernet@ff540000 snps,txpbl <0x21>
ketominer Posted May 1, 2019 Posted May 1, 2019 Hi, We are using hundreds of Rock64 boards for a project and we are facing a lot of unreliabilities due to the kernel Armbian is currently using. After long discussions with Lukasz, Kamil, and Pine64 support, we concluded that some kernel fixes are necessary to have a reliable operation. These fixes are available for example in Ayufan's rc9 bionic distribution and all the boards that fail with armbian (it goes from kernel crash at boot to just some subtle miscalculations during some FPU operations) work perfectly well with his kernel. We are ready to support Armbian by providing v2 and v3 Rock64 4GB boards as well as a monetary support to get a kernel update as quick as possible to "un-brick" the 50-60 boards we currently can't use (and to avoid having to switch to another distribution). Please let us now if you're interested and how we could arrange that.
Igor Posted May 1, 2019 Posted May 1, 2019 13 minutes ago, ketominer said: Hi, We are using hundreds of Rock64 boards for a project and we are facing a lot of unreliabilities due to the kernel Armbian is currently using. After long discussions with Lukasz, Kamil, and Pine64 support, we concluded that some kernel fixes are necessary to have a reliable operation. These fixes are available for example in Ayufan's rc9 bionic distribution and all the boards that fail with armbian (it goes from kernel crash at boot to just some subtle miscalculations during some FPU operations) work perfectly well with his kernel. We are ready to support Armbian by providing v2 and v3 Rock64 4GB boards as well as a monetary support to get a kernel update as quick as possible to "un-brick" the 50-60 boards we currently can't use (and to avoid having to switch to another distribution). Please let us now if you're interested and how we could arrange that. One very quick fix would be switching to beta repository: - apt update and upgrade - armbian-config -> switch to nightly beta builds Kernel in beta repository should be close or identical to Ayufan's latest build ... but its not tested. Its on you to test it. After you switch to beta and if things works as expected, make a freeze (again in armbian-config -> system) and wait until things are properly tested by us and show up in stable repository. Then rather switch back to stable builds. Regarding board samples we have to discuss who is willing to join this debug party. I only have Rock64, IIRC its v1 with 4G. That will take some time to arrange due to ongoing holidays. I am shortly back to the office tomorrow and can check if I can do anything in this matter - in case suggestion to beta won't do any good. For monetary support please kindly use donate page, where its possible to choose between anonymous PayPal donation and invoice issued/billed to the company. In case you want a combination with publicly known amount and donor (forum) name, use https://forum.armbian.com/subscriptions/
ketominer Posted May 1, 2019 Posted May 1, 2019 39 minutes ago, Igor said: One very quick fix would be switching to beta repository: - apt update and upgrade - armbian-config -> switch to nightly beta builds Kernel in beta repository should be close or identical to Ayufan's latest build ... but its not tested. Its on you to test it. After you switch to beta and if things works as expected, make a freeze (again in armbian-config -> system) and wait until things are properly tested by us and show up in stable repository. Then rather switch back to stable builds. Regarding board samples we have to discuss who is willing to join this debug party. I only have Rock64, IIRC its v1 with 4G. That will take some time to arrange due to ongoing holidays. I am shortly back to the office tomorrow and can check if I can do anything in this matter - in case suggestion to beta won't do any good. For monetary support please kindly use donate page, where its possible to choose between anonymous PayPal donation and invoice issued/billed to the company. In case you want a combination with publicly known amount and donor (forum) name, use https://forum.armbian.com/subscriptions/ Thank you for your answer, will test the beta very quick and let you know how it goes. I will have to figure out how to safely push that to our users without breaking everything update: switching to nightly, kernel panic as soon as I run my test program (simple argon2i hashing test + stress-ng running in the background, it tends to be easier to trigger under load on some boards) (also subcription done, hopefully moving to the biggest one when we can). Let me know if you want the full test protocol along with some boards.
Igor Posted May 1, 2019 Posted May 1, 2019 1 minute ago, ketominer said: Thank you for your answer, will test the beta very quick and let you know how it goes. I will have to figure out how to safely push that to our users without breaking everything When you confirm beta kernel is working well for you, I can update main repository and all they will need to do is: apt update && apt upgrade
JMCC Posted May 1, 2019 Posted May 1, 2019 It's probably a matter of updating Armbian's kernel config to reflect ayufan's changes. I cannot test with Rock64 since I don't have one, just a Renegade.
ketominer Posted May 1, 2019 Posted May 1, 2019 32 minutes ago, JMCC said: It's probably a matter of updating Armbian's kernel config to reflect ayufan's changes. I cannot test with Rock64 since I don't have one, just a Renegade. Probably, I've even got a kernel package from him that "should" work on armbian but I've broken everything trying to install it...
ketominer Posted May 1, 2019 Posted May 1, 2019 1 hour ago, Igor said: When you confirm beta kernel is working well for you, I can update main repository and all they will need to do is: apt update && apt upgrade no luck so far, I've updated previous post (maybe not a good thing to do)
Igor Posted May 1, 2019 Posted May 1, 2019 4 minutes ago, ketominer said: no luck so far, I've updated previous post (maybe not a good thing to do) OK. Please provide a link to the working kernel and as much technical information as possible: logs (armbianmonitor -u) with a bad kernel and at least dmesg from a good one.
ketominer Posted May 1, 2019 Posted May 1, 2019 3 hours ago, Igor said: OK. Please provide a link to the working kernel and as much technical information as possible: logs (armbianmonitor -u) with a bad kernel and at least dmesg from a good one. Info from "bad" kernel (freshly updated armbian bionic but it's the same for 6 months):https://transfer.nodl.it/fJ3wP/armbianmonitor.bad.txt (used my own transfer side because the one used by default by armbianmonitor -u doesn't show any url after upload) Info from "good" distro (that works on the "bad" boards is):http://ix.io/1HKS (somehow worked on this one) Distribution used is bionic-minimal-rock64-0.8.0rc9-1120-arm64.img.xz fromhttps://github.com/ayufan-rock64/linux-build/releases The kernel is linux-headers-4.4.167-1169-rockchip-ayufan-g3cde5c624c9c:arm64 install linux-image-4.4.167-1169-rockchip-ayufan-g3cde5c624c9c:arm64 install u-boot-rockchip-rock64-2017.09-rockchip-ayufan-1045-g9922d32c04 install and board package board-package-rock64-0.8-126 install This leads to the correct dependencies:https://github.com/ayufan-rock64/linux-build/releases/download/0.8.0rc9/linux-rock64-0.8.0rc9_arm64.deb please note that the rc10 of ayufan's distro breaks again what rc9 was fixing... Hope that helps 1
Igor Posted May 1, 2019 Posted May 1, 2019 1 hour ago, ketominer said: Hope that helps A little. It seems quite a mess. Try kernel below if that makes any difference for you .... I am seeing this weird thingy in the syslogs: May 1 18:38:25 localhost kernel: [ 1416.059127] zram: Decompression failed! err=-22, page=40 linux-image-rockchip64_5.83_arm64.deb linux-dtb-rockchip64_5.83_arm64.deb
Igor Posted May 1, 2019 Posted May 1, 2019 Another idea for test - start from a stable repository or new image and use oldest kernel we have in repository -> armbian-config -> system -> alternative kernels.
ketominer Posted May 1, 2019 Posted May 1, 2019 2 hours ago, Igor said: A little. It seems quite a mess. Try kernel below if that makes any difference for you .... I am seeing this weird thingy in the syslogs: May 1 18:38:25 localhost kernel: [ 1416.059127] zram: Decompression failed! err=-22, page=40 linux-image-rockchip64_5.83_arm64.deb linux-dtb-rockchip64_5.83_arm64.deb no luck. I'm starting to wonder if this is all not about something someone mentioned earlier, being that it's the u-boot settings slowing down the RAM that fix the problem... https://transfer.nodl.it/4bgnD/armbianmonitor.txt
Recommended Posts