windysea

  • Content Count

    42
  • Joined

  • Last visited

About windysea

  • Rank
    Advanced Member

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. I haven't had much time to follow-up on this. FWIW the upstream patches should have included an improved version of the patch above already. To clarify it is not a fix but rather a loop to read the hardware timer repeatedly until there are two consecutive reads that are the same. It does add a different amount of overhead than the fix it is replacing. For instance there must always be at least two reads. I am wondering about the above solution being replaced since it is only a nine-bit mask, but it was found that this should probably be 11 bits. I forget
  2. An ntp stratum is not related to accuracy nor precision - it is simply an indication of how many "hops" a given NTP server is from a reference clock. A stratum-0 is a reference clock (IE: atomic clock, GPS receiver, etc). A stratum-1 is the NTP server directly using that reference clock for time synchronization. An SBC with a serial GPS indeed can be a stratum-1 (the GPS would be stratum-0), and there are many public postings on doing this. In fact the NTPsec team is doing "research" on this topic and has published documentation regarding this. The nature of the reference imple
  3. Yes that fix should already be part of the existing kernels, however there are actually several timer-related issues on the A64 and it is not clear that they are all addressed even upstream. Even with all of the fixes that have been published I still see the issue recur occasionally. There is a thread in the Development section where this is (or was) being discussed.
  4. Thanks! I removed the patch from my local 'userpatches' and built a new -dev kernel using the default kernel config (no changes) and no local patches or changes. The kernel booted with expected network connectivity and all looks good. I'll verify tomorrow with the nightly builds as well.
  5. @martinayotte - I was researching this one since I experienced the same when updating from 5.0.y to 5.1.y. It looks like you had found similar and posted a query/comment to the upstream repo at https://github.com/megous/linux/commit/c797ad894874de33b2be4beeb1f5a1e6339a888d#comments I had found the same as you - the drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c driver was missing a change made specifically in megous' branch for 5.0 but not carried to 5.1. Did you intend to include this as a local patch or should I submit a pull request for a new patch to
  6. Beware that 'b' will literally trigger an immediate reboot without even attempting to sync the disks. That is possible, and even likely in some cases, to cause corruption. For the OP case this would not be an issue since it looks like the root filesystem is already trashed, though perhaps there may have been others? A sync may be triggered similarly, however, just before the reboot: root@host# echo s > /proc/sysirq-trigger root@host# echo b > /proc/sysirq-trigger The 's' can also be handy when /bin/sync is also missing, to help prevent corruption on additional
  7. The fixes to date have all been community-provided based on observations and testing (IE: not based on anything from the SoC vendor). Allwinner has been silent on this specific issue, which actually isn't really surprising I did find a different fix from Allwinner for a different timer issue, which occurs when trying to write (rather than read) one of the affected timer registers. That fix is not part of the mainline (or armbian) kernels but does not immediately appear to be directly related, but that is part of what I hope to be able to look into.
  8. The mainline patch is almost identical to the previous patch used in alternate forks, including those used by Armbian. The mainline patch has only one change (other than the Kconfig symbol and matching macro): one less low-order bit is masked during the test for the condition when the problem has been seen to occur, which actually makes the mainline patch provide more "coverage" than the previous. The previous patch that had been applied to 4.19.y and 5.0.y kernels has similar behavior but was indeed working with 4.14.y and earlier, so there may be something else that has changed
  9. You really shouldn't need to set a cron job to sync the system time back to the hardware clock since this is already a builtin kernel feature. It just appears that feature was disabled (broken) by something. When ntpd or systemd-timesyncd start they will set a flag in the kernel to enable this feature: ntpd will do so after it obtains sync but systemd-timesyncd does so almost immediately. Running 'hwclock -s' after this will in turn disable this feature again. After some time ntpd may discover this (if the kernel tells it) and try to reset the flag, but systemd-timesyncd will
  10. If your hardware clock is not being synced from your system clock then something is broken. Runing 'hwclock -s' after systemd-timesyncd or ntpd has started is one of those things that will cause such breakage but there can be other reasons. Having an RTC isn't much value if it won't be kept decently accurate. Most RTCs are intended to be periodically updated by the host OS as they utilize low-cost and not very-high-accuracy components and design. As for using a Pine64 as an NTP server - it actually works very well. I've had this pine64 doing exactly that with KPPS with 4.14.y a
  11. I should probably put in an update. . . . After running stable for over a week I tried to disable a tickless kernel (standard modern default) to reduce jitter: user@pine64:~$ egrep nohz /boot/armbianEnv.txt extraargs=nohz=off user@pine64:~$ cat /proc/cmdline root=UUID=c80325d7-1d25-4151-85c3-47343b473fae rootwait rootfstype=ext4 console=ttyS0,115200 console=tty1 panic=10 consoleblank=0 loglevel=7 ubootpart=0f940383-01 usb-storage.quirks=0x2537:0x1066:u,0x2537:0x1068:u nohz=off cgroup_enable=memory swapaccount=1 Sadly this once again brought about instability. Wit
  12. Just for completeness and closure PR #1329 implemented this for 'sunxi' and 'sunxi64' in both NEXT and DEV. Other platforms already have this feature configured. There should be no longer be any need for additional automated or otherwise scripted enhancements to do a separate 'hwclock -s' after boot as the kernel will do this itself very early, where it should be done when possible.
  13. Again - the RTC and the timer in question are two separate entities and are unrelated. The issue here is not with the RTC but with a completely different counter in the SoC. The mainline fix for the timer issue here was formally added to the mainline in the 5.1.y branch for the A64. It has been pulled back to the 5.0.y and 4.19.y branches in various parts between the upstream mainline fork by megous used by Armbian as well as Armbian patches applied on top of that branch. The nightly builds and the packaged kernel updates have had these for a short while, and to see if the worka
  14. Yes I do have an RTC backup battery attached, but that really only applies when power is physically removed from the board. With just a 'reboot' the RTC remains powered. This particular timer issue also is not related to the RTC - it is a problem with a hardware timer (really just a counter) that is part of the SoC itself. When the symptoms manifest, the kernel (aka "system") time will jump by a multiple of 95 years and after that the kernel is no longer able to sync back to the hardware RTC which is independent. I had also submitted PR #1329 following Kernel config:
  15. Interestingly I've never had a successful run with Andre's test_timer: This board too is a 2GB PineA64+ from the initial Kickstarter campaign. Using the latest DEV pre-built kernel via 'apt upgrade', with the UNKNOWN1 erratum active: I haven't seen any of the known symptoms from this issue (IE: date jump by multiple of 95 years) as long as the erratum workaround is active, but the test_timer has never passed fully. For now I've chosen to leave this aside since I'm not seeing any other symptoms.