Petee Posted April 21, 2019 Posted April 21, 2019 Thank you WindySea. Yes just put the hardware sync once on boot. Will disable it. Noticed no time jumps but after ~48 hours the hardware clock is drifting off time sync. I started this little mini automation computer stuff here on RPi's with RTCs (used PiFace RTC shim) and then also on micro OpenWRT routers (made for tinkering with easy to get to GPIO pins) the got a Pine64 (pro bono to ticker with). I waited for over a year for the Rock64 to implement an RTC (it was promised) and I was told it would be in the next production batch....never came to being switched to maybe getting a RockPi4 (still thinking of this one) and meanwhile found out about Armbian and drifted over to getting a TVBox and running Armbian on it. I got in to the NTP stuff in the 1990's tinkering in my home sandbox. So then the nightly upgrade has created a timing issue that you are looking to resolve. Understood. Not sure if you are tinkering with any of the Armbian based TV boxes. Here looking to add a battery backed up RTC in a new automation server based on the S912 Octocore arm tvbox. I have yet to take one apart. Are there schematics out there in internetlandia for any of these devices?
Petee Posted April 23, 2019 Posted April 23, 2019 Been running now around 72 hours with no time or network issues from what I can tell here. Updated this morning to current build. | _ \(_)_ __ ___ / /_ | || | | |_) | | '_ \ / _ \ '_ \| || |_ | __/| | | | | __/ (_) |__ _| |_| |_|_| |_|\___|\___/ |_| Welcome to ARMBIAN 5.79.190420 nightly Ubuntu 18.04.2 LTS 4.19.36-sunxi64 System load: 0.20 0.30 0.27 Up time: 10:52 hours Memory usage: 32 % of 2001MB IP: 192.168.244.149 CPU temp: 45°C Usage of /: 17% of 29G [ General system configuration (beta): armbian-config ] Last login: Tue Apr 23 04:19:50 2019 from 192.168.244.231 Removed hwclock sync on reboot root@ICS-Pine64:~# date Tue Apr 23 15:11:36 CDT 2019 root@ICS-Pine64:~# hwclock 2019-04-23 15:11:39.210334-0500 root@ICS-Pine64:~# ./test_timer TAP version 13 # number of cores: 4 ok 1 same timer frequency on all cores # timer frequency is 24000000 Hz (24 MHz) # time1: dc584c0bf7, time2: dc584c0b00, diff: -247 # time1: dc5f5f93f7, time2: dc5f5f9300, diff: -247 not ok 2 native counter reads are monotonic # 2 errors # min: -247, avg: 8, max: 16137 # diffs: -10084 not ok 3 Linux counter reads are monotonic # 1 errors # min: -10084, avg: 552, max: 116709 # core 0: counter value: 946811128468 => 39450 sec # core 0: offsets: back-to-back: 12, b-t-b synced: 8, b-t-b w/ delay: 11 # core 1: counter value: 946811131161 => 39450 sec # core 1: offsets: back-to-back: 15, b-t-b synced: 8, b-t-b w/ delay: 10 # core 2: counter value: 946811134613 => 39450 sec # core 2: offsets: back-to-back: 14, b-t-b synced: 8, b-t-b w/ delay: 10 # core 3: counter value: 946811136586 => 39450 sec # core 3: offsets: back-to-back: 10, b-t-b synced: 8, b-t-b w/ delay: 10 1..3
Petee Posted April 24, 2019 Posted April 24, 2019 Just having a look at another Pine64 this morning which I configured for an automation peer in ATL a couple of months ago. I see the time issue here and updating. | _ \(_)_ __ ___ / /_ | || | | |_) | | '_ \ / _ \ '_ \| || |_ | __/| | | | | __/ (_) |__ _| |_| |_|_| |_|\___|\___/ |_| Welcome to ARMBIAN 5.75 stable Ubuntu 18.04.2 LTS 4.19.20-sunxi64 System load: 0.46 0.27 0.26 Up time: 24855 days Memory usage: 42 % of 2001MB IP: 192.168.1.85 CPU temp: 46°C Usage of /: 28% of 29G [ 0 security updates available, 187 updates total: apt upgrade ] Last check: 2114-06-16 06:29 [ General system configuration (beta): armbian-config ] root@ATL-Pine64:~# date Sat Jun 16 07:16:14 EST 2114 root@ATL-Pine64:~# hwclock 2019-04-24 07:04:02.984370-0400 root@ATL-Pine64:~# I couldn't change the date so doing the nightly upgrade and hopefully a reboot will fix the date?? Nightly upgrade fixed the date stuff. (note doing this remotely) root@ATL-Pine64:~# date Wed Apr 24 07:30:34 EDT 2019 root@ATL-Pine64:~# hwclock 2019-04-24 07:31:09.806147-0400 oot@ATL-Pine64:~# uname -a Linux ATL-Pine64 4.19.20-sunxi64 #5.75 SMP Fri Feb 8 10:29:25 CET 2019 aarch64 aarch64 aarch64 GNU/Linux root@ATL-Pine64:~# ./test_timer TAP version 13 # number of cores: 4 ok 1 same timer frequency on all cores # timer frequency is 24000000 Hz (24 MHz) # time1: 8948417f7, time2: 894841400, diff: -1015 # time1: 89484d7f7, time2: 89484d400, diff: -1015 # time1: 8948f87f7, time2: 8948f8400, diff: -1015 # time1: 89491f7f7, time2: 89491f400, diff: -1015 # time1: 894a457f7, time2: 894a45400, diff: -1015 # time1: 894bb07f7, time2: 894bb0400, diff: -1015 # time1: 894cbe7f7, time2: 894cbe400, diff: -1015 # time1: 894ce27f7, time2: 894ce2400, diff: -1015 # time1: 894d397f7, time2: 894d39400, diff: -1015 # time1: 894d8a7f7, time2: 894d8a400, diff: -1015 # time1: 894ead7f7, time2: 894ead400, diff: -1015 # time1: 894f917f7, time2: 894f91400, diff: -1015 # time1: 89511a7f7, time2: 89511a400, diff: -1015 # time1: 8951297f7, time2: 895129400, diff: -1015 # time1: 8951717f7, time2: 895171400, diff: -1015 # time1: 8952be7f7, time2: 8952be400, diff: -1015 # too many errors, stopping reports not ok 2 native counter reads are monotonic # 293 errors # min: -14695416, avg: 6, max: 15369 # diffs: -42000, -41958, -42000, -42000, -42000, -42042, -42042, -42042, -42042, -42041, -42041, -42000, -42000, -42041, -42041, -42041 # too many errors, stopping reports not ok 3 Linux counter reads are monotonic # 141 errors # min: -42042, avg: 612, max: 78125 # core 0: counter value: 37320126012 => 1555 sec # core 0: offsets: back-to-back: 10, b-t-b synced: 9, b-t-b w/ delay: 12 # core 1: counter value: 37320127893 => 1555 sec # core 1: offsets: back-to-back: 15, b-t-b synced: 8, b-t-b w/ delay: 11 # core 2: counter value: 37320129416 => 1555 sec # core 2: offsets: back-to-back: 14, b-t-b synced: 8, b-t-b w/ delay: 11 # core 3: counter value: 37320131492 => 1555 sec # core 3: offsets: back-to-back: 9, b-t-b synced: 9, b-t-b w/ delay: 10 1..3 root@ATL-Pine64:~#
sundbry Posted April 30, 2019 Posted April 30, 2019 Unfortunately, the current mainline fix seems to be unreliable. I am running a fleet of these things (with a kernel built from source). The UNKOWN1 patch is enabled and loaded, but this is the second time I have seen this happen on this kernel. It happens on different devices too - the second time was a different one than the first time I saw it. I don't know if the patch was just a statistical cross-your-fingers patch, I'm going to review it now, but that's what it seems like. root@host-131:~# uname -a Linux host-131 5.1.0-rc2-sunxi64 #5.77.66 SMP Sun Apr 28 16:02:54 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux root@host-131:~# date Fri Jun 22 19:45:31 UTC 2114 root@host-131:~# zcat /proc/config.gz | grep ERRAT CONFIG_ARM64_ERRATUM_826319=y CONFIG_ARM64_ERRATUM_827319=y CONFIG_ARM64_ERRATUM_824069=y CONFIG_ARM64_ERRATUM_819472=y # CONFIG_ARM64_ERRATUM_832075 is not set CONFIG_ARM64_ERRATUM_845719=y CONFIG_ARM64_ERRATUM_843419=y CONFIG_ARM64_ERRATUM_1024718=y CONFIG_ARM64_ERRATUM_1188873=y CONFIG_ARM64_ERRATUM_1165522=y CONFIG_ARM64_ERRATUM_1286807=y # CONFIG_CAVIUM_ERRATUM_22375 is not set CONFIG_CAVIUM_ERRATUM_23144=y # CONFIG_CAVIUM_ERRATUM_23154 is not set # CONFIG_CAVIUM_ERRATUM_27456 is not set # CONFIG_CAVIUM_ERRATUM_30115 is not set # CONFIG_QCOM_FALKOR_ERRATUM_1003 is not set # CONFIG_QCOM_FALKOR_ERRATUM_1009 is not set # CONFIG_QCOM_QDF2400_ERRATUM_0065 is not set CONFIG_HISILICON_ERRATUM_161600802=y CONFIG_QCOM_FALKOR_ERRATUM_E1041=y CONFIG_FUJITSU_ERRATUM_010001=y CONFIG_FSL_ERRATUM_A008585=y # CONFIG_HISILICON_ERRATUM_161010101 is not set # CONFIG_ARM64_ERRATUM_858921 is not set CONFIG_SUN50I_ERRATUM_UNKNOWN1=y root@host-131:~# cat /proc/device device-tree/ devices root@host-131:~# cat /proc/device device-tree/ devices root@host-131:~# cat /proc/device-tree/timer/ allwinner,erratum-unknown1 compatible interrupts name root@host-131:~# cat /proc/device-tree/timer/allwinner,erratum-unknown1 root@host-131:~# dmesg | grep errat [ 0.000000] CPU features: detected: ARM erratum 845719 [ 0.000000] CPU features: detected: ARM erratum 843419 [ 0.000000] arch_timer: Enabling global workaround for Allwinner erratum UNKNOWN1
sundbry Posted April 30, 2019 Posted April 30, 2019 @Petee you can get all the board schematic pdfs (and the CPU manual) at http://wiki.pine64.org/index.php/PINE_A64_Main_Page
sundbry Posted April 30, 2019 Posted April 30, 2019 test timer results Host with clock already drifted to 2114: ./root@host-131:~# ./test_timer TAP version 13 # number of cores: 4 ok 1 same timer frequency on all cores # timer frequency is 24000000 Hz (24 MHz) ok 2 native counter reads are monotonic # 0 errors # min: 6, avg: 6, max: 3795 ok 3 Linux counter reads are monotonic # 0 errors # min: 458, avg: 504, max: 39000 # core 0: counter value: 2610635813617 => 108776 sec # core 0: offsets: back-to-back: 7, b-t-b synced: 10, b-t-b w/ delay: 9 # core 1: counter value: 2610635816527 => 108776 sec # core 1: offsets: back-to-back: 9, b-t-b synced: 7, b-t-b w/ delay: 9 # core 2: counter value: 2610635818363 => 108776 sec # core 2: offsets: back-to-back: 9, b-t-b synced: 7, b-t-b w/ delay: 8 # core 3: counter value: 2610635821240 => 108776 sec # core 3: offsets: back-to-back: 8, b-t-b synced: 7, b-t-b w/ delay: 8 1..3 root@host-131:~# date Fri Jun 22 19:59:10 UTC 2114 Host with clock still at the present time: root@host-125:~# ./test_timer TAP version 13 # number of cores: 4 ok 1 same timer frequency on all cores # timer frequency is 24000000 Hz (24 MHz) ok 2 native counter reads are monotonic # 0 errors # min: 6, avg: 6, max: 174096 ok 3 Linux counter reads are monotonic # 0 errors # min: 458, avg: 500, max: 25958 # core 0: counter value: 2612467490982 => 108852 sec # core 0: offsets: back-to-back: 8, b-t-b synced: 11, b-t-b w/ delay: 9 # core 1: counter value: 2612467492975 => 108852 sec # core 1: offsets: back-to-back: 8, b-t-b synced: 6, b-t-b w/ delay: 8 # core 2: counter value: 2612467494726 => 108852 sec # core 2: offsets: back-to-back: 8, b-t-b synced: 6, b-t-b w/ delay: 8 # core 3: counter value: 2612467497033 => 108852 sec # core 3: offsets: back-to-back: 8, b-t-b synced: 6, b-t-b w/ delay: 8 1..3 root@host-125:~# date Tue Apr 30 23:27:54 UTC 2019
Petee Posted May 1, 2019 Posted May 1, 2019 Thank you Sundry. The ATL (Atlanta, Georgia) Pine64 crashed and burned up (literally) after the heat sinks installed slide off yesterday. I ordered a new TVBox / SD card to replace the dead Pine64 today and it will be there tomorrow and configured by end of business day tomorrow. I shut local Pine64 off (near Chicago here) and switched over to using a TVBox for the mini automation server. Currently moving all of the automation software from the ATL Pine64 to another TVBox there in Atlanta. That said and relating to the Pine64 I had asked about a promised updated to the Rock64 that never materilized for a year so I gave up on them and decided to maybe try the RockPi4 with more features and same size as the Rock64. Posted on the Rock64 forum- yeah I was nagging a bit here... ROCK64 RTC December 3, 2017 Question: Will the RTC be available in the December 12, 2017 Rock64 release? Answer: RTC is working, just lacking battery backup. The battery backup circuit not available on December production, we plan to merge in on January production but not sure will happen. This due to there is super long lead time on PCB manufacturing.January 30, 2018 Question: Wondering if the RTC / battery stuff is available on current Rock54 hardware? If not can I have instructions for soldering in battery posts and utilize the Pine64 battery holder? Answer: Attached is the RTC mod circuit for ROCK64. However, ROCK64 don't support battery charging and no such circuit available.February 9, 2018 Question: Thank you tlimm. So the hardware RTC with battery changes have been abandoned eh? Basically then all I need to do is solder on a battery to the Rock64 board. I have done similar here with my tabletop touchscreens. Answer: The ROCK64 board with RTC battery connector is not abandon. Check out the FOSDEM PINE64 table stand photo and you may able to spot this board. This is just a minor change and we will revise board revision during production run sometimes on Q2 2018 timeframe.May 2, 2018 Question: Curious if the new Rock64 hardware release has the built in RTC battery connections? Answer: This release already on the plan and originally plan release to production on this month. However, due to focus on ROCKPro64 activity, and has push to June/July. June 11, 2018 Posting another couple of requests today. 1 - when will the updated Rock64 with RTC and Battery will be available? 2 - will the Rock64Pro ever have a battery RTC option? (it would be a great little firewall)August 6, 2018 tllim Wrote: already factor into new ROCK64 revision including PoE option and micro SD UHS design. Needs to test all new function first before release and this takes couple months. 02-08-2019, 07:17 AM I do not see any update to using a battery for RTC on newest Rock64 which was going to be utilized as an automation server combo firewall. I have not used the Internet for time in over 15 years now and always used an NTP server with GPS / PPS. Over the years transitioned to using the PFSense box with a serial / PPS connection to a GPS which provides great time for me. I am very particular about automation and time. We are at the two year mark here. Time to give up and move on. The best thing that happened for the Pine64 was Armbian. That said I have moved on now from the Pine64 / Rock64 stuff. I would like to add an RTC / Battery to the TV Box. (icing on the cake)
windysea Posted May 1, 2019 Posted May 1, 2019 2 hours ago, sundbry said: Unfortunately, the current mainline fix seems to be unreliable. I am running a fleet of these things (with a kernel built from source). The UNKOWN1 patch is enabled and loaded, but this is the second time I have seen this happen on this kernel. It happens on different devices too - the second time was a different one than the first time I saw it. I don't know if the patch was just a statistical cross-your-fingers patch, I'm going to review it now, but that's what it seems like. The mainline patch is almost identical to the previous patch used in alternate forks, including those used by Armbian. The mainline patch has only one change (other than the Kconfig symbol and matching macro): one less low-order bit is masked during the test for the condition when the problem has been seen to occur, which actually makes the mainline patch provide more "coverage" than the previous. The previous patch that had been applied to 4.19.y and 5.0.y kernels has similar behavior but was indeed working with 4.14.y and earlier, so there may be something else that has changed. I have not had the opportunity to continue looking into this just yet. It is also possible that the original fix and the subsequent mainline fix were not complete, which is actually noted in the comments for the various commits. Disabling a tickless kernel, for instance, seems to exacerbate the issue significantly though on the surface this would not be expected.
sundbry Posted May 1, 2019 Posted May 1, 2019 The heat sink adhesive melted off? Yikes @Petee Thanks for the info @windysea. I wonder what the hell the ALLWINNER CORPORATION has to say about this. You would think that maybe knowing the internals of the CPU they could provide some guidance. Why would I ever want to rely on another Allwinner CPU if they can't communicate. Maybe the Pine64 guys are the only ones with enough leverage to get them to speak up.
windysea Posted May 1, 2019 Posted May 1, 2019 The fixes to date have all been community-provided based on observations and testing (IE: not based on anything from the SoC vendor). Allwinner has been silent on this specific issue, which actually isn't really surprising I did find a different fix from Allwinner for a different timer issue, which occurs when trying to write (rather than read) one of the affected timer registers. That fix is not part of the mainline (or armbian) kernels but does not immediately appear to be directly related, but that is part of what I hope to be able to look into.
Petee Posted May 1, 2019 Posted May 1, 2019 The heat sink adhesive melted off? No. This was the second heatsink pack I purchased at Amazon. The first one didn't really have any adhesive on it. I thought the second pack of heat sinks had more adhesive. Best to purchase a sticky adhesive compound when installing these little heatsinks.
sundbry Posted May 1, 2019 Posted May 1, 2019 @windysea as I was pondering this further yesterday, I realized I had never observed this issue for over a year when running on the 3.xx ayufan kernels. I think that what you said may have something to do with it. I wonder what other differences are hiding and implemented in that kernel?
gchain Posted May 14, 2019 Posted May 14, 2019 On 3/13/2019 at 1:30 PM, martinayotte said: It has been fixed a while ago but having CONFIG_FSL_ERRATUM_A008585 into kernel def_config ... Which image/kernel are you using ? On 3/13/2019 at 1:30 PM, martinayotte said: It has been fixed a while ago but having CONFIG_FSL_ERRATUM_A008585 into kernel def_config ... Which image/kernel are you using ? Sorry for the delay. I'm using Armbian 5.75 Stable (Stretch) $ uname -a Linux pine64 4.20.7-sunxi64 #5.75 SMP Fri Feb 8 10:37:18 CET 2019 aarch64 GNU/Linux I'll try with the nightly build on 5.x and check if it works. Also, here's Andre's test_timer output: # ./timer TAP version 13 # number of cores: 4 ok 1 same timer frequency on all cores # timer frequency is 24000000 Hz (24 MHz) # time1: a33e3cbff, time2: a33e3c80b, diff: -1012 # time1: a33e533ff, time2: a33e5300b, diff: -1012 # time1: a33ea87f5, time2: a33ea8400, diff: -1013 # time1: a33eb4bff, time2: a33eb480b, diff: -1012 # time1: a33f6b7f4, time2: a33f6b400, diff: -1012 # time1: a341037f4, time2: a34103400, diff: -1012 # time1: a34233ff5, time2: a34230c00, diff: -13301 # time1: a342ccff4, time2: a342ccc00, diff: -1012 # time1: a342edff5, time2: a342ed3ff, diff: -3062 # time1: a3430d7f5, time2: a3430d400, diff: -1013 # time1: a3436d7f4, time2: a3436d400, diff: -1012 # time1: a343a83ff, time2: a343a8014, diff: -1003 # time1: a343c17f5, time2: a343c1400, diff: -1013 # time1: a34568bff, time2: a3456880b, diff: -1012 # time1: a345e2c00, time2: a345e200b, diff: -3061 # time1: a3465b7f5, time2: a3465b400, diff: -1013 # too many errors, stopping reports not ok 2 native counter reads are monotonic # 370 errors # min: -15350, avg: 11, max: 15372 # diffs: -127209, -41834, -127167, -41834, -41792, -41834, -41834, -41792, -41833, -41834, -41792, -41834, -41792, -41833, -41834, -127250 # too many errors, stopping reports not ok 3 Linux counter reads are monotonic # 373 errors # min: -174633417, avg: 807, max: 640875 # core 0: counter value: 44453336039 => 1852 sec # core 0: offsets: back-to-back: 15, b-t-b synced: 12, b-t-b w/ delay: 20 # core 1: counter value: 44453431859 => 1852 sec # core 1: offsets: back-to-back: 12, b-t-b synced: 11, b-t-b w/ delay: 15 # core 2: counter value: 44453624827 => 1852 sec # core 2: offsets: back-to-back: 19, b-t-b synced: 12, b-t-b w/ delay: 20 # core 3: counter value: 44453630152 => 1852 sec # core 3: offsets: back-to-back: 13, b-t-b synced: 12, b-t-b w/ delay: 19 1..3
sfx2000 Posted May 29, 2019 Posted May 29, 2019 On 4/20/2019 at 12:13 PM, windysea said: Then came systemd and that caused breakage so the extra checks were put in. There are many discussions on this elsewhere so there's no need to discuss further here: The summary is if you are using systemd do not run 'hwclock -s'. Ever. Conversely - if one is running ntpd, systemd-timesyncd can cause a lot of trouble here... Above is a great post - and some things to add maybe... GPS without PPS is an ok time source for NTP, if nothing else is available via the network - but it's precision only goes so far, and if the GPS is over USB, this adds more jitter... NOHZ is perhaps one approach, esp. with a CPU that does DVFS, and setting the scheduler appropriately to get some timing stability overall - not withstanding the long jumps that A64 seems to do, which is another issue, as NTP assumes that the system timing is somewhat stable. hint - if one is looking for a stable Stratum 1 NTP - check out google Public NTP, as this is what's keeping their global cloud in sync, and there it pays to also use google DNS for the resolver. This is likely good enough for most folks, and if one uses GPS over gpsd on the shm, that should be the preferred, and then time1/time2/time3/time4.google.com, and maybe consider the local clock, should get to a stable NTP. To get really stable local time - one is not going to get stratum 0 on an SBC, period, even with GPS and PPS... most of these cheap SBC's use an XO with 30 to 50 PPM accuracy (this is a 50 cent chip) - this is good enough for ethernet and WIFi - I'm looking a part that is 50 ppb, which would normally be an OCXO (spendy), but this part is a MEMS device, and so far the data I've seen looks pretty good over time.
windysea Posted June 11, 2019 Posted June 11, 2019 On 5/29/2019 at 12:37 AM, sfx2000 said: To get really stable local time - one is not going to get stratum 0 on an SBC, period, even with GPS and PPS... most of these cheap SBC's use an XO with 30 to 50 PPM accuracy (this is a 50 cent chip) - this is good enough for ethernet and WIFi - I'm looking a part that is 50 ppb, which would normally be an OCXO (spendy), but this part is a MEMS device, and so far the data I've seen looks pretty good over time. An ntp stratum is not related to accuracy nor precision - it is simply an indication of how many "hops" a given NTP server is from a reference clock. A stratum-0 is a reference clock (IE: atomic clock, GPS receiver, etc). A stratum-1 is the NTP server directly using that reference clock for time synchronization. An SBC with a serial GPS indeed can be a stratum-1 (the GPS would be stratum-0), and there are many public postings on doing this. In fact the NTPsec team is doing "research" on this topic and has published documentation regarding this. The nature of the reference implementation of NTPd is specifically to maintain accurate time regardless of any hardware timers. Today's "50-cent" parts are still more stable than those orders-of-magnitude more expensive decades-ago when NTPd was first developed. Google's NTP project may use their own "atomic clocks", but their public NTP servers tend to be on the poor end with respect to jitter. They're intended to be "close-enough", stable, and highly-available. They are not intended to be highly accurate. Their public NTP servers, for instance, implement leap-smearing rather than advertise a leap-second (when appropriate). For this reason Google strongly recommends not mixing their public NTP servers in a configuration with other NTP sources (bad things can happen, and in fact have happened in the past). Google's NTP servers also are behind anycast load-balancers. While this improves availability and end-device configuration simplicity it actually degrades performance. In my own testing google's ntp servers typically have higher jitter than most of the larger NTP pool project pools, the latter of which are already commonly used as defaults in many OS distributions. Configuring and building a non-tickless kernel is required in order to enable kernel-pps (aka "hard pps"), which typically has far less jitter than "soft pps". However, doing so even with the latest 5.1.y (DEV) kernels results in an unstable platform where the issue noted in this thread will manifest fairly frequently. It may just be that A64-based SBCs are not suitable to host NTP reference clocks and stratum-1 NTP servers but earlier kernels did not seem to have this issue so it may just be that a previous mitigation got lost along the way. 2
Petee Posted June 12, 2019 Posted June 12, 2019 Here just built a second PFSense box and testing out a new mini GPS (Arduino style) that has built in RS-232 and PPS signal. It is around $7 USD with free shipping on Ebay. My primary GPS / PPS is a modded Sure GPS. Tiny box DIN mounting this box. I know this doesn't have much to do with the TVBox / Armbian rather just a NTP time source for my home network.
markolonius Posted July 30, 2019 Posted July 30, 2019 I have this problem very very often. Running Ubuntu Bionic with Armbian Linux 4.19.57-sunxi64 pretty stable otherwise when time is sync'd. I disabled systemd-timesyncd and seems to not have the time/date go to year 2114 as much, but still happens 3-4 times per week. I usually have to manually resync the time using sudo ntpdate 0.north-america.pool.ntp.org. Sometimes this gives invalid argument and requires me to reboot to run this command. I'd love to help in any way I can to debug this issue. Not sure how! I should add I didn't have this issue running arch linux with kernel 4.19 which is odd. I like armbian much better however and want to stay with it. Also never had this issue on the ayufan kernels either.
sundbry Posted July 30, 2019 Posted July 30, 2019 @markolonius I haven't seen this issue in production for over a month on my 7x units I have deployed, after reverting to the method used in the "legacy" kernel branch. Here is the patch. The original patch which is in upstream Linux was a little "too clever", and although I believe it must have worked well for the person who submitted it, the errata in the hardware make it an unreliable fix. The previous technique of just reading the register in a loop until it gets a stable reading works much better. https://github.com/arctype-co/linux/commit/5fcb4e57eeaa4d670ef4acf5818c6fe16aa0d3d0
markolonius Posted July 30, 2019 Posted July 30, 2019 5 hours ago, sundbry said: @markolonius I haven't seen this issue in production for over a month on my 7x units I have deployed, after reverting to the method used in the "legacy" kernel branch. Here is the patch. The original patch which is in upstream Linux was a little "too clever", and although I believe it must have worked well for the person who submitted it, the errata in the hardware make it an unreliable fix. The previous technique of just reading the register in a loop until it gets a stable reading works much better. https://github.com/arctype-co/linux/commit/5fcb4e57eeaa4d670ef4acf5818c6fe16aa0d3d0 This sounds awesome. Haven't built a kernel for an arm board yet so this should be interesting. I'll give it a try, but might take me a while due to time. How would I download this patch? Any plans/hopes of this making it into a nightly or stable release?
markolonius Posted August 10, 2019 Posted August 10, 2019 @sundbry I attempted to apply the patch but could never get it to work. So I actually tried out the 5.2.x kernel and kept systemd-timesyncd disabled, which seems to work and is stable for me for the past week or so. As soon as I enable systemd-timesyncd and start it, it switches the time to year 2114. Since I don't have much time at the moment keep trying to build a patched kernel I'll keep it this way. But if anyone has the same problem on the 5.2.x kernel w/ systemd-timesyncd and is working on trying to solve it I'll be more than happy to supply whatever logs requested to help.
Recommended Posts