Jump to content

A64 date/time clock issue


Recommended Posts

Thank you WindySea.

 

Yes just put the hardware sync once on boot.  Will disable it.  Noticed no time jumps but after ~48 hours the hardware clock is drifting off time sync.

 

I started this little mini automation computer stuff here on RPi's with RTCs (used PiFace RTC shim) and then also on micro OpenWRT routers (made for tinkering with easy to get to GPIO pins) the got a Pine64 (pro bono to ticker with).  I waited for over a year for the Rock64 to implement an RTC (it was promised) and I was told it would be in the next production batch....never came to being switched to maybe getting a RockPi4 (still thinking of this one) and meanwhile found out about Armbian and drifted over to getting a TVBox and running Armbian on it.

 

I got in to the NTP stuff in the 1990's tinkering in my home sandbox. 

 

So then the nightly upgrade has created a timing issue that you are looking to resolve.  Understood.

 

Not sure if you are tinkering with any of the Armbian based TV boxes. 

 

Here looking to add a battery backed up RTC in a new automation server based on the S912 Octocore arm tvbox.

 

I have yet to take one apart.  Are there schematics out there in internetlandia for any of these devices?

 

 

 

Link to comment
Share on other sites

Been running now around 72 hours with no time or network issues from what I can tell here.  Updated this morning to current build.

|  _ \(_)_ __   ___ / /_ | || |  
| |_) | | '_ \ / _ \ '_ \| || |_
|  __/| | | | |  __/ (_) |__   _|
|_|   |_|_| |_|\___|\___/   |_|  
                                 

Welcome to ARMBIAN 5.79.190420 nightly Ubuntu 18.04.2 LTS 4.19.36-sunxi64   
System load:   0.20 0.30 0.27      Up time:       10:52 hours        
Memory usage:  32 % of 2001MB     IP:            192.168.244.149
CPU temp:      45°C               
Usage of /:    17% of 29G        

[ General system configuration (beta): armbian-config ]

Last login: Tue Apr 23 04:19:50 2019 from 192.168.244.231

Removed hwclock sync on reboot

 

root@ICS-Pine64:~# date
Tue Apr 23 15:11:36 CDT 2019
root@ICS-Pine64:~# hwclock
2019-04-23 15:11:39.210334-0500


 

root@ICS-Pine64:~# ./test_timer
TAP version 13
# number of cores: 4
ok 1 same timer frequency on all cores
# timer frequency is 24000000 Hz (24 MHz)
# time1: dc584c0bf7, time2: dc584c0b00, diff: -247
# time1: dc5f5f93f7, time2: dc5f5f9300, diff: -247
not ok 2 native counter reads are monotonic # 2 errors
# min: -247, avg: 8, max: 16137
# diffs: -10084
not ok 3 Linux counter reads are monotonic # 1 errors
# min: -10084, avg: 552, max: 116709
# core 0: counter value: 946811128468 => 39450 sec
# core 0: offsets: back-to-back: 12, b-t-b synced: 8, b-t-b w/ delay: 11
# core 1: counter value: 946811131161 => 39450 sec
# core 1: offsets: back-to-back: 15, b-t-b synced: 8, b-t-b w/ delay: 10
# core 2: counter value: 946811134613 => 39450 sec
# core 2: offsets: back-to-back: 14, b-t-b synced: 8, b-t-b w/ delay: 10
# core 3: counter value: 946811136586 => 39450 sec
# core 3: offsets: back-to-back: 10, b-t-b synced: 8, b-t-b w/ delay: 10
1..3

 

 

Link to comment
Share on other sites

Just having a look at another Pine64 this morning which I configured for an automation peer in ATL a couple of months ago.  I see the time issue here and updating.

|  _ \(_)_ __   ___ / /_ | || |  
| |_) | | '_ \ / _ \ '_ \| || |_
|  __/| | | | |  __/ (_) |__   _|
|_|   |_|_| |_|\___|\___/   |_|  
                                 
Welcome to ARMBIAN 5.75 stable Ubuntu 18.04.2 LTS 4.19.20-sunxi64   
System load:   0.46 0.27 0.26      Up time:       24855 days        
Memory usage:  42 % of 2001MB     IP:            192.168.1.85
CPU temp:      46°C               
Usage of /:    28% of 29G        

[ 0 security updates available, 187 updates total: apt upgrade ]
Last check: 2114-06-16 06:29

[ General system configuration (beta): armbian-config ]


root@ATL-Pine64:~# date
Sat Jun 16 07:16:14 EST 2114
root@ATL-Pine64:~# hwclock
2019-04-24 07:04:02.984370-0400
root@ATL-Pine64:~#

 

I couldn't change the date so doing the nightly upgrade and hopefully a reboot will fix the date??

 

Nightly upgrade fixed the date stuff.  (note doing this remotely)

 

root@ATL-Pine64:~# date
Wed Apr 24 07:30:34 EDT 2019
root@ATL-Pine64:~# hwclock
2019-04-24 07:31:09.806147-0400

 

oot@ATL-Pine64:~# uname -a
Linux ATL-Pine64 4.19.20-sunxi64 #5.75 SMP Fri Feb 8 10:29:25 CET 2019 aarch64 aarch64 aarch64 GNU/Linux

 

root@ATL-Pine64:~# ./test_timer
TAP version 13
# number of cores: 4
ok 1 same timer frequency on all cores
# timer frequency is 24000000 Hz (24 MHz)
# time1: 8948417f7, time2: 894841400, diff: -1015
# time1: 89484d7f7, time2: 89484d400, diff: -1015
# time1: 8948f87f7, time2: 8948f8400, diff: -1015
# time1: 89491f7f7, time2: 89491f400, diff: -1015
# time1: 894a457f7, time2: 894a45400, diff: -1015
# time1: 894bb07f7, time2: 894bb0400, diff: -1015
# time1: 894cbe7f7, time2: 894cbe400, diff: -1015
# time1: 894ce27f7, time2: 894ce2400, diff: -1015
# time1: 894d397f7, time2: 894d39400, diff: -1015
# time1: 894d8a7f7, time2: 894d8a400, diff: -1015
# time1: 894ead7f7, time2: 894ead400, diff: -1015
# time1: 894f917f7, time2: 894f91400, diff: -1015
# time1: 89511a7f7, time2: 89511a400, diff: -1015
# time1: 8951297f7, time2: 895129400, diff: -1015
# time1: 8951717f7, time2: 895171400, diff: -1015
# time1: 8952be7f7, time2: 8952be400, diff: -1015
# too many errors, stopping reports
not ok 2 native counter reads are monotonic # 293 errors
# min: -14695416, avg: 6, max: 15369
# diffs: -42000, -41958, -42000, -42000, -42000, -42042, -42042, -42042, -42042, -42041, -42041, -42000, -42000, -42041, -42041, -42041
# too many errors, stopping reports
not ok 3 Linux counter reads are monotonic # 141 errors
# min: -42042, avg: 612, max: 78125
# core 0: counter value: 37320126012 => 1555 sec
# core 0: offsets: back-to-back: 10, b-t-b synced: 9, b-t-b w/ delay: 12
# core 1: counter value: 37320127893 => 1555 sec
# core 1: offsets: back-to-back: 15, b-t-b synced: 8, b-t-b w/ delay: 11
# core 2: counter value: 37320129416 => 1555 sec
# core 2: offsets: back-to-back: 14, b-t-b synced: 8, b-t-b w/ delay: 11
# core 3: counter value: 37320131492 => 1555 sec
# core 3: offsets: back-to-back: 9, b-t-b synced: 9, b-t-b w/ delay: 10
1..3
root@ATL-Pine64:~#

 

Link to comment
Share on other sites

Unfortunately, the current mainline fix seems to be unreliable. I am running a fleet of these things (with a kernel built from source). The UNKOWN1 patch is enabled and loaded, but this is the second time I have seen this happen on this kernel. It happens on different devices too - the second time was a different one than the first time I saw it. I don't know if the patch was just a statistical cross-your-fingers patch, I'm going to review it now, but that's what it seems like.

 

root@host-131:~# uname -a
Linux host-131 5.1.0-rc2-sunxi64 #5.77.66 SMP Sun Apr 28 16:02:54 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux
root@host-131:~# date
Fri Jun 22 19:45:31 UTC 2114
root@host-131:~# zcat /proc/config.gz | grep ERRAT
CONFIG_ARM64_ERRATUM_826319=y
CONFIG_ARM64_ERRATUM_827319=y
CONFIG_ARM64_ERRATUM_824069=y
CONFIG_ARM64_ERRATUM_819472=y
# CONFIG_ARM64_ERRATUM_832075 is not set
CONFIG_ARM64_ERRATUM_845719=y
CONFIG_ARM64_ERRATUM_843419=y
CONFIG_ARM64_ERRATUM_1024718=y
CONFIG_ARM64_ERRATUM_1188873=y
CONFIG_ARM64_ERRATUM_1165522=y
CONFIG_ARM64_ERRATUM_1286807=y
# CONFIG_CAVIUM_ERRATUM_22375 is not set
CONFIG_CAVIUM_ERRATUM_23144=y
# CONFIG_CAVIUM_ERRATUM_23154 is not set
# CONFIG_CAVIUM_ERRATUM_27456 is not set
# CONFIG_CAVIUM_ERRATUM_30115 is not set
# CONFIG_QCOM_FALKOR_ERRATUM_1003 is not set
# CONFIG_QCOM_FALKOR_ERRATUM_1009 is not set
# CONFIG_QCOM_QDF2400_ERRATUM_0065 is not set
CONFIG_HISILICON_ERRATUM_161600802=y
CONFIG_QCOM_FALKOR_ERRATUM_E1041=y
CONFIG_FUJITSU_ERRATUM_010001=y
CONFIG_FSL_ERRATUM_A008585=y
# CONFIG_HISILICON_ERRATUM_161010101 is not set
# CONFIG_ARM64_ERRATUM_858921 is not set
CONFIG_SUN50I_ERRATUM_UNKNOWN1=y
root@host-131:~# cat /proc/device
device-tree/ devices
root@host-131:~# cat /proc/device
device-tree/ devices
root@host-131:~# cat /proc/device-tree/timer/
allwinner,erratum-unknown1  compatible                  interrupts                  name
root@host-131:~# cat /proc/device-tree/timer/allwinner,erratum-unknown1
root@host-131:~# dmesg | grep errat
[    0.000000] CPU features: detected: ARM erratum 845719
[    0.000000] CPU features: detected: ARM erratum 843419
[    0.000000] arch_timer: Enabling global workaround for Allwinner erratum UNKNOWN1

 

Link to comment
Share on other sites

test timer results

 

Host with clock already drifted to 2114:

./root@host-131:~# ./test_timer
TAP version 13
# number of cores: 4
ok 1 same timer frequency on all cores
# timer frequency is 24000000 Hz (24 MHz)
ok 2 native counter reads are monotonic # 0 errors
# min: 6, avg: 6, max: 3795
ok 3 Linux counter reads are monotonic # 0 errors
# min: 458, avg: 504, max: 39000
# core 0: counter value: 2610635813617 => 108776 sec
# core 0: offsets: back-to-back: 7, b-t-b synced: 10, b-t-b w/ delay: 9
# core 1: counter value: 2610635816527 => 108776 sec
# core 1: offsets: back-to-back: 9, b-t-b synced: 7, b-t-b w/ delay: 9
# core 2: counter value: 2610635818363 => 108776 sec
# core 2: offsets: back-to-back: 9, b-t-b synced: 7, b-t-b w/ delay: 8
# core 3: counter value: 2610635821240 => 108776 sec
# core 3: offsets: back-to-back: 8, b-t-b synced: 7, b-t-b w/ delay: 8
1..3
root@host-131:~# date
Fri Jun 22 19:59:10 UTC 2114

 

Host with clock still at the present time:

root@host-125:~# ./test_timer
TAP version 13
# number of cores: 4
ok 1 same timer frequency on all cores
# timer frequency is 24000000 Hz (24 MHz)
ok 2 native counter reads are monotonic # 0 errors
# min: 6, avg: 6, max: 174096
ok 3 Linux counter reads are monotonic # 0 errors
# min: 458, avg: 500, max: 25958
# core 0: counter value: 2612467490982 => 108852 sec
# core 0: offsets: back-to-back: 8, b-t-b synced: 11, b-t-b w/ delay: 9
# core 1: counter value: 2612467492975 => 108852 sec
# core 1: offsets: back-to-back: 8, b-t-b synced: 6, b-t-b w/ delay: 8
# core 2: counter value: 2612467494726 => 108852 sec
# core 2: offsets: back-to-back: 8, b-t-b synced: 6, b-t-b w/ delay: 8
# core 3: counter value: 2612467497033 => 108852 sec
# core 3: offsets: back-to-back: 8, b-t-b synced: 6, b-t-b w/ delay: 8
1..3
root@host-125:~# date
Tue Apr 30 23:27:54 UTC 2019


 

Link to comment
Share on other sites

Thank you Sundry.

 

The ATL (Atlanta, Georgia) Pine64 crashed and burned up (literally) after the heat sinks installed slide off yesterday.

 

I ordered a new TVBox / SD card to replace the dead Pine64 today and it will be there tomorrow and configured by end of business day tomorrow.

 

I shut local Pine64 off (near Chicago here) and switched over to using a TVBox for the mini automation server.

 

Currently moving all of the automation software from the ATL Pine64 to another TVBox there in Atlanta.

 

That said and relating to the Pine64 I had asked about a promised updated to the Rock64 that never materilized for a year so I gave up on them and decided to maybe try the RockPi4 with more features and same size as the Rock64.

 

Posted on the Rock64 forum- yeah I was nagging a bit here...

 

ROCK64 RTC

 

December 3, 2017
Question:
Will the RTC be available in the December 12, 2017 Rock64 release?

Answer:
RTC is working, just lacking battery backup. The battery backup circuit not available on December production, we plan to merge in on January production but not sure will happen. This due to there is super long lead time on PCB manufacturing.

January 30, 2018
Question:
Wondering if the RTC / battery stuff is available on current Rock54 hardware?

If not can I have instructions for soldering in battery posts and utilize the Pine64 battery holder?
Answer:
Attached is the RTC mod circuit for ROCK64. However, ROCK64 don't support battery charging and no such circuit available.

February 9, 2018
Question:
Thank you tlimm.

So the hardware RTC with battery changes have been abandoned eh?
Basically then all I need to do is solder on a battery to the Rock64 board.
I have done similar here with my tabletop touchscreens.
Answer:
The ROCK64 board with RTC battery connector is not abandon. Check out the FOSDEM PINE64 table stand photo and you may able to spot this board.

This is just a minor change and we will revise board revision during production run sometimes on Q2 2018 timeframe.

May 2, 2018
Question:
Curious if the new Rock64 hardware release has the built in RTC battery connections?
Answer:
This release already on the plan and originally plan release to production on this month. However, due to focus on ROCKPro64 activity, and has push to June/July.
 

June 11, 2018

Posting another couple of requests today.

1 - when will the updated Rock64 with RTC and Battery will be available?
2 - will the Rock64Pro ever have a battery RTC option? (it would be a great little firewall)

August 6, 2018

tllim Wrote: already factor into new ROCK64 revision including PoE option and micro SD UHS design. Needs to test all new function first before release and this takes couple months.

 

02-08-2019, 07:17 AM

 

I do not see any update to using a battery for RTC on newest Rock64 which was going to be utilized as an automation server combo firewall.

I have not used the Internet for time in over 15 years now and always used an NTP server with GPS / PPS. Over the years transitioned to using the PFSense box with a serial / PPS connection to a GPS which provides great time for me.

I am very particular about automation and time.

We are at the two year mark here.

Time to give up and move on.

 

The best thing that happened for the Pine64 was Armbian.  That said I have moved on now from the Pine64 / Rock64 stuff.

 

I would like to add an RTC / Battery to the TV Box. (icing on the cake)

 

 

 

Link to comment
Share on other sites

2 hours ago, sundbry said:

Unfortunately, the current mainline fix seems to be unreliable. I am running a fleet of these things (with a kernel built from source). The UNKOWN1 patch is enabled and loaded, but this is the second time I have seen this happen on this kernel. It happens on different devices too - the second time was a different one than the first time I saw it. I don't know if the patch was just a statistical cross-your-fingers patch, I'm going to review it now, but that's what it seems like.

 

The mainline patch is almost identical to the previous patch used in alternate forks, including those used by Armbian.  The mainline patch has only one change (other than the Kconfig symbol and matching macro):  one less low-order bit is masked during the test for the condition when the problem has been seen to occur, which actually makes the mainline patch provide more "coverage" than the previous.

 

The previous patch that had been applied to 4.19.y and 5.0.y kernels has similar behavior but was indeed working with 4.14.y and earlier, so there may be something else that has changed.  I have not had the opportunity to continue looking into this just yet.

 

It is also possible that the original fix and the subsequent mainline fix were not complete, which is actually noted in the comments for the various commits.   Disabling a tickless kernel, for instance, seems to exacerbate the issue significantly though on the surface this would not be expected.

 

 

Link to comment
Share on other sites

The heat sink adhesive melted off? Yikes @Petee  
Thanks for the info @windysea. I wonder what the hell the ALLWINNER CORPORATION has to say about this. You would think that maybe knowing the internals of the CPU they could provide some guidance. Why would I ever want to rely on another Allwinner CPU if they can't communicate. Maybe the Pine64 guys are the only ones with enough leverage to get them to speak up.
 

Link to comment
Share on other sites

The fixes to date have all been community-provided based on observations and testing (IE: not based on anything from the SoC vendor).  Allwinner has been silent on this specific issue, which actually isn't really surprising :(

 

I did find a different fix from Allwinner for a different timer issue, which occurs when trying to write (rather than read) one of the affected timer registers.  That fix is not part of the mainline (or armbian) kernels but does not immediately appear to be directly related, but that is part of what I hope to be able to look into.

Link to comment
Share on other sites

 

The heat sink adhesive melted off?

 

No.  This was the second heatsink pack I purchased at Amazon.  The first one didn't really have any adhesive on it.  I thought the second pack of heat sinks had more adhesive.  Best to purchase a sticky adhesive compound when installing these little heatsinks.

 

 

 

 

Link to comment
Share on other sites

@windysea as I was pondering this further yesterday, I realized I had never observed this issue for over a year when running on the 3.xx ayufan kernels. I think that what you said may have something to do with it. I wonder what other differences are hiding and implemented in that kernel?

Link to comment
Share on other sites

On 3/13/2019 at 1:30 PM, martinayotte said:

It has been fixed a while ago but having CONFIG_FSL_ERRATUM_A008585 into kernel def_config ...

Which image/kernel are you using ?

 

On 3/13/2019 at 1:30 PM, martinayotte said:

It has been fixed a while ago but having CONFIG_FSL_ERRATUM_A008585 into kernel def_config ...

Which image/kernel are you using ?

 

Sorry for the delay.

I'm using Armbian 5.75 Stable (Stretch)

$ uname -a
Linux pine64 4.20.7-sunxi64 #5.75 SMP Fri Feb 8 10:37:18 CET 2019 aarch64 GNU/Linux

I'll try with the nightly build on 5.x and check if it works.

Also, here's Andre's test_timer output:

 

# ./timer 
TAP version 13
# number of cores: 4
ok 1 same timer frequency on all cores
# timer frequency is 24000000 Hz (24 MHz)
# time1: a33e3cbff, time2: a33e3c80b, diff: -1012
# time1: a33e533ff, time2: a33e5300b, diff: -1012
# time1: a33ea87f5, time2: a33ea8400, diff: -1013
# time1: a33eb4bff, time2: a33eb480b, diff: -1012
# time1: a33f6b7f4, time2: a33f6b400, diff: -1012
# time1: a341037f4, time2: a34103400, diff: -1012
# time1: a34233ff5, time2: a34230c00, diff: -13301
# time1: a342ccff4, time2: a342ccc00, diff: -1012
# time1: a342edff5, time2: a342ed3ff, diff: -3062
# time1: a3430d7f5, time2: a3430d400, diff: -1013
# time1: a3436d7f4, time2: a3436d400, diff: -1012
# time1: a343a83ff, time2: a343a8014, diff: -1003
# time1: a343c17f5, time2: a343c1400, diff: -1013
# time1: a34568bff, time2: a3456880b, diff: -1012
# time1: a345e2c00, time2: a345e200b, diff: -3061
# time1: a3465b7f5, time2: a3465b400, diff: -1013
# too many errors, stopping reports
not ok 2 native counter reads are monotonic # 370 errors
# min: -15350, avg: 11, max: 15372
# diffs: -127209, -41834, -127167, -41834, -41792, -41834, -41834, -41792, -41833, -41834, -41792, -41834, -41792, -41833, -41834, -127250
# too many errors, stopping reports
not ok 3 Linux counter reads are monotonic # 373 errors
# min: -174633417, avg: 807, max: 640875
# core 0: counter value: 44453336039 => 1852 sec
# core 0: offsets: back-to-back: 15, b-t-b synced: 12, b-t-b w/ delay: 20
# core 1: counter value: 44453431859 => 1852 sec
# core 1: offsets: back-to-back: 12, b-t-b synced: 11, b-t-b w/ delay: 15
# core 2: counter value: 44453624827 => 1852 sec
# core 2: offsets: back-to-back: 19, b-t-b synced: 12, b-t-b w/ delay: 20
# core 3: counter value: 44453630152 => 1852 sec
# core 3: offsets: back-to-back: 13, b-t-b synced: 12, b-t-b w/ delay: 19
1..3
Link to comment
Share on other sites

On 4/20/2019 at 12:13 PM, windysea said:

Then came systemd and that caused breakage so the extra checks were put in.  There are many discussions on this elsewhere so there's no need to discuss further here:  The summary is if you are using systemd do not run 'hwclock -s'.  Ever.

 

Conversely - if one is running ntpd, systemd-timesyncd can cause a lot of trouble here...

 

Above is a great post - and some things to add maybe...

 

GPS without PPS is an ok time source for NTP, if nothing else is available via the network - but it's precision only goes so far, and if the GPS is over USB, this adds more jitter...

 

NOHZ is perhaps one approach, esp. with a CPU that does DVFS, and setting the scheduler appropriately to get some timing stability overall - not withstanding the long jumps that A64 seems to do, which is another issue, as NTP assumes that the system timing is somewhat stable.

 

hint - if one is looking for a stable Stratum 1 NTP - check out google Public NTP, as this is what's keeping their global cloud in sync, and there it pays to also use google DNS for the resolver.

 

This is likely good enough for most folks, and if one uses GPS over gpsd on the shm, that should be the preferred, and then time1/time2/time3/time4.google.com, and maybe consider the local clock, should get to a stable NTP.

 

To get really stable local time - one is not going to get stratum 0 on an SBC, period, even with GPS and PPS... most of these cheap SBC's use an XO with 30 to 50 PPM accuracy (this is a 50 cent chip) - this is good enough for ethernet and WIFi - I'm looking a part that is 50 ppb, which would normally be an OCXO (spendy), but this part is a MEMS device, and so far the data I've seen looks pretty good over time.

Link to comment
Share on other sites

On 5/29/2019 at 12:37 AM, sfx2000 said:

 

To get really stable local time - one is not going to get stratum 0 on an SBC, period, even with GPS and PPS... most of these cheap SBC's use an XO with 30 to 50 PPM accuracy (this is a 50 cent chip) - this is good enough for ethernet and WIFi - I'm looking a part that is 50 ppb, which would normally be an OCXO (spendy), but this part is a MEMS device, and so far the data I've seen looks pretty good over time.

 

An ntp stratum is not related to accuracy nor precision - it is simply an indication of how many "hops" a given NTP server is from a reference clock.  A stratum-0 is a reference clock (IE: atomic clock, GPS receiver, etc).  A stratum-1 is the NTP server directly using that reference clock for time synchronization.  An SBC with a serial GPS indeed can be a stratum-1 (the GPS would be stratum-0), and there are many public postings on doing this.  In fact the NTPsec team is doing "research" on this topic and has published documentation regarding this.

 

The nature of the reference implementation of NTPd is specifically to maintain accurate time regardless of any hardware timers.  Today's "50-cent" parts are still more stable than those orders-of-magnitude more expensive decades-ago when NTPd was first developed.

 

Google's NTP project may use their own "atomic clocks", but their public NTP servers tend to be on the poor end with respect to jitter.  They're intended to be "close-enough", stable, and highly-available.  They are not intended to be highly accurate.  Their public NTP servers, for instance, implement leap-smearing rather than advertise a leap-second (when appropriate).  For this reason Google strongly recommends not mixing their public NTP servers in a configuration with other NTP sources (bad things can happen, and in fact have happened in the past).  Google's NTP servers also are behind anycast load-balancers.  While this improves availability and end-device configuration simplicity it actually degrades performance.

 

In my own testing google's ntp servers typically have higher jitter than most of the larger NTP pool project pools, the latter of which are already commonly used as defaults in many OS distributions.

 

Configuring and building a non-tickless kernel is required in order to enable kernel-pps (aka "hard pps"), which typically has far less jitter than "soft pps".  However, doing so even with the latest 5.1.y (DEV) kernels results in an unstable platform where the issue noted in this thread will manifest fairly frequently.  It may just be that A64-based SBCs are not suitable to host NTP reference clocks and stratum-1 NTP servers but earlier kernels did not seem to have this issue so it may just be that a previous mitigation got lost along the way.

 

Link to comment
Share on other sites

Here just built a second PFSense box and testing out a new mini GPS (Arduino style) that has built in RS-232 and PPS signal.  It is around $7 USD with free shipping on Ebay.  My primary GPS / PPS is a modded Sure GPS.

 

Tiny box DIN mounting this box. 

 

PFSense.thumb.jpg.4d26a905d45e360d53899e6cefa91883.jpg

 

I know this doesn't have much to do with the TVBox / Armbian rather just a NTP time source for my home network.

 

GPS.jpg.7d00f178bca632472fe5b5160115224f.jpg

 

 

 

 

Link to comment
Share on other sites

I have this problem very very often.  Running Ubuntu Bionic with Armbian Linux 4.19.57-sunxi64 pretty stable otherwise when time is sync'd.  I disabled systemd-timesyncd and seems to not have the time/date go to year 2114 as much, but still happens 3-4 times per week.  I usually have to manually resync the time using sudo ntpdate 0.north-america.pool.ntp.org.  Sometimes this gives invalid argument and requires me to reboot to run this command.  I'd love to help in any way I can to debug this issue.  Not sure how!

 

I should add I didn't have this issue running arch linux with kernel 4.19 which is odd.  I like armbian much better however and want to stay with it.  Also never had this issue on the ayufan kernels either.

Link to comment
Share on other sites

@markolonius I haven't seen this issue in production for over a month on my 7x units I have deployed, after reverting to the method used in the "legacy" kernel branch. Here is the patch. The original patch which is in upstream Linux was a little "too clever", and although I believe it must have worked well for the person who submitted it, the errata in the hardware make it an unreliable fix. The previous technique of just reading the register in a loop until it gets a stable reading works much better.

 

https://github.com/arctype-co/linux/commit/5fcb4e57eeaa4d670ef4acf5818c6fe16aa0d3d0

 

 

Link to comment
Share on other sites

5 hours ago, sundbry said:

@markolonius I haven't seen this issue in production for over a month on my 7x units I have deployed, after reverting to the method used in the "legacy" kernel branch. Here is the patch. The original patch which is in upstream Linux was a little "too clever", and although I believe it must have worked well for the person who submitted it, the errata in the hardware make it an unreliable fix. The previous technique of just reading the register in a loop until it gets a stable reading works much better.

 

https://github.com/arctype-co/linux/commit/5fcb4e57eeaa4d670ef4acf5818c6fe16aa0d3d0

  

 

This sounds awesome.  Haven't built a kernel for an arm board yet so this should be interesting.  I'll give it a try, but might take me a while due to time.  How would I download this patch? Any plans/hopes of this making it into a nightly or stable release?

Link to comment
Share on other sites

@sundbry I attempted to apply the patch but could never get it to work.  So I actually tried out the 5.2.x kernel and kept  systemd-timesyncd disabled, which seems to work and is stable for me for the past week or so.  As soon as I enable  systemd-timesyncd and start it, it switches the time to year 2114.  Since I don't have much time at the moment keep trying to build a patched kernel I'll keep it this way.  But if anyone has the same problem on the 5.2.x kernel w/  systemd-timesyncd  and is working on trying to solve it I'll be more than happy to supply whatever logs requested to help.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines