12 12
linda

pine64: massive date/time clock problem

Recommended Posts

1 hour ago, Igor said:

 

Use updated build engine - do git pull before starting. I just build this kernel successfully.

That worked, the patch is applied, let's see if it works :)

Thanks!

Share this post


Link to post
Share on other sites
On 12/16/2019 at 6:30 PM, eminguez said:

That worked, the patch is applied, let's see if it works :)

Thanks!

I can confirm it's been >24h without the issue \o/ Also I've seen linux-image-dev-sunxi64/buster 19.11.4.351 contains the fix, so I'm upgrading to that package version instead my custom compiled one.

 

Thanks!

Share this post


Link to post
Share on other sites
Quote

Dec 23 13:35:02 localhost nm-dispatcher: req:2 'connectivity-change': new request (2 scripts)
Dec 23 13:35:02 localhost nm-dispatcher: req:2 'connectivity-change': start running ordered scripts...
Dec 23 13:35:03 localhost systemd[1]: Started Samba NMB Daemon.
Dec 23 13:35:03 localhost systemd[1]: Starting Samba SMB Daemon...
Dec 23 13:35:03 localhost systemd[1]: Started Samba SMB Daemon.
Dec 23 13:35:05 localhost chronyd[1037]: Selected source x.x.x.x
Dec 23 13:35:05 localhost chronyd[1037]: System clock wrong by -1.542673 seconds, adjustment started
Dec 23 13:35:05 localhost chronyd[1037]: System clock was stepped by -1.542673 seconds
Dec 23 13:35:10 localhost systemd[1]: NetworkManager-dispatcher.service: Succeeded.
Dec 23 13:35:16 localhost systemd[1]: systemd-hostnamed.service: Succeeded.
Dec 23 13:35:18 localhost systemd[1]: dev-ttyFIQ0.device: Job dev-ttyFIQ0.device/start timed out.
Dec 23 13:35:18 localhost systemd[1]: Timed out waiting for device /dev/ttyFIQ0.
Dec 23 13:35:18 localhost systemd[1]: Dependency failed for Serial Getty on ttyFIQ0.
Dec 23 13:35:18 localhost systemd[1]: serial-getty@ttyFIQ0.service: Job serial-getty@ttyFIQ0.service/start failed with result 'dependency'.
Dec 23 13:35:18 localhost systemd[1]: dev-ttyFIQ0.device: Job dev-ttyFIQ0.device/start failed with result 'timeout'.
Dec 23 13:35:18 localhost systemd[1]: Reached target Login Prompts.
Dec 23 13:35:18 localhost systemd[1]: Reached target Multi-User System.
Dec 23 13:35:18 localhost systemd[1]: Reached target Graphical Interface.
Dec 23 13:35:18 localhost systemd[1]: Starting Update UTMP about System Runlevel Changes...
Dec 23 13:35:18 localhost systemd[1]: systemd-update-utmp-runlevel.service: Succeeded.
Dec 23 13:35:18 localhost systemd[1]: Started Update UTMP about System Runlevel Changes.
Dec 23 13:35:18 localhost systemd[1]: Startup finished in 52.814s (kernel) + 1min 30.570s (userspace) = 2min 23.384s.

 

This is a output of tail -f /var/log/syslog on Linux rockpro64 5.3.11-rockchip64 #19.11.3 SMP PREEMPT Mon Nov 18 21:03:09 CET 2019 aarch64 GNU/Linux so my network dongle is working better now on this kernel after the armbian-config update today 

Share this post


Link to post
Share on other sites

i tried to build the new kernel and by looking throught all the messages i found this:

[ warn ] * [l][c] fix-a64-timejump.patch [ failed ]

 

maybe the patch is there but it is not getting applied properly

Share this post


Link to post
Share on other sites
32 minutes ago, langerma said:

i tried to build the new kernel and by looking throught all the messages i found this

 

Have you updated your build engine sources before? Which kernel?

Share this post


Link to post
Share on other sites
29 minutes ago, langerma said:

i did a git pull beforehand


I checked on my sources and I see no problems on patching (current/dev).

 

Is your head (git log) really this:

commit 246b6726bab48ea67e560cc41c4122ee5fa35d63 (HEAD -> master, origin/master, origin/HEAD)
Merge: b3a68c36 26cf7f2f
Author: count-doku <52237708+count-doku@users.noreply.github.com>
Date:   Sat Dec 28 14:51:02 2019 +0100
I am asking this because I was fixing this patch a week ago.

Share this post


Link to post
Share on other sites

indeed it is:

 

commit 246b6726bab48ea67e560cc41c4122ee5fa35d63
Merge: b3a68c3 26cf7f2
Author: count-doku <52237708+count-doku@users.noreply.github.com>
Date:   Sat Dec 28 14:51:02 2019 +0100

    Merge pull request #1697 from mmriech/fix_dkms

    Fix DKMS in Armbian

Share this post


Link to post
Share on other sites

I just want to confirm that using Armbian 19.11.6 on my pine64 (Linux pine64 5.4.7-sunxi64 #19.11.6 SMP Sat Jan 4 19:40:10 CET 2020 aarch64 GNU/Linux) for a couple of weeks with some workloads (single k3s cluster with some pods running) and 0 issues so far related to the clock.

 

Thanks!

Share this post


Link to post
Share on other sites

I upgraded my four Pine64s (one LTS) to 5.3.9 today. So far, so good, re: clock problems. However, the Pine that I had the most clock problems with kept hanging while trying to edit files. I tried to go back to my patched kernel, but lost the CPU temperature and frequency. I also had problems with kworker processes using lots of CPU time. I speculated that it probably had something to do with the Ethernet port, as I've had problems with that before. I decided to try 5.5-rc6 and see what happened. No hanging, CPU temp is back, but the CPU frequencies show as just 0 (N/A in armbianmonitor).  Still, I'm OK with that, as long as the clock problem stays gone.

 

Thanks!

Share this post


Link to post
Share on other sites

Ouch. Two hours after I posted that, one of my Pine64s running 5.3.9 went back in time to 1979, apparently. A reboot cleared that. I've only seen date problems on this system three times. Once was early this morning, which prompted me to do the upgrade, as it was still running an unpatched 4.x, and then again this afternoon.

Share this post


Link to post
Share on other sites
15 hours ago, goathunter said:

went back in time to 1979

The original issue about "massive datetime clock problem" always jump in the future for about 95 years.

So, going "back in time to 1979" must be a different issue ...

Share this post


Link to post
Share on other sites

IIRC, a careful reading of the patch notes indicated that it only actually covered like 99% of cases or something like that, so it is possible (though rare) that it could be the same problem?

Share this post


Link to post
Share on other sites
On 1/31/2020 at 11:06 PM, TRS-80 said:

IIRC, a careful reading of the patch notes indicated that it only actually covered like 99% of cases or something like that, so it is possible (though rare) that it could be the same problem?


It looks like it. After posting that, the system jumped ahead 95 years on Saturday morning. That was on 5.3. I decided to take all of my systems to 5.5-rc6. I haven't seen the issue on any of them since (though the CPU frequency still shows as 0 in htop on all four).

Share this post


Link to post
Share on other sites

A week or so after I posted that, 5.5-rc6 was apparently pulled? Abandoned? It was no longer an option in the "switch kernels" section of armbian-config. I decided it was bad to run something that was apparently pulled, so I took my Pine A64+ systems to 5.3.9, and a day or two later, apt upgrade bumped them to 5.4.20.

 

But that was bad. My Ethernet speeds dropped to about 1/5 of what they normally are, and my Pine running Plex was constantly showing Plex processes using as much as 20% of the CPUs all the time. Last night, I finally rolled back to the 5.3.9 kernel (build 19.x.x), and all of my problems with speed and CPU usage went away. Things are as speedy as they were before 5.4.20, and Plex processes are no longer consuming CPU non-stop. I also noticed that my CPU frequencies are back, and that the CPU temperature readings seem to be more accurate now. Something appears to have broken between 5.3.9 and 5.4.20 for Pine A64+.

I've now held the kernel updates to prevent their being upgraded again. I'll just hang out at 5.3.9 until I see some reason to think that things will get better.

$ apt-mark hold linux-headers-current-sunxi64
$ apt-mark hold linux-dtb-current-sunxi64
$ apt-mark hold linux-image-current-sunxi64

I haven't seen time jumps on any of the three since rolling back to 5.3.9, but it's only been about 14 hours, so we'll see.

 

I did decommission a fourth Pine that had the most problems with time jumping. I finally decided it wasn't worth the hassle, and I just got rid of it. That particular system also had a finicky Ethernet port that didn't work with some versions of the 4.x.x kernel.

 

Share this post


Link to post
Share on other sites

After about 10 days, two of my Pines are stable. The third one keeps time jumping. This morning, it took me several "set time/reboot" sequences before I was finally able to stop it from time jumping. There's clearly still a problem in 5.3.9 with some AllWinner clocks.

The patch posted earlier in this thread still seems to be the best workaround to this problem, even if some think it isn't the correct workaround. I have not tried to patch 5.3.9 with that patch, though I may give it a try if I continue to have problems.

Share this post


Link to post
Share on other sites

After my last post, I made a discovery that seems to have helped with the time jump problem. So much so that I rescued my decommissioned fourth Pine from the recycle bin to give it one more go.

 

I learned about the hwclock command to display and set the hardware clock. Long story short, I used this sequence to set my clocks on Pine #3, the one that I said on the 10th took me several "set time/reboot" sequences:

$ hwclock
$ ntpdate pool.ntp.org
$ hwclock -w
$ hwclock && date

The first "hwclock" just displays the hardware clock setting. The two clocks differed by a few seconds when I first checked them. I used ntpdate to set the time, then "hwclock -w" to set the hardware clock based on the system clock.

 

It could be that I've just been lucky, but it's been nine days now since doing that on all of my systems, and none of them have time-jumped. My theory is that the hardware clock was previously off from the system clock enough to confuse things and cause the time jump.

 

I started fresh with the current Armbian Bionic distribution for Pine #4. As with the others, the 5.4.20 kernel displayed an incorrect CPU temperature, and even while idle, the CPU was in use more than it should have been, with CPU frequencies jumping all over the place. I dropped down to 5.3.9, and all of those issues went away. It's only been 14 hours, but the system has not time-jumped since issuing the sequence above.

 

We'll see....

Share this post


Link to post
Share on other sites

And the system described above jumped again today. When I rebooted it, this is what I saw.

root@zako:~# hwclock && date
1969-12-31 18:00:38.270548-0600
Tue Mar 24 14:41:14 CDT 2020
root@zako:~#

I have now gone back to the patched 4.19.83 on the two systems that keep jumping, and there they shall stay.

 

So far, my other two have not jumped since going to 5.3.9. If either does, I'll move it back to the patched 4.19.83.

Share this post


Link to post
Share on other sites
4 hours ago, langerma said:

@goathunterdo you have the patched debs somewhere?

i'd love to have 'em

 

so if you would possibly be so kind as to send me a link

 

Link sent. If anyone else wants them, let me know.

 

Hunter

Share this post


Link to post
Share on other sites

Interesting. I finally took the time to pull down the current Armbian kernel code and build it so I could see what's in there. The post-build arm_arch_timer.c files are identical for my patched 4.19.83 and the sources I got with "git clone" this morning, which means that the patch is included in the current sources.

 

But I have other issues with 5.4.20, as I documented above, so maybe 5.4.20 does solve the time-jump problem, and I just don't know it because I haven't run it long enough (because of the other problems: kworker processes chewing up CPU time when idle, incorrect CPU temperature readings). 5.3.9 did not solve it, but perhaps the patch wasn't included in that. I haven't tried to figure out how to check that.

 

Update: after a week and a half, all four Pines are fine (two on patched 4.19..83, two running 5.3.9). No time jumps on any of them.

Share this post


Link to post
Share on other sites

Hello,

 

I hope someone can help.  I have 2 A64s, and one of them keeps hitting this problem.   I'm not running Armbian,  but I hope your solutions can help me.

 

I have no experience in upgrading the kernel. currently, I have a file 

/boot/config-4.19.63-sunxi64  with the line  

CONFIG_FSL_ERRATUM_A008585=y

 

Could anyone please explain to a noob how to upgrade my kernel to 4.19.83 and apply any necessary patches?  I'd be so grateful!

 

Brian

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
12 12