0
martinayotte

A64 date/time clock issue

Recommended Posts

Let's first recap from the other thread :

 

I found a strange bug and seen it twice in a week :

 

After few days running, my PineA64 had wrong date, some thing like "Tue Mar 13 20:50:11 EDT 2153".

Trying to restart NTP didn't fix it. Trying to set it manually give me an error :

root@pine64:~# date -s "2017-02-03 00:00:01"
date: cannot set date: Invalid argument

Doing an "strace" with it reveal that it can not write system clock :

settimeofday({1486098000, 0}, NULL) = -1 EINVAL (Invalid argument)
openat(AT_FDCWD, "/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3

Doing search on the net, I've found http://nerdbynature.de/s9y/2009/07/22/cannot-set-date-Invalid-argumentand http://www.mail-archive.com/bug-coreutils@gnu.org/msg14103.html, but not real answers other than it is really the kernel that are f*k*ing up with the system clock. And the issue is gone if I simply reboot the board. Really strange ... Especially that it happened only on PineA64 ...

 

As someone got the similar issue ?

 

And here is Zador's first reply :

 

Is there support for the hardware RTC in mainline (assuming you are testing the mainline)? If yes, then it may be related to wrong RTC settings like external or internal oscillator. If not - then maybe it's related to the arch timer bug? https://github.com/longsleep/linux-pine64/issues/44#issuecomment-263060276
 


I think kernel just prevents unsafe date/time changes, there are several -EINVAL returns in the date/time changing code.

 

And my first reply :

 

I was just looking at DT about this, because if I remember, I didn't had the issue up until recently.

I'm seeing that in the old longsleep DT, it was using 'sun50i-rtc' and in the current it is using 'sun6i-a31-rtc'.

Since I kept also several A64 armbian previous images, I will look into them too.

(I will also check what is the frequency of the occurence, because maybe it happen every 24hrs by reading wrong value from a sync service call. Let see tomorrow ...)

(Another thing I found : the H3 has /dev/rtc and clear trace in dmesg, but not with H5 or A64, no trace at all. But why I don't have the same issue with H5 ?)

 

Now, here is my latest investigations :

 

Both H5 and A64 doesn't probe RTC driver successfully, explaining why dmesg doesn't mentioned it (I hate driver which are not verbose on failures !)

But, why H5 doesn't react the same as A64 ?

I will have to wait until issue re-appear to dig further in /var/log/syslog , maybe tomorrow !

 

BTW, Zador, the -EINVAL seen on "date" command is only shown when symptom occurred, not after reboot... (let's wait tomorrow to see)

As I said, I suspect "systemd" compatiblilty issue here.

 

BTW, each time the issue occurred, the current SSH session has been kicked out, probably due to time elapsed been too large, 120 years ... :P

 

Share this post


Link to post
Share on other sites

BTW, Zador, the -EINVAL seen on "date" command is only shown when symptom occurred, not after reboot... (let's wait tomorrow to see)

As I said, I suspect "systemd" compatiblilty issue here.

Because after the reboot time goes back to normal ftom NTP and/or fake-hwclock? And systemd is not the problem here. First, it's the kernel that rejects date/time change, and second - systemd issue affects only 32-bit systems.

Share this post


Link to post
Share on other sites

Maybe I've didn't explain it well : ntpdate or ntp server are also facing the same -EINVAL seen with date when symptoms kick in, but not after reboot. Something happened, probably at the 24th hour (I will see later today) which make the symtoms resurrect somehow, probably after a systemd or cron task job. I will check dmesg/syslog next time ...

But, still, why PineA64 is having that trouble and not OPiPC2, they are both using the same kernel branch ?

Share this post


Link to post
Share on other sites

Unfortunately, the occurrence is not 24hrs ... :(

So, I will have to wait more, maybe 48hrs, or maybe it is even random ... :angry:

 

EDIT : after 48hrs, the issue didn't appeared yet, so I guest it is random ... :(

Share this post


Link to post
Share on other sites

Ok ! I probably find out why : defconfig for both next and dev don't have CONFIG_FSL_ERRATUM_A008585=y, but we had it when we were on 4.11.x ...

But, of course, only testing 24/48/72 hrs will prove that it is now really fixed !

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
0