1 1
martinayotte

A64 date/time clock issue

Recommended Posts

Let's first recap from the other thread :

 

I found a strange bug and seen it twice in a week :

 

After few days running, my PineA64 had wrong date, some thing like "Tue Mar 13 20:50:11 EDT 2153".

Trying to restart NTP didn't fix it. Trying to set it manually give me an error :

root@pine64:~# date -s "2017-02-03 00:00:01"
date: cannot set date: Invalid argument

Doing an "strace" with it reveal that it can not write system clock :

settimeofday({1486098000, 0}, NULL) = -1 EINVAL (Invalid argument)
openat(AT_FDCWD, "/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3

Doing search on the net, I've found http://nerdbynature.de/s9y/2009/07/22/cannot-set-date-Invalid-argumentand http://www.mail-archive.com/bug-coreutils@gnu.org/msg14103.html, but not real answers other than it is really the kernel that are f*k*ing up with the system clock. And the issue is gone if I simply reboot the board. Really strange ... Especially that it happened only on PineA64 ...

 

As someone got the similar issue ?

 

And here is Zador's first reply :

 

Is there support for the hardware RTC in mainline (assuming you are testing the mainline)? If yes, then it may be related to wrong RTC settings like external or internal oscillator. If not - then maybe it's related to the arch timer bug? https://github.com/longsleep/linux-pine64/issues/44#issuecomment-263060276
 


I think kernel just prevents unsafe date/time changes, there are several -EINVAL returns in the date/time changing code.

 

And my first reply :

 

I was just looking at DT about this, because if I remember, I didn't had the issue up until recently.

I'm seeing that in the old longsleep DT, it was using 'sun50i-rtc' and in the current it is using 'sun6i-a31-rtc'.

Since I kept also several A64 armbian previous images, I will look into them too.

(I will also check what is the frequency of the occurence, because maybe it happen every 24hrs by reading wrong value from a sync service call. Let see tomorrow ...)

(Another thing I found : the H3 has /dev/rtc and clear trace in dmesg, but not with H5 or A64, no trace at all. But why I don't have the same issue with H5 ?)

 

Now, here is my latest investigations :

 

Both H5 and A64 doesn't probe RTC driver successfully, explaining why dmesg doesn't mentioned it (I hate driver which are not verbose on failures !)

But, why H5 doesn't react the same as A64 ?

I will have to wait until issue re-appear to dig further in /var/log/syslog , maybe tomorrow !

 

BTW, Zador, the -EINVAL seen on "date" command is only shown when symptom occurred, not after reboot... (let's wait tomorrow to see)

As I said, I suspect "systemd" compatiblilty issue here.

 

BTW, each time the issue occurred, the current SSH session has been kicked out, probably due to time elapsed been too large, 120 years ... :P

 

Share this post


Link to post
Share on other sites

BTW, Zador, the -EINVAL seen on "date" command is only shown when symptom occurred, not after reboot... (let's wait tomorrow to see)

As I said, I suspect "systemd" compatiblilty issue here.

Because after the reboot time goes back to normal ftom NTP and/or fake-hwclock? And systemd is not the problem here. First, it's the kernel that rejects date/time change, and second - systemd issue affects only 32-bit systems.

Share this post


Link to post
Share on other sites

Maybe I've didn't explain it well : ntpdate or ntp server are also facing the same -EINVAL seen with date when symptoms kick in, but not after reboot. Something happened, probably at the 24th hour (I will see later today) which make the symtoms resurrect somehow, probably after a systemd or cron task job. I will check dmesg/syslog next time ...

But, still, why PineA64 is having that trouble and not OPiPC2, they are both using the same kernel branch ?

Share this post


Link to post
Share on other sites

Unfortunately, the occurrence is not 24hrs ... :(

So, I will have to wait more, maybe 48hrs, or maybe it is even random ... :angry:

 

EDIT : after 48hrs, the issue didn't appeared yet, so I guest it is random ... :(

Share this post


Link to post
Share on other sites

Ok ! I probably find out why : defconfig for both next and dev don't have CONFIG_FSL_ERRATUM_A008585=y, but we had it when we were on 4.11.x ...

But, of course, only testing 24/48/72 hrs will prove that it is now really fixed !

 

Share this post


Link to post
Share on other sites
6 hours ago, gchain said:

did you fix the problem?

It has been fixed a while ago but having CONFIG_FSL_ERRATUM_A008585 into kernel def_config ...

6 hours ago, gchain said:

I have a similar issue

Which image/kernel are you using ?

 

Share this post


Link to post
Share on other sites

Sure, here is the result

 

grep ERRATUM /boot/config*
CONFIG_ARM64_ERRATUM_826319=y
CONFIG_ARM64_ERRATUM_827319=y
CONFIG_ARM64_ERRATUM_824069=y
CONFIG_ARM64_ERRATUM_819472=y
# CONFIG_ARM64_ERRATUM_832075 is not set
CONFIG_ARM64_ERRATUM_845719=y
CONFIG_ARM64_ERRATUM_843419=y
CONFIG_ARM64_ERRATUM_1024718=y
# CONFIG_CAVIUM_ERRATUM_22375 is not set
CONFIG_CAVIUM_ERRATUM_23144=y
# CONFIG_CAVIUM_ERRATUM_23154 is not set
# CONFIG_CAVIUM_ERRATUM_27456 is not set
# CONFIG_CAVIUM_ERRATUM_30115 is not set
# CONFIG_QCOM_FALKOR_ERRATUM_1003 is not set
# CONFIG_QCOM_FALKOR_ERRATUM_1009 is not set
# CONFIG_QCOM_QDF2400_ERRATUM_0065 is not set
CONFIG_HISILICON_ERRATUM_161600802=y
CONFIG_QCOM_FALKOR_ERRATUM_E1041=y
CONFIG_FSL_ERRATUM_A008585=y
# CONFIG_HISILICON_ERRATUM_161010101 is not set
# CONFIG_ARM64_ERRATUM_858921 is not set

 

Share this post


Link to post
Share on other sites
1 hour ago, Kalin said:

Sure, here is the result

Oh ! The CONFIG_FSL_ERRATUM_A008585 is present ...

Then, I don't know ... For me, I've never faced the issue again since when that config has been added more than 2 years ago ...

Share this post


Link to post
Share on other sites

I am seeing the same issue on a Pine64+, with 4.19.y, 4.20.y, and 5.0.y kernels.  After some time of running OK, setting the date by any means fails as above.  I also noticed that the actual date on the system suddenly goes way off (as in year 2114, for example).

 

I've added this to my list in trying to get my pine64 back as an authoritative NTP server.

Share this post


Link to post
Share on other sites

Thanks.  This looks to be a busy weekend for me but I would like to start working on these a little deeper.

 

There do seem to be timer issues post- 4.14.y and I haven't yet determined if that could be due to missing kernel configs or due to underlying kernel code changes.  The former shouldn't be too hard to find. . .just tedious, but the latter might be more involved.

Share this post


Link to post
Share on other sites
14 minutes ago, windysea said:

There do seem to be timer issues post- 4.14.y

The timer issue is present since the A64 SoC introduction, it is a silicon bug !

If you look at first post of this thread, I've faced it the first time 2 years ago in Feb 2017, so at that time, it was 4.10.y at that time ...

If people still facing the issue, even after few workarounds, maybe those don't cover all test cases ...

 

Share this post


Link to post
Share on other sites

I'm seeing other issues, such as not being able to configure a non-tickless kernel.  If anything, using a constant-rate (non-tickless) should be more stable but with post-14.4.y it is wildly unstable.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
1 1