Jump to content

Recommended Posts

Posted
On 8/20/2019 at 2:32 PM, martinayotte said:

Do you have "CONFIG_SUN50I_ERRATUM_UNKNOWN1=y" in /boot/config-4.19.63-sunxi64 ?

 

 

I followed all the threads and checked all the config settings, I have this and all the others you usually ask enabled.

 

and for the record, it happened again this morning. 

 

Posted
3 hours ago, mrstone78 said:

I have this and all the others you usually ask enabled.

Strange ...

I've a Pine64, OPiWin and OlinuxinoA64 and none of them got the issue since months ...

Maybe @smaeul can give more details since he is the author of the latest patch ?

 

Posted

I got tired of trying to cleanup after the Pine A64/Armbian clock skips and shut off the board for a few months.

 

Then I came across this linux 5.3 patch for the Allwinner clock

https://github.com/torvalds/linux/commit/c950ca8c35eeb32224a63adc47e12f9e226da241

 

The stock 5.3-rc5 kernel built on the pine64 and installed without much trouble. Its been a day and a half under fairly heavy load (a heavy load would increase the frequency of clock skips) with no skips yet so I am cautiously optimistic.

 

the new clock fix kernel flag is: (already set in the tarball below)

CONFIG_SUN50I_ERRATUM_UNKNOWN1=y

 

If someone wants to try this on their own board without having to complie the kernel, I tar'ed up the kernel, modules, and /boot bits.

 

https://donp.org/linux-5.3-rc5-pinea64.tar.xz

 

By moving the /boot symlinks, the board should boot into this kernel:

 

lrwxrwxrwx  1 root root       11 Aug 21  2116 dtb -> dtb-5.3-rc5
lrwxrwxrwx  1 root root       17 Oct 13  2114 Image -> vmlinuz-5.3.0-rc5
lrwxrwxrwx  1 root root       17 Oct 13  2114 uInitrd -> uInitrd-5.3.0-rc5

 

haha you can see the timestamps of the symlinks happened after a clock skip.

 

Don

 

Posted
21 hours ago, Don Park said:

Then I came across this linux 5.3 patch for the Allwinner clock

As you can see in this commit, it was originally introduced during v5.1-rc1 in January.

It is exactly the same patch that Armbian is using since months ...

Posted
On 8/22/2019 at 2:08 PM, martinayotte said:

I've a Pine64, OPiWin and OlinuxinoA64 and none of them got the issue since months ...

 

What can I say, lucky you :)

 

 

FTR, it just happened again. 

In the meantime I had updated the kernel to 5.3.0-rc3-sunxi64

 

Posted

Hi guys!
Facing the same issue :)
 

$ date  
Thu Oct 25 18:17:22 UTC 2114

$ uname -a
Linux pine64so-node6 4.19.57-sunxi64 #5.90 SMP Fri Jul 5 18:38:49 CEST 2019 aarch64 aarch64 aarch64 GNU/Linux

$ grep CONFIG_SUN50I_ERRATUM_UNKNOWN1 /boot/config-4.19.57-sunxi64 
CONFIG_SUN50I_ERRATUM_UNKNOWN1=y

 

Posted

I switched to 5.3.0-rc3, but not work. After +/- 8 days.

 

Welcome to Debian Stretch with Armbian Linux 5.3.0-rc3-sunxi64
System load:   0.82 0.73 0.70  	Up time:       24855 days	

# date
Sat Dec 16 19:19:42 -03 2209

 

Posted
12 hours ago, ams.br said:

I switched to 5.3.0-rc3, but not work. After +/- 8 days.

 


Welcome to Debian Stretch with Armbian Linux 5.3.0-rc3-sunxi64
System load:   0.82 0.73 0.70  	Up time:       24855 days	

# date
Sat Dec 16 19:19:42 -03 2209

 

This was same for me.

Posted

I compiled the kernel (branch next 4.19.y) applying a patch based in https://github.com/arctype-co/linux/commit/5fcb4e57eeaa4d670ef4acf5818c6fe16aa0d3d0

 

Strikes me as a more stable workaround, but I'm still evaluating.

 

 

Patch:

--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -328,16 +328,17 @@
  * number of CPU cycles in 3 consecutive 24 MHz counter periods.
  */
 #define __sun50i_a64_read_reg(reg) ({                                  \
-       u64 _val;                                                       \
-       int _retries = 150;                                             \
+       u64 _old, _new;                                                 \
+       int _retries = 200;                                             \
                                                                        \
        do {                                                            \
-               _val = read_sysreg(reg);                                \
+               _old = read_sysreg(reg);                                \
+                _new = read_sysreg(reg);                               \
                _retries--;                                             \
-       } while (((_val + 1) & GENMASK(9, 0)) <= 1 && _retries);        \
+       } while (unlikely(_old != _new) && _retries);                   \
                                                                        \
        WARN_ON_ONCE(!_retries);                                        \
-       _val;                                                           \
+       _new;                                                           \
 })
 
 static u64 notrace sun50i_a64_read_cntpct_el0(void)

 

Posted
On 9/11/2019 at 9:15 PM, ams.br said:

I compiled the kernel (branch next 4.19.y) applying a patch based in https://github.com/arctype-co/linux/commit/5fcb4e57eeaa4d670ef4acf5818c6fe16aa0d3d0

 

Strikes me as a more stable workaround, but I'm still evaluating.

 

 

Patch:


--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -328,16 +328,17 @@
  * number of CPU cycles in 3 consecutive 24 MHz counter periods.
  */
 #define __sun50i_a64_read_reg(reg) ({                                  \
-       u64 _val;                                                       \
-       int _retries = 150;                                             \
+       u64 _old, _new;                                                 \
+       int _retries = 200;                                             \
                                                                        \
        do {                                                            \
-               _val = read_sysreg(reg);                                \
+               _old = read_sysreg(reg);                                \
+                _new = read_sysreg(reg);                               \
                _retries--;                                             \
-       } while (((_val + 1) & GENMASK(9, 0)) <= 1 && _retries);        \
+       } while (unlikely(_old != _new) && _retries);                   \
                                                                        \
        WARN_ON_ONCE(!_retries);                                        \
-       _val;                                                           \
+       _new;                                                           \
 })
 
 static u64 notrace sun50i_a64_read_cntpct_el0(void)

 

 

 

After 14 days, the date remains stable. The patch apparently worked

Posted
2 hours ago, langerma said:

how can i compile this in my kernel? or are there any packages?

 

I would also be very happy if I could integrate the patch into my current image!

Posted
16 hours ago, langerma said:

how can i compile this in my kernel? or are there any packages?

 

at a Linux OS (Armbian recommend Ubuntu), download the latest source of armbian build

git clone https://github.com/armbian/build.git

copy patch file arm_arch_timer.patch to folder userpatches/kernel/sunxi-next (more details in documentation). Run the compile script

./compile.sh

choose "U-boot and kernel packages", apply.

 

choose "Do not change the kernel configuration" for a standard kernel, or "Show a kernel configuration menu before compilation" for customize your kernel (recommended for advanced users), apply

 

choose your board

 

choose "next"

 

apply, and now just have a coffee and wait. After compilation deb packages generated at folder output/debs, copy the deb packages to you Armbian and run dpkg -i *.deb

 

more details at https://docs.armbian.com/Developer-Guide_Build-Preparation/

arm_arch_timer.patch

Posted
On 10/4/2019 at 4:25 AM, langerma said:

@ams.br seems like your patch works!

I was able to successfully build with the patch this time as well.   Been running solid since!  Can we get this patch submitted into official builds?

Posted
On 10/4/2019 at 11:25 AM, langerma said:

@ams.br seems like your patch works!

 

1 hour ago, markolonius said:

I was able to successfully build with the patch this time as well.   Been running solid since!  Can we get this patch submitted into official builds?

Same here, running stable for 8 days. Before the patch, usually 1 or 2 days.

Posted

Thank you ams.br for the solution.

I run 7 pine64so on the clusterboard and two of the SoCs had this problem. I would get the error almost immediately under heavy cpu usage.

Now my test SoC is running 20h straight at 99% cpu usage (Seti@home) and the date is still in 2019.

Cool stuff. thank you once again

Posted

@ams.br Allow me to join the chorus. Thank you for posting your easy-to-follow instructions. I was able to build the current Armbian kernel with your patch and install it without any issue at all. The system has been up 1 day and 11 hours. I finally took the step of installing your patch after the system time-jumped twice over the weekend.

 

I can't say for sure yet that it fixes the problem for me, but I can say that your directions made it a piece of cake to build and install, and so far, so good.


Thanks!

 

Updated 2019-11-18: My patched Pine has been running for 7 days and 12 hours now without issue, which is the longest it has ever run without experiencing the time jump.

Posted

I haven't had much time to follow-up on this.

 

FWIW the upstream patches should have included an improved version of the patch above already.  To clarify it is not a fix but rather a loop to read the hardware timer repeatedly until there are two consecutive reads that are the same.  It does add a different amount of overhead than the fix it is replacing.  For instance there must always be at least two reads.

 

I am wondering about the above solution being replaced since it is only a nine-bit mask, but it was found that this should probably be 11 bits.

 

I forget the reason that solution was not used as-is in the upstream but I do remember there was concern two reads close enough could read the same value before the hardware timer actually changed (IE:  The same bad value could be read twice consecutively)

 

Now that we're a few versions newer than the last time I checked it probably warrants another review.  FWIW I haven't seen the issue with the more recent kernels while previously I had one A64 that couldn't get through a half day.

 

Posted

I've just installed latest armbian buster today, updated, switched to the latest 5.4.2 kernel via armbian-config and installed k3s (minimal k8s distro) on it and after a few minutes, 2113!

I will try the attached patch but just in case, to me it is still unresolved in 5.4.2.

Posted
11 minutes ago, eminguez said:

Is there a nightly builds kernel repo?


There are. See armbian-config and remember that those builds are not tested.

Posted
4 hours ago, Igor said:


There are. See armbian-config and remember that those builds are not tested.

I've updated to the latest one (19.11.3.346) and I can confirm the fix is not yet there (probably by looking at the commit time and the build time it didn't entered in the build). In order to do that, I've downloaded the sources (https://apt.armbian.com/pool/main/l/linux-source-5.4.2-dev-sunxi64/linux-source-dev-sunxi64_19.11.3.346_all.deb) and the patch doesn't seem to be there (it would be nice if the debs have a changelog or if they can be shown somewhere)

I'll wait for the next build :)

Thanks!!!

Posted

I've tried to build the kernel with the current sources but it seems to fail to apply the patch. From the log file:

Processing file /root/build/patch/kernel/sunxi-dev/fix-a64-timejump.patch
1 out of 1 hunk FAILED -- saving rejects to file drivers/clocksource/arm_arch_timer.c.rej

It is just a clean git clone from the armbian/build repository and the following build flags:

/compile.sh BOARD=pine64 BRANCH=dev KERNEL_ONLY=yes KERNEL_CONFIGURE=no

The rej file:

# find ~/build/cache/sources -type f -iname "*arm_arch_timer.c.rej*"
/root/build/cache/sources/linux-mainline/orange-pi-5.4/drivers/clocksource/arm_arch_timer.c.rej

# cat /root/build/cache/sources/linux-mainline/orange-pi-5.4/drivers/clocksource/arm_arch_timer.c.rej
--- drivers/clocksource/arm_arch_timer.c
+++ drivers/clocksource/arm_arch_timer.c
@@ -342,16 +342,17 @@ static u64 notrace arm64_858921_read_cntvct_el0(void)
  * number of CPU cycles in 3 consecutive 24 MHz counter periods.
  */
 #define __sun50i_a64_read_reg(reg) ({					\
-	u64 _val;							\
-	int _retries = 150;						\
+	u64 _old, _new;                                                 \
+	int _retries = 200;						\
 									\ 									\
 	do {								\
-		_val = read_sysreg(reg);				\
+		_old = read_sysreg(reg);				\
+		_new = read_sysreg(reg);				\
 		_retries--;						\
-	} while (((_val + 1) & GENMASK(9, 0)) <= 1 && _retries);	\
+	} while (unlikely(_old != _new) && _retries);			\
 									\
 	WARN_ON_ONCE(!_retries);					\
-	_val;								\
+	_new;								\
 })
 
 static u64 notrace sun50i_a64_read_cntpct_el0(void)

IDK why this file is hosted in the `orange-pi-5.4` folder.

 

I've just in case extracted the source from the deb file and the code is not patched.

 

Any help? Thanks!!!

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines