jock Posted April 14, 2020 Posted April 14, 2020 I'm experience a really weird problem on a very experimental setup: the system is an Armbian build for an unsupported board (Rockchip RK322x) and Ubuntu Focal 20.04 with mainline kernel 5.6.4. The system has issues to boot when enters into systemd: it is slow but all of sudden it goes farther. Many different systemd services are complaining, expecially the journal and the chrony NTP client. Finally I found that the root of the problems is the system time which is abruptly moving back and forth even by months. Subsequent calls to date may report 20 June, then 4 March, then 18 August and so on... The kernel log is fine and, most of all, the very same kernel works perfectly with Debian Buster 10 and Ubuntu Eoan 19.10. Also Ubuntu Focal 20.04 works fine using the legacy rockchip 4.4.194 kernel. I also booted from a clean Ubuntu Eoan 19.10 and then upgraded to Focal via apt; the problem manifested the same way, but I got some logging errors telling me that an assertion failed on clock_gettime() system call, which does not returned 0 as expected. All the installations were Armbian Minimal, so no GUI or advanced services whatsoever. Any ideas?
Igor Posted April 14, 2020 Posted April 14, 2020 3 hours ago, jock said: Any ideas? https://armbian.atlassian.net/browse/AR-211 Its related but I don't know why this is happening ...
Werner Posted April 14, 2020 Posted April 14, 2020 Unable to reproduce on OPi1+ with 5.6.3 and Focal Desktop running. RK related?
jock Posted April 14, 2020 Author Posted April 14, 2020 Possibly, I just build for rk3288 and kernel 5.5.17: I don't have the system time jumping around, but still many systemd services have issues, don't start or keep failing. e2scrub_all and fstrim seem to be particularly affected. Running systemctl alone to show the services often instantly fails with "Failed to read server status: Connection timed out". edit: I correct myself, system time is jumping around also on rk3288
Werner Posted April 14, 2020 Posted April 14, 2020 Both state for me "loaded active waiting" so I guess I am unaffected. Other service looking good too, nothing fancy.
jock Posted April 14, 2020 Author Posted April 14, 2020 I made a strace of date utility and got this: Spoiler paolo@xt-q8l-v10:~$ strace date execve("/usr/bin/date", ["date"], 0xbef816a0 /* 24 vars */) = 0 brk(NULL) = 0x1b8f000 mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6f50000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=58027, ...}) = 0 mmap2(NULL, 58027, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb6f1a000 close(3) = 0 openat(AT_FDCWD, "/lib/arm-linux-gnueabihf/libc.so.6", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\331\252\1\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=972500, ...}) = 0 mmap2(NULL, 1038540, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6e1c000 mprotect(0xb6f04000, 65536, PROT_NONE) = 0 mmap2(0xb6f14000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xe8000) = 0xb6f14000 mmap2(0xb6f18000, 6348, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb6f18000 close(3) = 0 set_tls(0xb6f50e10) = 0 mprotect(0xb6f14000, 8192, PROT_READ) = 0 mprotect(0x4db000, 4096, PROT_READ) = 0 mprotect(0xb6f52000, 4096, PROT_READ) = 0 munmap(0xb6f1a000, 58027) = 0 brk(NULL) = 0x1b8f000 brk(0x1bb0000) = 0x1bb0000 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=3035952, ...}) = 0 mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb6c1c000 mmap2(NULL, 2596864, PROT_READ, MAP_PRIVATE, 3, 0x6c000) = 0xb69a2000 close(3) = 0 syscall_0x193(0, 0xbeac5438, 0xbeac5438, 0xfe354, 0, 0xbeac5438) = 0 openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 fstat64(3, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\1\0\0\0\0"..., 4096) = 118 _llseek(3, -62, [56], SEEK_CUR) = 0 read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\1\0\0\0\0"..., 4096) = 62 close(3) = 0 fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}) = 0 write(1, "Tue 14 Apr 2020 08:08:29 PM UTC\n", 32Tue 14 Apr 2020 08:08:29 PM UTC ) = 32 close(1) = 0 close(2) = 0 exit_group(0) = ? +++ exited with 0 +++ Now I don't see any clock_gettime, but there is a syscall_0x193, of which I found some references here. I'm compiling a kernel with vDSO disabled just to see what happens. It could be it is affecting 32 bit arm (rk322x and rk3288), but not 64 bit arm (@Werner Opi1+)
Werner Posted April 15, 2020 Posted April 15, 2020 I can test this later this day on an OPi One which features the Allwinner H3 32bit SoC.
jock Posted April 15, 2020 Author Posted April 15, 2020 Both rk3288 and rk322x now works fine with kernels (5.5 for rk3288 and 5.6 for rk322x) compiled without VDSO. Chrony still is not able to start, so it is not related to this one: uninstalling it via apt, systemd automatically resorted to systemd-timesyncd which immediately synced with remote NTP server.
Werner Posted April 15, 2020 Posted April 15, 2020 sunxi seems to be fine as well. No weird stuff happening on my OrangePi One with Focal on Linux 5.6.4
jock Posted May 2, 2020 Author Posted May 2, 2020 Cannot find any traces on the internet of public discussion, but my source says that this is an issue that people from glibc are trying to resolve, and it seems to affect all 32 bit arm systems. The culprit seems to be searched in the architected system timer initialization: if done by kernel causes the problem, done by ATF works ok.
Recommended Posts