1 1
kcn

systemd 100% CPU on OrangePi One

Recommended Posts

Hi!

Periodically, time in two-three days, sharply increases CPU utilization, the system becomes disfunctional.

pione_cpu-day4.png
SD card and power supply maybe not the best, but they working fine in other systems and I'm tried others.
SSH hangs on login while this problem is actual, but i have a serial connection, so can investigate in real time.
Please, advise, what to look when this situation came again?

top - 03:02:15 up 24855 days,  3:14,  1 user,  load average: 1.75, 1.77, 1.93
Tasks: 125 total,   3 running,  70 sleeping,   0 stopped,   8 zombie
%Cpu(s): 17.4 us, 26.6 sy,  0.0 ni, 56.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :   505152 total,     9724 free,   144028 used,   351400 buff/cache
KiB Swap:  1301132 total,  1299340 free,     1792 used.   343236 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
    1 root      20   0   26012   3040   1304 R 100.0  0.6  46:55.28 systemd
  851 root      20   0    6128   1368    948 R  75.2  0.3  34:13.74 systemd-lo+
 8668 root      20   0    7044   2608   2136 R   1.3  0.5   0:00.15 top
    7 root      20   0       0      0      0 S   0.3  0.0  14:28.83 ksoftirqd/0
    8 root      20   0       0      0      0 I   0.3  0.0   1:21.03 rcu_sched
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.05 kthreadd

root@pione:/media/data/log# systemctl reboot
Failed to set wall message, ignoring: Connection timed out
Failed to reboot system via logind: Connection timed out
Failed to start reboot.target: Connection timed out
See system logs and 'systemctl status reboot.target' for details.
root@pione:/media/data/log#
root@pione:/media/data/log# systemctl status reboot.target
Failed to get properties: Connection timed out
root@pione:/media/data/log#
root@pione:/media/data/log# reboot
root@pione:/media/data/log#
root@pione:/media/data/log# reboot -f
Failed to read reboot parameter file: No such file or directory
Rebooting.
[141846.733884] reboot: Restarting system
▒▒▒▒▒▒
U-Boot SPL 2018.05-armbian (Aug 19 2018 - 17:07:52 +0200)
DRAM: 512 MiB
Trying to boot from MMC1

System log:

Spoiler

Dec 06 13:00:01 pione CRON[8463]: pam_unix(cron:session): session opened for use
Dec 06 13:00:01 pione CRON[8464]: (root) CMD (/usr/lib/armbian/armbian-truncate-
Dec 06 13:05:01 pione CRON[8470]: pam_unix(cron:session): session opened for use
Dec 06 13:05:01 pione CRON[8471]: (root) CMD (command -v debian-sa1 > /dev/null
Dec 06 13:05:01 pione CRON[8470]: pam_unix(cron:session): session closed for use
Dec 20 02:17:05 pione kernel: INFO: rcu_sched detected stalls on CPUs/tasks:
Dec 20 02:17:05 pione kernel:         0-...: (1 GPs behind) idle=92e/1/0 softirq
Dec 20 02:17:05 pione kernel:         (detected by 2, t=116210967 jiffies, g=159
Dec 20 02:17:05 pione kernel: Sending NMI from CPU 2 to CPUs 0:
Dec 20 02:17:05 pione kernel: NMI backtrace for cpu 0
Dec 20 02:17:05 pione kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.65-
Dec 20 02:17:05 pione kernel: Hardware name: Allwinner sun8i Family
Dec 20 02:17:05 pione kernel: task: c0d07780 task.stack: c0d00000
Dec 20 02:17:05 pione kernel: PC is at __slab_free+0x11e/0x224
Dec 20 02:17:05 pione kernel: LR is at __slab_free+0x11b/0x224
Dec 20 02:17:05 pione kernel: pc : [<c02185ee>]    lr : [<c02185eb>]    psr: 600
Dec 20 02:17:05 pione kernel: sp : c0d01cc0  ip : 00000000  fp : d92e6fc0
Dec 20 02:17:05 pione kernel: r10: 00000001  r9 : 600f0113  r8 : df587b00
Dec 20 02:17:05 pione kernel: r7 : 00000001  r6 : 00000000  r5 : dfee3858  r4 :
Dec 20 02:17:05 pione kernel: r3 : 00000001  r2 : 00008100  r1 : dfee3858  r0 :
Dec 20 02:17:05 pione kernel: Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA Th
Dec 20 02:17:05 pione kernel: Control: 50c5387d  Table: 5eb5406a  DAC: 00000051
Dec 20 02:17:05 pione kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.65-
Dec 20 02:17:05 pione kernel: Hardware name: Allwinner sun8i Family
Dec 20 02:17:05 pione kernel: [<c010dacd>] (unwind_backtrace) from [<c010a0b5>]
Dec 20 02:17:05 pione kernel: [<c010a0b5>] (show_stack) from [<c086b7fd>] (dump_
Dec 20 02:17:05 pione kernel: [<c086b7fd>] (dump_stack) from [<c086f5a7>] (nmi_c
Dec 20 02:17:05 pione kernel: [<c086f5a7>] (nmi_cpu_backtrace) from [<c010c9c1>]
Dec 20 02:17:05 pione kernel: [<c010c9c1>] (handle_IPI) from [<c01013e3>] (gic_h
Dec 20 02:17:05 pione kernel: [<c01013e3>] (gic_handle_irq) from [<c010a9e5>] (_
Dec 20 02:17:05 pione kernel: Exception stack(0xc0d01c70 to 0xc0d01cb8)
Dec 20 02:17:05 pione kernel: 1c60:                                     00000000
Dec 20 02:17:05 pione kernel: 1c80: 00000000 dfee3858 00000000 00000001 df587b00
Dec 20 02:17:05 pione kernel: 1ca0: 00000000 c0d01cc0 c02185eb c02185ee 600f0133
Dec 20 02:17:05 pione kernel: [<c010a9e5>] (__irq_svc) from [<c02185ee>] (__slab
Dec 20 02:17:05 pione kernel: [<c02185ee>] (__slab_free) from [<c021888d>] (kmem
Dec 20 02:17:05 pione kernel: [<c021888d>] (kmem_cache_free) from [<c06a6cc7>] (
Dec 20 02:17:05 pione kernel: [<c06a6cc7>] (stmmac_tx_clean) from [<c06a6ee1>] (

 

Share this post


Link to post
Share on other sites

armbianmonitor -u won't work:

pione:~$ sudo armbianmonitor -u
System diagnosis information will now be uploaded to /usr/bin/armbianmonitor: line 831: [: -gt: unary operator expected
Please post the URL in the forum where you've been asked for.

I manualy uploaded some armbianmonitor logs: https://pastebin.com/VGrMddE5

Share this post


Link to post
Share on other sites

rufik, where to find this config?

$ cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq
240000

 

Tido, there are no wireless adapters connected. Only external USB hard drive attached.

$ lsmod
Module                  Size  Used by
evdev                  20480  0
snd_soc_hdmi_codec     16384  1
rc_cec                 16384  0
dw_hdmi_i2s_audio      16384  0
dw_hdmi_cec            16384  0
ip6table_filter        16384  0
ip6_tables             20480  1 ip6table_filter
lz4                    16384  20
sun8i_dw_hdmi          16384  0
lz4_compress           53248  1 lz4
dw_hdmi                28672  2 dw_hdmi_i2s_audio,sun8i_dw_hdmi
cec                    40960  2 dw_hdmi_cec,dw_hdmi
sun4i_i2s              16384  2
sun8i_codec_analog     24576  0
snd_soc_simple_card    16384  0
snd_soc_simple_card_utils    16384  1 snd_soc_simple_card
sun4i_gpadc_iio        16384  0
snd_soc_core          118784  5 sun4i_i2s,sun8i_codec_analog,snd_soc_hdmi_codec,snd_soc_simple_card_utils,snd_soc_simple_card
snd_pcm_dmaengine      16384  1 snd_soc_core
snd_pcm                65536  4 sun4i_i2s,snd_pcm_dmaengine,snd_soc_hdmi_codec,snd_soc_core
sun8i_mixer            16384  0
snd_timer              24576  1 snd_pcm
sun4i_tcon             20480  1 sun8i_dw_hdmi
snd                    45056  4 snd_soc_hdmi_codec,snd_timer,snd_soc_core,snd_pcm
xt_nat                 16384  1
soundcore              16384  1 snd
zram                   24576  5
uio_pdrv_genirq        16384  0
xt_tcpudp              16384  1
sun4i_drm              16384  0
uio                    16384  1 uio_pdrv_genirq
iptable_nat            16384  1
nf_conntrack_ipv4      16384  2
nf_defrag_ipv4         16384  1 nf_conntrack_ipv4
nf_nat_ipv4            16384  1 iptable_nat
nf_nat                 24576  2 xt_nat,nf_nat_ipv4
nf_conntrack           81920  4 xt_nat,nf_conntrack_ipv4,nf_nat_ipv4,nf_nat
iptable_filter         16384  0
ip_tables              20480  2 iptable_filter,iptable_nat
x_tables               20480  6 xt_nat,ip_tables,iptable_filter,xt_tcpudp,ip6table_filter,ip6_tables
uas                    20480  0

 

Share this post


Link to post
Share on other sites

What is the settings of your /etc/default/cpufrequtils

 

Change GOVERNOR=ondemand to interactive or conservative

Then reboot
I also believe you can run (as root)

# /etc/init.d/cpufrequtils restart

# /etc/init.d/cpufrequtils restart

or

# service cpufrequtils restart

I have mine on conservative...this way it ramps up and ramps down rather than just jumps to the frequency

 

Ah found it...

https://docs.fedoraproject.org/en-US/Fedora/15/html/Power_Management_Guide/cpufreq_governors.html

not "armbian" perse, but the information is accurate

Quote

3.2.1. CPUfreq Governor Types

This section lists and describes the different types of CPUfreq governors available in Fedora 15.

 

cpufreq_performance

The Performance governor forces the CPU to use the highest possible clock frequency. This frequency will be statically set, and will not change. As such, this particular governor offers no power saving benefit. It is only suitable for hours of heavy workload, and even then only during times wherein the CPU is rarely (or never) idle.

 

cpufreq_powersave

By contrast, the Powersave governor forces the CPU to use the lowest possible clock frequency. This frequency will be statically set, and will not change. As such, this particular governor offers maximum power savings, but at the cost of the lowest CPU performance.

The term "powersave" can sometimes be deceiving, though, since (in principle) a slow CPU on full load consumes more power than a fast CPU that is not loaded. As such, while it may be advisable to set the CPU to use the Powersave governor during times of expected low activity, any unexpected high loads during that time can cause the system to actually consume more power.

The Powersave governor is, in simple terms, more of a "speed limiter" for the CPU than a "power saver". It is most useful in systems and environments where overheating can be a problem.

 

cpufreq_ondemand

The Ondemand governor is a dynamic governor that allows the CPU to achieve maximum clock frequency when system load is high, and also minimum clock frequency when the system is idle. While this allows the system to adjust power consumption accordingly with respect to system load, it does so at the expense of latency between frequency switching. As such, latency can offset any performance/power saving benefits offered by the Ondemand governor if the system switches between idle and heavy workloads too often.

For most systems, the Ondemand governor can provide the best compromise between heat emission, power consumption, performance, and manageability. When the system is only busy at specific times of the day, the Ondemand governor will automatically switch between maximum and minimum frequency depending on the load without any further intervention.

 

cpufreq_userspace

The Userspace governor allows userspace programs (or any process running as root) to set the frequency. This governor is normally used in conjunction with the cpuspeed daemon. Of all the governors, Userspace is the most customizable; and depending on how it is configured, it can offer the best balance between performance and consumption for your system.

 

cpufreq_conservative

Like the Ondemand governor, the Conservative governor also adjusts the clock frequency according to usage (like the Ondemand governor). However, while the Ondemand governor does so in a more aggressive manner (that is from maximum to minimum and back), the Conservative governor switches between frequencies more gradually.

This means that the Conservative governor will adjust to a clock frequency that it deems fitting for the load, rather than simply choosing between maximum and minimum. While this can possibly provide significant savings in power consumption, it does so at an ever greater latency than the Ondemand governor.

 

Better yet...armbian docs and further really good forum entries

 

 

Share this post


Link to post
Share on other sites
On 12/12/2018 at 2:04 PM, kcn said:

I manualy uploaded some armbianmonitor logs: https://pastebin.com/VGrMddE5

### boot environment:
 
#   $OpenBSD: sshd_config,v 1.100 2016/08/15 12:32:04 naddy Exp $
# This is the sshd server system-wide configuration file.  See
# sshd_config(5) for more information.
# This sshd was compiled with PAT
usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u,0x0bc2:0x2323:u

Smells like filesystem corruption. If the contents of /boot/armbianEnv.txt really look like this garbage you should immediately check installation integrity using armbianmonitor -v.

Share this post


Link to post
Share on other sites

Indeed, armbianmonitor -v confirmed that the file system is damaged:

pione:~$ sudo armbianmonitor -v
Starting package integrity check. This might take some time. Be patient please...
It appears you may have corrupt packages.
This is usually a symptom of filesystem corruption caused by SD cards or eMMC
dying or burning the OS image to the installation media went wrong.
The following changes from packaged state files were detected:
/var/lib/rpimonitor/updatestatus.txt

Thanks

Share this post


Link to post
Share on other sites

Reinstalled with new image, modified cpufrequtils (MIN_SPEED=480000, GOVERNOR=conservative). This did not help.

pione_cpu-day.png

Filesystem is ok:

pione:~$ sudo armbianmonitor -v
Starting package integrity check. This might take some time. Be patient please...
It appears you don't have any corrupt files or packages!

Now trying minimal setup with only minidlna installed...

Share this post


Link to post
Share on other sites

There are two log files attached: current_journal.log contains current logs since last spontaneous reboot this night. This time, there was no anomaly CPU utilization yet (system installed last evening). Second file - old_journal.log is from previous setup, when issue occured many times.

Manual for systemd management is very good, but when issue happens, systemd becomes uncontrollable. Any command looks like this:

# systemctl reboot
Failed to set wall message, ignoring: Connection timed out
Failed to reboot system via logind: Connection timed out
Failed to start reboot.target: Connection timed out

I think to try Armbian with legacy kernel. I have an OrangePi Lite working stable with such version.

old_journal.log

current_journal.log

Share this post


Link to post
Share on other sites

Tido, this was just for example. Any other command with systemctl like status, restart, etc, outputs "Connection timed out".

For system restart helps "reboot -f". But sometimes even serial console hangs, then only power cycle.

Share this post


Link to post
Share on other sites

After gradual installation of packages on clean image the system works stable.  The problem is solved.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
1 1