kcn Posted December 12, 2018 Posted December 12, 2018 Hi! Periodically, time in two-three days, sharply increases CPU utilization, the system becomes disfunctional. SD card and power supply maybe not the best, but they working fine in other systems and I'm tried others. SSH hangs on login while this problem is actual, but i have a serial connection, so can investigate in real time. Please, advise, what to look when this situation came again? top - 03:02:15 up 24855 days, 3:14, 1 user, load average: 1.75, 1.77, 1.93 Tasks: 125 total, 3 running, 70 sleeping, 0 stopped, 8 zombie %Cpu(s): 17.4 us, 26.6 sy, 0.0 ni, 56.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 505152 total, 9724 free, 144028 used, 351400 buff/cache KiB Swap: 1301132 total, 1299340 free, 1792 used. 343236 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 20 0 26012 3040 1304 R 100.0 0.6 46:55.28 systemd 851 root 20 0 6128 1368 948 R 75.2 0.3 34:13.74 systemd-lo+ 8668 root 20 0 7044 2608 2136 R 1.3 0.5 0:00.15 top 7 root 20 0 0 0 0 S 0.3 0.0 14:28.83 ksoftirqd/0 8 root 20 0 0 0 0 I 0.3 0.0 1:21.03 rcu_sched 2 root 20 0 0 0 0 S 0.0 0.0 0:00.05 kthreadd root@pione:/media/data/log# systemctl reboot Failed to set wall message, ignoring: Connection timed out Failed to reboot system via logind: Connection timed out Failed to start reboot.target: Connection timed out See system logs and 'systemctl status reboot.target' for details. root@pione:/media/data/log# root@pione:/media/data/log# systemctl status reboot.target Failed to get properties: Connection timed out root@pione:/media/data/log# root@pione:/media/data/log# reboot root@pione:/media/data/log# root@pione:/media/data/log# reboot -f Failed to read reboot parameter file: No such file or directory Rebooting. [141846.733884] reboot: Restarting system ▒▒▒▒▒▒ U-Boot SPL 2018.05-armbian (Aug 19 2018 - 17:07:52 +0200) DRAM: 512 MiB Trying to boot from MMC1 System log: Spoiler Dec 06 13:00:01 pione CRON[8463]: pam_unix(cron:session): session opened for use Dec 06 13:00:01 pione CRON[8464]: (root) CMD (/usr/lib/armbian/armbian-truncate- Dec 06 13:05:01 pione CRON[8470]: pam_unix(cron:session): session opened for use Dec 06 13:05:01 pione CRON[8471]: (root) CMD (command -v debian-sa1 > /dev/null Dec 06 13:05:01 pione CRON[8470]: pam_unix(cron:session): session closed for use Dec 20 02:17:05 pione kernel: INFO: rcu_sched detected stalls on CPUs/tasks: Dec 20 02:17:05 pione kernel: 0-...: (1 GPs behind) idle=92e/1/0 softirq Dec 20 02:17:05 pione kernel: (detected by 2, t=116210967 jiffies, g=159 Dec 20 02:17:05 pione kernel: Sending NMI from CPU 2 to CPUs 0: Dec 20 02:17:05 pione kernel: NMI backtrace for cpu 0 Dec 20 02:17:05 pione kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.65- Dec 20 02:17:05 pione kernel: Hardware name: Allwinner sun8i Family Dec 20 02:17:05 pione kernel: task: c0d07780 task.stack: c0d00000 Dec 20 02:17:05 pione kernel: PC is at __slab_free+0x11e/0x224 Dec 20 02:17:05 pione kernel: LR is at __slab_free+0x11b/0x224 Dec 20 02:17:05 pione kernel: pc : [<c02185ee>] lr : [<c02185eb>] psr: 600 Dec 20 02:17:05 pione kernel: sp : c0d01cc0 ip : 00000000 fp : d92e6fc0 Dec 20 02:17:05 pione kernel: r10: 00000001 r9 : 600f0113 r8 : df587b00 Dec 20 02:17:05 pione kernel: r7 : 00000001 r6 : 00000000 r5 : dfee3858 r4 : Dec 20 02:17:05 pione kernel: r3 : 00000001 r2 : 00008100 r1 : dfee3858 r0 : Dec 20 02:17:05 pione kernel: Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA Th Dec 20 02:17:05 pione kernel: Control: 50c5387d Table: 5eb5406a DAC: 00000051 Dec 20 02:17:05 pione kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.65- Dec 20 02:17:05 pione kernel: Hardware name: Allwinner sun8i Family Dec 20 02:17:05 pione kernel: [<c010dacd>] (unwind_backtrace) from [<c010a0b5>] Dec 20 02:17:05 pione kernel: [<c010a0b5>] (show_stack) from [<c086b7fd>] (dump_ Dec 20 02:17:05 pione kernel: [<c086b7fd>] (dump_stack) from [<c086f5a7>] (nmi_c Dec 20 02:17:05 pione kernel: [<c086f5a7>] (nmi_cpu_backtrace) from [<c010c9c1>] Dec 20 02:17:05 pione kernel: [<c010c9c1>] (handle_IPI) from [<c01013e3>] (gic_h Dec 20 02:17:05 pione kernel: [<c01013e3>] (gic_handle_irq) from [<c010a9e5>] (_ Dec 20 02:17:05 pione kernel: Exception stack(0xc0d01c70 to 0xc0d01cb8) Dec 20 02:17:05 pione kernel: 1c60: 00000000 Dec 20 02:17:05 pione kernel: 1c80: 00000000 dfee3858 00000000 00000001 df587b00 Dec 20 02:17:05 pione kernel: 1ca0: 00000000 c0d01cc0 c02185eb c02185ee 600f0133 Dec 20 02:17:05 pione kernel: [<c010a9e5>] (__irq_svc) from [<c02185ee>] (__slab Dec 20 02:17:05 pione kernel: [<c02185ee>] (__slab_free) from [<c021888d>] (kmem Dec 20 02:17:05 pione kernel: [<c021888d>] (kmem_cache_free) from [<c06a6cc7>] ( Dec 20 02:17:05 pione kernel: [<c06a6cc7>] (stmmac_tx_clean) from [<c06a6ee1>] (
Tido Posted December 12, 2018 Posted December 12, 2018 armbianmonitor -u tells the reader all the details about ur Installation, like Kernel Version and much more // sent from mobile phone //
kcn Posted December 12, 2018 Author Posted December 12, 2018 armbianmonitor -u won't work: pione:~$ sudo armbianmonitor -u System diagnosis information will now be uploaded to /usr/bin/armbianmonitor: line 831: [: -gt: unary operator expected Please post the URL in the forum where you've been asked for. I manualy uploaded some armbianmonitor logs: https://pastebin.com/VGrMddE5
rufik Posted December 12, 2018 Posted December 12, 2018 Check your cpufreq config for bad entry 408 mhz instead of 480 mhz.
Tido Posted December 12, 2018 Posted December 12, 2018 Do you have bluetooth or WiFi connected or loaded ? Can you do: lsmod
kcn Posted December 13, 2018 Author Posted December 13, 2018 rufik, where to find this config? $ cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq 240000 Tido, there are no wireless adapters connected. Only external USB hard drive attached. $ lsmod Module Size Used by evdev 20480 0 snd_soc_hdmi_codec 16384 1 rc_cec 16384 0 dw_hdmi_i2s_audio 16384 0 dw_hdmi_cec 16384 0 ip6table_filter 16384 0 ip6_tables 20480 1 ip6table_filter lz4 16384 20 sun8i_dw_hdmi 16384 0 lz4_compress 53248 1 lz4 dw_hdmi 28672 2 dw_hdmi_i2s_audio,sun8i_dw_hdmi cec 40960 2 dw_hdmi_cec,dw_hdmi sun4i_i2s 16384 2 sun8i_codec_analog 24576 0 snd_soc_simple_card 16384 0 snd_soc_simple_card_utils 16384 1 snd_soc_simple_card sun4i_gpadc_iio 16384 0 snd_soc_core 118784 5 sun4i_i2s,sun8i_codec_analog,snd_soc_hdmi_codec,snd_soc_simple_card_utils,snd_soc_simple_card snd_pcm_dmaengine 16384 1 snd_soc_core snd_pcm 65536 4 sun4i_i2s,snd_pcm_dmaengine,snd_soc_hdmi_codec,snd_soc_core sun8i_mixer 16384 0 snd_timer 24576 1 snd_pcm sun4i_tcon 20480 1 sun8i_dw_hdmi snd 45056 4 snd_soc_hdmi_codec,snd_timer,snd_soc_core,snd_pcm xt_nat 16384 1 soundcore 16384 1 snd zram 24576 5 uio_pdrv_genirq 16384 0 xt_tcpudp 16384 1 sun4i_drm 16384 0 uio 16384 1 uio_pdrv_genirq iptable_nat 16384 1 nf_conntrack_ipv4 16384 2 nf_defrag_ipv4 16384 1 nf_conntrack_ipv4 nf_nat_ipv4 16384 1 iptable_nat nf_nat 24576 2 xt_nat,nf_nat_ipv4 nf_conntrack 81920 4 xt_nat,nf_conntrack_ipv4,nf_nat_ipv4,nf_nat iptable_filter 16384 0 ip_tables 20480 2 iptable_filter,iptable_nat x_tables 20480 6 xt_nat,ip_tables,iptable_filter,xt_tcpudp,ip6table_filter,ip6_tables uas 20480 0
WarHawk_AVG Posted December 14, 2018 Posted December 14, 2018 What is the settings of your /etc/default/cpufrequtils Change GOVERNOR=ondemand to interactive or conservative Then reboot I also believe you can run (as root) # /etc/init.d/cpufrequtils restart # /etc/init.d/cpufrequtils restart or # service cpufrequtils restart I have mine on conservative...this way it ramps up and ramps down rather than just jumps to the frequency Ah found it... https://docs.fedoraproject.org/en-US/Fedora/15/html/Power_Management_Guide/cpufreq_governors.html not "armbian" perse, but the information is accurate Quote 3.2.1. CPUfreq Governor Types This section lists and describes the different types of CPUfreq governors available in Fedora 15. cpufreq_performance The Performance governor forces the CPU to use the highest possible clock frequency. This frequency will be statically set, and will not change. As such, this particular governor offers no power saving benefit. It is only suitable for hours of heavy workload, and even then only during times wherein the CPU is rarely (or never) idle. cpufreq_powersave By contrast, the Powersave governor forces the CPU to use the lowest possible clock frequency. This frequency will be statically set, and will not change. As such, this particular governor offers maximum power savings, but at the cost of the lowest CPU performance. The term "powersave" can sometimes be deceiving, though, since (in principle) a slow CPU on full load consumes more power than a fast CPU that is not loaded. As such, while it may be advisable to set the CPU to use the Powersave governor during times of expected low activity, any unexpected high loads during that time can cause the system to actually consume more power. The Powersave governor is, in simple terms, more of a "speed limiter" for the CPU than a "power saver". It is most useful in systems and environments where overheating can be a problem. cpufreq_ondemand The Ondemand governor is a dynamic governor that allows the CPU to achieve maximum clock frequency when system load is high, and also minimum clock frequency when the system is idle. While this allows the system to adjust power consumption accordingly with respect to system load, it does so at the expense of latency between frequency switching. As such, latency can offset any performance/power saving benefits offered by the Ondemand governor if the system switches between idle and heavy workloads too often. For most systems, the Ondemand governor can provide the best compromise between heat emission, power consumption, performance, and manageability. When the system is only busy at specific times of the day, the Ondemand governor will automatically switch between maximum and minimum frequency depending on the load without any further intervention. cpufreq_userspace The Userspace governor allows userspace programs (or any process running as root) to set the frequency. This governor is normally used in conjunction with the cpuspeed daemon. Of all the governors, Userspace is the most customizable; and depending on how it is configured, it can offer the best balance between performance and consumption for your system. cpufreq_conservative Like the Ondemand governor, the Conservative governor also adjusts the clock frequency according to usage (like the Ondemand governor). However, while the Ondemand governor does so in a more aggressive manner (that is from maximum to minimum and back), the Conservative governor switches between frequencies more gradually. This means that the Conservative governor will adjust to a clock frequency that it deems fitting for the load, rather than simply choosing between maximum and minimum. While this can possibly provide significant savings in power consumption, it does so at an ever greater latency than the Ondemand governor. Better yet...armbian docs and further really good forum entries
tkaiser Posted December 14, 2018 Posted December 14, 2018 On 12/12/2018 at 2:04 PM, kcn said: I manualy uploaded some armbianmonitor logs: https://pastebin.com/VGrMddE5 ### boot environment: # $OpenBSD: sshd_config,v 1.100 2016/08/15 12:32:04 naddy Exp $ # This is the sshd server system-wide configuration file. See # sshd_config(5) for more information. # This sshd was compiled with PAT usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u,0x0bc2:0x2323:u Smells like filesystem corruption. If the contents of /boot/armbianEnv.txt really look like this garbage you should immediately check installation integrity using armbianmonitor -v.
kcn Posted December 17, 2018 Author Posted December 17, 2018 Indeed, armbianmonitor -v confirmed that the file system is damaged: pione:~$ sudo armbianmonitor -v Starting package integrity check. This might take some time. Be patient please... It appears you may have corrupt packages. This is usually a symptom of filesystem corruption caused by SD cards or eMMC dying or burning the OS image to the installation media went wrong. The following changes from packaged state files were detected: /var/lib/rpimonitor/updatestatus.txt Thanks
kcn Posted December 21, 2018 Author Posted December 21, 2018 Reinstalled with new image, modified cpufrequtils (MIN_SPEED=480000, GOVERNOR=conservative). This did not help. Filesystem is ok: pione:~$ sudo armbianmonitor -v Starting package integrity check. This might take some time. Be patient please... It appears you don't have any corrupt files or packages! Now trying minimal setup with only minidlna installed...
WarHawk_AVG Posted December 21, 2018 Posted December 21, 2018 What does journalctl look like $sudo journalctl https://www.digitalocean.com/community/tutorials/how-to-use-systemctl-to-manage-systemd-services-and-units From the snip of top, it also shows systemd -lo+ high as well Might need to see what in systemd is running so hard and shut it off have you done a apt-get update and apt-get upgrade yet?
kcn Posted December 21, 2018 Author Posted December 21, 2018 There are two log files attached: current_journal.log contains current logs since last spontaneous reboot this night. This time, there was no anomaly CPU utilization yet (system installed last evening). Second file - old_journal.log is from previous setup, when issue occured many times. Manual for systemd management is very good, but when issue happens, systemd becomes uncontrollable. Any command looks like this: # systemctl reboot Failed to set wall message, ignoring: Connection timed out Failed to reboot system via logind: Connection timed out Failed to start reboot.target: Connection timed out I think to try Armbian with legacy kernel. I have an OrangePi Lite working stable with such version. old_journal.log current_journal.log
Tido Posted December 21, 2018 Posted December 21, 2018 20 minutes ago, kcn said: systemctl reboot I know many commands, but never use this one. However, if nothing works... there is ONE left: reisub http://blog.kember.net/articles/reisub-the-gentle-linux-restart/
kcn Posted December 21, 2018 Author Posted December 21, 2018 Tido, this was just for example. Any other command with systemctl like status, restart, etc, outputs "Connection timed out". For system restart helps "reboot -f". But sometimes even serial console hangs, then only power cycle.
Nuha Arina Rafiuddin Posted December 25, 2018 Posted December 25, 2018 Quote Just chiming in because armbianEnv corruption happen to me too.
kcn Posted February 1, 2019 Author Posted February 1, 2019 After gradual installation of packages on clean image the system works stable. The problem is solved.
Recommended Posts