Superkoning Posted August 22, 2017 Posted August 22, 2017 Hi, When I put a heavy load on my NanoPi NEO2, it locks up. The ethernet light keeps flashing, so there is still something alive? After a power reset, the NEO2 works again. I've already put a CPU fan on my CPU. CPU temp is at 25 degrees Celsius in rest, and goes up to 37 degrees Celsius under the heavy load. The heavy load is a "make -j4" of a C source code project Tips how to solve this? sander@nanopineo2:~$ while true; do date ; uptime ; cat /etc/armbianmonitor/datasources/soctemp; sleep 2; done | tee mijn-log.txt Tue Aug 22 11:46:37 UTC 2017 11:46:37 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 28342 Tue Aug 22 11:46:39 UTC 2017 11:46:39 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 28584 Tue Aug 22 11:46:41 UTC 2017 11:46:41 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 28463 Tue Aug 22 11:46:43 UTC 2017 11:46:43 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 28705 Tue Aug 22 11:46:45 UTC 2017 11:46:45 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 28827 Tue Aug 22 11:46:47 UTC 2017 11:46:47 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 28463 Tue Aug 22 11:46:49 UTC 2017 11:46:49 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 30281 Tue Aug 22 11:46:51 UTC 2017 11:46:51 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 28584 Tue Aug 22 11:46:53 UTC 2017 11:46:53 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 30402 Tue Aug 22 11:46:55 UTC 2017 11:46:55 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 28827 Tue Aug 22 11:46:57 UTC 2017 11:46:57 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 28705 Tue Aug 22 11:46:59 UTC 2017 11:46:59 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 28705 Tue Aug 22 11:47:01 UTC 2017 11:47:01 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 28463 Tue Aug 22 11:47:03 UTC 2017 11:47:03 up 5:24, 2 users, load average: 0.00, 0.00, 0.00 28342 Tue Aug 22 11:47:05 UTC 2017 11:47:05 up 5:25, 2 users, load average: 0.00, 0.00, 0.00 29917 Tue Aug 22 11:47:07 UTC 2017 11:47:07 up 5:25, 2 users, load average: 0.08, 0.02, 0.01 29069 Tue Aug 22 11:47:09 UTC 2017 11:47:09 up 5:25, 2 users, load average: 0.08, 0.02, 0.01 33431 Tue Aug 22 11:47:11 UTC 2017 11:47:11 up 5:25, 2 users, load average: 0.47, 0.10, 0.03 36945 Tue Aug 22 11:47:13 UTC 2017 11:47:14 up 5:25, 2 users, load average: 0.47, 0.10, 0.03 37187 Tue Aug 22 11:47:25 UTC 2017 ... and then nothing more
Naguissa Posted August 22, 2017 Posted August 22, 2017 Maybe power issue. Can you check with other PSU?Enviado desde mi Jolla mediante Tapatalk
Superkoning Posted August 22, 2017 Author Posted August 22, 2017 4 hours ago, Naguissa said: Maybe power issue. Can you check with other PSU? Enviado desde mi Jolla mediante Tapatalk I tried that, and re-tested, and it looked a bit better, but it seems the problem is different than I described: the NEO2 is still alive, it seems. Instead of a print each second, it now prints once somewhere in x minutes. 'uptime' shows a load aka waiting queue of 32 processes. The NEO2 is still ping-able (1-2 ms). But a new ssh connection times out. So extremely overloaded, but still alive So not a PSU problem, I would say? di 22 aug 2017 17:43:42 UTC 17:43:47 up 24 min, 2 users, load average: 6,10, 5,01, 2,51 30523 di 22 aug 2017 17:43:56 UTC 17:44:01 up 25 min, 2 users, load average: 6,14, 5,07, 2,58 30281 di 22 aug 2017 17:44:07 UTC 17:44:12 up 25 min, 2 users, load average: 6,12, 5,10, 2,61 30402 di 22 aug 2017 17:44:22 UTC 17:53:42 up 35 min, 2 users, load average: 15,02, 12,76, 7,83 33310 di 22 aug 2017 18:03:55 UTC 18:08:01 up 49 min, 2 users, load average: 20,89, 19,22, 14,30 31129 di 22 aug 2017 18:08:09 UTC 18:08:20 up 49 min, 2 users, load average: 20,04, 19,20, 14,51 31250 di 22 aug 2017 18:10:22 UTC 18:20:03 up 1:01, 2 users, load average: 24,44, 23,30, 19,40 31129 di 22 aug 2017 18:27:27 UTC 18:31:13 up 1:12, 2 users, load average: 26,97, 26,43, 22,89 33067 di 22 aug 2017 18:40:02 UTC
Superkoning Posted August 23, 2017 Author Posted August 23, 2017 Update from my side: With the "stress" command starting a lot of CPU processes, I can NOT get a lockup; CPU load very high, but it keeps working. So not a CPU problem after all? So my next guess: a disk problem. As a first test, I disabled my swap space. A new "make -j4" did NOT result in a lock-up, but in "virtual memory exhausted: Cannot allocate memory". Maybe that's better than a lockup ... :-) To be continued.
Naguissa Posted August 23, 2017 Posted August 23, 2017 Update from my side: With the "stress" command starting a lot of CPU processes, I can NOT get a lockup; CPU load very high, but it keeps working. So not a CPU problem after all? So my next guess: a disk problem. As a first test, I disabled my swap space. A new "make -j4" did NOT result in a lock-up, but in "virtual memory exhausted: Cannot allocate memory". Maybe that's better than a lockup ... :-) To be continued.Maybe, so much tasks using SD could lock SD. You could try using an USB as swap....Enviado desde mi Jolla mediante Tapatalk
Menion Posted August 25, 2017 Posted August 25, 2017 I have the same identical problem with my OrangePi PC2 (AllWinner H5) The swap is just too slow so the kswapd0 kernel process get all the CPU power stalling everything in unser space. I don't know if it is just because the swap medium (SD or USB in my case) is too slow, or if there is some bugs (or wrong irq affinity)
Superkoning Posted August 26, 2017 Author Posted August 26, 2017 I've put both my swap file and my git project on the USB drive. I deactivated the swapfile on the SD card. And still my system stalls with a "make -j4". I can't see anything special in the monitoring output: - swap is only used for 14MB - cpu is 98% idle (!), whereas cc1plus processes are using cpu cores for 70-90% each - kswap is at 26% cpu usage ... that might be high za 26 aug 2017 7:21:12 UTC 07:21:12 up 1 day, 18:45, 2 users, load average: 2,09, 1,50, 1,02 37672 total used free shared buff/cache available Mem: 483 426 6 1 50 33 Swap: 1023 14 1009 top - 07:21:12 up 1 day, 18:45, 2 users, load average: 2,09, 1,50, 1,02 Tasks: 125 total, 5 running, 120 sleeping, 0 stopped, 0 zombie %Cpu(s): 0,1 us, 0,2 sy, 0,0 ni, 98,9 id, 0,7 wa, 0,0 hi, 0,0 si, 0,0 st KiB Mem : 495440 total, 7208 free, 450112 used, 38120 buff/cache KiB Swap: 1048572 total, 1033972 free, 14600 used. 20772 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 30720 sander 20 0 135260 116048 10588 R 90,0 23,4 0:06.45 cc1plus 30721 sander 20 0 133224 115304 10764 R 90,0 23,3 0:06.44 cc1plus 30722 sander 20 0 131196 112388 10748 R 80,0 22,7 0:06.19 cc1plus 30719 sander 20 0 133228 115736 10720 R 75,0 23,4 0:06.41 cc1plus 94 root 20 0 0 0 0 S 25,0 0,0 14:22.82 kswapd0 30763 sander 20 0 7352 3108 2648 R 10,0 0,6 0:00.04 top 8 root 20 0 0 0 0 S 5,0 0,0 0:02.97 rcu_sched PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 94 root 20 0 0 0 0 R 26,3 0,0 14:22.87 kswapd0
Recommended Posts