Jump to content

Recommended Posts

Posted
1 hour ago, ag123 said:

ok testing out desktop x-windows typing from armbian ...   nightly upgraded to 5.60 kernel

testing  testing  testing.. :) 

You need:

  • a github account
  • a bit of time
  • a SBC armbian supports
git clone https://github.com/armbian/testings
cd testings
./createreport.sh

you don't even need to fork the repo in the browser, we'll do it for you from command line.. :P Same for the PR, everything is done from console.

 

4 hours ago, botfap said:

DNS servers have been changed today, probably just waiting for propagation although it should have happened by now.

is back now even for google users... 

Posted
17 minutes ago, chwe said:

testing  testing  testing.. :) 

You need:

  • a github account
  • a bit of time
  • a SBC armbian supports

git clone https://github.com/armbian/testings
cd testings
./createreport.sh

you don't even need to fork the repo in the browser, we'll do it for you from command line.. :P Same for the PR, everything is done from console.

 

is back now even for google users... 

 

ok my entry is here

https://github.com/ag88/testings/blob/20180921-orangepipc-next/orangepipc-next.report

armbianmonitor -u apparently returns a blank

Posted
4 minutes ago, ag123 said:

ok my entry is here

https://github.com/ag88/testings/blob/20180921-orangepipc-next/orangepipc-next.report

armbianmonitor -u apparently returns a blank

 

I guess that's due to your DNS resolver has still issues with ix.io... :( Can we assume that 

Quote

armbianmonitor -u

https://pastebin.com/3NLDHaFs

(truncated due to size limits, the latest entry is at bottom)

is still correct?  (I edited your PR filling in armbianmonitor)

Posted

Still stability issues on stable v5.60 (v5.59 upgraded to 5.60). I can not have an uptime over 3-12 hours (it is not reachable via ssh or webgui).

SD and psu are OK. The system run stable on old v5.38 (january 2018) with same setup for weeks.

 

Orange Pi Zero 512/v1.4 system log:

http://ix.io/1ngD

Posted

@Moklev i'd suggest monitoring and tracking down the root cause of the ssh woes, one of those ways is to connect via a serial console when ssh fails

https://forum.armbian.com/topic/8237-serial-console-access-via-usb-uartgadget-mode-on-linuxwindowsosx/

 

try to connect a cable via the micro usb OTG port to see if you get a linux console presented. if that works when ssh fails you could then 'get into' the board and do some checks like a armbianmonitor -u

(if network fails you may need to run armbian monitor -U and saving the logs to a file)

 

among the things check /sbin/sysctl -a |grep vm.swappiness

vm.swappiness = 100

 

check in /etc/sysctl.conf if it is defined there you may like to reduce that value to 50 (or even for tests 0) to see if it made a difference

 

there could also be other ways like hacking up a script to say ping the router every 5 minutes and if ping fails to capture a log like armbianmonitor -U > logfile

 

------

on a side note, i'd think 5.60 is still a useful 'milestone' to have

 

 

Posted

on kernel sunxi-next 4.14.70 (armbian 5.59 upgraded to 5.60 in nighly builds)  orange pi pc

i'm trying to setup the power tactile button to send a 'key' as input. i.e. using acpid to shutdown

 

i did an

$ ls /dev/input/by-path
platform-1ee0000.hdmi-event 

platform-1f02000.ir-event

 

the gpio key is absent, hence regardless of the dt overlays i apply, i don't receive any keys

the things i checked are /boot/dtb/sun8i-h3-orangepi-pc.dtb decompiled (dtc -I dtb -O dts /boot/dtb/sun8i-h3-orangepi-pc.dgb )

Quote

    r_gpio_keys {
        compatible = "gpio-keys";
        pinctrl-names = "default";
        pinctrl-0 = <0x3a>;

        sw4 {
            label = "sw4";
            linux,code = <0x100>;
            gpios = <0x36 0x0 0x3 0x1>;
        };
    };

the above looks ok

however in /boot/config-4.14.70-sunxi

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
# CONFIG_KEYBOARD_ADC is not set
...
# CONFIG_KEYBOARD_GPIO is not set
# CONFIG_KEYBOARD_GPIO_POLLED is not set

i think CONFIG_KEYBOARD_GPIO needs to be configured as at least 'm' (module) or 'y' yes

without it gpio keys inputs won't appear in /dev/input/by-path as the input driver is missing

 

one main use of that button is by (orange pi *) users who wanted to use the 'power' button to shutdown the os via acpid

but it could be useful in other situations as well

 

 

Posted
2 hours ago, ag123 said:

vm.swappiness = 100

 

check in /etc/sysctl.conf if it is defined there you may like to reduce that value to 50 (or even for tests 0) to see if it made a difference

 

Interesting... I'll change swap threshold to a more conservative value to 30 and I'll test it for 24/48h.

Posted

i think @Igor is preparing the release of 5.60 images, known issues found here could be resolved after that e.g. for users to switch to 'nightly' to get the incremental updates

Posted
1 hour ago, ag123 said:

i think @Igor is preparing the release of 5.60 images, known issues found here could be resolved after that e.g. for users to switch to 'nightly' to get the incremental updates


I had weeks of long shifts due to the last major upgrade and I will do my best to do nothing for some time :P Then I have a pile of non-technical tasks to cope with.

Posted

@ag123

The good thing: reducing the value to 30 (vm.swappiness = 30) seems to solve the problem. Now I got ~22 h uptime without any hang...

The bad one: an heavy zram usage on my AMD/Debian 9.5 microserver freeze the system (both gnome shell and ssh). :-|

I think zram has some problems with Debian...

Posted
On 9/22/2018 at 2:34 PM, Moklev said:

Orange Pi Zero 512/v1.4 system log:

http://ix.io/1ngD

 

Do you use this board to boil water? 75°C SoC temperature reported at boot? I don't know how the thresholds are defined for 'emergency shutdown' but in this situation some more CPU activity alone can result in Armbian shutting down. If you want to diagnose a problem you need to diagnose it. Some problems arise over time. 'Worked fine for months and now unstable' happens for various reasons. Human beings usually then blame immediately the last change they remember (like upgrading the OS) instead of looking for the culprit.

 

Hopefully it still works but in your situation I would immediately install RPi-Monitor using

armbianmonitor -r

You get then a monitor instance running on your board and can look what happens and happened: https://www.cnx-software.com/2016/03/17/rpi-monitor-is-a-web-based-remote-monitor-for-arm-development-boards-such-as-raspberry-pi-and-orange-pi/

 

On 9/23/2018 at 4:55 PM, Moklev said:

The good thing: reducing the value to 30 (vm.swappiness = 30) seems to solve the problem. Now I got ~22 h uptime without any hang...

 

Hahaha! I run a bunch of Ubuntu and Debian servers with lowered DRAM, huge memory overcommitment (300% and not just the laughable 50% as with Armbian defaults) and of course vm.swappiness set to 100. No problem whatsoever. If you think swapping is the culprit you need to monitor swap usage! At least your http://ix.io/1ngD output shows memory and swap usage that is not critical at all.

 

If you want to get the culprit whether this is related to swap you need to run something like 'iostat 600 >/root/iostat.log' and this in the background:

while true ; do
    echo -e "\n$(date)" >>/root/free.log
    free -m >>/root/free.log
    sleep 600
done

Then check these logs whether swapping happened. An alternative would be adjusting RPi-Monitor templates but while this is quite easy nobody will do this of course.

 

On 9/23/2018 at 4:55 PM, Moklev said:

The bad one: an heavy zram usage on my AMD/Debian 9.5 microserver freeze the system (both gnome shell and ssh). :-|

I think zram has some problems with Debian...

 

Zram is a kernel thing and not related to any userland stuff at all. In other words: you have the same set of problems on a MicroServer and an ARM SBC running different software stacks? Are SBC and MicroServer connected to the same power outlet?

 

Unfortunately currently logging in Armbian is broken (shutdown logging and ramlog reported by @dmeey and @eejeel) but nobody cares (though @lanefu self assigned the Github issue). Great timing to adjust some memory related behavior and at the same time accepting that the relevant logging portions at shutdown that allow to see what's happening 'in the field' do not work any more)

Posted
1 hour ago, tkaiser said:

 

Do you use this board to boil water? 75°C SoC temperature reported at boot? [...]

 

After warm reboot... ~70-72°C it's a badly reported temperature, correct value -measured with a Fluke thermometer- is ~65°C on the SOC heatsink. It's normal, the scb works as a visual motion analizer (1 h264 hd stream) 24/7 since mid 2017.

 

Quote

Hopefully it still works but in your situation I would immediately install RPi-Monitor using


armbianmonitor -r

 

Yes, I've starting monitoring...

 

Quote

Zram is a kernel thing and not related to any userland stuff at all. In other words: you have the same set of problems on a MicroServer and an ARM SBC running different software stacks? Are SBC and MicroServer connected to the same power outlet?

 

They are not... totally different purpose or software stacks... and power outlet.

Anyway now my OPZ works stable again with Armbian 5.60 (with vm.swappiness set to 30-60).

Posted
23 hours ago, Moklev said:

Yes, I've starting monitoring...

 

So can you please provide output from both 'free -m' and 'armbianmonitor -u' now? Once you switch back to vm.swappiness=100 the VM monitoring script should be adjusted to

while true ; do
    echo -e "\n$(date)" >>/root/free.log
    free -m >>/root/free.log
    sync
    sleep 60
done

(same with iostat -- also switching from 600 seconds to 60. And then the 'sync' call above is very important since with default Armbian settings we have a 600 second commit interval so last monitoring entries won't be written to disk without this sync call happening)

Posted

Debian 9.5 AMD/Microserver (vm.swappiness=60)

 

iostat.log

root@tubserver:~# cat iostat.log
Linux 4.17.0-0.bpo.3-amd64 (tubserver)  27/09/2018      _x86_64_        (2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0,09    0,01    0,15    0,29    0,00   99,45

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               1,55        24,94        27,24   21613817   23605307
sdb               1,23         4,76        27,24    4122642   23605307
md3               0,22         0,77         0,13     669789     109496
md1               0,00         0,00         0,00       2304          0
md2               0,36        26,05        19,72   22569513   17083692
md0               1,55         2,89         7,29    2501781    6313004
sdc               0,18         1,12         1,80     973561    1560460
zram0             0,01         0,00         0,03       2660      29588
zram1             0,01         0,00         0,04       2904      38420

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0,10    0,00    0,15    0,73    0,00   99,02

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               2,81         0,05        68,93         28      41359
sdb               2,77         0,00        68,93          0      41359
md3               0,00         0,00         0,00          0          0
md1               0,00         0,00         0,00          0          0
md2               0,00         0,00         0,00          0          0
md0               4,50         0,05        68,74         28      41244
sdc               0,00         0,00         0,00          0          0
zram0             0,00         0,01         0,00          4          0
zram1             0,00         0,00         0,00          0          0

free.log

root@tubserver:~# cat free.log

gio 27 set 2018, 11.22.53, CEST
              total        used        free      shared  buff/cache   available
Mem:           1743         257         123          26        1362        1278
Swap:          2776          55        2721

gio 27 set 2018, 11.24.12, CEST
              total        used        free      shared  buff/cache   available
Mem:           1743         258         121          26        1363        1276
Swap:          2776          55        2721

gio 27 set 2018, 11.27.01, CEST
              total        used        free      shared  buff/cache   available
Mem:           1743         260         155          26        1327        1274
Swap:          2776          54        2722

gio 27 set 2018, 11.37.01, CEST
              total        used        free      shared  buff/cache   available
Mem:           1743         260         154          26        1328        1274
Swap:          2776          54        2722

Tomorrow for OPZ/Armbian data... both for vm.swappiness=30 and 100 (and armbianmonitor -u)

Posted
14 minutes ago, Moklev said:

zram0             0,00         0,01         0,00          4          0
zram1             0,00         0,00         0,00          0          0

 

Sorry, this thing is doing nothing and especially not swapping. And I'm really not interested in what a MicroServer somewhere on this earth is doing. It's only about Armbian and the claim vm.swappiness=100 would crash your board.

 

So as already said, it would be great if you can provide output from the two commands I asked for now

1 hour ago, tkaiser said:

So can you please provide output from both 'free -m' and 'armbianmonitor -u' now?

 

Since after switching back to vm.swappiness=100 your board is supposed to crash then afterwards it gets interesting to have a look at the two logs (submitted via pastebin.com or some similar online pasteboard service).

Posted

i think the somewhat elevated temperatures of say 70-75 deg C on H3 socs can be tolerated. the main trouble is that many of these boards ships without a heatsink

the main thing about running at these temperatures is that the board needs to run in a well ventilated environment e.g. that it should not be enclosed in a box that limits simple convection heat transfer (i.e. 70-75 deg should be an upper load limit and temperatures must stabilise at this range during high loads and must not increase)

 

I've an orange pi one that i found rather commonly runs at those temperatures without a heatsink, i'd imagine it may even be possible for it to goto boiling point 100 deg C and that the soc still runs but that to prevent damage, it would be necessary to throttle the frequency (e.g. reduce to 400mhz) and lower the voltages (reduce to as low as is possible e.g. below 1.1v if possible) at those temperatures. but a shutdown should be unnecessary

but that 70-75 deg C should be considered 'normal' operating temperatures for H3 socs running at 1.2ghz when under high loads

 

the situation can be improved quite a bit even with a rather small heat sink especially at the higher temperatures

https://forum.armbian.com/topic/8203-orangepione-h3-an-experiment-with-small-heatsink/

in particular at the higher temperatures

70-75 deg could be rather easily reached in hot summer temperatures as heat transfer is a function of temperature differences vs ambient temperatures.

a heat sink needs to be combined with good ventilation to remove all that heat

and as in my experiment, heat sink + *good ventilation* prevents the soc from reaching the higher temperatures and that at steady state it lets the soc run at a marginally lower temperature than without a heat sink

 

the problem i've with the orange pi h3 socs is that *after* the os is shutdown, the h3 soc and the board can overheat to such extreme that temperatures is well above 100 deg C, i almost have a board going up in fire. the only safe / precautionary way for now is to disconnect power once the status led indicates that the os has shutdown and do not issue a shutdown remotely if you can't unplug power from the board

Posted
25 minutes ago, ag123 said:

i think the somewhat elevated temperatures of say 70-75 deg C on H3 socs can be tolerated

 

Huh? Can we please stay focused here on an Armbian release upgrade and potential show-stoppers? You brought up the claim that vm.swapiness settings cause crashes without any evidence whatsoever. Let's focus now on collecting some data related to this and keep temperature babbling outside.

 

We all know that certain boards report wrong temperatures anyway (OPi Zero rev 1.4 for example) and the only area where this gets interesting is whether those wrongly reported temperatures cause an emergency shutdown (the kernel's cpufreq framework defines that at a specific critical temperature a shutdown is initiated). So obviously with wrong thermal readouts as we experience on some sunxi boards/platforms we get a 'denial of service' behavior under load.

 

Using the wrong hardware for specific purposes is not the focus of this thread (e.g. using a board with just 512 MB for desktop linux -- that's not a 'use case', that's just plain weird)

Posted

i think the issue about the vm.swapiness is that some of the processes, e.g. sshd after it is swapped out i.e. not resident in memory, may not respond or respond too slowly to connects over the network. i.e. that for some reason, external connects did not initiate a swapped out process image to be reinstated in memory in a normal manner

if that's the case any attempts to connect over ssh fails

reducing vm.swapiness in this case makes it more likely that sshd remains resident in memory rather than being swapped out, hence it alleviates the problem, but this likely won't resolve it entirely

Posted
17 minutes ago, ag123 said:

i think the issue about the vm.swapiness is that some of the processes, e.g. sshd after it is swapped out i.e. not resident in memory, may not respond or respond too slowly to connects over the network

 

WTF? Can we please stop developing weird theories but focus on what's happening (which requires a diagnostic attempt collecting some data)? Without swap it could be possible that the sshd gets quit by the oom-killer (very very very unlikely though because the memory footprint of sshd is pretty low) but if it would be possible that swapping in Linux results in daemons not responding any more or not fast enough we all would've stopped using Linux a long time ago.

 

And can you please try to understand that we're not talking about 'swap on crappy storage' (HDD) any more but about zram: that's just decompressing swapped out pages back into another memory area which is lightning fast compared to the old attempts with swap on HDD.

Posted
1 hour ago, tkaiser said:

 

WTF? Can we please stop developing weird theories but focus on what's happening (which requires a diagnostic attempt collecting some data)? Without swap it could be possible that the sshd gets quit by the oom-killer (very very very unlikely though because the memory footprint of sshd is pretty low) but if it would be possible that swapping in Linux results in daemons not responding any more or not fast enough we all would've stopped using Linux a long time ago.

 

And can you please try to understand that we're not talking about 'swap on crappy storage' (HDD) any more but about zram: that's just decompressing swapped out pages back into another memory area which is lightning fast compared to the old attempts with swap on HDD.

 

i won't be able to prove this easily, but if setting vm-swapiness to values less than 100 alleviates the issue, could it point to some possible issues related to or with zram?

some google searches draw a blank as various similar issues are found but not identical to this, i.e. that zram isn't specifically mentioned.

the other thing would be that it may be necessary to have the oom killer log the processes killed into /var/log/messages (i.e. by syslog or journald) for post mortem analysis of what actually happened, e.g. is sshd literally killed? i think oom killer currently only logs to dmesg, which means that the traces are lost once the board/soc is reset / restarted. i'm not too sure where such logs could be configured to be captured (kswapd?) such that if a paged out / swapped memory could not be restored the event be logged in syslog etc

 

Posted

vm.swappiness=100

 

https://ghostbin.com/paste/vrt5t

 

With vm.swappiness=30: system work fine

With wm.swappiness=100: system work fine for a random time (3-12 h), then hang with ssh unreachable, yellow ethernet led fixed on, pihole/motioneye/rpi monitor web pages unreachables.

 

Hardware: OrangePI Zero v1.4 - Sandisk uSD 16GB U1 A1 (checked, good healt) EXT4 - Toshiba USB Stick 32GB (checked, good healt) F2FS, USB PSU FriendlyARM 5V/3A (checked, good healt).

 

Posted

Let's stop here. I won't further waste my time with this 'report' (which is just a bunch of theories, assumptions and a weird methodology to back an idea).

 

@Moklev: I was asking for armbianmonitor -u output but got only a redacted/censored variant (the line numbers are there for a reason). I was asking for what's the output of 'free -m' NOW. As in 'with your vm.swappiness=30 or 60 setting). Instead you rebooted (why?! You can adjust vm.swappiness all the time, no reboot needed).

 

Providing 'free -m' output directly after a reboot is pointless as you see ZERO swapping happened. And providing a log containing information from 14.36.28 until 14.53.31 is pointless too.

 

14 minutes ago, Moklev said:

With wm.swappiness=100: system work fine for a random time (3-12 h), then hang with ... rpi monitor web pages unreachables

 

Ok

On 9/26/2018 at 10:19 AM, Moklev said:

Yes, I've starting monitoring...

 

Yesterday you started monitoring. Now you report having switched from vm.swappiness=30 to 100. But you're able to report that with 100 settings your board froze and even RPi Monitor web page not being accessible. So you tested within the last 29 hours already twice with 100 settings and were able to report your board crashing (since you talk about '3-12 h' -- if these 3-12h is some anecdotical story from days ago I'm not interested in. It's only relevant what happens now with some monitoring installed able to provide insights).

 

I'm interested in resolving real problems if there are any. What happens here is developing weird theories and trying to back them. Unless I get data I won't look any further into this.

 

If you had RPi-Monitor running the graphs are available even after a reboot so you can share them focussing on the last hour prior to latest crash (Cpufreq, temperature, load). Also if 'ssh unreachable, yellow ethernet led fixed on, pihole/motioneye/rpi monitor web pagesunreachable' is the symptom then you report your board having freezes or crashed (and not sshd not being responsive any more).

Posted

after some searches i've only found a related but different issue

https://github.com/armbian/build/issues/219

that relates mainly to kswapd taking 100% cpu - found quite a few instances in google searches not related to armbian. that particular issue seemed to be alleviated by vm.swappiness=0

but given the rather large lapse of time that issue is raised in 2016, and that it is for a different kernel 3.4.x it may not be directly relevant

i think for the time being selecting a vm.swappiness in a mid range between 0 to 100 seem to alleviate related issues surfaced currently

Posted
5 minutes ago, ag123 said:

i think for the time being selecting a vm.swappiness in a mid range between 0 to 100 seem to alleviate related issues surfaced currently

 

Are you trolling? Which 'related issues'? I only know of one overheating Orange Pi Zero that for whatever reasons is/was unstable. Zero useful efforts taken to diagnose this single problem other than a huge waste of time right now.

Posted

nope, i'm referring to the non-responding sshd, rpi-monitor freezes etc. that seemed to be alleviated selecting a vm.swappiness that is moderate say 50 etc

Posted

Ok, this is pure madness here. Won't look into this thread for some time.

 

@ag123 it's really nice that you're creative and able to develop theories about Linux being broken in general (breaking the system by swapping out important daemons -- I guess you also think kernel developers behave that moronic that they allow to swap out the kernel itself?). It also is nice that you ignore reality (@Moklev having reported a system freeze and not sshd being unresponsive). It also doesn't matter that much that you create out of one single report of 'freezes since some time' for reasons yet unknown a proof that there's something wrong with the new vm.swappiness default in Armbian and talk about 'related issues surfaced currently' (issues --> plural).

 

I simply won't read these funny stories any more since they're preventing time better spent on something useful.

 

Here are the download stats: https://dl.armbian.com/_download-stats/ Check when 5.60 has been released, count the number of downloads, count the reports of boards freezing/crashing in the forum and you know about 'issues surfaced currently'. 

 

 

Posted
30 minutes ago, tkaiser said:

I was asking for armbianmonitor -u output but got only a redacted/censored variant (the line numbers are there for a reason). I was asking for what's the output of 'free -m' NOW. As in 'with your vm.swappiness=30 or 60 setting). Instead you rebooted (why?! You can adjust vm.swappiness all the time, no reboot needed).

 

It's a problem, "armbianmonitor -u" is broken, all free service (pastebin, ghostbin, etc...) are limited to 512kB-1,5MB. Now is not possibile to upload all necessary  data.

 

30 minutes ago, tkaiser said:

Providing 'free -m' output directly after a reboot is pointless as you see ZERO swapping happened. And providing a log containing information from 14.36.28 until 14.53.31 is pointless too.

 

 

Ok

 

Yesterday you started monitoring. Now you report having switched from vm.swappiness=30 to 100. But you're able to report that with 100 settings your board froze and even RPi Monitor web page not being accessible. So you tested within the last 29 hours already twice with 100 settings and were able to report your board crashing (since you talk about '3-12 h' -- if these 3-12h is some anecdotical story from days ago I'm not interested in. It's only relevant what happens now with some monitoring installed able to provide insights).

 

SBC is crashed on 20.09 (boot at 13:05, crashed at 15:35*), on 21.09 (boot at 14.00 then crashed at 17.37*) and on 22.09 (boot at 13:35 then crashed at 23.12*). (*): timestamp of the last picture shooted and processed. Sunday (23.09) I've changed the vm.swappiness to 30 and the problem has been solved.

 

Now I need the following directions to help you:

- when to run the "free -m" command

- how to send the log if armbianmonitor does not work

- how long to run the indicated scripts

 

I need at least 1 or 2 weeks to do everything...

 

 

  • Igor unpinned this topic
Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines