laurentppol Posted May 23, 2020 Posted May 23, 2020 Hi there, this is continuation of my previous thread about problems with my OPiPC+. 1). "OOB" Armbian Buster, looks working OK, not "ported" all soft / configs from another one. 2). Armbian Jessie upgraded up to Buster, mainly for mosquitto, OpenHAB2, PostgreSQL 11 (for logical replication). Now I have copied everything (only different UUIDS) from working /boot directory to the one with problems. (kernel, initrd, etc). What I do not understand, is fact, that on "good" HTOP shows 4 cores, on "bad" one only 1 core, performance is horrible (OpenHAB startup in minutes). OpenHab (I suspect) can cause whole system to hang - ping works but no even ssh / www. What am I missing? This is mainly to understand "why", more than solve (as I can copy uSD from working one as soon as I migrate services to "new" one).
usual user Posted May 23, 2020 Posted May 23, 2020 46 minutes ago, laurentppol said: What am I missing? You don't say anything about the used u-boot and its configuration. U-boot performs the basic system initialization, so a different u-boot setup can lead to such behavior. The Kernel initialization is based on the initial u-boot setup.
laurentppol Posted May 24, 2020 Author Posted May 24, 2020 u-boot is (probably) "original" from "OOB" Armbian Jessie. Should I update it now using nand-sata-install? Tried this, using program from working OPi. As there was no "nand-sata-install" on upgraded device. (used scp) It boots, but still shows 1 core :(. Uboot prompt is accessible only from hardware serial?
usual user Posted May 24, 2020 Posted May 24, 2020 U-boot and its components usually reside in the not allocated space of the partition table between MBR and the 1th partition. What tool you use to put it in the proper location does not really matter. I prefer to learn what the specific parameters of the desired device are and use dd. It always works the same way and I don't have to rely on tools where I don't know what they're really doing. But perhaps someone who knows Armbian better can tell you what is the preferred way in your environment.
Igor Posted May 24, 2020 Posted May 24, 2020 19 hours ago, laurentppol said: Armbian Jessie upgraded up to Buster This is not a recommended way and you are lucky that you boot the board ... however, if you don't update boot loader manually, you will not have all four cores. This is known problem, but to explain you why is this happening ... I would need to waste hours since I lost it from my head. Use search engine to obtain that info if you are curious.
laurentppol Posted July 11, 2020 Author Posted July 11, 2020 Today I have "upgraded" bootloader. As both devices have identical SD cards and partition layout I simply copied (using dd + scp) beginning of /dev/mmcblk1 from working to "partially working" OPiPC+. Rebooted but NO change, htop still shows only 1 core on "old" one :( What could I check / change next? (as probably 1 core is far not enough in system hosting OpenHAB2 which is quite big Java app).
usual user Posted July 11, 2020 Posted July 11, 2020 3 hours ago, laurentppol said: What could I check / change next? Verify the full verbose boot log of both devices. Best obtained by serial console as it will most probably also preserve the uboot messages.
laurentppol Posted July 11, 2020 Author Posted July 11, 2020 From the "non-working" one, I'll get this probably during weekend, as I have physical access to it. The other (working 100%) one is at my neighbor, so will do it in few days.
usual user Posted July 11, 2020 Posted July 11, 2020 On 5/23/2020 at 7:27 PM, laurentppol said: Now I have copied everything (only different UUIDS) from working /boot directory to the one with problems. (kernel, initrd, etc). Just a shot into the dark. When you copied the kernel binary, you also copied its companion modules from /usr/lib/modules/<kernel-version> to your rootfs? Perhaps the initrd lacks an essential module that is later loaded from rootfs. You must use the same <kernel-version>directory label in your rootfs.
laurentppol Posted July 11, 2020 Author Posted July 11, 2020 Ok, forgot. Just copied now, on "non working" there are now /lib/modules/ for 5.4.31 (from NEW board, stock Armbian Buster) AND 5.4.35 (which shows - surprise on OLD board). Rebooted, no change in htop, differencies in lsmod... lsmod -> OLD one: Quote Module Size Used by uinput 20480 2 snd_soc_hdmi_codec 16384 1 zram 24576 2 rc_cec 16384 0 snd_soc_simple_card 16384 1 snd_soc_simple_card_utils 16384 1 snd_soc_simple_card sun8i_codec_analog 24576 1 lima 36864 0 sun4i_i2s 24576 2 sunxi_cir 16384 0 dw_hdmi_i2s_audio 16384 0 dw_hdmi_cec 16384 0 sun8i_adda_pr_regmap 16384 1 sun8i_codec_analog sun4i_codec 40960 3 sun4i_gpadc_iio 16384 0 gpu_sched 24576 1 lima snd_soc_core 118784 6 sun4i_codec,sun4i_i2s,sun8i_codec_analog,snd_soc_hdmi_codec,snd_soc_simple_card_utils,snd_soc_simple_card snd_pcm_dmaengine 16384 1 snd_soc_core snd_pcm 65536 5 sun4i_codec,sun4i_i2s,snd_pcm_dmaengine,snd_soc_hdmi_codec,snd_soc_core industrialio 49152 1 sun4i_gpadc_iio snd_timer 24576 1 snd_pcm sun8i_thermal 16384 0 snd 45056 8 snd_soc_hdmi_codec,snd_timer,snd_soc_core,snd_pcm soundcore 16384 1 snd sunxi_cedrus 32768 0 v4l2_mem2mem 20480 1 sunxi_cedrus evdev 20480 4 uio_pdrv_genirq 16384 0 uio 16384 1 uio_pdrv_genirq cpufreq_dt 16384 0 binfmt_misc 16384 1 8189fs 1081344 0 cfg80211 425984 1 8189fs rfkill 20480 3 cfg80211 dm_crypt 32768 0 dm_mod 94208 1 dm_crypt dax 24576 1 dm_mod ip_tables 24576 0 x_tables 20480 1 ip_tables pwrseq_simple 16384 1 sy8106a_regulator 16384 1 gpio_keys 20480 0 and from NEW one: Quote Module Size Used by overlay 77824 2 fuse 81920 3 snd_soc_hdmi_codec 16384 1 8189fs 1081344 0 rc_cec 16384 0 sunxi_cir 16384 0 sun8i_codec_analog 24576 1 sun8i_adda_pr_regmap 16384 1 sun8i_codec_analog sun4i_gpadc_iio 16384 0 lima 36864 0 industrialio 49152 1 sun4i_gpadc_iio uas 20480 0 gpu_sched 24576 1 lima dw_hdmi_cec 16384 0 dw_hdmi_i2s_audio 16384 0 sun8i_thermal 16384 0 sun4i_codec 40960 3 sun4i_i2s 24576 2 cfg80211 425984 1 8189fs rfkill 20480 3 cfg80211 sunxi_cedrus 32768 0 v4l2_mem2mem 20480 1 sunxi_cedrus zram 24576 2 snd_soc_simple_card 16384 1 snd_soc_simple_card_utils 16384 1 snd_soc_simple_card snd_soc_core 118784 6 sun4i_codec,sun4i_i2s,sun8i_codec_analog,snd_soc_hdmi_codec,snd_soc_simple_card_utils,snd_soc_simple_card snd_pcm_dmaengine 16384 1 snd_soc_core snd_pcm 65536 5 sun4i_codec,sun4i_i2s,snd_pcm_dmaengine,snd_soc_hdmi_codec,snd_soc_core snd_timer 24576 1 snd_pcm snd 45056 8 snd_soc_hdmi_codec,snd_timer,snd_soc_core,snd_pcm soundcore 16384 1 snd evdev 20480 5 uio_pdrv_genirq 16384 0 uio 16384 1 uio_pdrv_genirq cpufreq_dt 16384 0 ip_tables 24576 0 x_tables 20480 1 ip_tables pwrseq_simple 16384 1 sy8106a_regulator 16384 1 gpio_keys 20480 0 Does it say something? No encryption/LVM on new may be caused by fact, it was now connected to USB HDD with LUKS/LVM partitions (the old one was).
usual user Posted July 11, 2020 Posted July 11, 2020 1 hour ago, laurentppol said: Does it say something? Perhaps checking the output of "dmesg" that was recorded on both devices after a fresh reboot can gain more insight. As you said you had copied everything in /boot, you also copied the kernel companion dtb files, did you?
xwiggen Posted July 12, 2020 Posted July 12, 2020 have you tried swapping SD+PSU+cable to rule out any hardware issues?
laurentppol Posted July 14, 2020 Author Posted July 14, 2020 Yes, I did. My power circuit is quite complicated: battery backed SMPS PSU @ 12V, then SMPS converter (rated 3A) to 5V. Switched to other 12/5V converter/cable, same result. Will also try to replace SD card, but they are both Samsung class 10 32GB.
Vant195 Posted July 14, 2020 Posted July 14, 2020 On 7/12/2020 at 10:42 PM, xwiggen said: have you tried swapping SD+PSU+cable to rule out any hardware issues? having similar issues to OP and i did tried this method. nothing worked. 8 hours ago, laurentppol said: Will also try to replace SD card can update?
laurentppol Posted July 16, 2020 Author Posted July 16, 2020 (edited) Ok, cloned "working" uSD (dd + gzip + scp + unzip + dd over the network). At least problem with only 1 core is "gone", will do "burn in" test for few days. This also will give me possibility to attach to serial and record boot sequence (swapping cards). But I think it is system problem. It would be interesting to find WHY it gone bad? (after upgrade from Jessie). Anyway I am migrating OpenHAB2 and PostgreSQL to OPi3, more MHz, 2x more RAM. It is too heavy for OPiPC(+). Final target will be OPi4 Edited July 16, 2020 by laurentppol
laurentppol Posted July 17, 2020 Author Posted July 17, 2020 As for now (24+h) running OK, still 4 cores in htop.
laurentppol Posted August 3, 2020 Author Posted August 3, 2020 Stopped after 5 days :(. This MAY be temperature issue: - found system in "dead" (but graphic login screeen displayed, mouse & kbd dead) state, no even ping. - after power-up cycle system restarted (with OpenHAB2 "starting"), in htop core temperature 80+*C. - case was REALLY HOT. In few days I expect to get fan for this (originally for RPi3 but I think I can adapt it, it is 3.3V/5V). Drill the ventilation holes then return to testing. Have spare heatsink will attach it to processor too. I also made "1 liner shell" running, sending date every minute to a file, so I'll be able to tell when it started and when stoppped. Someone knows how to get core temperature (like on ssh login screen & htop screen) from shell command line? Also stopped OpenHAB2 -> for eating 280MB RAM.
usual user Posted August 3, 2020 Posted August 3, 2020 4 hours ago, laurentppol said: Someone knows how to get core temperature (like on ssh login screen & htop screen) from shell command line? Use "tmon -dl".
laurentppol Posted August 5, 2020 Author Posted August 5, 2020 Don't have "tmon" (as root), what should I install (package name)?
usual user Posted August 5, 2020 Posted August 5, 2020 The tool is part of the kernel source tree. I don't know how it's packed in your environment but following the README it can be built very easily in the appropriate directory. If not packed for Armbian perhaps a good opportunity to contribute by PR to Armbian.
laurentppol Posted August 6, 2020 Author Posted August 6, 2020 Is this "simple Makefile" project? Can be found by google?
usual user Posted August 7, 2020 Posted August 7, 2020 11 hours ago, laurentppol said: Is this "simple Makefile" project? Can be found by google? You followed the link in the first post about tmon and looked around the directory where the README is located?
laurentppol Posted August 8, 2020 Author Posted August 8, 2020 Got to the readme finally. Installed linux-source, then libncurses-dev, then compiled. Works. A bit too complicated (not suitable for script use, IMO). Do You know anything better suited for scripts? Eg 1 line per temp? Or just do a grep on tmon output?
xwiggen Posted August 8, 2020 Posted August 8, 2020 1 hour ago, laurentppol said: Got to the readme finally. Installed linux-source, then libncurses-dev, then compiled. Works. A bit too complicated (not suitable for script use, IMO). Do You know anything better suited for scripts? Eg 1 line per temp? Or just do a grep on tmon output? you could use sensors from lm-sensors
laurentppol Posted August 8, 2020 Author Posted August 8, 2020 Ooops, installed but sensors-detect finds nothing. It suggests to verify if all needed kernel modules are loaded. ?
xwiggen Posted August 8, 2020 Posted August 8, 2020 25 minutes ago, laurentppol said: Ooops, installed but sensors-detect finds nothing. It suggests to verify if all needed kernel modules are loaded. ? just use sensors, cpu temp is measured
laurentppol Posted August 8, 2020 Author Posted August 8, 2020 root@orangepi3:~# sensors No sensors found! Make sure you loaded all the kernel drivers you need. Try sensors-detect to find out which these are. :(
usual user Posted August 9, 2020 Posted August 9, 2020 8 hours ago, laurentppol said: A bit too complicated (not suitable for script use, IMO). Do You know anything better suited for scripts? I find the log that tmon produces quite usable. Just one visualized by throwing it through gnuplot: Spoiler
xwiggen Posted August 9, 2020 Posted August 9, 2020 13 hours ago, laurentppol said: root@orangepi3:~# sensors No sensors found! Make sure you loaded all the kernel drivers you need. Try sensors-detect to find out which these are. odd. 5.4.x does not display the temp, coming release does :-) # modprobe sun8i-thermal # cat /sys/devices/virtual/thermal/thermal_zone0/temp
Recommended Posts