1 1
chwe

research The dogafu experiment (DS18b20, unreliable SD-Card & power source)

Recommended Posts

Starting to work on lets call it the "dogafu" experiment (don't give a fuck about recommendations). This would combine my threads about powering & bad SD-Cards. Cause I want do something useful and not just crash armbian on a system, from which I know that it would happen, I decided to do not only stupid tasks on my OPi zero.  I could just hammering the SD-Card with a webcam and motion until it crashes and write on this thread 'opi 0 with terrible setup crashes after x days'.  But, nobody would read this thread cause it's not interesting for anyone. We know that this would happen and for those users who don't know, there's nothing of interest in this thread. 

Since thermal throttling on the opi zero seems to be a real issue and there's a lack of information, that the temperature readouts from the SoC are correct, I decided to connect a DS18b20 to the OPi0 and let it measure the temperature of the SoC. Everything was installed on ARMBIAN 5.31 stable Ubuntu 16.04.2 LTS 3.4.113-sun8i. 

 

Hooking up the DS18b20:

First we edit the configuration file and uncomment the w1 modules with sudo nano /etc/modules-load.d/modules.conf

Spoiler

 


w1-sunxi
w1-gpio
w1-therm
#sunxi-cir
xradio_wlan
g_serial
xradio_wlan
gpio-sunxi

and cause onewire does not work properly @240 MHz we had also to change MIN_SPEED to 480000 with:

sudo nano /etc/default/cpufrequtils

Spoiler

ENABLE=true
MIN_SPEED=480000
MAX_SPEED=1200000
GOVERNOR=interactive

 

after a shutdown we can connect the DS18b20 on GPIO10 (Data, physical pin 26), VCC to one of the 3.3V (physical pin 1 or 17) and ground (physical pin 6,9,14,20 or 25) don't forget to have a 4.8kOhm resistor betwenn VCC and Data! If you want to have your data pin not on GPIO10 you have to modify the script.bin with bin2fex /boot/script.bin /tmp/orange.fex followed by nano /tmp/orange.fex and change the GPIO in the [w1_para] section (example for using of GPIO6 for DS18b20):

Spoiler

[w1_para]
w1_used = 1
gpio = 6

 


sudo fex2bin /tmp/orange.fex /boot/script.bin and a reboot is needed to activate this settings. Im everything works correctly sudo armbianmonitor -m should show that the cpu frequency would not go below 480MHz (otherwise DS18b20 would not run smoothly). Go to cd /sys/bus/w1/devices/ we can see our sensor with ls.

Spoiler

opi@orangepizero:/sys/bus/w1/devices$ ls
28-0517010cbeff  w1_bus_master1

 

 

It should start with something like 28-XXXXXXXXXXX. cat 28-0517010cbeff/w1_slave should show our actual temperature.

Spoiler

opi@orangepizero:/sys/bus/w1/devices$ cat 28-0517010cbeff/w1_slave
79 01 4b 46 7f ff 0c 10 29 : crc=29 YES
79 01 4b 46 7f ff 0c 10 29 t=23562

 

So the actual temperature in my room is 23.562°C (IMO forget about everything behind the °C, proper precise temperature measurement isn't trivial and needs calibration which is not possible without professional equipment). 

 

Send data to an other device:

Cause this setup with bad powering & a corrupt SD card will brick and I do not want to lose the collected data, I decided to send all the data to a second, proper running, OPi 0 via mqtt. This will be done by some bash scripts and crontab (it would be possible to do this only with crontab, but cause I may use this scripts on other devices for other purposes it's easier to have them isolated). For this, I installed a mqqt client with sudo apt-get install mosquitto-clients. After installation, we test if the client can publish on an other device with: mosquitto_pub -h 192.168.x.xx -t test -m "everything works ;-)" (-h ip oft the mqtt broker, -t topic to publish -m msg.payload). On my second OPi 0 with node-red and mosquitto we should see that the message arrived (installation of node-red and mosquitto).

Spoiler

node-red_test.jpg.aa64dfbff757112032ef48d210795591.jpg

(with #, we subscribe to every topic on the mosquitto broker)


Bash script & crontab:

IMO the easiest way to send data periodically is to generate a small bash script which includes all the tasks and then setup a crontab to start this bash script. The scritp was saved in /home/opi/scripts/ with nano scriptname the script was generated (It might be possible to do this tasks without a bash script but since I'll reuse parts of it i decided it's the lazy way to do it.:rolleyes:):

Spoiler

#!/bin/bash
#get themal data from thermal_zone0 &thermal_zone1
mosquitto_pub -h 192.168.x.xx -t bad_opi/temp/thermal_zone0/ -s </sys/devices/virtual/thermal/thermal_zone0/temp
mosquitto_pub -h 192.168.x.xx -t bad_opi/temp/thermal_zone1/ -s </sys/devices/virtual/thermal/thermal_zone1/temp
#get cpu speed from cpu0 to cpu3
mosquitto_pub -h 192.168.x.xx -t bad_opi/cpu/cpu0/ -s </sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
mosquitto_pub -h 192.168.x.xx -t bad_opi/cpu/cpu1/ -s </sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_cur_freq
mosquitto_pub -h 192.168.x.xx -t bad_opi/cpu/cpu2/ -s </sys/devices/system/cpu/cpu2/cpufreq/cpuinfo_cur_freq
mosquitto_pub -h 192.168.x.xx -t bad_opi/cpu/cpu3/ -s </sys/devices/system/cpu/cpu3/cpufreq/cpuinfo_cur_freq
#get temp from DS18b20
output=$(head -n 2  /sys/bus/w1/devices/28-0517010cbeff/w1_slave | tail -n 1 | cut -c 30-35)
output2=$((output / 1000))
mosquitto_pub -h 192.168.x.xx -t bad_opi/temp/DS18b20/ -m $output2

 

 

Cause cpu infos are only available as root, crontab must run under root privileges. Add a new crontab with sudo crontab -e (1 for edit with nano). This crontab will start our script every minute:

 

Spoiler

 


# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h  dom mon dow   command

* * * * * bash /home/opi/scripts/scriptname

Cause this script runs now with root privileges, this might be a security risk so make sure that no one has access to your OPi! Now it's time to set up everything in node-red to get our results visible. I added dashboard to node-red to have some nice UI templates (usage of node-red & how to set up can easily learned by google :P

Spoiler

node-red_setup.jpg.97f958d9f3b823d296ddeac614fd43a4.jpg

FYI: This is an ongoing project. At the moment, everything runs on a reliable SD-Card and the DS18b20 is not properly mounted on the SoC (waiting for thermal paste). As soon as I have everything setted up, I'll put it on the bad SD-Card with a cheap phone charger to see how stable it runs.

Share this post


Link to post
Share on other sites

Had some 'soldering fun' today My DS18b20 is no fixed on the SoC.  :D

DS18b20.thumb.jpg.6e4cf08a0d402d898f92ed1fab9387ee.jpgUnfortunately, I don't have a rev. 1.1 board do compare this data. During stress test the temperature on the DS18b20 is around  56-57°C (internal temp 69-70°C) and idle 45°C (internal temp 52°C). I noticed that the idle temp is slightly lower than without DS18b20 attached to it (~54-55°C). Armbianmonitor displays mosty that the cpu should run @480MHz, but recording of each CPU suggest that they run mostly @1008MHz. Sometimes CPU0 goes down to 480MHz (once or two times I noticed that CPU1 also runs @480MHz).  Something between 480 or 1008 MHz only happens if there's thermal throttling. 

 

Edit:

Cause of this thread I decided to connect the system via wland instead of lan cable to my router.

 

Also 'Wi-Fi' issues should be fixed by the hardware change on PCB rev 1.4 -- no idea since... no one is testing this for reasons unknown to me).

Since there's some interest in stability of the wlan on OPi rev 1.4, I decided to run the experiment some days with the reliable SD-Card & powering to see how it performs. Added a pingtest every 20 seconds for you. :P

Share this post


Link to post
Share on other sites

Lost ping to my opi 40 minutes ago. Unfortunately I forgot to solder some pins for serial debug (not sure if I should do this on a running system). So it's not clear if the hole system crashed or only wireless. I'll solder serial debug pins to my 'expansion board' and startup the system again. The DS18b20 had to error readings during this session 

lost_wlan.jpg.afe400cc2b1cc08b0d8ad14a28483ac8.jpg

Share this post


Link to post
Share on other sites

After a longer break, the dogafu experiment was online again. Since rev. 1.4 of the OPi0 is prone to overheat I tried some possibilites to avoid these situations. Cause thermal throttling leds normally to a CPU frequency of 768MHz, it's obvious that no throttling should occur as long as your max cpu freg. is sett to 768MHz.  stress -c 4 ran smoothly without any issues for more than one hour.  Wifi seems to be still unstable. Ping is horrible high (iwconfig showed that power management is off). Normally I lose connection after 5-12 hours (I'll test it now with power management on). Sometimes crashing of wifi leads to crash the whole system so no connection via serial console possible.

Compared to Igor, I'm not 100% convinced that thermal readings on the SoC are false. The temperature difference between DS18b20 and internal temperature could also come from 'point of measurement'. Measurements outside from a liquid are always hard. There are so many possibilities that the measured temperatures are too low (e.g. contact surface, insulation/heat spreading between cpu and SoC surface). I need a rev. 1.1 board to log SoC temperature with the same setup. Since I don't own such a board, I've no possibility to do this. ;) Maybe I test temperature on my OPi pc plus to get a 'feeling' of the temperature readings from DS18b20 compared to internal readings. 

I also tested if they broke something to feed the CPU. Measurements on the SY8113B showed that the CPU is mostly fed with 1.14V idle and 1.37V if I run stress -c 4. So it seems that this is not the case. I also noticed that the SoC temperature is 2-3°C higher when wifi is running. It might be that the thermal design of the rev. 1.4 board is worse compared to the rev. 1.1 board.

All the data is logged with a second OPi 0 connected via ETH to the the network with node-red. This one runs now since 18d without any issues (last reboot wasn't cause it crashed :P ). But node-red isn't such a high load for this board... :P 

Share this post


Link to post
Share on other sites

Opi 0 is now 'alive' since ~28h. Seems that 'power management on' leds to a more stable wifi connection than 'power management off'. Please notice that SSH is really annoying with such a high ping.  I think you should see the onboard wifi as a 'possibility to send some data from a IoT-node'. ;) 

 

Edit: Also node-red-dashboard needs horrible long to load (~30s) if you have so many data points plotted. I don't know how they store the data but obviously it's not made to have ~14'000 data points visible simultaneously. :D

 

Edit2: After ~48h wifi crashed in power management on mode. Seems that the whole board was really warm after crashing. I'll test wifi with active cooling to see if this has an impact. 

Share this post


Link to post
Share on other sites
1 1