ChAoSWK Posted November 28, 2016 Posted November 28, 2016 Okay, today evening I am able to make these tests. I will completely disable my watchdog script (and even all other scripts), so they can not interference with my tests. I will run sysbench test at these both freqs to test stability... If 912mhz is instable I can test setting 912 to 1.3v to test if it is stable (i already know how to work with bin2fex etc). I am also able to measure again the voltage at tp 1. BUT -> interesting for me, what is happening at 6:51? How can I find out what processes occupies the CPU for three minutes??? As told - I am using Armbian jessie server 5.20 upgraded to 5.23 for the OPi one. THX in advance @tkaiser which sysbench options do you recommend to use in my constellation?
tkaiser Posted November 28, 2016 Posted November 28, 2016 BUT -> interesting for me, what is happening at 6:51? How can I find out what processes occupies the CPU for three minutes??? I would believe systemctl list-timers --all will tell. And /etc/cron.daily/ might be also worth a look.
ChAoSWK Posted November 28, 2016 Posted November 28, 2016 root@orangepione:/etc# ls cron.*cron.d:rpimonitor sysstatcron.daily:apt aptitude bsdmainutils dpkg logrotate man-db ntp passwd sysstatcron.hourly:fake-hwclockcron.monthly:cron.weekly:man-dbroot@orangepione:/etc# systemctl list-timers --allNEXT LEFT LAST PASSED UNIT ACTIVATESMon 2016-11-28 12:55:28 CET 3h 8min left Sun 2016-11-27 12:55:29 CET 20h ago systemd-tmpfiles-clean.timer systemd-tmpfiles-clean.servicen/a n/a n/a n/a systemd-readahead-done.timer systemd-readahead-done.service2 timers listed.root@orangepione:/etc# Okay, interesting which sysbench options do you recommend to use in my constellation? edit: tested with sysbench, even 1 and 2 cores used (2 are configured in my constellation) both tests ran fine at 912 mhz for at least 30 minutes
ChAoSWK Posted November 29, 2016 Posted November 29, 2016 This time I answer myself therefore it is seen here: yesterday I made some tests and found this strange thing: As you can see in the three screenshots it is really weird with freqs and voltages. What is clear the OPi one operates only with 1.1 and 1.3 Volt. But my thought was, thatt all freqs OVER 912 mhz will operate at 1.3 V and all under 912mhz will operate at 1.1 Volt. In the three Screenshots you can see the OPi operating with 814mhz and 1.3 V and sometimes 912mhz with 1.1 Volt... In Voltages1.png you can see the pi operating at 753 mhz and 1.3 Volt In Voltages2.png you can see the pi operating at 814 mhz and 1.3 Volt In Voltages3.png you can see the pi operating at 912 mhz and 1.1 Volt really weird?!? I did not modify the fex / bin file. It is the factory default in this Armbian Image!!! fex / bin contains following: [dvfs_table]pmuic_type = 1pmu_gpio0 = port:PL06<1><1><2><1>pmu_level0 = 11300pmu_level1 = 1100max_freq = 1200000000min_freq = 480000000LV_count = 5LV1_freq = 1200000000LV1_volt = 1300LV2_freq = 1008000000LV2_volt = 1300LV3_freq = 912000000LV3_volt = 1100LV4_freq = 648000000LV4_volt = 1100LV5_freq = 480000000LV5_volt = 1100 -> pmu_level0 = 11300pmu_level1 = 1100 is 11300 really 11300 or is it a typo?
tkaiser Posted November 29, 2016 Posted November 29, 2016 is 11300 really 11300 or is it a typo? It's really 11300. Please don't trust too much into the voltage graph since this is just the result of parsing script.bin when the board starts, storing there a formula to calculate voltage based on clockspeed and providing a file below /tmp which contains the results of this guesswork (there is nothing we could query on the small H3 boards so that's the only variant that should work -- maybe it does not in your installation) @zador: 'recommended' DVFS operating points as follows. On the larger Oranges we use: cooler0 = "1296000 4 4294967295 0" cooler1 = "1200000 4 4294967295 0" cooler2 = "1008000 4 4294967295 0" cooler3 = "816000 4 4294967295 0" cooler4 = "648000 4 4294967295 0" cooler5 = "480000 4 4294967295 0" cooler6 = "480000 3 4294967295 0" cooler7 = "480000 2 4294967295 0" cooler8 = "480000 1 4294967295 0" LV1_freq = 1296000000 LV1_volt = 1320 LV2_freq = 1200000000 LV2_volt = 1240 LV3_freq = 1104000000 LV3_volt = 1180 LV4_freq = 1008000000 LV4_volt = 1140 LV5_freq = 960000000 LV5_volt = 1080 LV6_freq = 816000000 LV6_volt = 1020 LV7_freq = 480000000 LV7_volt = 980 And on the smaller ones: cooler0 = "1200000 4 4294967295 0" cooler1 = "912000 4 4294967295 0" cooler2 = "768000 4 4294967295 0" cooler3 = "648000 4 4294967295 0" cooler4 = "480000 4 4294967295 0" cooler5 = "480000 3 4294967295 0" cooler6 = "480000 2 4294967295 0" cooler7 = "480000 1 4294967295 0" LV1_freq = 1200000000 LV1_volt = 1300 LV2_freq = 1008000000 LV2_volt = 1300 LV3_freq = 912000000 LV3_volt = 1100 LV4_freq = 648000000 LV4_volt = 1100 LV5_freq = 480000000 LV5_volt = 1100 In the meantime this really looks wrong to me since we switch to 1.3V with 1008 MHz, everything below remains at 1.1V. Edit: Strange, I thought I corrected a bit of the above before. Anyway: Wanted to say that wo chose differing settings for a reason: Since we did extensive testing back then. I'm only thinking about potential delays with the GPIO based voltage switching so cpufreq might already be increased while voltage increase has a short delay? But then we would've a bigger problem since with high loads cpufreq immediately enters scaling_max_freq. Hmm... don't know.
ChAoSWK Posted November 29, 2016 Posted November 29, 2016 Hi, because we use LV5_freq = 960000000LV5_volt = 1080 on the large boards, we shouldnt run in UV issues while using 912mhz and 1.125 (measured) Volts even when the regulator increases with delay, because regulator should not switch to 1.3 volt when I set the CPU to max 912 mhz. I think there must be a completely other issue. How many people are affected you know? What is the cooler section for? THX
tkaiser Posted November 29, 2016 Posted November 29, 2016 we shouldnt run in UV issues while using 912mhz and 1.125 (measured) Volts even when the regulator increases with delay To be honest, I'm an electrics NOOB and have simply no idea how this works (the switching part, in case there's some oscillating involved it might be different from the larger Oranges where another and less primitive voltage regulator is used: SY8106A). Regarding your problem: After the sysbench tests I'm somewhat confident that it's not related to undervoltage, the remark was more a general one addressed at @zador (since we both spent a lot of time already on tweaking voltage, thermal and performance settings). To be honest: I don't know what's happening on your board but if you still run into deadlocks every now and then at the sime time in the morning (when the scheduled jobs do whatever they do) then this might be worth a look. I know it sounds stupid but bad SD cards are the root cause of all evil Really: I've identified 3 bad cards here and not thrown them away to test with them from time to time. The symptoms don't look like a failing HDD but it gets just weird and you run in all sort of problems. We provide a (not so) quick test for this also already: armbianmonitor -c $HOME (without sudo). Can take ages on slow cards since all the empty space will be checked.
ChAoSWK Posted November 29, 2016 Posted November 29, 2016 The time in the morning seems to vary... Today it was 6:41 when CPU load rised for the minutes. But the Board today didn't crash. Yesterday I changed the place of the pi but not the electronic connections. I will test, as first mentioned with 816mhz and if it crashes again, I'll copy the contents of this SD card to another (I've got around 10-15 of these 4GB Cards laying here around). What is the cooler section for? A fan connected to a GPIO?
zador.blood.stained Posted November 29, 2016 Posted November 29, 2016 What is the cooler section for? A fan connected to a GPIO? It's for throttling and/or killing cores in case of overheating. First number is the max allowed frequency, second one is the number of cores. Numbers 3 and 4 can be ignored.
tkaiser Posted November 29, 2016 Posted November 29, 2016 if it crashes again, I'll copy the contents of this SD card to another (I've got around 10-15 of these 4GB Cards laying here around). Ok, then let's stop here. I recommended to TEST the card and not to copy a potentially damaged system around. Also "4GB" cards are usually low-end (and in my opinion broken anyway, at least 16 GB from SanDisk, Samsung, Toshiba or Transcend should be used). Anyway: cooler_table determines the cpufreq steps that are used when throttling occurs. The thermal trip points are defined in ths_para above.
ChAoSWK Posted November 29, 2016 Posted November 29, 2016 hello together, to exclude all software and sd-hardware failures I took an other of these cards, and first I intensively tested it with a windows tool. I re-extracted the image from the downloaded zip and tested the sha-256. I wrote the image to the tested sdcard I booted the image and wait that the pi came fully up (incl reboots). I made my h3disp config, apt-get update && apt-get upgrade installed the dependencies of my scripts and copied my scripts to it, also changed the rc.local and crontab to my fits. at last I made my h3consumption settings to 816 / 132 / display off / USB on Now I will wait for device to run stable or even not and tell here if or not. If not I can surely say SD Card and random software error are excluded! THX in advance
dottgonzo Posted November 30, 2016 Author Posted November 30, 2016 as writed before, my boards works with all fs in read only mode, so i can exclude sd corruption. On my tests the opilite freeze on boot (and only on boot, while systemd load the system), but maybe because my applications are not intensive and the system is poorely loaded. So i'm waiting to find the time to conduit tests for this issue
ChAoSWK Posted December 1, 2016 Posted December 1, 2016 Hi guys, this night my freshly and clean installed pi one crashed again. This time at 3:05 am. Uptime was ca 29,5h. Attached a ss of Armbianmonitor. Now, settings were 816MHz, 2 Cores, 1.1 V, DRAM 132 MHz, The green Power LED was off again, so I think, as dottgonzo already reported, the OPi stucks at a reboot or so on. With my last test I can exclude a defective SD-Card and a defective installation. Now my ideas for further testing: Set CPU at 1.3 Volt all the time, Reduce clock speed to 480mhz completely set 1 or 4 cores change dram clock speed (increase) I think the Pi has a problem with the suddenly increasing load when scheduled tasks are started. CPU has a passive cooler and was at 22°C at this moment. The last test I can do is the same SD-Card at my Orange Pi Plus. Only changed thing must be the voltage regulator driver and dfvs table, wifi and emmc is not needed. If all these not work I'll request refund or a spare Pi from Orangepi, because so it is unusable for me. Do you have any other ideas to find a solution? EDIT: For testing I changed the whole policy. I set CPU Freq hardly coded to 648 mhz (Min AND Max same value), activated 4 cores and corekeeper, DRAM to 408 MHz and Volt to 1.3 Volt. Temperature increased to 26 / 27 degree and this evening I will measure TP1's volt (if it is really set to 1.3 Volt). EDIT2: yesterday evening I measured TP1 and it told me to be at 1.325 Volts, so this seems very good... I decided to wait one week, and if it runs stable I will go step by step to my unstable config and wait between the steps one week per step again. So I should be able to find out what option causes the instability. Weird thing, but as tkaiser told me: the Voltage even in this config is recorded wrongly in armbianmonitor. The PI runs at a fixed frequency and a fixed voltage but as you can see in "VoltPeak1.png" there are Downpeaks... THX
ChAoSWK Posted December 3, 2016 Posted December 3, 2016 Hello again, bad news, even with this policy (fixed freq, OV, etc) I got a crash this night. I do not think ist is a Voltage, frequency, load issue. As tkaiser told, I changed now the SDCard to a complete other brand, 32 gb Samsung class 10... Testing again.... when it now crashes I have completely no idea. I'll try to get a refund or a replacement from orangepi... Greetings
sovking Posted December 5, 2016 Posted December 5, 2016 I join to the crew of people with stability issue with Orange Pi. Usually I need to place SBC in rural areas to monitor solar production: tipical equipment is a board connected to ethernet or wifi, and with attacched one or two usb-rs485 converters. They run some simple sampling software, a webserver and php. Tipically very low load, no performance required. Once I've used a meanwell 15-5 for power supply, some other tumes chepear alternative, 5V 2A psu. In any case, a powerbank in the middle or lithium battery are used as USP (For power bank my choice is Easyacc PB10000CF, which usually is capable to continusly power the board when there are power outages - it's not common among power banks). For SD card I use 16 GB Sandisk Extreme or Extreme Plus, nothing cheaper. Until now I've used raspberry and olimex A10 and A20. A single core A10 is enough for me, and they have the good AXP209 for multiple power management, it's own lithium battery... but sometimes can be little bit expensive, when configured with full optionals. So I'm tring H3 chips, and I started using Orange PI PC, with armbian 5.23 Jessie Server. And while in laboratory it seems work, when I place into the field, its maxium uptime is around 3 days. Probably the voltage supplied to the psu is very noisy and dirty, and the power outages can happen quite often even for few minutes, but it seem unreliable: it seems that it reboots, but sometimes, it hangs during boot. So I've to recall the unit back to the lab, and probably I will put another A10 into the field, until it can deal with different power condition. I'm triend to provide it good compnent around it, but it has to work flawlessy as A10-A20 do. I'm open to read any suggestion to improve the stability, and if something is wrong with my hardware setup, please, tell me.b
ChAoSWK Posted December 5, 2016 Posted December 5, 2016 Hello sovking, thank you for your detailed post. This post makes my suspicion in somecomponents. For me there are 3 big sources for the crashes at the moment: 1) The Sd Card: i am now using a samsung instead of Sandisk. I tested 2 old but good Sandisk Cards, both qere tested with h2wtest and both are old (4gb class 4) but good in quality. I read many people are having problems with H3 and sandisk cards, even with newer hand high class cards. Somewhere I read that Sandisk Cards have a special power saving feature and the crashes ever occur when write cycles are done. So I am now on day 2.5 with a samsung 32 GB Class 10, after one week I will be sure if it is stable. 2) I am also using a self built UPS with a step down to 4.8 V, in the middle a LiIon charging circuit connected to 2 18650 in parallel constellation and because power can go down to 3.3 volts I use a step up converter to a fixed Voltage of 5.2 Volts. I use 2 of these self built UPS on 2 x Raspberrys Pi 3 and this runs absolutely stable since the appearence of the rpi3 (I ordered them directly after appearing on the market). Perhaps there is a noise created by the step up converter the H3 board doesnt like. If the board will crash again with the samsung card I will test with a standard Mobile phone Power supply 5v 2a. 3) last and not least: we 3 here are using rs485 converters (all usb?)? Perhaps a software / driver / firmware error? On the pi this dongle ran over month without a crash... .... In my fact the crashes appear in the night at 3:05 am and around 6:45 am. There CPU load rises because of some tasks (standard taks in armbian image). I closed out some stabilty issues by fixing the cpu freq to ONE fixed value and overvolt it (648 mhz) to 1.3 volt. Crashes appeared as before. But my favorite reason is at the moment the SD card (if it will not crash this night I have beaten my own uptime record of 3 days with the OPi Lite ) Thanks for any suggestions and Infos
dottgonzo Posted December 6, 2016 Author Posted December 6, 2016 I join to the crew of people with stability issue with Orange Pi. Usually I need to place SBC in rural areas to monitor solar production: tipical equipment is a board connected to ethernet or wifi, and with attacched one or two usb-rs485 converters. They run some simple sampling software, a webserver and php. Tipically very low load, no performance required. Once I've used a meanwell 15-5 for power supply, some other tumes chepear alternative, 5V 2A psu. In any case, a powerbank in the middle or lithium battery are used as USP (For power bank my choice is Easyacc PB10000CF, which usually is capable to continusly power the board when there are power outages - it's not common among power banks). For SD card I use 16 GB Sandisk Extreme or Extreme Plus, nothing cheaper. Until now I've used raspberry and olimex A10 and A20. A single core A10 is enough for me, and they have the good AXP209 for multiple power management, it's own lithium battery... but sometimes can be little bit expensive, when configured with full optionals. So I'm tring H3 chips, and I started using Orange PI PC, with armbian 5.23 Jessie Server. And while in laboratory it seems work, when I place into the field, its maxium uptime is around 3 days. Probably the voltage supplied to the psu is very noisy and dirty, and the power outages can happen quite often even for few minutes, but it seem unreliable: it seems that it reboots, but sometimes, it hangs during boot. So I've to recall the unit back to the lab, and probably I will put another A10 into the field, until it can deal with different power condition. I'm triend to provide it good compnent around it, but it has to work flawlessy as A10-A20 do. I'm open to read any suggestion to improve the stability, and if something is wrong with my hardware setup, please, tell me.b i can say that is not true. I use 2 opi pc connected with ethernet on 2 places in country, in a place where the power is cut every minimum 4-5 days, and there are many issues, and my opi pc are working 1 from 5 months and 1 from 2 months, with no issues. Your problem, for my opinion, is sd corruption and problems with ethernet connectivity. If you use the sd in read only mode and send the data over internet or to other systems AND use a watchdog to recover the connectivity, you can relax with opi pc (armbian 5.16 i think, don't remember, with original psu from the same manifacturier). And i would to say that the opi pc works better then rpi2-rpi3 (with ethernet), because 1 of that was placed inside a box where the temperature is 50-60 degree (the rpi burn) From some days i'm using the opi lite with the suggestion of tkaiser on my home while i'm trying to simulate several problems (connectivity and power issues) and they are working perfect with a common 3A psu for now (sorry if i've not tested the voltage yet). I think to put the opi lite on one of the worst place where there are rpi working (this week) to see if the voltage adjustment suggested is just enough for my case
ChAoSWK Posted December 6, 2016 Posted December 6, 2016 ... If you use the sd in read only mode and send the data over internet or to other systems... how to completely run the system in readonly mode???
dottgonzo Posted December 6, 2016 Author Posted December 6, 2016 how to completely run the system in readonly mode??? with unionfs-fuse (look here https://forum.armbian.com/index.php/topic/1526-armbian-in-read-only-mode/#entry20043)
ChAoSWK Posted December 7, 2016 Posted December 7, 2016 So, as far as I can say, I have beaten my personally uptime record with the sammy card. now more than 4 days I had never. Interesting is that I dd the image from the "defective" sandisk card to the sammsung card. I tested the sandisk again 2 times with h2testw and it was marked as good. For me it is cleaar, there is an incompatibility between Allwinner H3 and sandisk cards (as I told with exact these cards the Raspberry had NO problem). So, for experimenting I ordered 2 cheap SD-Cards, when they pass in h2testw I will test them with this Image. And @dottgonzo -> thanx for your advice with unionfs-fuse - I tested it yesterday with my OPi PC + and it runs fine Perhaps I'll give the sandisk cards again a try in readonly mode. I will wait till weekend and with 7 days Uptime I am 100% sure the card caused the crashes.
ChAoSWK Posted December 12, 2016 Posted December 12, 2016 Now it is 99,9% clear it is a write issue with the sandisk SD-Cards. I now made the image readonly with union-fuse and copied it back to the SD-Card. It runs stable since 3,5 days in readonly mode. THX in advance
dottgonzo Posted December 13, 2016 Author Posted December 13, 2016 Nope, just do some testing and please report back. We're particularly interested whether exchanging 912 MHz with 816 MHz in fex file makes a difference for you (since I wasted many many hours of my live but found none on all those H3 devices that use the more primitive voltage regulator) seems that now my 3 opilites are working perfectly (from 2 weeks, but they aren't tested on countries with power issues). Do you plan to change this settings on the future release of armbian for rpilite? or can i send a pull request somewhere? now i'm waiting to receive the rpizero to test it
dottgonzo Posted March 1, 2017 Author Posted March 1, 2017 after some week of uninterrupted working i can say that 3 opizero are working perfectly! And for now is my favorite board I'm just start to use it instead of the orangepiPC and raspberry 2
Recommended Posts