Jump to content

Recommended Posts

Posted

Update: BPi M2+ is done but further results would still be interesting.

 

Now we need results for NanoPi M1. The most simple way is using the Armbian image as outlined in post #4 below. The only thing you would've to change is the following and can then run the lima-memtester binary as outlined below:

ln -sf /boot/bin/nanopim1.bin /boot/script.bin
echo nanopim1 > /etc/hostname
reboot

Dear BPi M2+ users. I just tested DRAM reliability with my BPi M2+ just to realize that this board doesn't run stable even with just 624 MHz clockspeed (currently testing 600 MHz for an additional hour or so).

 

In case you have a BPi M2+ it would really help if you could do the same. Everything is outlined here: http://linux-sunxi.org/Xunlong_Orange_Pi_Plus_2E#DRAM_clock_speed_limit

 

Just grab the referenced fel-boot-lima-memtester-on-orange-pi-h3-v3.tar.bz2 archive that now also contains stuff for BPi M2+ and then use the contained fel-boot-lima-memtester-on-banana-pi-m2plus script (I would also start with 624 and if that succeeds then check increasing DRAM clockspeed in 24 MHz steps). Please be aware that since SinoVoip saved a second led on BPi M2+ the red led will blink and you get no notification by a solid lighting 2nd led so you should let the test run at least for 1 hour.

 

It's important to connect a HDMI display and ensure that a spinning cube can be seen with gray background (if the background is glowing red then something is wrong). Some more information can be found here: https://linux-sunxi.org/Hardware_Reliability_Tests#Reliability

 

Please get back ASAP with results since Chen-Yu is currently preparing upstream u-boot support and DRAM timing is important!

Posted

It would be great, if someone would make a suite, that would test everything on these boards.

Things like DRAM, CPU, GPU, SDIO...

 

If it's easier to setup (or even built into the armbian) more users would do it. 

Posted

Here we go. You find a freshly built  Armbian 5.14 Xenial (16.04 LTS) desktop image here: Armbian_5.14_Bananapim2plus_Ubuntu_xenial_3.4.112_desktop.7z  (438M download size)

 

This can be burned on any SD card larger than 2 GB and starts with a DRAM clockspeed of 648 MHz (and we do not allow switching between different DRAM clockspeeds: "# CONFIG_DEVFREQ_DRAM_FREQ is not set"). Also a statically linked lima-memtester binary is included. To start with this please let RPi-Monitor install and then start the test in the following way (as root -- do a 'sudo su -' before if you're not already super user):

armbianmonitor -r
/usr/local/bin/lima-memtester 100M >/dev/null 2>&1

Since we disabled CONFIG_DEVFREQ_DRAM_FREQ RPi-Monitor won't be able to show actual DRAM frequency any more so we have to trust in settings.

 

IMPORTANT: The test is only useful when a connected HDMI display is on and shows a spinning cube on a gray background and this runs at least 1 hour. In case you see a glowing red background then something's already wrong and you have to switch DRAM frequency. So if it looks like this then the test FAILED:

 

 

To change DRAM clockspeed you need this archive here: u-boot-bananapim2plus_5.14_memtester.tar.bz2

 

The contents are as follows:

linux-u-boot-bananapim2plus_5.14_armhf_600MHz.deb
linux-u-boot-bananapim2plus_5.14_armhf_624MHz.deb
linux-u-boot-bananapim2plus_5.14_armhf_648MHz.deb
linux-u-boot-bananapim2plus_5.14_armhf_672MHz.deb
linux-u-boot-bananapim2plus_5.14_armhf_696MHz.deb
linux-u-boot-bananapim2plus_5.14_armhf_720MHz.deb
linux-u-boot-bananapim2plus_5.14_armhf_744MHz.deb

So to switch to eg. 624 MHz you would grab the archive, untar it using 'tar xf /path/to/u-boot-bananapim2plus_5.14_memtester.tar.bz2' and then do a 'dpkg -i linux-u-boot-bananapim2plus_5.14_armhf_624MHz.deb && sync && reboot'. And then start again using

/usr/local/bin/lima-memtester 100M >/dev/null 2>&1
Posted

Any volunteers? We're in an urgent need of further testers. The procedure outlined above should be simple enough, isn't it? Grab a 4 GB card, burn the image, start it, create the usual normal user, install RPi-Monitor (please see below) and then let the test run and get back to here with feedback. :)

 

BTW: Installation of RPi-Monitor would really help getting an idea whether H3 on my BPi M2+ is broken or whether heat dissipation of this board is broken in general. When I run this image with just a heatsink on H3 and without a fan then H3 will get clocked down to 312 MHz and also one CPU core will be killed. The same image running on an OPi PC Plus (after relinking script.bin) with the same heatsink in the same location only clocks down to 1008/1200 MHz.

 

Bildschirmfoto%202016-06-04%20um%2017.55

 

So it would really help if others can show their thermal measurements while executing the test as outlined above.

Posted

My cube started to spin with 648, screen @720p output is normal, using small heat sink.

 

m2-1.png

 

Is this normal?

 

 

 

Posted
  On 6/4/2016 at 5:23 PM, Igor said:

Is this normal?

 

Huh, it really seems BPi M2+ has a horrible 'thermal design', you experienced already 2 CPU cores being killed. When I adjusted the cooler_table entries after first real tests with BPi M2+ to such low values I would've never thought anyone will be able to reach this unless he uses really an 'enclosure from hell' without any airflow. But it seems we both manage to get CPU cores being killed at 240 MHz when running outside an enclosure and with heatsink applied (while H3 Oranges happily run with the same workload at +1000 MHz with 4 cores)

 

Anyway: I tested on the basis of boot0 using a SinoVoip OS image and were able to check DRAM with 720MHz clockspeed successfully. Then I did the same with our Armbian test image (using u-boot 2016.05) and could confirm: 720MHz work at least for an hour while 744MHz already gave a glowing red background. Now I replaced u-boot+spl on the Armbian image with the stuff from ssvb when he created his FEL boot based lima-memtester archive (full bootlog) and am currently testing 720MHz. Spinning cube after 15 minutes -- will let this run for an hour and start then FEL boot test (using 'our' u-boot 2016.05 then).

 

@Igor: Did you try out higher DRAM clockspeeds already or just the default 648 MHz I used when creating the image?

 

Maybe the different power scheme on the BPi M2+ is responsible for the worse results I got. BPi M2+ powers up when an USB cable is connected to the Micro USB port. Will have a look later when testing FEL mode again. Maybe it's just instable DC-IN when both a PSU and another host on the OTG port 'provide' power?

Posted
  Quote

Did you try out higher DRAM clockspeeds already or just the default 648 MHz I used when creating the image?

 

No. It seems pointless.

Posted

Now testing again FEL boot (the 'usual' lima-memtester approach) but using the most recent u-boot version the Armbian test image also uses. Since it failed the last time already at 624 MHz I tried it now with 672 MHz:

 

 

  Reveal hidden contents

 

 

Now testing with 624 MHz again which also fails pretty early:

 

  Reveal hidden contents

 

 

So different DRAM reliability results aren't related to boot0 vs. u-boot and the latter's version doesn't have an effect at all. When using the Armbian image I created or SinoVoip's crappy Ubuntu Mate image (boot0) I'm able to succeed at 720MHz and fail at 744 MHz, using FEL boot exceeding 600 MHz fails.

 

So time to stop wasting time with this crappy board. As usual: Stay away from any device that can be powered through Micro USB since I would suspect the problem we're experiencing right now is that the board both get's power through the USB OTG port (where a Pine64 is connected to be the FEL host) and DC-IN and then $something happens that affects the stability of the board.

 

We use 624 MHz now as DRAM clockspeed in Armbian and already insanely low THS/cooler_table settings that seem to be necessary due to the board design. So it's not only the slowest H3 board ever due to throttling way earlier to insanely low clockspeeds (see Igor's result above: only 2 CPU cores running at 240MHz!) but guarantees also stability problems when powered through Micro USB (as usual).

Posted

BTW, not much related but anyway. I saw this MALI turbo speed patch failed out of our default branch ...

Does this make troubles?

diff --git a/drivers/gpu/mali/mali/platform/mali400-pmu/mali_platform.c b/drivers/gpu/mali/mali/platform/mali400-pmu/mali_platform.c
index 54e50d5..1dc4f79 100644
--- a/drivers/gpu/mali/mali/platform/mali400-pmu/mali_platform.c
+++ b/drivers/gpu/mali/mali/platform/mali400-pmu/mali_platform.c
@@ -37,7 +37,7 @@ static struct clk *gpu_pll  = NULL;
 
 _mali_osk_errcode_t mali_platform_init(void)
 {
-	int freq = 252; /* 252 MHz */
+	int freq = 600; /* 600 MHz */
 
 	gpu_pll = clk_get(NULL, PLL_GPU_CLK);
Posted
  On 6/5/2016 at 11:28 AM, Igor said:

BTW, not much related but anyway. I saw this MALI turbo speed patch failed out of our default branch

 

This patch didn't apply at all after switching to the new BSP kernel from FriendlyARM a few weeks ago so I deleted it. Based on thermal readouts when running lima-memtester it's obvious that the new kernel clocks Mali higher. We should ask @Melanrz whether he can provide fps numbers for Quake (IIRC he reported 37 fps when we increased Mali clockspeed from ssvb's 252 MHz to 504 MHz before we activated this specific patch later finally increasing clockspeed to 600 MHz).

Posted

So I tried your image on my Nanopi (as asked on github), I've been optimistic and tried 696MHz First, I got a crash after 10 minutes or so. I tried 672, also crash. I tried 648, still crash within the 10 first minutes.

 

I made a sprunge : http://sprunge.us/hMHS

 

I still have strange ARISC errors so I don't think the DRAM is that bad, there could be a more general problem about Nanopi M1 or maybe the fact that it only have 512Mo of RAM .....

 

I haven't really checked the log, I'll do it tonight.

 

EDIT :

 

I've attached the rpi monitor graph. The real tests were made after 13:00

post-915-0-82208800-1465300816_thumb.png

Posted
  On 6/7/2016 at 11:55 AM, vlad59 said:

I still have strange ARISC errors

 

These are there since you would've to adjust minimum cpufreq in /etc/default/cpufrequtils to 480 MHz (sorry, I forgot that to mention before). So after a reboot the errors should be gone. The graphs look good (and confirm voltage switching so the ARISC errors are related to trying to clock down to a frequency not  allowed in the dvfs settings).

 

Did you see the spinning cube at all when running lima-memtester? And it would be still interesting which type of DRAM is on the board (since in the meantime @Tido spotted that on BPi M2+ that shows horrible overheating problems not low power DDR3L as on all the Oranges is used but just normal DDR3)

Posted

About the cpufreq, I made the change 10 minutes ago, I should have thought of that before .... Sorry :(.

 

The 2 chips are samsung k4b2g1646q-bck0. If you need something else : a picture / to run a command, I'll do it.

 

Of course I forgot to state the obvious, the cube spin over a light gray background so I think it was good ... After 10 minutes, full screen went lightgray and keyboard / mouse weren't working anymore -> crash ! I may have moved the mouse during the test but I don't think it could be that.

Posted

Great so I guess buying two Nanopi M1 before any review was not a good idea ... Thanks for the information

 

Still the tests were made with an USB power supply, I'll remake the test with ATX power supply / GPIO pin to make sure it was not a power problem.

Posted
  On 6/7/2016 at 1:59 PM, vlad59 said:

Still the tests were made with an USB power supply, I'll remake the test with ATX power supply / GPIO pin to make sure it was not a power problem.

 

That's a good idea especially if you have USB peripherals plugged in (I've only Apple keyboards and mice they're horribly power hungry -- with my special 'Micro USB crap' cable I saved to demonstrate how shitty powering through Micro USB is I'm able to power off every Banana Pi/Pro when I try to connect them since the voltage drops at that moment are too much for the boards :)

 

BTW: I did the testing all the time through a serial console or SSH. You can execute lima-memtester as root without any problems even if X11 is running. And I found it also somewhat convenient having potential error messages available even if the board crashed (your freezes sound more like a powering problem but it's good to confirm what's really going on later)

 

I'm also very curious about the thermal values you get :)

Posted

Hi Zador,

 

A good question. I have attached it to the ground from the Power supply.

Now I am wondering if there is some better point.

 

What Do you think?

Posted
  On 6/7/2016 at 8:15 PM, Tido said:

A good question. I have attached it to the ground from the Power supply.

I left mine on chassis of ATX power supply (that I'm using to power boards I'm testing) and measured more than 1.5V instead of 1.3 on OPi One.  :D

You can try connecting it to one of GPIO GND pins (i.e. pin 39). Also you can measure voltage on (between its leads) tantalum capacitor (big yellow thing in the middle of your photo).

 

Ideally you need to connect your positive probe to VDD_CPUFB signal, but I don't see any testpoint for it, and without resistor numbers on PCB it is almost impossible to find it.

Posted

well, I got my probe (negativ) on the power-jack (just to be clear).

 

Is there a difference in volt between power-jack negativ and GPIO GND pin?

Posted
  On 6/7/2016 at 8:38 PM, Tido said:

Is there a difference in volt between power-jack negativ and GPIO GND pin?

No, it's connected to GND directly. But just to be sure, please measure voltage on the capacitor. If it's 1.3V, then most probably first result is correct too.

Posted

Good morning fellows

 

red LED on

Power Supply 5,16 Volt (measured on the PCB, Pin39 to power-barrel)

GND attached to Pin 39

 

as reference

Pin 1 = 3,23 V

Pin 2 = 5,13 V

 

Capacitor yellow side 0,0 V

Capacitor orange side 1,30 V

 

I also measured again the points from the picture = same result as in the picture.

 

  Quote

I left mine on chassis of ATX power supply

An ATX power supply delivers several volts, so I think it is very important where you connect ground.

Can you also make the reference check like I did above, what are your results?

Posted
  On 6/8/2016 at 4:23 AM, Tido said:

Capacitor yellow side 0,0 V

Capacitor orange side 1,30 V

 

I also measured again the points from the picture = same result as in the picture.

So 1.3V it is

 

  On 6/8/2016 at 4:23 AM, Tido said:

An ATX power supply delivers several volts, so I think it is very important where you connect ground.

Can you also make the reference check like I did above, what are your results?

It was an ATX power supply, now it has banana plug connectors to provide 5V and 12V for different purposes.

 

I tested my multimeter on REF01CPZ voltage reference (10V), it displayed 10.01V.

Posted
  On 6/8/2016 at 9:42 AM, zador.blood.stained said:

So 1.3V it is

 

Tried to reflect that in linux-sunxi wiki and linked to the thread where @sinovoip might (not) explain what happened: http://linux-sunxi.org/index.php?title=Sinovoip_Banana_Pi_M2%2B&curid=2677&diff=17540&oldid=17501

 

Anyway: Always being fed with 1.3V does only partially explain the horrible overheating experience we make with this board. Maybe it's really DRAM, let's see how NanoPi M1 that also uses DDR3 instead of DDR3L DRAM behaves.

Posted

So I retried today.

 

All tests were made with a modified ATX board so the 5V should be clean (I didn't have my multimeter at hand to be totally sure but it really should) without any usb devices . The ambient temperature is between 24 and 26°C (so quite hot)

 

I made two first tests with 648MHz and two tests with 624MHz Both tests failed within 15 minutes.

 

What's interesting is that the soc temp goes above 100°C (103 or 104°C). Does the kernel has a failsafe depending on temperature ? could it be the cause ?

 

lima_memtester is way more stressing that the cpuburn test I made following tkaiser's instructions some weeks ago.

 

Another interesting thing that bugs me is that there never is more that 1 core killed. I remember lurking on a discussion about that a few days ago, maybe we need to more agressive (at least on this board).

 

So far my plan is :

 * Try to tweak the cooling table to kill 3 cores when cooling state = 5 (other adjustments could be made after)

 * Add a fan

 

I'm really not a fan of my last proposal as that would mean putting both my nanopi in the trash :(.

 

 

post-915-0-90906200-1465387633_thumb.png

Posted
  On 6/8/2016 at 12:22 PM, vlad59 said:

What's interesting is that the soc temp goes above 100°C (103 or 104°C). Does the kernel has a failsafe depending on temperature ?

 

Yes, currently that's configured to initiate an emergency shutdown at 105°C: https://github.com/igorpecovnik/lib/blob/master/config/fex/nanopim1.fex#L251-L288

 

But I totally agree: The settings I came up with are insufficient (Zador already pointed that out to me but we looked solely at Oranges and Bananas where this eventually worked and weren't aware that NanoPi M1 obviously is also hot stuff). As a first test it would be great if you could adopt the BPi M2+ settings (just replace  the highlighted lines with the stuff from BPi M2+ fex file and allow downclocking to 240 MHz again in /etc/default/cpufrequtils).

 

Please run the test again (starting with 624 MHz) and report back.

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines