Jump to content

R1 overheated?


petrmaje

Recommended Posts

Hi All,

 

I have R1 issue again. After lot of experiments and tuning, I decided try to use R1 for real routing :-) It means dual WAN, lot of VLANs, thousands connections, traffic shaping etc. Everything worked well, until yesterday. As whole Europe has real summer these days, working temperature of my R1 is around 30 celsius. 

I experienced strange issues when traffic is very high. This happen approx. twice a day. System was not restarted, but PPPOE goes down, kernel reinitialize some hardware and everything continues well. Outage is several seconds only.

I have to say, that I'm using OpenWrt for this application, but the system is well patched and worked without problems up to now. So I have suspect that a hardware is overheated.

Which part it can be? CPU or switch?

 

Has somebody similar experience?

 

I'm trying another board with CPU aluminium cooler, I will post results here.

 

Any idea is welcome.

 

Petr

Link to comment
Share on other sites

Hi petrmaje

 

With my farmer logic its probly your Broadcom BCM53125 chip that sits right under your HDD.

Your prbly dont have a HDD attached that funktions (@here) as a very luxury cooling plate with additional data parking space. ;)

 

My sollution :

 

  1. If your using the standard acrlyc housing loose the sides and put it verticaly like tkaiser said.
  2. There is enough space between the acrylic housing backplate to fit in a small cooling element/plate.
  3. Always cool the A20 but thats not the problem imho.

Well i dont have it build like Tkasier but its realy stable 2 altough i might paint it black 2 :P

 

see pictures

 

loose the side panels

 

cool your BCM53125 chip

 

 

Sinc Patcher

Link to comment
Share on other sites

altough i might paint it black 2 :P

 

Well, in my case I chose the wrong color since the customer now wants to put the device somewhere where it's exposed to sunlight from noon till five. I will include 2 DS18B20 temperature sensors to get a clue how hot it gets inside regardless of internal chip temperatures.

 

The next time I'll build such a device (main part is the 24V PSU in the left and an 8-port Pover over Ethernet injector to provide a couple of Raspberry Pis with network and power) I'll replace the 5V/12 PSU for board and display with 2 step-down converters and use a Banana Pi together with a small 8-port-switch.

Link to comment
Share on other sites

Hi petrmaje

 

With my farmer logic its probly your Broadcom BCM53125 chip that sits right under your HDD.

Your prbly dont have a HDD attached that funktions (@here) as a very luxury cooling plate with additional data parking space. ;)

 

My sollution :

 

  1. If your using the standard acrlyc housing loose the sides and put it verticaly like tkaiser said.
  2. There is enough space between the acrylic housing backplate to fit in a small cooling element/plate.
  3. Always cool the A20 but thats not the problem imho.

Well i dont have it build like Tkasier but its realy stable 2 altough i might paint it black 2 :P

 

see pictures

 

loose the side panels

 

cool your BCM53125 chip

 

 

Sinc Patcher

Hi Patcher,

 

I agree with you, that it's the BCM chip. I put the box verticaly:

 

http://petasek.ddns.net/R1-001.jpg

 

and stability is better. I will buy cooler for the BCM chip too and we will see :)

 

PM

Link to comment
Share on other sites

I agree with you, that it's the BCM chip. I put the box verticaly:

 

http://petasek.ddns.net/R1-001.jpg

 

and stability is better. I will buy cooler for the BCM chip too and we will see :)

 

Always keep in mind that an applied heatsink won't help that much if there's no airflow possible. If you don't have a fan or let convection do the work this might only help in situations where ICs overheat temporarely due to load peaks. But if the chip always heats up then a heatsink won't help since surrounding temperature will increase over time and then no heat dissipation or cooling is possible any more.

Link to comment
Share on other sites

Ok, I will drill several holes on upper cover here http://petasek.ddns.net/R1-holes.jpg to improve airflow??

Heatsinks I ordered already, so I have to wait several days for delivery.

 

Given the positions of the other openings I really doubt that a few holes on top are sufficient. If you're using 3.4.x kernel you can at least read out the temperature values of SoC and PMU.

 

Store this code here eg. as

/usr/local/bin/gettemp.sh

 and do a

source /usr/local/bin/gettemp.sh
pmutemp
soctemp

afterwards (it might be necessary to become root and soctemp sometimes works only if you called it a couple of times). I found that reading out temperatures in mainline is way more difficult/wrong: With kernel 4.x you can query the SoC's temperature sensor using /sys/devices/virtual/thermal/thermal_zone0/temp but the values provided are slightly off in 4.0 and completely messed up in 4.1.x

 

Sysfs/I2C integration for the PMU is gone in mainline and while there exist patches (see the end of this page http://sunxi.montjoie.ovh) it's obvious that the values measured are wrong (this was the same with the SoC's temp patches from the same author with kernel 3.4 -- also wrong values without further adjustment): SoC's temperature +40.6°C and PMU's just 23.5°C --> nearly impossible. When I made some extensive tests half a year ago I realized that the IC's temperature is always at least 10°C above ambient (10°/7°C when using an efficient SMD heatsink) and that it's nearly impossible to create a workload on the CPU cores that leads to the SoC's temp being 17°C exceeding the PMU's.

Link to comment
Share on other sites

Given the positions of the other openings I really doubt that a few holes on top are sufficient. If you're using 3.4.x kernel you can at least read out the temperature values of SoC and PMU.

 

 

This board uses OpenWrt with 3.18 kernel, the only way to read temperature is infrared thermometer. When I experienced problems, A20 temperature was ~45°C and the broadcom chip temperature exceeds 50°C.

Then I turned the box to vertical position, in the meantime weather gets normal and room temperature dropped to 25°C. Now I have 17 hour uptime without outage. I will wait for heatsink. When I measured temperature on A20 with 3.4 kernel, with heatsink temperature decreased by 3°C, so I expect similar behaviour on broadcom chip. This 3° can help the chip overlive extreme conditions as last days.

Link to comment
Share on other sites

When I measured temperature on A20 with 3.4 kernel, with heatsink temperature decreased by 3°C, so I expect similar behaviour on broadcom chip

 

Again: Use a good heatsink, use thermal paste and most importantly: Let convection do the work.

 

I have a cheap port multiplier here that automatically starts to corrupt data under load due to overheating:

 

IMG_4942.JPG

 

After I attached a heatsink and operated the thing vertically problems were gone:

 

IMG_4945.JPG

 

 

And the good news is: The way convection works the better it cools parts that get hotter than others.

Link to comment
Share on other sites

 A little to add to this,

 

Dont forget Tkaiser and i use a hdd, and does some cooling 2

 

Maybe add a lay around laptop hdd, the hdd houseing sits ontop of the Broadcom and far away from the A20.

It could only get better this way.

 

If your going to use the hdd read the instructions on Tkaisers wiki because u have to power it another way.

 

P.

Link to comment
Share on other sites

 A little to add to this,

 

Dont forget Tkaiser and i use a hdd, and does some cooling 2

 

Maybe add a lay around laptop hdd, the hdd houseing sits ontop of the Broadcom and far away from the A20.

It could only get better this way.

 

If your going to use the hdd read the instructions on Tkaisers wiki because u have to power it another way.

 

P.

Originally I used hdd too, but when things went wrong due to temperature, I removed hdd to reduce heating inside the box. You are right, that hdd can consume significant part of energy from bcm chip, I will try to put it back. I have the cable for secondary powering via battery connector.

As temporary solution I reduced network traffic, disabled torrents and similar heavy loads and system is stable now. I am waiting for broadcom heatsink. When I will have bcm cooler, I will put hdd back, enable all traffic and ... :)

Thanks for help gentlemen!

Link to comment
Share on other sites

Again: Use a good heatsink, use thermal paste and most importantly: Let convection do the work.

 

I have a cheap port multiplier here that automatically starts to corrupt data under load due to overheating:

 

IMG_4942.JPG

 

After I attached a heatsink and operated the thing vertically problems were gone:

 

IMG_4945.JPG

 

 

And the good news is: The way convection works the better it cools parts that get hotter than others.

Where do you buy pretty heatsinks like this? :) I live in the Czech Republic and in the local e-shops are not available :(

Link to comment
Share on other sites

Hi,

 

As a quick comment I'm also doing my best to keep my banana pi m1 cool. My tricks:

- Buy a kit of heatsinks, I found kit with exact size for processor memory and pmu

- Cut the full bottom of the acrilic case. Not just holes or lines, remove all of it (0 problems for heatsinks not fitting in between)

- Instead of laying on the bottom, turn 90º and build "legs" (with foam). This will allow more are to be move by convection. And of course I had no other option: Sata and hdmi are on that side

- Grooves on the heatsinks are aligned up-to-down to ease air flow

 

I can't really tell the effect of the heatsinks (cheap stuff bought in china) but removing the bottom and turning to the side gave 10ºC less in hot days

Link to comment
Share on other sites

Sorry if these pictures are misleading, let me explain:

- First rework was removing the cover and placing at 90º

- Second is adding alluminium heatsinks

 

I'm missing a picture of 90º+heatsinks, that's why you see that the A20 is no heatsinked. But it really is (today)

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines