0
ag123

orangepione H3 an experiment with small heatsink

Recommended Posts

orange pi one H3 idles somewhat warm in part due to a power regulator that switches only between 1.1 and 1.3 v. orange pi pc could idle at pretty low temperatures (closer to room without a heat sink due to a better power regulator as well), and the h3 runs rather hot say > 70deg C when under loads

the trouble is the SOC die is 14x14mm, and if one wants to use a large heat sink, it wouldn't fit and you would need to use the copper shim method, to provide some clearance to fit a large heat sink

 

that is troublesome and hence , i made do and made an attempt to try out *small* heat sinks, e.g.

https://www.aliexpress.com/item/10pcs-Computer-Cooler-Radiator-Aluminum-Heatsink-Heat-sink-for-Electronic-Chip-Heat-dissipation-Cooling-Pads-14/32890197245.html

the dealer provided some thermal tapes, it is convenient, but i did not use them, instead i used some ordinary heat sink compounds like this

https://www.ebay.com/sch/i.html?_from=R40&_nkw=hy510&_sacat=0&_sop=15

the results looks like this, i used a little too much thermal compound and they ooze from the edges

pioneheatsink1.thumb.jpg.691d1d5e3fdd4e5bba6c8f258ae9b579.jpg

 

tests

platform: debian stretch mainline kernel (switched to development / nightly builds) Linux orangepione 4.14.68-sunxi #161 SMP

 

to evaluate how well such small heat sink works i made it (H3) do some math (square matrix multiplication) based on codes adapted from

https://github.com/mtrebi/matrix-multiplication-threading.git

this probably won't give the best mflops, gflops but it is able to run all 4 threads concurrently doing the matrix multiplication

the test does a multi-threaded 1000x1000 matrix multiplication concurrently, this is 2N^3 flops (2 billion floating point ops)

 

here are the results

orange pi one

 

no heat sink idle

Spoiler

armbian monitor -m

Time        CPU    load %cpu %sys %usr %nice %io %irq   CPU  C.St.

00:16:03: 1200MHz  0.04   3%   1%   1%   0%   0%   0% 52.8°C  0/8
00:16:08:  480MHz  0.03   0%   0%   0%   0%   0%   0% 55.9°C  0/8
00:16:14: 1200MHz  0.03   2%   1%   0%   0%   0%   0% 55.8°C  0/8
00:16:19:  480MHz  0.03   1%   0%   0%   0%   0%   0% 56.7°C  0/8
00:16:24: 1200MHz  0.02   1%   1%   0%   0%   0%   0% 55.4°C  0/8
00:16:29:  480MHz  0.02   0%   0%   0%   0%   0%   0% 54.7°C  0/8

heat sink idle

Spoiler

Time        CPU    load %cpu %sys %usr %nice %io %irq   CPU  C.St.
00:14:09:  480MHz  0.14   1%   0%   0%   0%   0%   0% 45.0°C  0/8
00:14:14:  480MHz  0.13   1%   1%   0%   0%   0%   0% 44.2°C  0/8
00:14:19:  480MHz  0.12   3%   1%   1%   0%   0%   0% 42.8°C  0/8
00:14:24:  480MHz  0.11   2%   1%   0%   0%   0%   0% 45.7°C  0/8
00:14:29:  480MHz  0.10   3%   1%   1%   0%   0%   0% 43.9°C  0/8
00:14:35: 1200MHz  0.09  18%   2%  15%   0%   1%   0% 47.3°C  0/8
00:14:40: 1200MHz  0.16  13%   1%  11%   0%   0%   0% 48.2°C  0/8
00:14:45:  480MHz  0.15   1%   0%   0%   0%   0%   0% 44.8°C  0/8
00:14:50:  480MHz  0.14   1%   1%   0%   0%   0%   0% 46.6°C  0/8

 

i noted that the idle temperatures are about 5 degrees lower average vs without the heat sink

 

no heat sink load 5 cycles of 1000x1000 matrix multiplication (each cycle takes about 10-11 secs hence about 1 min run)

Spoiler

Multi thread execution
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
    Average execution took    11076.8 ms
    flops:    180.558Mflops   <<< 180 Mflops
End of program

---------------------------

    cpu 0    cpu 1    cpu 2    cpu 3    
240000    0    0    0    0
480000    1    1    1    1
648000    0    0    0    0
816000    0    0    0    0
912000    0    0    0    0
960000    0    0    0    0
1008000    0    0    0    0
1104000    0    0    0    0
1200000    0    0    0    0
Stop monitoring using [ctrl]-[c]
Time        CPU    load %cpu %sys %usr %nice %io %irq   CPU  C.St.

00:18:19: 1200MHz  0.04   3%   0%   3%   0%   0%   0% 52.5°C  0/8
00:18:24: 1200MHz  0.04   3%   1%   0%   0%   0%   0% 54.8°C  0/8
00:18:34: 1200MHz  0.03  66%   1%  64%   0%   0%   0% 72.8°C  3/8
00:18:47:  960MHz  1.20  94%   0%  94%   0%   0%   0% 74.5°C  2/8
00:18:59:  960MHz  1.71  97%   0%  96%   0%   0%   0% 75.0°C  4/8
00:19:11:  912MHz  2.13  97%   0%  96%   0%   0%   0% 69.1°C  2/8
00:19:22:  960MHz  2.84  97%   0%  96%   0%   0%   0% 77.7°C  3/8  <<< 77 deg C
00:19:28:  480MHz  2.77  48%   0%  47%   0%   0%   0% 60.7°C  1/8
00:19:33:  480MHz  2.55   1%   1%   0%   0%   0%   0% 58.1°C  0/8
00:19:39:  480MHz  2.34   1%   0%   0%   0%   0%   0% 56.4°C  0/8
00:19:44:  480MHz  2.15   1%   1%   0%   0%   0%   0% 56.9°C  0/8
    cpu 0    cpu 1    cpu 2    cpu 3    
240000    0    0    0    0
480000    1873    1873    1873    1873
648000    0    0    0    0
816000    0    0    0    0
912000    640    640    640    640
960000    2389    2389    2389    2389  <<< - cpu spend most of the time here during load
1008000    2354    2354    2354    2354 <<< - cpu spend most of the time here during load
1104000    0    0    0    0
1200000    1722    1722    1722    1722

 

heat sink load 5 cycles of 1000x1000 matrix multiplication (each cycle takes about 10-11 secs hence about 1 min run)

Spoiler

Multi thread execution
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
    Average execution took    9327.4 ms
    flops:    214.422Mflops <<< 214 mflops !
End of program

----

    cpu 0    cpu 1    cpu 2    cpu 3    
240000    0    0    0    0
480000    1    1    1    1
648000    0    0    0    0
816000    0    0    0    0
912000    0    0    0    0
960000    0    0    0    0
1008000    0    0    0    0
1104000    0    0    0    0
1200000    0    0    0    0
Stop monitoring using [ctrl]-[c]
Time        CPU    load %cpu %sys %usr %nice %io %irq   CPU  C.St.

00:38:21: 1200MHz  0.08   9%   0%   8%   0%   0%   0% 46.7°C  0/8
00:38:26:  480MHz  0.08   0%   0%   0%   0%   0%   0% 46.8°C  0/8
00:38:31:  480MHz  0.07   2%   1%   0%   0%   0%   0% 45.4°C  0/8
00:38:36: 1200MHz  0.07   1%   0%   0%   0%   0%   0% 47.1°C  0/8
00:38:41:  480MHz  0.06   0%   0%   0%   0%   0%   0% 45.5°C  0/8
00:38:47: 1200MHz  0.06   3%   1%   1%   0%   0%   0% 45.7°C  0/8
00:38:56: 1200MHz  0.37  58%   0%  57%   0%   0%   0% 58.2°C  0/8
00:39:08: 1200MHz  1.00  97%   0%  97%   0%   0%   0% 61.7°C  0/8
00:39:17: 1200MHz  1.62  97%   0%  97%   0%   0%   0% 62.2°C  0/8
00:39:27: 1200MHz  2.06  97%   0%  96%   0%   0%   0% 63.2°C  0/8
00:39:37: 1008MHz  2.50  97%   0%  97%   0%   0%   0% 64.6°C  1/8 << 64 deg C
00:39:42:  480MHz  2.56  38%   0%  37%   0%   0%   0% 54.8°C  0/8
00:39:47:  480MHz  2.36   1%   1%   0%   0%   0%   0% 53.0°C  0/8
00:39:53:  480MHz  2.17   0%   0%   0%   0%   0%   0% 55.5°C  0/8
Time        CPU    load %cpu %sys %usr %nice %io %irq   CPU  C.St.
00:39:58: 1200MHz  1.99   1%   1%   0%   0%   0%   0% 54.8°C  0/8
00:40:03:  480MHz  1.83   0%   0%   0%   0%   0%   0% 54.1°C  0/8
00:40:08: 1200MHz  1.69   3%   1%   0%   0%   0%   0% 51.8°C  0/8
00:40:13:  480MHz  1.55   1%   0%   0%   0%   0%   0% 50.5°C  0/8
    cpu 0    cpu 1    cpu 2    cpu 3    
240000    0    0    0    0
480000    4243    4243    4243    4243
648000    0    0    0    0
816000    0    0    0    0
912000    0    0    0    0
960000    0    0    0    0
1008000    435    435    435    435
1104000    358    358    358    358
1200000    6717    6717    6717    6717 <<< cpu spend most of the time here

 

ok so that is 180mflops (no heat sink) vs 214mflops (heat sink), note that in this particular case, i started off at lower temperatures (i.e. after the board just started), hence, there is more thermal headroom which results in a higher difference. this is quite reflective of a transient load.

 

no heat sink load 10 cycles of 1000x1000 matrix multiplication (each cycle takes about 10-11 secs hence about 2 minutes run)

Spoiler

Multi thread execution
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
    Average execution took    11185.5 ms
    flops:    178.803Mflops  <<< 178 Mflops
End of program

----

    cpu 0    cpu 1    cpu 2    cpu 3    
240000    0    0    0    0
480000    1    1    1    1
648000    0    0    0    0
816000    0    0    0    0
912000    0    0    0    0
960000    0    0    0    0
1008000    0    0    0    0
1104000    0    0    0    0
1200000    0    0    0    0
Stop monitoring using [ctrl]-[c]
Time        CPU    load %cpu %sys %usr %nice %io %irq   CPU  C.St.

00:22:49: 1200MHz  0.22   4%   0%   3%   0%   0%   0% 58.0°C  0/8
00:22:54:  480MHz  0.20   0%   0%   0%   0%   0%   0% 53.8°C  0/8
00:22:59:  480MHz  0.19   1%   0%   0%   0%   0%   0% 54.9°C  0/8
00:23:09: 1008MHz  0.17  60%   1%  58%   0%   0%   0% 65.7°C  3/8
00:23:23: 1008MHz  1.30  97%   0%  96%   0%   0%   0% 69.0°C  2/8
00:23:34:  960MHz  1.79  97%   0%  96%   0%   0%   0% 76.2°C  4/8
00:23:46:  912MHz  2.21  97%   0%  96%   0%   0%   0% 69.6°C  3/8
00:23:58:  960MHz  2.63  97%   0%  96%   0%   0%   0% 71.4°C  2/8
00:24:09: 1008MHz  2.92  97%   0%  97%   0%   0%   0% 73.8°C  3/8 <<< 73.8 deg C
00:24:20:  912MHz  3.16  97%   0%  96%   0%   0%   0% 70.2°C  3/8
00:24:32: 1008MHz  3.56  96%   0%  96%   0%   0%   0% 72.2°C  3/8
00:24:44: 1008MHz  3.70  97%   0%  96%   0%   0%   0% 70.7°C  3/8
00:24:55:  960MHz  3.82  94%   0%  93%   0%   0%   0% 73.9°C  3/8 <<< 73.9 deg C
00:25:02:  480MHz  3.76  49%   0%  48%   0%   0%   0% 63.5°C  1/8
Time        CPU    load %cpu %sys %usr %nice %io %irq   CPU  C.St.
00:25:07:  480MHz  3.45   1%   0%   0%   0%   0%   0% 61.8°C  1/8
00:25:12:  480MHz  3.18   1%   0%   0%   0%   0%   0% 61.8°C  0/8
00:25:17: 1200MHz  2.92   1%   0%   0%   0%   0%   0% 59.8°C  0/8
00:25:23:  480MHz  2.69   0%   0%   0%   0%   0%   0% 61.8°C  0/8
00:25:28: 1200MHz  2.47   1%   0%   0%   0%   0%   0% 63.0°C  0/8
    cpu 0    cpu 1    cpu 2    cpu 3    
240000    0    0    0    0
480000    2752    2752    2752    2752
648000    0    0    0    0
816000    0    0    0    0
912000    2126    2126    2126    2126
960000    6162    6162    6162    6162 <<< cpu spend most of the time here
1008000    3307    3307    3307    3307 <<< cpu spend most of the time here
1104000    424    424    424    424
1200000    1661    1661    1661    1661

 

heat sink load 10 cycles of 1000x1000 matrix multiplication (each cycle takes about 10-11 secs hence about 2 minutes run)

Spoiler

Multi thread execution
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
    Average execution took    10277.3 ms
    flops:    194.604Mflops  <<< 194 Mflops
End of program

----------------------

    cpu 0    cpu 1    cpu 2    cpu 3    
240000    0    0    0    0
480000    0    0    0    0
648000    0    0    0    0
816000    0    0    0    0
912000    0    0    0    0
960000    0    0    0    0
1008000    0    0    0    0
1104000    0    0    0    0
1200000    0    0    0    0
Stop monitoring using [ctrl]-[c]
Time        CPU    load %cpu %sys %usr %nice %io %irq   CPU  C.St.

00:02:40:  480MHz  0.10  18%   1%  17%   0%   0%   0% 49.0°C  0/8
00:02:45:  480MHz  0.09   1%   0%   0%   0%   0%   0% 49.4°C  0/8
00:02:51:  480MHz  0.08   1%   1%   0%   0%   0%   0% 49.7°C  0/8
00:02:56:  480MHz  0.08   1%   0%   0%   0%   0%   0% 51.4°C  0/8
00:03:01:  480MHz  0.07   1%   1%   0%   0%   0%   0% 49.1°C  0/8
00:03:06:  480MHz  0.06   0%   0%   0%   0%   0%   0% 49.2°C  0/8
00:03:15: 1200MHz  0.46  89%   0%  88%   0%   0%   0% 60.9°C  0/8
00:03:25: 1200MHz  1.08  95%   0%  95%   0%   0%   0% 63.3°C  0/8
00:03:35: 1104MHz  1.68  96%   0%  96%   0%   0%   0% 67.8°C  1/8
00:03:46: 1008MHz  2.11  97%   0%  97%   0%   0%   0% 67.8°C  2/8
00:03:56: 1008MHz  2.47  96%   0%  96%   0%   0%   0% 67.0°C  2/8
00:04:07: 1008MHz  2.86  96%   0%  96%   0%   0%   0% 67.9°C  2/8
00:04:18: 1008MHz  3.11  96%   0%  96%   0%   0%   0% 69.2°C  2/8
00:04:29: 1008MHz  3.45  97%   0%  96%   0%   0%   0% 68.4°C  2/8
Time        CPU    load %cpu %sys %usr %nice %io %irq   CPU  C.St.
00:04:40: 1008MHz  3.68  97%   0%  97%   0%   0%   0% 70.5°C  2/8
00:04:51:  960MHz  3.81  95%   0%  94%   0%   0%   0% 71.6°C  3/8  <<< 71.6 deg C
00:04:57:  480MHz  3.59  45%   0%  44%   0%   0%   0% 59.0°C  1/8
00:05:03: 1200MHz  3.30   9%   2%   6%   0%   0%   0% 59.9°C  0/8
00:05:08:  480MHz  3.12  20%   4%  15%   0%   0%   0% 62.0°C  0/8
00:05:13: 1200MHz  2.95  27%   1%  24%   0%   0%   0% 62.6°C  0/8
00:05:18: 1200MHz  2.79  26%   0%  25%   0%   0%   0% 64.1°C  0/8
00:05:23: 1200MHz  2.65  26%   2%  23%   0%   0%   0% 62.7°C  0/8
00:05:28:  480MHz  2.52   6%   2%   3%   0%   0%   0% 60.4°C  0/8
00:05:33: 1200MHz  2.39   2%   1%   0%   0%   0%   0% 57.1°C  0/8
00:05:38: 1200MHz  2.20   2%   0%   0%   1%   0%   0% 56.9°C  0/8
00:05:43: 1200MHz  2.03   2%   0%   0%   0%   0%   0% 55.5°C  0/8
00:05:49:  480MHz  1.86   0%   0%   0%   0%   0%   0% 54.9°C  0/8
00:05:54:  480MHz  1.71   1%   1%   0%   0%   0%   0% 54.8°C  0/8
00:05:59:  480MHz  1.58   1%   0%   0%   0%   0%   0% 55.9°C  0/8
Time        CPU    load %cpu %sys %usr %nice %io %irq   CPU  C.St.
00:06:04:  480MHz  1.45   1%   1%   0%   0%   0%   0% 56.4°C  0/8
    cpu 0    cpu 1    cpu 2    cpu 3    
240000    0    0    0    0
480000    4816    4816    4816    4816
648000    0    0    0    0
816000    0    0    0    0
912000    0    0    0    0
960000    899    899    899    899
1008000    6528    6528    6528    6528 <<< cpu spend most of the time here
1104000    1620    1620    1620    1620
1200000    7056    7056    7056    7056  <<< cpu spend most of the time here

 

ok so that is 178 Mflops no heat sink vs 194 Mflops, with heat sink

no heat sink

960000    6162    6162    6162    6162 <<< cpu spend most of the time here
1008000    3307    3307    3307    3307 <<< cpu spend most of the time here

vs heat sink

1008000    6528    6528    6528    6528 <<< cpu spend most of the time here
1200000    7056    7056    7056    7056  <<< cpu spend most of the time here

this case would more likely reflect sustained loads

 

conclusion

the small heat sink is probably inadequate but nevertheless it made a modest improvement

the cpu is able to spend more of the time at the higher frequencies under load vs without heat sink, hence the Mflops improvement

for transient conditions (i.e. short high loads) this small heat sink apparently made a visible difference as seen in the Mflops improvement

somewhat lower idle temperatures (for orange pi one), orange pi pc could do without a heat sink at idle

Share this post


Link to post
Share on other sites

out of curiosity i fine tuned the source codes making use of the quad execute for floating point on cortex a7 (simply unroll the loop)

https://community.arm.com/processors/f/discussions/5277/cortex-a7-pipeline-is-non-symmetric-what-does-this-attribute-mean

Spoiler

Multi thread execution
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
Calculating....
Finishing multithreading execution...
    Average execution took    837 ms
    flops:    2389.49Mflops

2389.49Mflops !

 

now that looks like the power of a modern arm cortex a7 processor. my guess is the gains are from pipelining efficiencies and the fp quad execute

 

the command to build is c++ -O2 -std=gnu++11 -pthread -o mat main.cpp

then run ./mat

sources attached

main.cpp

 

yes that is the orange pi one or orange pi pc (H3) mainline kernel

and for all that speeds the temperatures (with heatsink) looked uneventful (no high temps etc)

soctemp1.png.3e933bf4be1575f5a7b0e130286298cb.png

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
0