ag123 Posted September 15, 2018 Posted September 15, 2018 orange pi one H3 idles somewhat warm in part due to a power regulator that switches only between 1.1 and 1.3 v. orange pi pc could idle at pretty low temperatures (closer to room without a heat sink due to a better power regulator as well), and the h3 runs rather hot say > 70deg C when under loads the trouble is the SOC die is 14x14mm, and if one wants to use a large heat sink, it wouldn't fit and you would need to use the copper shim method, to provide some clearance to fit a large heat sink that is troublesome and hence , i made do and made an attempt to try out *small* heat sinks, e.g. https://www.aliexpress.com/item/10pcs-Computer-Cooler-Radiator-Aluminum-Heatsink-Heat-sink-for-Electronic-Chip-Heat-dissipation-Cooling-Pads-14/32890197245.html the dealer provided some thermal tapes, it is convenient, but i did not use them, instead i used some ordinary heat sink compounds like this https://www.ebay.com/sch/i.html?_from=R40&_nkw=hy510&_sacat=0&_sop=15 the results looks like this, i used a little too much thermal compound and they ooze from the edges tests platform: debian stretch mainline kernel (switched to development / nightly builds) Linux orangepione 4.14.68-sunxi #161 SMP to evaluate how well such small heat sink works i made it (H3) do some math (square matrix multiplication) based on codes adapted from https://github.com/mtrebi/matrix-multiplication-threading.git this probably won't give the best mflops, gflops but it is able to run all 4 threads concurrently doing the matrix multiplication the test does a multi-threaded 1000x1000 matrix multiplication concurrently, this is 2N^3 flops (2 billion floating point ops) here are the results orange pi one no heat sink idle Spoiler armbian monitor -m Time CPU load %cpu %sys %usr %nice %io %irq CPU C.St. 00:16:03: 1200MHz 0.04 3% 1% 1% 0% 0% 0% 52.8°C 0/8 00:16:08: 480MHz 0.03 0% 0% 0% 0% 0% 0% 55.9°C 0/8 00:16:14: 1200MHz 0.03 2% 1% 0% 0% 0% 0% 55.8°C 0/8 00:16:19: 480MHz 0.03 1% 0% 0% 0% 0% 0% 56.7°C 0/8 00:16:24: 1200MHz 0.02 1% 1% 0% 0% 0% 0% 55.4°C 0/8 00:16:29: 480MHz 0.02 0% 0% 0% 0% 0% 0% 54.7°C 0/8 heat sink idle Spoiler Time CPU load %cpu %sys %usr %nice %io %irq CPU C.St. 00:14:09: 480MHz 0.14 1% 0% 0% 0% 0% 0% 45.0°C 0/8 00:14:14: 480MHz 0.13 1% 1% 0% 0% 0% 0% 44.2°C 0/8 00:14:19: 480MHz 0.12 3% 1% 1% 0% 0% 0% 42.8°C 0/8 00:14:24: 480MHz 0.11 2% 1% 0% 0% 0% 0% 45.7°C 0/8 00:14:29: 480MHz 0.10 3% 1% 1% 0% 0% 0% 43.9°C 0/8 00:14:35: 1200MHz 0.09 18% 2% 15% 0% 1% 0% 47.3°C 0/8 00:14:40: 1200MHz 0.16 13% 1% 11% 0% 0% 0% 48.2°C 0/8 00:14:45: 480MHz 0.15 1% 0% 0% 0% 0% 0% 44.8°C 0/8 00:14:50: 480MHz 0.14 1% 1% 0% 0% 0% 0% 46.6°C 0/8 i noted that the idle temperatures are about 5 degrees lower average vs without the heat sink no heat sink load 5 cycles of 1000x1000 matrix multiplication (each cycle takes about 10-11 secs hence about 1 min run) Spoiler Multi thread execution Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Average execution took 11076.8 ms flops: 180.558Mflops <<< 180 Mflops End of program --------------------------- cpu 0 cpu 1 cpu 2 cpu 3 240000 0 0 0 0 480000 1 1 1 1 648000 0 0 0 0 816000 0 0 0 0 912000 0 0 0 0 960000 0 0 0 0 1008000 0 0 0 0 1104000 0 0 0 0 1200000 0 0 0 0 Stop monitoring using [ctrl]-[c] Time CPU load %cpu %sys %usr %nice %io %irq CPU C.St. 00:18:19: 1200MHz 0.04 3% 0% 3% 0% 0% 0% 52.5°C 0/8 00:18:24: 1200MHz 0.04 3% 1% 0% 0% 0% 0% 54.8°C 0/8 00:18:34: 1200MHz 0.03 66% 1% 64% 0% 0% 0% 72.8°C 3/8 00:18:47: 960MHz 1.20 94% 0% 94% 0% 0% 0% 74.5°C 2/8 00:18:59: 960MHz 1.71 97% 0% 96% 0% 0% 0% 75.0°C 4/8 00:19:11: 912MHz 2.13 97% 0% 96% 0% 0% 0% 69.1°C 2/8 00:19:22: 960MHz 2.84 97% 0% 96% 0% 0% 0% 77.7°C 3/8 <<< 77 deg C 00:19:28: 480MHz 2.77 48% 0% 47% 0% 0% 0% 60.7°C 1/8 00:19:33: 480MHz 2.55 1% 1% 0% 0% 0% 0% 58.1°C 0/8 00:19:39: 480MHz 2.34 1% 0% 0% 0% 0% 0% 56.4°C 0/8 00:19:44: 480MHz 2.15 1% 1% 0% 0% 0% 0% 56.9°C 0/8 cpu 0 cpu 1 cpu 2 cpu 3 240000 0 0 0 0 480000 1873 1873 1873 1873 648000 0 0 0 0 816000 0 0 0 0 912000 640 640 640 640 960000 2389 2389 2389 2389 <<< - cpu spend most of the time here during load 1008000 2354 2354 2354 2354 <<< - cpu spend most of the time here during load 1104000 0 0 0 0 1200000 1722 1722 1722 1722 heat sink load 5 cycles of 1000x1000 matrix multiplication (each cycle takes about 10-11 secs hence about 1 min run) Spoiler Multi thread execution Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Average execution took 9327.4 ms flops: 214.422Mflops <<< 214 mflops ! End of program ---- cpu 0 cpu 1 cpu 2 cpu 3 240000 0 0 0 0 480000 1 1 1 1 648000 0 0 0 0 816000 0 0 0 0 912000 0 0 0 0 960000 0 0 0 0 1008000 0 0 0 0 1104000 0 0 0 0 1200000 0 0 0 0 Stop monitoring using [ctrl]-[c] Time CPU load %cpu %sys %usr %nice %io %irq CPU C.St. 00:38:21: 1200MHz 0.08 9% 0% 8% 0% 0% 0% 46.7°C 0/8 00:38:26: 480MHz 0.08 0% 0% 0% 0% 0% 0% 46.8°C 0/8 00:38:31: 480MHz 0.07 2% 1% 0% 0% 0% 0% 45.4°C 0/8 00:38:36: 1200MHz 0.07 1% 0% 0% 0% 0% 0% 47.1°C 0/8 00:38:41: 480MHz 0.06 0% 0% 0% 0% 0% 0% 45.5°C 0/8 00:38:47: 1200MHz 0.06 3% 1% 1% 0% 0% 0% 45.7°C 0/8 00:38:56: 1200MHz 0.37 58% 0% 57% 0% 0% 0% 58.2°C 0/8 00:39:08: 1200MHz 1.00 97% 0% 97% 0% 0% 0% 61.7°C 0/8 00:39:17: 1200MHz 1.62 97% 0% 97% 0% 0% 0% 62.2°C 0/8 00:39:27: 1200MHz 2.06 97% 0% 96% 0% 0% 0% 63.2°C 0/8 00:39:37: 1008MHz 2.50 97% 0% 97% 0% 0% 0% 64.6°C 1/8 << 64 deg C 00:39:42: 480MHz 2.56 38% 0% 37% 0% 0% 0% 54.8°C 0/8 00:39:47: 480MHz 2.36 1% 1% 0% 0% 0% 0% 53.0°C 0/8 00:39:53: 480MHz 2.17 0% 0% 0% 0% 0% 0% 55.5°C 0/8 Time CPU load %cpu %sys %usr %nice %io %irq CPU C.St. 00:39:58: 1200MHz 1.99 1% 1% 0% 0% 0% 0% 54.8°C 0/8 00:40:03: 480MHz 1.83 0% 0% 0% 0% 0% 0% 54.1°C 0/8 00:40:08: 1200MHz 1.69 3% 1% 0% 0% 0% 0% 51.8°C 0/8 00:40:13: 480MHz 1.55 1% 0% 0% 0% 0% 0% 50.5°C 0/8 cpu 0 cpu 1 cpu 2 cpu 3 240000 0 0 0 0 480000 4243 4243 4243 4243 648000 0 0 0 0 816000 0 0 0 0 912000 0 0 0 0 960000 0 0 0 0 1008000 435 435 435 435 1104000 358 358 358 358 1200000 6717 6717 6717 6717 <<< cpu spend most of the time here ok so that is 180mflops (no heat sink) vs 214mflops (heat sink), note that in this particular case, i started off at lower temperatures (i.e. after the board just started), hence, there is more thermal headroom which results in a higher difference. this is quite reflective of a transient load. no heat sink load 10 cycles of 1000x1000 matrix multiplication (each cycle takes about 10-11 secs hence about 2 minutes run) Spoiler Multi thread execution Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Average execution took 11185.5 ms flops: 178.803Mflops <<< 178 Mflops End of program ---- cpu 0 cpu 1 cpu 2 cpu 3 240000 0 0 0 0 480000 1 1 1 1 648000 0 0 0 0 816000 0 0 0 0 912000 0 0 0 0 960000 0 0 0 0 1008000 0 0 0 0 1104000 0 0 0 0 1200000 0 0 0 0 Stop monitoring using [ctrl]-[c] Time CPU load %cpu %sys %usr %nice %io %irq CPU C.St. 00:22:49: 1200MHz 0.22 4% 0% 3% 0% 0% 0% 58.0°C 0/8 00:22:54: 480MHz 0.20 0% 0% 0% 0% 0% 0% 53.8°C 0/8 00:22:59: 480MHz 0.19 1% 0% 0% 0% 0% 0% 54.9°C 0/8 00:23:09: 1008MHz 0.17 60% 1% 58% 0% 0% 0% 65.7°C 3/8 00:23:23: 1008MHz 1.30 97% 0% 96% 0% 0% 0% 69.0°C 2/8 00:23:34: 960MHz 1.79 97% 0% 96% 0% 0% 0% 76.2°C 4/8 00:23:46: 912MHz 2.21 97% 0% 96% 0% 0% 0% 69.6°C 3/8 00:23:58: 960MHz 2.63 97% 0% 96% 0% 0% 0% 71.4°C 2/8 00:24:09: 1008MHz 2.92 97% 0% 97% 0% 0% 0% 73.8°C 3/8 <<< 73.8 deg C 00:24:20: 912MHz 3.16 97% 0% 96% 0% 0% 0% 70.2°C 3/8 00:24:32: 1008MHz 3.56 96% 0% 96% 0% 0% 0% 72.2°C 3/8 00:24:44: 1008MHz 3.70 97% 0% 96% 0% 0% 0% 70.7°C 3/8 00:24:55: 960MHz 3.82 94% 0% 93% 0% 0% 0% 73.9°C 3/8 <<< 73.9 deg C 00:25:02: 480MHz 3.76 49% 0% 48% 0% 0% 0% 63.5°C 1/8 Time CPU load %cpu %sys %usr %nice %io %irq CPU C.St. 00:25:07: 480MHz 3.45 1% 0% 0% 0% 0% 0% 61.8°C 1/8 00:25:12: 480MHz 3.18 1% 0% 0% 0% 0% 0% 61.8°C 0/8 00:25:17: 1200MHz 2.92 1% 0% 0% 0% 0% 0% 59.8°C 0/8 00:25:23: 480MHz 2.69 0% 0% 0% 0% 0% 0% 61.8°C 0/8 00:25:28: 1200MHz 2.47 1% 0% 0% 0% 0% 0% 63.0°C 0/8 cpu 0 cpu 1 cpu 2 cpu 3 240000 0 0 0 0 480000 2752 2752 2752 2752 648000 0 0 0 0 816000 0 0 0 0 912000 2126 2126 2126 2126 960000 6162 6162 6162 6162 <<< cpu spend most of the time here 1008000 3307 3307 3307 3307 <<< cpu spend most of the time here 1104000 424 424 424 424 1200000 1661 1661 1661 1661 heat sink load 10 cycles of 1000x1000 matrix multiplication (each cycle takes about 10-11 secs hence about 2 minutes run) Spoiler Multi thread execution Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Average execution took 10277.3 ms flops: 194.604Mflops <<< 194 Mflops End of program ---------------------- cpu 0 cpu 1 cpu 2 cpu 3 240000 0 0 0 0 480000 0 0 0 0 648000 0 0 0 0 816000 0 0 0 0 912000 0 0 0 0 960000 0 0 0 0 1008000 0 0 0 0 1104000 0 0 0 0 1200000 0 0 0 0 Stop monitoring using [ctrl]-[c] Time CPU load %cpu %sys %usr %nice %io %irq CPU C.St. 00:02:40: 480MHz 0.10 18% 1% 17% 0% 0% 0% 49.0°C 0/8 00:02:45: 480MHz 0.09 1% 0% 0% 0% 0% 0% 49.4°C 0/8 00:02:51: 480MHz 0.08 1% 1% 0% 0% 0% 0% 49.7°C 0/8 00:02:56: 480MHz 0.08 1% 0% 0% 0% 0% 0% 51.4°C 0/8 00:03:01: 480MHz 0.07 1% 1% 0% 0% 0% 0% 49.1°C 0/8 00:03:06: 480MHz 0.06 0% 0% 0% 0% 0% 0% 49.2°C 0/8 00:03:15: 1200MHz 0.46 89% 0% 88% 0% 0% 0% 60.9°C 0/8 00:03:25: 1200MHz 1.08 95% 0% 95% 0% 0% 0% 63.3°C 0/8 00:03:35: 1104MHz 1.68 96% 0% 96% 0% 0% 0% 67.8°C 1/8 00:03:46: 1008MHz 2.11 97% 0% 97% 0% 0% 0% 67.8°C 2/8 00:03:56: 1008MHz 2.47 96% 0% 96% 0% 0% 0% 67.0°C 2/8 00:04:07: 1008MHz 2.86 96% 0% 96% 0% 0% 0% 67.9°C 2/8 00:04:18: 1008MHz 3.11 96% 0% 96% 0% 0% 0% 69.2°C 2/8 00:04:29: 1008MHz 3.45 97% 0% 96% 0% 0% 0% 68.4°C 2/8 Time CPU load %cpu %sys %usr %nice %io %irq CPU C.St. 00:04:40: 1008MHz 3.68 97% 0% 97% 0% 0% 0% 70.5°C 2/8 00:04:51: 960MHz 3.81 95% 0% 94% 0% 0% 0% 71.6°C 3/8 <<< 71.6 deg C 00:04:57: 480MHz 3.59 45% 0% 44% 0% 0% 0% 59.0°C 1/8 00:05:03: 1200MHz 3.30 9% 2% 6% 0% 0% 0% 59.9°C 0/8 00:05:08: 480MHz 3.12 20% 4% 15% 0% 0% 0% 62.0°C 0/8 00:05:13: 1200MHz 2.95 27% 1% 24% 0% 0% 0% 62.6°C 0/8 00:05:18: 1200MHz 2.79 26% 0% 25% 0% 0% 0% 64.1°C 0/8 00:05:23: 1200MHz 2.65 26% 2% 23% 0% 0% 0% 62.7°C 0/8 00:05:28: 480MHz 2.52 6% 2% 3% 0% 0% 0% 60.4°C 0/8 00:05:33: 1200MHz 2.39 2% 1% 0% 0% 0% 0% 57.1°C 0/8 00:05:38: 1200MHz 2.20 2% 0% 0% 1% 0% 0% 56.9°C 0/8 00:05:43: 1200MHz 2.03 2% 0% 0% 0% 0% 0% 55.5°C 0/8 00:05:49: 480MHz 1.86 0% 0% 0% 0% 0% 0% 54.9°C 0/8 00:05:54: 480MHz 1.71 1% 1% 0% 0% 0% 0% 54.8°C 0/8 00:05:59: 480MHz 1.58 1% 0% 0% 0% 0% 0% 55.9°C 0/8 Time CPU load %cpu %sys %usr %nice %io %irq CPU C.St. 00:06:04: 480MHz 1.45 1% 1% 0% 0% 0% 0% 56.4°C 0/8 cpu 0 cpu 1 cpu 2 cpu 3 240000 0 0 0 0 480000 4816 4816 4816 4816 648000 0 0 0 0 816000 0 0 0 0 912000 0 0 0 0 960000 899 899 899 899 1008000 6528 6528 6528 6528 <<< cpu spend most of the time here 1104000 1620 1620 1620 1620 1200000 7056 7056 7056 7056 <<< cpu spend most of the time here ok so that is 178 Mflops no heat sink vs 194 Mflops, with heat sink no heat sink 960000 6162 6162 6162 6162 <<< cpu spend most of the time here 1008000 3307 3307 3307 3307 <<< cpu spend most of the time here vs heat sink 1008000 6528 6528 6528 6528 <<< cpu spend most of the time here 1200000 7056 7056 7056 7056 <<< cpu spend most of the time here this case would more likely reflect sustained loads conclusion the small heat sink is probably inadequate but nevertheless it made a modest improvement the cpu is able to spend more of the time at the higher frequencies under load vs without heat sink, hence the Mflops improvement for transient conditions (i.e. short high loads) this small heat sink apparently made a visible difference as seen in the Mflops improvement somewhat lower idle temperatures (for orange pi one), orange pi pc could do without a heat sink at idle
ag123 Posted September 16, 2018 Author Posted September 16, 2018 out of curiosity i fine tuned the source codes making use of the quad execute for floating point on cortex a7 (simply unroll the loop) https://community.arm.com/processors/f/discussions/5277/cortex-a7-pipeline-is-non-symmetric-what-does-this-attribute-mean Spoiler Multi thread execution Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Calculating.... Finishing multithreading execution... Average execution took 837 ms flops: 2389.49Mflops 2389.49Mflops ! now that looks like the power of a modern arm cortex a7 processor. my guess is the gains are from pipelining efficiencies and the fp quad execute the command to build is c++ -O2 -std=gnu++11 -pthread -o mat main.cpp then run ./mat sources attached main.cpp yes that is the orange pi one or orange pi pc (H3) mainline kernel and for all that speeds the temperatures (with heatsink) looked uneventful (no high temps etc)
Recommended Posts