Trying to compile Pine H64

kissste · May 2, 2018

Try to get armbian compiled for PineH64.

Unfortunately, hitting this roadblock.

drivers/net/wireless/xradio/sdio.c:17:10: fatal error: asm/mach-types.h: No such file or directory
 #include <asm/mach-types.h>
          ^~~~~~~~~~~~~~~~~~
compilation terminated.
scripts/Makefile.build:312: recipe for target 'drivers/net/wireless/xradio/sdio.o' failed
make[4]: *** [drivers/net/wireless/xradio/sdio.o] Error 1
scripts/Makefile.build:559: recipe for target 'drivers/net/wireless/xradio' failed
make[3]: *** [drivers/net/wireless/xradio] Error 2
scripts/Makefile.build:559: recipe for target 'drivers/net/wireless' failed
make[2]: *** [drivers/net/wireless] Error 2
scripts/Makefile.build:559: recipe for target 'drivers/net' failed
make[1]: *** [drivers/net] Error 2
Makefile:1060: recipe for target 'drivers' failed
make: *** [drivers] Error 2
[ error ] ERROR in function compile_kernel [ compilation.sh:337 ]
[ error ] Kernel was not built [ @host ]
[ o.k. ] Process terminated

Any suggestions?

Is this even needed - xradio. I will try to exclude it from the build.

Thank you

kissste · May 2, 2018

This is what I get after excluding xradio driver from the build.

 CC      drivers/of/configfs.o
drivers/of/configfs.c: In function ‘create_overlay’:
drivers/of/configfs.c:67:8: error: implicit declaration of function ‘of_overlay_create’; did you mean ‘of_overlay_remove’? [-Werror=implicit-function-declaration]
  err = of_overlay_create(overlay->overlay);
        ^~~~~~~~~~~~~~~~~
        of_overlay_remove
drivers/of/configfs.c: In function ‘cfs_overlay_release’:
drivers/of/configfs.c:218:3: error: implicit declaration of function ‘of_overlay_destroy’; did you mean ‘of_overlay_remove’? [-Werror=implicit-function-declaration]
   of_overlay_destroy(overlay->ov_id);
   ^~~~~~~~~~~~~~~~~~
   of_overlay_remove
cc1: some warnings being treated as errors
scripts/Makefile.build:312: recipe for target 'drivers/of/configfs.o' failed
make[2]: *** [drivers/of/configfs.o] Error 1
scripts/Makefile.build:559: recipe for target 'drivers/of' failed
make[1]: *** [drivers/of] Error 2
Makefile:1060: recipe for target 'drivers' failed
make: *** [drivers] Error 2
[ error ] ERROR in function compile_kernel [ compilation.sh:337 ]
[ error ] Kernel was not built [ @host ]
[ o.k. ] Process terminated

constantius · May 12, 2018

i downloaded image debian stretch 5.43 dev? Is there no video output? only SSH?

Igor · May 13, 2018

12 hours ago, constantius said:

Is there no video output? only SSH?

Yes, video out is not yet supported.

constantius · May 17, 2018

on the orange pi download section there are debian jessie images with xfce 4 desktop. I mean for Orange pi one plus and lite2. Both H6 CPU.

GPU driver exist, because they must have used it. Only audio is not supported. I have to used usb external audio card. These boards addicted only for server because 1 GB of RAM.

chwe · May 17, 2018

3 minutes ago, constantius said:

on the orange pi download section there are debian jessie images with xfce 4 desktop. I mean for Orange pi one plus and lite2. Both H6 CPU.

GPU driver exist, because they must have used it.

uname -r will tell you that they use a old bsp kernel (3.x forgot which one). Armbian doesn't deal with bsp kernels for this board. Mainline only, means what's not mainlined is for sure not supported yet (and probably nobody has an interest in fixing the BSP-Kernel). Keep in mind, this board is in early (early early early) stage. If you expect something which can be used 'productive' --> that's (at the moment) the wrong SoC if you want use Armbian. Probably every other SoC is better supported at the moment. I think Igor uploaded those Images so that people try to test and fix things not use them productive. @Igor maybe we should write a big red reminder in the downloadsection?

Quote

Don't use this Images for productive they're only provided for testing!

serial console only

Bootlog

I know it has a bunch of red flags there.. but this seems no be enough.

Igor · May 17, 2018

23 minutes ago, chwe said:

maybe we should write a big red reminder in the downloadsection?

I don't think so. It's already impossible to miss those signs. Sometimes people just ignore them or choose to ignore them. Or make wrong assumptions based on weak/wrong technical knowledge or some belief. This entire project helps to understand. Remember the school days. Even teacher explained everything and wrote things on the table, somebody or more, usually always the same people, needed extra clarification.

Just add remarks to the "WWW update wish list" and something can be done. Perhaps Javascript popup on nightly/preview images when clicking the download button.

chwe · May 17, 2018

18 minutes ago, Igor said:

I don't think so. It's already impossible to miss those signs.

:rolleyes:

provide them only via dl.armbian.com might help. but then they see no red flags and come up with other questions.. Let's hope that @Icenowy keeps interest in mainlining the H6.

Igor · May 17, 2018

6 minutes ago, chwe said:

provide them only via dl.armbian.com might help. but then they see no red flags and come up with other questions..

Exactly. It's good as is.

Now, with a new website, things are done better and this should not be a problem. And I don't think we have one. All those signs, flags and rules are in their essence to limit (repeating) questions but its impossible to suppress them.

7 minutes ago, chwe said:

Let's hope

The development progress is slow. Much much slower and way more time is consumed than the average person thinks. Only education can change this perception.

tkaiser · July 31, 2018

If anyone manages to build a H6 kernel with ths/dvfs/cpufreq support (using @Icenowy h6-ths-ugly branch) I would be happy to receive sbc-bench numbers for PineH64... After applying this patch of course

Icenowy · July 31, 2018

1 hour ago, tkaiser said:

If anyone manages to build a H6 kernel with ths/dvfs/cpufreq support (using @Icenowy h6-ths-ugly branch) I would be happy to receive sbc-bench numbers for PineH64... After applying this patch of course

1.16V is beyond the recommended operation range, so I didn't add it now.

BTW H6 comes with two official DVFS table judged by "speed bin", maybe this needs to be implemented. ("Speed bin" info is in SID)

tkaiser · July 31, 2018

14 minutes ago, Icenowy said:

H6 comes with two official DVFS table judged by "speed bin", maybe this needs to be implemented. ("Speed bin" info is in SID)

Interesting

My 1.16V were just copy&paste from this entry, just realized that there is a second OPP table below with lower voltage values for higher OPP. Maybe Allwinner does chip selection after factory testing and then sells those that can cope with lower voltages at higher clockspeeds at slightly higher prices?

Icenowy · July 31, 2018

7 hours ago, tkaiser said:

Interesting

My 1.16V were just copy&paste from this entry, just realized that there is a second OPP table below with lower voltage values for higher OPP. Maybe Allwinner does chip selection after factory testing and then sells those that can cope with lower voltages at higher clockspeeds at slightly higher prices?

Check drivers/soc/allwinner/sunxi-sid.c, it says about "speed bin".

@lex · July 31, 2018

If you don't mind, i throw some numbers here grabbed from Opi One Plus so we all can compare to PineH64. I am interested in seeing how Opi stacks up against PineH64 and how it could be improved.

This is @Icenowy kernel. No patch applied to save my only one Opi H6. I forgot or missed how to enable Crypto, still learning how to turn this little beast ON... Maybe @Igor or @zador.blood.stained has a better kernel config.

There is still some timitations on Opi, that's what i could get for the momment.

7z b

Spoiler

LE
CPU Freq: 1485 1485 1485 1485 1485 1485 1485 1485 1485

RAM size: 991 MB, # CPU hardware threads: 4
RAM usage: 882 MB, # Benchmark threads: 4

Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS

22: 2419 312 754 2353 | 64613 395 1395 5513
23: 2415 317 776 2461 | 63592 396 1389 5502
24: 2389 317 810 2569 | 62442 396 1384 5482
25: 2354 317 848 2688 | 61257 395 1381 5452
---------------------------------- | ------------------------------
Avr: 316 797 2518 | 396 1387 5487
Tot: 356 1092 4002

sysbench:

Spoiler

Initializing random number generator from current time

Prime numbers limit: 2000

Initializing worker threads...

Threads started!

CPU speed:
events per second: 24389.85

General statistics:
total time: 10.0002s
total number of events: 244025

Latency (ms):
min: 0.16
avg: 0.16
max: 0.29
95th percentile: 0.16
sum: 39890.75

Threads fairness:
events (avg/stddev): 61006.2500/15.11
execution time (avg/stddev): 9.9727/0.00

cpu-freq:

Spoiler

cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 1:
driver: cpufreq-dt
CPUs which run at the same hardware frequency: 0 1 2 3
CPUs which need to have their frequency coordinated by software: 0 1 2 3
maximum transition latency: 244 us.
hardware limits: 888 MHz - 1.49 GHz
available frequency steps: 888 MHz, 1.01 GHz, 1.32 GHz, 1.49 GHz
available cpufreq governors: conservative, powersave, ondemand, userspace, performance, schedutil
current policy: frequency should be within 888 MHz and 1.49 GHz.
The governor "performance" may decide which speed to use
within this range.
current CPU frequency is 1.49 GHz.
cpufreq stats: 888 MHz:66.49%, 1.01 GHz:0.37%, 1.32 GHz:0.12%, 1.49 GHz:33.02% (499)

8

tinymembench

Spoiler

tinymembench v0.4.9 (simple benchmark for memory throughput and latency)

==========================================================================
== Memory bandwidth tests ==
== ==
== Note 1: 1MB = 1000000 bytes ==
== Note 2: Results for 'copy' tests show how many bytes can be ==
== copied per second (adding together read and writen ==
== bytes would have provided twice higher numbers) ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
== to first fetch data into it, and only then write it to the ==
== destination (source -> L1 cache, L1 cache -> destination) ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in ==
== brackets ==
==========================================================================

C copy backwards : 1632.8 MB/s (1.4%)
C copy backwards (32 byte blocks) : 1651.2 MB/s (2.1%)
C copy backwards (64 byte blocks) : 1710.7 MB/s (2.0%)
C copy : 1706.8 MB/s (1.6%)
C copy prefetched (32 bytes step) : 1297.3 MB/s (0.3%)
C copy prefetched (64 bytes step) : 1156.2 MB/s
C 2-pass copy : 1472.8 MB/s
C 2-pass copy prefetched (32 bytes step) : 983.5 MB/s
C 2-pass copy prefetched (64 bytes step) : 1005.9 MB/s
C fill : 5573.3 MB/s
C fill (shuffle within 16 byte blocks) : 5572.5 MB/s
C fill (shuffle within 32 byte blocks) : 5573.2 MB/s (0.2%)
C fill (shuffle within 64 byte blocks) : 5572.4 MB/s
---
standard memcpy : 1750.2 MB/s (0.3%)
standard memset : 5577.3 MB/s
---
NEON LDP/STP copy : 1753.6 MB/s (0.2%)
NEON LDP/STP copy pldl2strm (32 bytes step) : 1122.7 MB/s (0.6%)
NEON LDP/STP copy pldl2strm (64 bytes step) : 1426.0 MB/s
NEON LDP/STP copy pldl1keep (32 bytes step) : 1871.2 MB/s
NEON LDP/STP copy pldl1keep (64 bytes step) : 1872.8 MB/s
NEON LD1/ST1 copy : 1728.2 MB/s (0.9%)
NEON STP fill : 5577.6 MB/s
NEON STNP fill : 5062.6 MB/s (2.2%)
ARM LDP/STP copy : 1755.4 MB/s (0.2%)
ARM STP fill : 5577.5 MB/s
ARM STNP fill : 5002.3 MB/s

==========================================================================
== Memory latency test ==
== ==
== Average time is measured for random memory accesses in the buffers ==
== of different sizes. The larger is the buffer, the more significant ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM ==
== accesses. For extremely large buffer sizes we are expecting to see ==
== page table walk with several requests to SDRAM for almost every ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest). ==
== ==
== Note 1: All the numbers are representing extra time, which needs to ==
== be added to L1 cache latency. The cycle timings for L1 cache ==
== latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
== two independent memory accesses at a time. In the case if ==
== the memory subsystem can't handle multiple outstanding ==
== requests, dual random read has the same timings as two ==
== single reads performed one after another. ==
==========================================================================

block size : single random read / dual random read, [MADV_NOHUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 4.6 ns / 7.7 ns
131072 : 7.0 ns / 10.8 ns
262144 : 8.3 ns / 12.0 ns
524288 : 9.6 ns / 13.4 ns
1048576 : 77.1 ns / 117.7 ns
2097152 : 113.6 ns / 151.4 ns
4194304 : 137.5 ns / 169.7 ns
8388608 : 149.5 ns / 178.0 ns
16777216 : 157.4 ns / 184.1 ns
33554432 : 162.3 ns / 188.6 ns
67108864 : 164.8 ns / 191.4 ns

block size : single random read / dual random read, [MADV_HUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 4.6 ns / 7.7 ns
131072 : 7.0 ns / 10.8 ns
262144 : 8.3 ns / 12.0 ns
524288 : 9.7 ns / 13.4 ns
1048576 : 77.0 ns / 117.7 ns
2097152 : 113.0 ns / 150.8 ns
4194304 : 130.8 ns / 161.6 ns
8388608 : 140.5 ns / 165.6 ns
16777216 : 145.5 ns / 167.2 ns
33554432 : 148.0 ns / 168.0 ns
67108864 : 149.2 ns / 168.3 ns

Spoiler

tinymembench v0.4.9 (simple benchmark for memory throughput and latency)

==========================================================================
== Memory bandwidth tests ==
== ==
== Note 1: 1MB = 1000000 bytes ==
== Note 2: Results for 'copy' tests show how many bytes can be ==
== copied per second (adding together read and writen ==
== bytes would have provided twice higher numbers) ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
== to first fetch data into it, and only then write it to the ==
== destination (source -> L1 cache, L1 cache -> destination) ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in ==
== brackets ==
==========================================================================

C copy backwards : 1632.8 MB/s (1.4%)
C copy backwards (32 byte blocks) : 1651.2 MB/s (2.1%)
C copy backwards (64 byte blocks) : 1710.7 MB/s (2.0%)
C copy : 1706.8 MB/s (1.6%)
C copy prefetched (32 bytes step) : 1297.3 MB/s (0.3%)
C copy prefetched (64 bytes step) : 1156.2 MB/s
C 2-pass copy : 1472.8 MB/s
C 2-pass copy prefetched (32 bytes step) : 983.5 MB/s
C 2-pass copy prefetched (64 bytes step) : 1005.9 MB/s
C fill : 5573.3 MB/s
C fill (shuffle within 16 byte blocks) : 5572.5 MB/s
C fill (shuffle within 32 byte blocks) : 5573.2 MB/s (0.2%)
C fill (shuffle within 64 byte blocks) : 5572.4 MB/s
---
standard memcpy : 1750.2 MB/s (0.3%)
standard memset : 5577.3 MB/s
---
NEON LDP/STP copy : 1753.6 MB/s (0.2%)
NEON LDP/STP copy pldl2strm (32 bytes step) : 1122.7 MB/s (0.6%)
NEON LDP/STP copy pldl2strm (64 bytes step) : 1426.0 MB/s
NEON LDP/STP copy pldl1keep (32 bytes step) : 1871.2 MB/s
NEON LDP/STP copy pldl1keep (64 bytes step) : 1872.8 MB/s
NEON LD1/ST1 copy : 1728.2 MB/s (0.9%)
NEON STP fill : 5577.6 MB/s
NEON STNP fill : 5062.6 MB/s (2.2%)
ARM LDP/STP copy : 1755.4 MB/s (0.2%)
ARM STP fill : 5577.5 MB/s
ARM STNP fill : 5002.3 MB/s

==========================================================================
== Memory latency test ==
== ==
== Average time is measured for random memory accesses in the buffers ==
== of different sizes. The larger is the buffer, the more significant ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM ==
== accesses. For extremely large buffer sizes we are expecting to see ==
== page table walk with several requests to SDRAM for almost every ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest). ==
== ==
== Note 1: All the numbers are representing extra time, which needs to ==
== be added to L1 cache latency. The cycle timings for L1 cache ==
== latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
== two independent memory accesses at a time. In the case if ==
== the memory subsystem can't handle multiple outstanding ==
== requests, dual random read has the same timings as two ==
== single reads performed one after another. ==
==========================================================================

block size : single random read / dual random read, [MADV_NOHUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 4.6 ns / 7.7 ns
131072 : 7.0 ns / 10.8 ns
262144 : 8.3 ns / 12.0 ns
524288 : 9.6 ns / 13.4 ns
1048576 : 77.1 ns / 117.7 ns
2097152 : 113.6 ns / 151.4 ns
4194304 : 137.5 ns / 169.7 ns
8388608 : 149.5 ns / 178.0 ns
16777216 : 157.4 ns / 184.1 ns
33554432 : 162.3 ns / 188.6 ns
67108864 : 164.8 ns / 191.4 ns

block size : single random read / dual random read, [MADV_HUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 4.6 ns / 7.7 ns
131072 : 7.0 ns / 10.8 ns
262144 : 8.3 ns / 12.0 ns
524288 : 9.7 ns / 13.4 ns
1048576 : 77.0 ns / 117.7 ns
2097152 : 113.0 ns / 150.8 ns
4194304 : 130.8 ns / 161.6 ns
8388608 : 140.5 ns / 165.6 ns
16777216 : 145.5 ns / 167.2 ns
33554432 : 148.0 ns / 168.0 ns
67108864 : 149.2 ns / 168.3 ns

Edited July 31, 2018 by @lex
tinymembech added

Icenowy · August 1, 2018

2 hours ago, @lex said:

If you don't mind, i throw some numbers here grabbed from Opi One Plus so we all can compare to PineH64. I am interested in seeing how Opi stacks up against PineH64 and how it could be improved.

This is @Icenowy kernel. No patch applied to save my only one Opi H6. I forgot or missed how to enable Crypto, still learning how to turn this little beast ON... Maybe @Igor or @zador.blood.stained has a better kernel config.

There is still some timitations on Opi, that's what i could get for the momment.

7z b

Hide contents

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,4 CPUs LE)

LE
CPU Freq: 1485 1485 1485 1485 1485 1485 1485 1485 1485

RAM size: 991 MB, # CPU hardware threads: 4
RAM usage: 882 MB, # Benchmark threads: 4

Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS

22: 2419 312 754 2353 | 64613 395 1395 5513
23: 2415 317 776 2461 | 63592 396 1389 5502
24: 2389 317 810 2569 | 62442 396 1384 5482
25: 2354 317 848 2688 | 61257 395 1381 5452
---------------------------------- | ------------------------------
Avr: 316 797 2518 | 396 1387 5487
Tot: 356 1092 4002

sysbench:

Hide contents

Initializing random number generator from current time

Prime numbers limit: 2000

Initializing worker threads...

Threads started!

CPU speed:
events per second: 24389.85

General statistics:
total time: 10.0002s
total number of events: 244025

Latency (ms):
min: 0.16
avg: 0.16
max: 0.29
95th percentile: 0.16
sum: 39890.75

Threads fairness:
events (avg/stddev): 61006.2500/15.11
execution time (avg/stddev): 9.9727/0.00

cpu-freq:

Hide contents

cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 1:
driver: cpufreq-dt
CPUs which run at the same hardware frequency: 0 1 2 3
CPUs which need to have their frequency coordinated by software: 0 1 2 3
maximum transition latency: 244 us.
hardware limits: 888 MHz - 1.49 GHz
available frequency steps: 888 MHz, 1.01 GHz, 1.32 GHz, 1.49 GHz
available cpufreq governors: conservative, powersave, ondemand, userspace, performance, schedutil
current policy: frequency should be within 888 MHz and 1.49 GHz.
The governor "performance" may decide which speed to use
within this range.
current CPU frequency is 1.49 GHz.
cpufreq stats: 888 MHz:66.49%, 1.01 GHz:0.37%, 1.32 GHz:0.12%, 1.49 GHz:33.02% (499)

8

tinymembench

Hide contents

tinymembench v0.4.9 (simple benchmark for memory throughput and latency)

==========================================================================
== Memory bandwidth tests ==
== ==
== Note 1: 1MB = 1000000 bytes ==
== Note 2: Results for 'copy' tests show how many bytes can be ==
== copied per second (adding together read and writen ==
== bytes would have provided twice higher numbers) ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
== to first fetch data into it, and only then write it to the ==
== destination (source -> L1 cache, L1 cache -> destination) ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in ==
== brackets ==
==========================================================================

C copy backwards : 1632.8 MB/s (1.4%)
C copy backwards (32 byte blocks) : 1651.2 MB/s (2.1%)
C copy backwards (64 byte blocks) : 1710.7 MB/s (2.0%)
C copy : 1706.8 MB/s (1.6%)
C copy prefetched (32 bytes step) : 1297.3 MB/s (0.3%)
C copy prefetched (64 bytes step) : 1156.2 MB/s
C 2-pass copy : 1472.8 MB/s
C 2-pass copy prefetched (32 bytes step) : 983.5 MB/s
C 2-pass copy prefetched (64 bytes step) : 1005.9 MB/s
C fill : 5573.3 MB/s
C fill (shuffle within 16 byte blocks) : 5572.5 MB/s
C fill (shuffle within 32 byte blocks) : 5573.2 MB/s (0.2%)
C fill (shuffle within 64 byte blocks) : 5572.4 MB/s
---
standard memcpy : 1750.2 MB/s (0.3%)
standard memset : 5577.3 MB/s
---
NEON LDP/STP copy : 1753.6 MB/s (0.2%)
NEON LDP/STP copy pldl2strm (32 bytes step) : 1122.7 MB/s (0.6%)
NEON LDP/STP copy pldl2strm (64 bytes step) : 1426.0 MB/s
NEON LDP/STP copy pldl1keep (32 bytes step) : 1871.2 MB/s
NEON LDP/STP copy pldl1keep (64 bytes step) : 1872.8 MB/s
NEON LD1/ST1 copy : 1728.2 MB/s (0.9%)
NEON STP fill : 5577.6 MB/s
NEON STNP fill : 5062.6 MB/s (2.2%)
ARM LDP/STP copy : 1755.4 MB/s (0.2%)
ARM STP fill : 5577.5 MB/s
ARM STNP fill : 5002.3 MB/s

==========================================================================
== Memory latency test ==
== ==
== Average time is measured for random memory accesses in the buffers ==
== of different sizes. The larger is the buffer, the more significant ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM ==
== accesses. For extremely large buffer sizes we are expecting to see ==
== page table walk with several requests to SDRAM for almost every ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest). ==
== ==
== Note 1: All the numbers are representing extra time, which needs to ==
== be added to L1 cache latency. The cycle timings for L1 cache ==
== latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
== two independent memory accesses at a time. In the case if ==
== the memory subsystem can't handle multiple outstanding ==
== requests, dual random read has the same timings as two ==
== single reads performed one after another. ==
==========================================================================

block size : single random read / dual random read, [MADV_NOHUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 4.6 ns / 7.7 ns
131072 : 7.0 ns / 10.8 ns
262144 : 8.3 ns / 12.0 ns
524288 : 9.6 ns / 13.4 ns
1048576 : 77.1 ns / 117.7 ns
2097152 : 113.6 ns / 151.4 ns
4194304 : 137.5 ns / 169.7 ns
8388608 : 149.5 ns / 178.0 ns
16777216 : 157.4 ns / 184.1 ns
33554432 : 162.3 ns / 188.6 ns
67108864 : 164.8 ns / 191.4 ns

block size : single random read / dual random read, [MADV_HUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 4.6 ns / 7.7 ns
131072 : 7.0 ns / 10.8 ns
262144 : 8.3 ns / 12.0 ns
524288 : 9.7 ns / 13.4 ns
1048576 : 77.0 ns / 117.7 ns
2097152 : 113.0 ns / 150.8 ns
4194304 : 130.8 ns / 161.6 ns
8388608 : 140.5 ns / 165.6 ns
16777216 : 145.5 ns / 167.2 ns
33554432 : 148.0 ns / 168.0 ns
67108864 : 149.2 ns / 168.3 ns

Hide contents

tinymembench v0.4.9 (simple benchmark for memory throughput and latency)

==========================================================================
== Memory bandwidth tests ==
== ==
== Note 1: 1MB = 1000000 bytes ==
== Note 2: Results for 'copy' tests show how many bytes can be ==
== copied per second (adding together read and writen ==
== bytes would have provided twice higher numbers) ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
== to first fetch data into it, and only then write it to the ==
== destination (source -> L1 cache, L1 cache -> destination) ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in ==
== brackets ==
==========================================================================

C copy backwards : 1632.8 MB/s (1.4%)
C copy backwards (32 byte blocks) : 1651.2 MB/s (2.1%)
C copy backwards (64 byte blocks) : 1710.7 MB/s (2.0%)
C copy : 1706.8 MB/s (1.6%)
C copy prefetched (32 bytes step) : 1297.3 MB/s (0.3%)
C copy prefetched (64 bytes step) : 1156.2 MB/s
C 2-pass copy : 1472.8 MB/s
C 2-pass copy prefetched (32 bytes step) : 983.5 MB/s
C 2-pass copy prefetched (64 bytes step) : 1005.9 MB/s
C fill : 5573.3 MB/s
C fill (shuffle within 16 byte blocks) : 5572.5 MB/s
C fill (shuffle within 32 byte blocks) : 5573.2 MB/s (0.2%)
C fill (shuffle within 64 byte blocks) : 5572.4 MB/s
---
standard memcpy : 1750.2 MB/s (0.3%)
standard memset : 5577.3 MB/s
---
NEON LDP/STP copy : 1753.6 MB/s (0.2%)
NEON LDP/STP copy pldl2strm (32 bytes step) : 1122.7 MB/s (0.6%)
NEON LDP/STP copy pldl2strm (64 bytes step) : 1426.0 MB/s
NEON LDP/STP copy pldl1keep (32 bytes step) : 1871.2 MB/s
NEON LDP/STP copy pldl1keep (64 bytes step) : 1872.8 MB/s
NEON LD1/ST1 copy : 1728.2 MB/s (0.9%)
NEON STP fill : 5577.6 MB/s
NEON STNP fill : 5062.6 MB/s (2.2%)
ARM LDP/STP copy : 1755.4 MB/s (0.2%)
ARM STP fill : 5577.5 MB/s
ARM STNP fill : 5002.3 MB/s

==========================================================================
== Memory latency test ==
== ==
== Average time is measured for random memory accesses in the buffers ==
== of different sizes. The larger is the buffer, the more significant ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM ==
== accesses. For extremely large buffer sizes we are expecting to see ==
== page table walk with several requests to SDRAM for almost every ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest). ==
== ==
== Note 1: All the numbers are representing extra time, which needs to ==
== be added to L1 cache latency. The cycle timings for L1 cache ==
== latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
== two independent memory accesses at a time. In the case if ==
== the memory subsystem can't handle multiple outstanding ==
== requests, dual random read has the same timings as two ==
== single reads performed one after another. ==
==========================================================================

block size : single random read / dual random read, [MADV_NOHUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 4.6 ns / 7.7 ns
131072 : 7.0 ns / 10.8 ns
262144 : 8.3 ns / 12.0 ns
524288 : 9.6 ns / 13.4 ns
1048576 : 77.1 ns / 117.7 ns
2097152 : 113.6 ns / 151.4 ns
4194304 : 137.5 ns / 169.7 ns
8388608 : 149.5 ns / 178.0 ns
16777216 : 157.4 ns / 184.1 ns
33554432 : 162.3 ns / 188.6 ns
67108864 : 164.8 ns / 191.4 ns

block size : single random read / dual random read, [MADV_HUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 4.6 ns / 7.7 ns
131072 : 7.0 ns / 10.8 ns
262144 : 8.3 ns / 12.0 ns
524288 : 9.7 ns / 13.4 ns
1048576 : 77.0 ns / 117.7 ns
2097152 : 113.0 ns / 150.8 ns
4194304 : 130.8 ns / 161.6 ns
8388608 : 140.5 ns / 165.6 ns
16777216 : 145.5 ns / 167.2 ns
33554432 : 148.0 ns / 168.0 ns
67108864 : 149.2 ns / 168.3 ns

sun8i-ce is not beast, its performace is not so good. For crypto acceleration, you need to expect aes instruction set of A53, which should be already usable.

BTW applying the patch in fact doesn't hurt when THS is ready, as the main reason to restrict voltage is in fact heat, and when THS is ready it can properly prevent overheat.

Icenowy · August 1, 2018

My philosophy is to keep things conservatived when in doubt, and Armbian shouldn't be affected by this ;-)

tkaiser · August 1, 2018

6 hours ago, @lex said:

grabbed from Opi One Plus so we all can compare to PineH64. I am interested in seeing how Opi stacks up against PineH64 and how it could be improved

So why don't you run 'sbc-bench neon' then? The problem with all those benchmark numbers is that the most important information is always missing: what has really happened (other activity in the background, cpufreq behaviour, throttling, killing CPU cores and so on). You showed zero information about this (especially thermal readouts).

Why not running sbc-bench to get this information? And then apply the 1.8 GHz patch, use heatsink and fan and run the test again?

Icenowy · August 1, 2018

BTW I sort out some code to integrate functions currently available.

U-Boot with HDMI SimpleFB (sometime fail to initialize and get magenta display, in this situation just reset): https://github.com/Icenowy/u-boot/tree/h6-hdmi-rebased-1

Linux with MMC/Ethernet/USB2/USB3/SimpleFB/THS: https://github.com/Icenowy/linux/tree/h6-integrate-2-ugly

@Igor @tkaiser

Icenowy · August 4, 2018

On 8/1/2018 at 1:29 PM, Icenowy said:

BTW I sort out some code to integrate functions currently available.

U-Boot with HDMI SimpleFB (sometime fail to initialize and get magenta display, in this situation just reset): https://github.com/Icenowy/u-boot/tree/h6-hdmi-rebased-1

Linux with MMC/Ethernet/USB2/USB3/SimpleFB/THS: https://github.com/Icenowy/linux/tree/h6-integrate-2-ugly

@Igor @tkaiser

Thanks to help from Jernej, the magenta display issue is solved in U-Boot commit 79405999d7ee43f830825751b200d739b53f20f5 ("video: sunxi: de2/3: clear all BLD address space").

Igor · August 8, 2018

On 8/1/2018 at 7:29 AM, Icenowy said:

Linux with MMC/Ethernet/USB2/USB3/SimpleFB/THS: https://github.com/Icenowy/linux/tree/h6-integrate-2-ugly

Spoiler


|  _ \(_)_ __   ___  | | | |/ /_ | || |  
| |_) | | '_ \ / _ \ | |_| | '_ \| || |_ 
|  __/| | | | |  __/ |  _  | (_) |__   _|
|_|   |_|_| |_|\___| |_| |_|\___/   |_|  
                                         

Welcome to ARMBIAN 5.55 user-built Debian GNU/Linux 9 (stretch) 4.18.0-rc7-sunxi64   
System load:   0.27 0.28 0.14   Up time:       5 min
Memory usage:  3 % of 1970MB    IP:            172.16.100.189
CPU temp:      42°C           
Usage of /:    4% of 30G    

[ 0 security updates available, 1 updates total: apt upgrade ]
Last check: 2018-08-08 10:37

Last login: Wed Aug  8 10:37:46 2018

root@pineh64:~#

ATFSOURCE='https://github.com/Icenowy/arm-trusted-firmware'
ATFBRANCH='branch:sun50i_h6_pmic'
+
BOOTSOURCE='https://github.com/Icenowy/u-boot'
BOOTBRANCH='branch:h6-hdmi-rebased-1'

+

KERNELSOURCE='https://github.com/Icenowy/linux'
KERNELBRANCH='branch:h6-integrate-2-ugly'

no patch except the .deb packaging, no

Works: HDMI, Network, USB2, LEDs

Fail: USB3, SD card, didn't try eMMC, DVFS, ...

http://ix.io/1jD0

What I am missing?

Icenowy · August 8, 2018

21 minutes ago, Igor said:
Reveal hidden contents
|  _ \(_)_ __   ___  | | | |/ /_ | || |  
| |_) | | '_ \ / _ \ | |_| | '_ \| || |_ 
|  __/| | | | |  __/ |  _  | (_) |__   _|
|_|   |_|_| |_|\___| |_| |_|\___/   |_|  
                                         

Welcome to ARMBIAN 5.55 user-built Debian GNU/Linux 9 (stretch) 4.18.0-rc7-sunxi64   
System load:   0.27 0.28 0.14   Up time:       5 min
Memory usage:  3 % of 1970MB    IP:            172.16.100.189
CPU temp:      42°C           
Usage of /:    4% of 30G    

[ 0 security updates available, 1 updates total: apt upgrade ]
Last check: 2018-08-08 10:37

Last login: Wed Aug  8 10:37:46 2018

root@pineh64:~# 
ATFSOURCE='https://github.com/Icenowy/arm-trusted-firmware'
ATFBRANCH='branch:sun50i_h6_pmic'
+
BOOTSOURCE='https://github.com/Icenowy/u-boot'
BOOTBRANCH='branch:h6-hdmi-rebased-1'

+

KERNELSOURCE='https://github.com/Icenowy/linux'
KERNELBRANCH='branch:h6-integrate-2-ugly'

no patch except the .deb packaging, no

Works: HDMI, Network, USB2, LEDs

Fail: USB3, SD card, didn't try eMMC, DVFS, ...

http://ix.io/1jD0

What I am missing?

For USB3, did you enable CONFIG_SUN50I_USB3_PHY?

For SD card, did you enable AXP PMIC support?

Check the following:

```

CONFIG_MFD_AXP20X=y
CONFIG_MFD_AXP20X_I2C=y

CONFIG_REGULATOR_AXP20X=y

CONFIG_PHY_SUN50I_USB3=y

```

Igor · August 8, 2018

2 minutes ago, Icenowy said:

For USB3, did you enable CONFIG_SUN50I_USB3_PHY?

This was set as module and CONFIG_MFD_AXP20X_I2C was missing ... let's try again.

Igor · August 8, 2018

50 minutes ago, Icenowy said:

For USB3, did you enable CONFIG_SUN50I_USB3_PHY?

For SD card, did you enable AXP PMIC support?

Check the following:

```

CONFIG_MFD_AXP20X=y
CONFIG_MFD_AXP20X_I2C=y

CONFIG_REGULATOR_AXP20X=y

CONFIG_PHY_SUN50I_USB3=y

Fixed now. All is working.

tkaiser · August 8, 2018

43 minutes ago, Igor said:

Fixed now. All is working

Great! Just for fun: shall we throw in this patch already? Situation will change only for people who know what they do since https://github.com/armbian/build/blob/master/config/sources/sun50iw6.conf#L8 will still prevent cpufreq exceeding 900 MHz by default.

Igor · August 8, 2018

2 hours ago, tkaiser said:

Great! Just for fun: shall we throw in this patch already? Situation will change only for people who know what they do since https://github.com/armbian/build/blob/master/config/sources/sun50iw6.conf#L8 will still prevent cpufreq exceeding 900 MHz by default.

Patch applied but we need to look elsewhere. 1.8Ghz doesn't show up. I need to sort out a lot of things until sending up sources. Until then, the image is here: https://dl.armbian.com/pineh64/Debian_stretch_dev_nightly.7z

Logs: http://ix.io/1jE5

Icenowy · August 8, 2018

5 minutes ago, Igor said:

Patch applied but we need to look elsewhere. 1.8Ghz doesn't show up. I need to sort out a lot of things until sending up sources. Until then, the image is here: https://dl.armbian.com/pineh64/Debian_stretch_dev_nightly.7z

Logs: http://ix.io/1jE5

It's because the constraints of the vdd-cpux regulator is set according to the recommended operation range on the datasheet, and this OPP is slightly beyond the maximum recommended operation voltage, so it's omitted (as Linux regulator framework doesn't allow voltage beyond constraints).

Icenowy · August 8, 2018

12 minutes ago, Igor said:

Patch applied but we need to look elsewhere. 1.8Ghz doesn't show up. I need to sort out a lot of things until sending up sources. Until then, the image is here: https://dl.armbian.com/pineh64/Debian_stretch_dev_nightly.7z

Logs: http://ix.io/1jE5

BTW (off topic of this post) the NS1068 quirk in kernel cmdline can be dropped now, as it's mainlined by me.

NS1066 one should still be kept as I didn't get a quirky NS1066 (my NS1066 just doesn't announce UAS).

Icenowy · August 8, 2018

15 minutes ago, Igor said:

Patch applied but we need to look elsewhere. 1.8Ghz doesn't show up. I need to sort out a lot of things until sending up sources. Until then, the image is here: https://dl.armbian.com/pineh64/Debian_stretch_dev_nightly.7z

Logs: http://ix.io/1jE5

It seems that you attached a USB3 Ethernet on XHCI. Have you tested it?

tkaiser · August 8, 2018

12 minutes ago, Icenowy said:

It seems that you attached a USB3 Ethernet on XHCI. Have you tested it?

The AX88179 isn't the best choice (driver hassles and poor performance in many situations). RTL8153 is clearly the better choice. Even the Raspberry Pi 'experts' found that out! https://lb.raspberrypi.org/forums/viewtopic.php?t=208512&start=100#p1297857

When I'm done benchmarking PineH64 I might provide some numbers made with RTL8153...

15 minutes ago, Icenowy said:

the NS1068 quirk in kernel cmdline can be dropped now, as it's mainlined by me

Unfortunately not for the reality Armbian faces (with older kernels like RK's 4.4 that do not get such stuff backported).

tkaiser · August 8, 2018

45 minutes ago, Icenowy said:

It's because the constraints of the vdd-cpux regulator is set according to the recommended operation range on the datasheet, and this OPP is slightly beyond the maximum recommended operation voltage, so it's omitted

Already patched and benchmarking. My board seems to be fine with 1800 MHz @ 1080mV

Sign In

Trying to compile Pine H64

Recommended Posts

kissste

kissste

constantius

Igor

constantius

chwe

Igor

chwe

Igor

tkaiser

Icenowy

tkaiser

Icenowy

@lex

Icenowy

Icenowy

tkaiser

Icenowy

Icenowy

Igor

Icenowy

Igor

Igor

tkaiser

Igor

Icenowy

Icenowy

Icenowy

tkaiser

tkaiser

Forums

My Activity Streams

Download

Store

Important Information