Jump to content

Trying to compile Pine H64


Recommended Posts

Try to get armbian compiled for PineH64.

 

Unfortunately, hitting this roadblock.

 

drivers/net/wireless/xradio/sdio.c:17:10: fatal error: asm/mach-types.h: No such file or directory
 #include <asm/mach-types.h>
          ^~~~~~~~~~~~~~~~~~
compilation terminated.
scripts/Makefile.build:312: recipe for target 'drivers/net/wireless/xradio/sdio.o' failed
make[4]: *** [drivers/net/wireless/xradio/sdio.o] Error 1
scripts/Makefile.build:559: recipe for target 'drivers/net/wireless/xradio' failed
make[3]: *** [drivers/net/wireless/xradio] Error 2
scripts/Makefile.build:559: recipe for target 'drivers/net/wireless' failed
make[2]: *** [drivers/net/wireless] Error 2
scripts/Makefile.build:559: recipe for target 'drivers/net' failed
make[1]: *** [drivers/net] Error 2
Makefile:1060: recipe for target 'drivers' failed
make: *** [drivers] Error 2
[ error ] ERROR in function compile_kernel [ compilation.sh:337 ]
[ error ] Kernel was not built [ @host ]
[ o.k. ] Process terminated

Any suggestions? 

 

Is this even needed - xradio. I will try to exclude it from the build.

 

Thank you

Link to comment
Share on other sites

This is what I get after excluding xradio driver from the build.

 

 CC      drivers/of/configfs.o
drivers/of/configfs.c: In function ‘create_overlay’:
drivers/of/configfs.c:67:8: error: implicit declaration of function ‘of_overlay_create’; did you mean ‘of_overlay_remove’? [-Werror=implicit-function-declaration]
  err = of_overlay_create(overlay->overlay);
        ^~~~~~~~~~~~~~~~~
        of_overlay_remove
drivers/of/configfs.c: In function ‘cfs_overlay_release’:
drivers/of/configfs.c:218:3: error: implicit declaration of function ‘of_overlay_destroy’; did you mean ‘of_overlay_remove’? [-Werror=implicit-function-declaration]
   of_overlay_destroy(overlay->ov_id);
   ^~~~~~~~~~~~~~~~~~
   of_overlay_remove
cc1: some warnings being treated as errors
scripts/Makefile.build:312: recipe for target 'drivers/of/configfs.o' failed
make[2]: *** [drivers/of/configfs.o] Error 1
scripts/Makefile.build:559: recipe for target 'drivers/of' failed
make[1]: *** [drivers/of] Error 2
Makefile:1060: recipe for target 'drivers' failed
make: *** [drivers] Error 2
[ error ] ERROR in function compile_kernel [ compilation.sh:337 ]
[ error ] Kernel was not built [ @host ]
[ o.k. ] Process terminated

 

Link to comment
Share on other sites

on the orange pi download section there are debian jessie images with xfce 4 desktop. I mean for Orange pi one plus and lite2. Both H6 CPU.

GPU driver exist, because they must have used it. Only audio is not supported. I have to used usb external audio card. These boards addicted only for server because 1 GB of RAM.

Link to comment
Share on other sites

3 minutes ago, constantius said:

on the orange pi download section there are debian jessie images with xfce 4 desktop. I mean for Orange pi one plus and lite2. Both H6 CPU.

GPU driver exist, because they must have used it.

uname -r will tell you that they use a old bsp kernel (3.x forgot which one). Armbian doesn't deal with bsp kernels for this board. Mainline only, means what's not mainlined is for sure not supported yet (and probably nobody has an interest in fixing the BSP-Kernel). Keep in mind, this board is in early (early early early) stage. If you expect something which can be used 'productive' --> that's (at the moment) the wrong SoC if you want use Armbian. Probably every other SoC is better supported at the moment. I think Igor uploaded those Images so that people try to test and fix things not use them productive. @Igor maybe we should write a big red reminder in the downloadsection? 

Quote
  • Don't use this Images for productive they're only provided for testing!
  • serial console only
  • Bootlog

 

I know it has a bunch of red flags there.. but this seems no be enough. 

Link to comment
Share on other sites

23 minutes ago, chwe said:

maybe we should write a big red reminder in the downloadsection? 


I don't think so. It's already impossible to miss those signs. Sometimes people just ignore them or choose to ignore them. Or make wrong assumptions based on weak/wrong technical knowledge or some belief. This entire project helps to understand. Remember the school days. Even teacher explained everything and wrote things on the table, somebody or more, usually always the same people, needed extra clarification.   :P

 

Just add remarks to the "WWW update wish list" and something can be done. Perhaps Javascript popup on nightly/preview images when clicking the download button.

Link to comment
Share on other sites

18 minutes ago, Igor said:

I don't think so. It's already impossible to miss those signs.

challenge-accepted-women-s-t-shirt.png :rolleyes: :P

 

provide them only via dl.armbian.com might help. but then they see no red flags and come up with other questions.. Let's hope that @Icenowy keeps interest in mainlining the H6. :)

 

Link to comment
Share on other sites

6 minutes ago, chwe said:

provide them only via dl.armbian.com might help. but then they see no red flags and come up with other questions..


Exactly. It's good as is.

Now, with a new website, things are done better and this should not be a problem. And I don't think we have one.  All those signs, flags and rules are in their essence to limit (repeating) questions but its impossible to suppress them.

 

7 minutes ago, chwe said:

Let's hope

 

The development progress is slow. Much much slower and way more time is consumed than the average person thinks. Only education can change this perception.

Link to comment
Share on other sites

1 hour ago, tkaiser said:

If anyone manages to build a H6 kernel with ths/dvfs/cpufreq support (using @Icenowyh6-ths-ugly branch) I would be happy to receive sbc-bench numbers for PineH64... After applying this patch of course :)

 

 

1.16V is beyond the recommended operation range, so I didn't add it now.

 

BTW H6 comes with two official DVFS table judged by "speed bin", maybe this needs to be implemented. ("Speed bin" info is in SID)

Link to comment
Share on other sites

14 minutes ago, Icenowy said:

H6 comes with two official DVFS table judged by "speed bin", maybe this needs to be implemented. ("Speed bin" info is in SID)

 

Interesting :)

 

My 1.16V were just copy&paste from this entry, just realized that there is a second OPP table below with lower voltage values for higher OPP. Maybe Allwinner does chip selection after factory testing and then sells those that can cope with lower voltages at higher clockspeeds at slightly higher prices?

Link to comment
Share on other sites

7 hours ago, tkaiser said:

 

Interesting :)

 

My 1.16V were just copy&paste from this entry, just realized that there is a second OPP table below with lower voltage values for higher OPP. Maybe Allwinner does chip selection after factory testing and then sells those that can cope with lower voltages at higher clockspeeds at slightly higher prices?

Check drivers/soc/allwinner/sunxi-sid.c, it says about "speed bin".

Link to comment
Share on other sites

If you don't mind, i throw some numbers here grabbed from Opi One Plus so we all can compare to PineH64. I am interested in seeing how Opi stacks up against PineH64 and how it could be improved.

This is @Icenowy  kernel. No patch applied to save my only one Opi H6. I forgot or missed how to enable Crypto, still learning how to turn this little beast ON... Maybe @Igor or @zador.blood.stained has a better kernel config.

There is still some timitations on Opi, that's what i could get for the momment.

 

7z b

Spoiler

 

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,4 CPUs LE)

LE
CPU Freq:  1485  1485  1485  1485  1485  1485  1485  1485  1485

RAM size:     991 MB,  # CPU hardware threads:   4
RAM usage:    882 MB,  # Benchmark threads:      4

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       2419   312    754   2353  |      64613   395   1395   5513
23:       2415   317    776   2461  |      63592   396   1389   5502
24:       2389   317    810   2569  |      62442   396   1384   5482
25:       2354   317    848   2688  |      61257   395   1381   5452
----------------------------------  | ------------------------------
Avr:             316    797   2518  |              396   1387   5487
Tot:             356   1092   4002

 

 

 

sysbench:

Spoiler

 


Initializing random number generator from current time


Prime numbers limit: 2000

Initializing worker threads...

Threads started!

CPU speed:
    events per second: 24389.85

General statistics:
    total time:                          10.0002s
    total number of events:              244025

Latency (ms):
         min:                                  0.16
         avg:                                  0.16
         max:                                  0.29
         95th percentile:                      0.16
         sum:                              39890.75

Threads fairness:
    events (avg/stddev):           61006.2500/15.11
    execution time (avg/stddev):   9.9727/0.00

 

 

cpu-freq:

 

Spoiler

cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 1:
  driver: cpufreq-dt
  CPUs which run at the same hardware frequency: 0 1 2 3
  CPUs which need to have their frequency coordinated by software: 0 1 2 3
  maximum transition latency: 244 us.
  hardware limits: 888 MHz - 1.49 GHz
  available frequency steps: 888 MHz, 1.01 GHz, 1.32 GHz, 1.49 GHz
  available cpufreq governors: conservative, powersave, ondemand, userspace, performance, schedutil
  current policy: frequency should be within 888 MHz and 1.49 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency is 1.49 GHz.
  cpufreq stats: 888 MHz:66.49%, 1.01 GHz:0.37%, 1.32 GHz:0.12%, 1.49 GHz:33.02%  (499)

8

 

 

tinymembench

Spoiler

tinymembench v0.4.9 (simple benchmark for memory throughput and latency)

==========================================================================
== Memory bandwidth tests                                               ==
==                                                                      ==
== Note 1: 1MB = 1000000 bytes                                          ==
== Note 2: Results for 'copy' tests show how many bytes can be          ==
==         copied per second (adding together read and writen           ==
==         bytes would have provided twice higher numbers)              ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
==         to first fetch data into it, and only then write it to the   ==
==         destination (source -> L1 cache, L1 cache -> destination)    ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in    ==
==         brackets                                                     ==
==========================================================================

 C copy backwards                                     :   1632.8 MB/s (1.4%)
 C copy backwards (32 byte blocks)                    :   1651.2 MB/s (2.1%)
 C copy backwards (64 byte blocks)                    :   1710.7 MB/s (2.0%)
 C copy                                               :   1706.8 MB/s (1.6%)
 C copy prefetched (32 bytes step)                    :   1297.3 MB/s (0.3%)
 C copy prefetched (64 bytes step)                    :   1156.2 MB/s
 C 2-pass copy                                        :   1472.8 MB/s
 C 2-pass copy prefetched (32 bytes step)             :    983.5 MB/s
 C 2-pass copy prefetched (64 bytes step)             :   1005.9 MB/s
 C fill                                               :   5573.3 MB/s
 C fill (shuffle within 16 byte blocks)               :   5572.5 MB/s
 C fill (shuffle within 32 byte blocks)               :   5573.2 MB/s (0.2%)
 C fill (shuffle within 64 byte blocks)               :   5572.4 MB/s
 ---
 standard memcpy                                      :   1750.2 MB/s (0.3%)
 standard memset                                      :   5577.3 MB/s
 ---
 NEON LDP/STP copy                                    :   1753.6 MB/s (0.2%)
 NEON LDP/STP copy pldl2strm (32 bytes step)          :   1122.7 MB/s (0.6%)
 NEON LDP/STP copy pldl2strm (64 bytes step)          :   1426.0 MB/s
 NEON LDP/STP copy pldl1keep (32 bytes step)          :   1871.2 MB/s
 NEON LDP/STP copy pldl1keep (64 bytes step)          :   1872.8 MB/s
 NEON LD1/ST1 copy                                    :   1728.2 MB/s (0.9%)
 NEON STP fill                                        :   5577.6 MB/s
 NEON STNP fill                                       :   5062.6 MB/s (2.2%)
 ARM LDP/STP copy                                     :   1755.4 MB/s (0.2%)
 ARM STP fill                                         :   5577.5 MB/s
 ARM STNP fill                                        :   5002.3 MB/s

==========================================================================
== Memory latency test                                                  ==
==                                                                      ==
== Average time is measured for random memory accesses in the buffers   ==
== of different sizes. The larger is the buffer, the more significant   ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM      ==
== accesses. For extremely large buffer sizes we are expecting to see   ==
== page table walk with several requests to SDRAM for almost every      ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest).                                         ==
==                                                                      ==
== Note 1: All the numbers are representing extra time, which needs to  ==
==         be added to L1 cache latency. The cycle timings for L1 cache ==
==         latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
==         two independent memory accesses at a time. In the case if    ==
==         the memory subsystem can't handle multiple outstanding       ==
==         requests, dual random read has the same timings as two       ==
==         single reads performed one after another.                    ==
==========================================================================

block size : single random read / dual random read, [MADV_NOHUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    4.6 ns          /     7.7 ns 
    131072 :    7.0 ns          /    10.8 ns 
    262144 :    8.3 ns          /    12.0 ns 
    524288 :    9.6 ns          /    13.4 ns 
   1048576 :   77.1 ns          /   117.7 ns 
   2097152 :  113.6 ns          /   151.4 ns 
   4194304 :  137.5 ns          /   169.7 ns 
   8388608 :  149.5 ns          /   178.0 ns 
  16777216 :  157.4 ns          /   184.1 ns 
  33554432 :  162.3 ns          /   188.6 ns 
  67108864 :  164.8 ns          /   191.4 ns 

block size : single random read / dual random read, [MADV_HUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    4.6 ns          /     7.7 ns 
    131072 :    7.0 ns          /    10.8 ns 
    262144 :    8.3 ns          /    12.0 ns 
    524288 :    9.7 ns          /    13.4 ns 
   1048576 :   77.0 ns          /   117.7 ns 
   2097152 :  113.0 ns          /   150.8 ns 
   4194304 :  130.8 ns          /   161.6 ns 
   8388608 :  140.5 ns          /   165.6 ns 
  16777216 :  145.5 ns          /   167.2 ns 
  33554432 :  148.0 ns          /   168.0 ns 
  67108864 :  149.2 ns          /   168.3 ns 
 

Spoiler

 

tinymembench v0.4.9 (simple benchmark for memory throughput and latency)

==========================================================================
== Memory bandwidth tests                                               ==
==                                                                      ==
== Note 1: 1MB = 1000000 bytes                                          ==
== Note 2: Results for 'copy' tests show how many bytes can be          ==
==         copied per second (adding together read and writen           ==
==         bytes would have provided twice higher numbers)              ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
==         to first fetch data into it, and only then write it to the   ==
==         destination (source -> L1 cache, L1 cache -> destination)    ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in    ==
==         brackets                                                     ==
==========================================================================

 C copy backwards                                     :   1632.8 MB/s (1.4%)
 C copy backwards (32 byte blocks)                    :   1651.2 MB/s (2.1%)
 C copy backwards (64 byte blocks)                    :   1710.7 MB/s (2.0%)
 C copy                                               :   1706.8 MB/s (1.6%)
 C copy prefetched (32 bytes step)                    :   1297.3 MB/s (0.3%)
 C copy prefetched (64 bytes step)                    :   1156.2 MB/s
 C 2-pass copy                                        :   1472.8 MB/s
 C 2-pass copy prefetched (32 bytes step)             :    983.5 MB/s
 C 2-pass copy prefetched (64 bytes step)             :   1005.9 MB/s
 C fill                                               :   5573.3 MB/s
 C fill (shuffle within 16 byte blocks)               :   5572.5 MB/s
 C fill (shuffle within 32 byte blocks)               :   5573.2 MB/s (0.2%)
 C fill (shuffle within 64 byte blocks)               :   5572.4 MB/s
 ---
 standard memcpy                                      :   1750.2 MB/s (0.3%)
 standard memset                                      :   5577.3 MB/s
 ---
 NEON LDP/STP copy                                    :   1753.6 MB/s (0.2%)
 NEON LDP/STP copy pldl2strm (32 bytes step)          :   1122.7 MB/s (0.6%)
 NEON LDP/STP copy pldl2strm (64 bytes step)          :   1426.0 MB/s
 NEON LDP/STP copy pldl1keep (32 bytes step)          :   1871.2 MB/s
 NEON LDP/STP copy pldl1keep (64 bytes step)          :   1872.8 MB/s
 NEON LD1/ST1 copy                                    :   1728.2 MB/s (0.9%)
 NEON STP fill                                        :   5577.6 MB/s
 NEON STNP fill                                       :   5062.6 MB/s (2.2%)
 ARM LDP/STP copy                                     :   1755.4 MB/s (0.2%)
 ARM STP fill                                         :   5577.5 MB/s
 ARM STNP fill                                        :   5002.3 MB/s

==========================================================================
== Memory latency test                                                  ==
==                                                                      ==
== Average time is measured for random memory accesses in the buffers   ==
== of different sizes. The larger is the buffer, the more significant   ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM      ==
== accesses. For extremely large buffer sizes we are expecting to see   ==
== page table walk with several requests to SDRAM for almost every      ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest).                                         ==
==                                                                      ==
== Note 1: All the numbers are representing extra time, which needs to  ==
==         be added to L1 cache latency. The cycle timings for L1 cache ==
==         latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
==         two independent memory accesses at a time. In the case if    ==
==         the memory subsystem can't handle multiple outstanding       ==
==         requests, dual random read has the same timings as two       ==
==         single reads performed one after another.                    ==
==========================================================================

block size : single random read / dual random read, [MADV_NOHUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    4.6 ns          /     7.7 ns 
    131072 :    7.0 ns          /    10.8 ns 
    262144 :    8.3 ns          /    12.0 ns 
    524288 :    9.6 ns          /    13.4 ns 
   1048576 :   77.1 ns          /   117.7 ns 
   2097152 :  113.6 ns          /   151.4 ns 
   4194304 :  137.5 ns          /   169.7 ns 
   8388608 :  149.5 ns          /   178.0 ns 
  16777216 :  157.4 ns          /   184.1 ns 
  33554432 :  162.3 ns          /   188.6 ns 
  67108864 :  164.8 ns          /   191.4 ns 

block size : single random read / dual random read, [MADV_HUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    4.6 ns          /     7.7 ns 
    131072 :    7.0 ns          /    10.8 ns 
    262144 :    8.3 ns          /    12.0 ns 
    524288 :    9.7 ns          /    13.4 ns 
   1048576 :   77.0 ns          /   117.7 ns 
   2097152 :  113.0 ns          /   150.8 ns 
   4194304 :  130.8 ns          /   161.6 ns 
   8388608 :  140.5 ns          /   165.6 ns 
  16777216 :  145.5 ns          /   167.2 ns 
  33554432 :  148.0 ns          /   168.0 ns 
  67108864 :  149.2 ns          /   168.3 ns 

 

 

Edited by @lex
tinymembech added
Link to comment
Share on other sites

2 hours ago, @lex said:

If you don't mind, i throw some numbers here grabbed from Opi One Plus so we all can compare to PineH64. I am interested in seeing how Opi stacks up against PineH64 and how it could be improved.

This is @Icenowy  kernel. No patch applied to save my only one Opi H6. I forgot or missed how to enable Crypto, still learning how to turn this little beast ON... Maybe @Igor or @zador.blood.stained has a better kernel config.

There is still some timitations on Opi, that's what i could get for the momment.

 

7z b

  Hide contents

 

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,4 CPUs LE)

LE
CPU Freq:  1485  1485  1485  1485  1485  1485  1485  1485  1485

RAM size:     991 MB,  # CPU hardware threads:   4
RAM usage:    882 MB,  # Benchmark threads:      4

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       2419   312    754   2353  |      64613   395   1395   5513
23:       2415   317    776   2461  |      63592   396   1389   5502
24:       2389   317    810   2569  |      62442   396   1384   5482
25:       2354   317    848   2688  |      61257   395   1381   5452
----------------------------------  | ------------------------------
Avr:             316    797   2518  |              396   1387   5487
Tot:             356   1092   4002

 

 

 

sysbench:

  Hide contents

 


Initializing random number generator from current time


Prime numbers limit: 2000

Initializing worker threads...

Threads started!

CPU speed:
    events per second: 24389.85

General statistics:
    total time:                          10.0002s
    total number of events:              244025

Latency (ms):
         min:                                  0.16
         avg:                                  0.16
         max:                                  0.29
         95th percentile:                      0.16
         sum:                              39890.75

Threads fairness:
    events (avg/stddev):           61006.2500/15.11
    execution time (avg/stddev):   9.9727/0.00

 

 

cpu-freq:

 

  Hide contents

cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 1:
  driver: cpufreq-dt
  CPUs which run at the same hardware frequency: 0 1 2 3
  CPUs which need to have their frequency coordinated by software: 0 1 2 3
  maximum transition latency: 244 us.
  hardware limits: 888 MHz - 1.49 GHz
  available frequency steps: 888 MHz, 1.01 GHz, 1.32 GHz, 1.49 GHz
  available cpufreq governors: conservative, powersave, ondemand, userspace, performance, schedutil
  current policy: frequency should be within 888 MHz and 1.49 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency is 1.49 GHz.
  cpufreq stats: 888 MHz:66.49%, 1.01 GHz:0.37%, 1.32 GHz:0.12%, 1.49 GHz:33.02%  (499)

8

 

 

tinymembench

  Hide contents

tinymembench v0.4.9 (simple benchmark for memory throughput and latency)

==========================================================================
== Memory bandwidth tests                                               ==
==                                                                      ==
== Note 1: 1MB = 1000000 bytes                                          ==
== Note 2: Results for 'copy' tests show how many bytes can be          ==
==         copied per second (adding together read and writen           ==
==         bytes would have provided twice higher numbers)              ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
==         to first fetch data into it, and only then write it to the   ==
==         destination (source -> L1 cache, L1 cache -> destination)    ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in    ==
==         brackets                                                     ==
==========================================================================

 C copy backwards                                     :   1632.8 MB/s (1.4%)
 C copy backwards (32 byte blocks)                    :   1651.2 MB/s (2.1%)
 C copy backwards (64 byte blocks)                    :   1710.7 MB/s (2.0%)
 C copy                                               :   1706.8 MB/s (1.6%)
 C copy prefetched (32 bytes step)                    :   1297.3 MB/s (0.3%)
 C copy prefetched (64 bytes step)                    :   1156.2 MB/s
 C 2-pass copy                                        :   1472.8 MB/s
 C 2-pass copy prefetched (32 bytes step)             :    983.5 MB/s
 C 2-pass copy prefetched (64 bytes step)             :   1005.9 MB/s
 C fill                                               :   5573.3 MB/s
 C fill (shuffle within 16 byte blocks)               :   5572.5 MB/s
 C fill (shuffle within 32 byte blocks)               :   5573.2 MB/s (0.2%)
 C fill (shuffle within 64 byte blocks)               :   5572.4 MB/s
 ---
 standard memcpy                                      :   1750.2 MB/s (0.3%)
 standard memset                                      :   5577.3 MB/s
 ---
 NEON LDP/STP copy                                    :   1753.6 MB/s (0.2%)
 NEON LDP/STP copy pldl2strm (32 bytes step)          :   1122.7 MB/s (0.6%)
 NEON LDP/STP copy pldl2strm (64 bytes step)          :   1426.0 MB/s
 NEON LDP/STP copy pldl1keep (32 bytes step)          :   1871.2 MB/s
 NEON LDP/STP copy pldl1keep (64 bytes step)          :   1872.8 MB/s
 NEON LD1/ST1 copy                                    :   1728.2 MB/s (0.9%)
 NEON STP fill                                        :   5577.6 MB/s
 NEON STNP fill                                       :   5062.6 MB/s (2.2%)
 ARM LDP/STP copy                                     :   1755.4 MB/s (0.2%)
 ARM STP fill                                         :   5577.5 MB/s
 ARM STNP fill                                        :   5002.3 MB/s

==========================================================================
== Memory latency test                                                  ==
==                                                                      ==
== Average time is measured for random memory accesses in the buffers   ==
== of different sizes. The larger is the buffer, the more significant   ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM      ==
== accesses. For extremely large buffer sizes we are expecting to see   ==
== page table walk with several requests to SDRAM for almost every      ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest).                                         ==
==                                                                      ==
== Note 1: All the numbers are representing extra time, which needs to  ==
==         be added to L1 cache latency. The cycle timings for L1 cache ==
==         latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
==         two independent memory accesses at a time. In the case if    ==
==         the memory subsystem can't handle multiple outstanding       ==
==         requests, dual random read has the same timings as two       ==
==         single reads performed one after another.                    ==
==========================================================================

block size : single random read / dual random read, [MADV_NOHUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    4.6 ns          /     7.7 ns 
    131072 :    7.0 ns          /    10.8 ns 
    262144 :    8.3 ns          /    12.0 ns 
    524288 :    9.6 ns          /    13.4 ns 
   1048576 :   77.1 ns          /   117.7 ns 
   2097152 :  113.6 ns          /   151.4 ns 
   4194304 :  137.5 ns          /   169.7 ns 
   8388608 :  149.5 ns          /   178.0 ns 
  16777216 :  157.4 ns          /   184.1 ns 
  33554432 :  162.3 ns          /   188.6 ns 
  67108864 :  164.8 ns          /   191.4 ns 

block size : single random read / dual random read, [MADV_HUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    4.6 ns          /     7.7 ns 
    131072 :    7.0 ns          /    10.8 ns 
    262144 :    8.3 ns          /    12.0 ns 
    524288 :    9.7 ns          /    13.4 ns 
   1048576 :   77.0 ns          /   117.7 ns 
   2097152 :  113.0 ns          /   150.8 ns 
   4194304 :  130.8 ns          /   161.6 ns 
   8388608 :  140.5 ns          /   165.6 ns 
  16777216 :  145.5 ns          /   167.2 ns 
  33554432 :  148.0 ns          /   168.0 ns 
  67108864 :  149.2 ns          /   168.3 ns 
 

  Hide contents

 

tinymembench v0.4.9 (simple benchmark for memory throughput and latency)

==========================================================================
== Memory bandwidth tests                                               ==
==                                                                      ==
== Note 1: 1MB = 1000000 bytes                                          ==
== Note 2: Results for 'copy' tests show how many bytes can be          ==
==         copied per second (adding together read and writen           ==
==         bytes would have provided twice higher numbers)              ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
==         to first fetch data into it, and only then write it to the   ==
==         destination (source -> L1 cache, L1 cache -> destination)    ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in    ==
==         brackets                                                     ==
==========================================================================

 C copy backwards                                     :   1632.8 MB/s (1.4%)
 C copy backwards (32 byte blocks)                    :   1651.2 MB/s (2.1%)
 C copy backwards (64 byte blocks)                    :   1710.7 MB/s (2.0%)
 C copy                                               :   1706.8 MB/s (1.6%)
 C copy prefetched (32 bytes step)                    :   1297.3 MB/s (0.3%)
 C copy prefetched (64 bytes step)                    :   1156.2 MB/s
 C 2-pass copy                                        :   1472.8 MB/s
 C 2-pass copy prefetched (32 bytes step)             :    983.5 MB/s
 C 2-pass copy prefetched (64 bytes step)             :   1005.9 MB/s
 C fill                                               :   5573.3 MB/s
 C fill (shuffle within 16 byte blocks)               :   5572.5 MB/s
 C fill (shuffle within 32 byte blocks)               :   5573.2 MB/s (0.2%)
 C fill (shuffle within 64 byte blocks)               :   5572.4 MB/s
 ---
 standard memcpy                                      :   1750.2 MB/s (0.3%)
 standard memset                                      :   5577.3 MB/s
 ---
 NEON LDP/STP copy                                    :   1753.6 MB/s (0.2%)
 NEON LDP/STP copy pldl2strm (32 bytes step)          :   1122.7 MB/s (0.6%)
 NEON LDP/STP copy pldl2strm (64 bytes step)          :   1426.0 MB/s
 NEON LDP/STP copy pldl1keep (32 bytes step)          :   1871.2 MB/s
 NEON LDP/STP copy pldl1keep (64 bytes step)          :   1872.8 MB/s
 NEON LD1/ST1 copy                                    :   1728.2 MB/s (0.9%)
 NEON STP fill                                        :   5577.6 MB/s
 NEON STNP fill                                       :   5062.6 MB/s (2.2%)
 ARM LDP/STP copy                                     :   1755.4 MB/s (0.2%)
 ARM STP fill                                         :   5577.5 MB/s
 ARM STNP fill                                        :   5002.3 MB/s

==========================================================================
== Memory latency test                                                  ==
==                                                                      ==
== Average time is measured for random memory accesses in the buffers   ==
== of different sizes. The larger is the buffer, the more significant   ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM      ==
== accesses. For extremely large buffer sizes we are expecting to see   ==
== page table walk with several requests to SDRAM for almost every      ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest).                                         ==
==                                                                      ==
== Note 1: All the numbers are representing extra time, which needs to  ==
==         be added to L1 cache latency. The cycle timings for L1 cache ==
==         latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
==         two independent memory accesses at a time. In the case if    ==
==         the memory subsystem can't handle multiple outstanding       ==
==         requests, dual random read has the same timings as two       ==
==         single reads performed one after another.                    ==
==========================================================================

block size : single random read / dual random read, [MADV_NOHUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    4.6 ns          /     7.7 ns 
    131072 :    7.0 ns          /    10.8 ns 
    262144 :    8.3 ns          /    12.0 ns 
    524288 :    9.6 ns          /    13.4 ns 
   1048576 :   77.1 ns          /   117.7 ns 
   2097152 :  113.6 ns          /   151.4 ns 
   4194304 :  137.5 ns          /   169.7 ns 
   8388608 :  149.5 ns          /   178.0 ns 
  16777216 :  157.4 ns          /   184.1 ns 
  33554432 :  162.3 ns          /   188.6 ns 
  67108864 :  164.8 ns          /   191.4 ns 

block size : single random read / dual random read, [MADV_HUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    4.6 ns          /     7.7 ns 
    131072 :    7.0 ns          /    10.8 ns 
    262144 :    8.3 ns          /    12.0 ns 
    524288 :    9.7 ns          /    13.4 ns 
   1048576 :   77.0 ns          /   117.7 ns 
   2097152 :  113.0 ns          /   150.8 ns 
   4194304 :  130.8 ns          /   161.6 ns 
   8388608 :  140.5 ns          /   165.6 ns 
  16777216 :  145.5 ns          /   167.2 ns 
  33554432 :  148.0 ns          /   168.0 ns 
  67108864 :  149.2 ns          /   168.3 ns 

 

 

 

sun8i-ce is not beast, its performace is not so good. For crypto acceleration, you need to expect aes instruction set of A53, which should be already usable.

 

BTW applying the patch in fact doesn't hurt when THS is ready, as the main reason to restrict voltage is in fact heat, and when THS is ready it can properly prevent overheat.

Link to comment
Share on other sites

6 hours ago, @lex said:

grabbed from Opi One Plus so we all can compare to PineH64. I am interested in seeing how Opi stacks up against PineH64 and how it could be improved

 

So why don't you run 'sbc-bench neon' then? The problem with all those benchmark numbers is that the most important information is always missing: what has really happened (other activity in the background, cpufreq behaviour, throttling, killing CPU cores and so on). You showed zero information about this (especially thermal readouts).

 

Why not running sbc-bench to get this information? And then apply the 1.8 GHz patch, use heatsink and fan and run the test again?

Link to comment
Share on other sites

BTW I sort out some code to integrate functions currently available.

 

U-Boot with HDMI SimpleFB (sometime fail to initialize and get magenta display, in this situation just reset): https://github.com/Icenowy/u-boot/tree/h6-hdmi-rebased-1

 

Linux with MMC/Ethernet/USB2/USB3/SimpleFB/THS: https://github.com/Icenowy/linux/tree/h6-integrate-2-ugly

 

@Igor @tkaiser

Link to comment
Share on other sites

On 8/1/2018 at 1:29 PM, Icenowy said:

BTW I sort out some code to integrate functions currently available.

 

U-Boot with HDMI SimpleFB (sometime fail to initialize and get magenta display, in this situation just reset): https://github.com/Icenowy/u-boot/tree/h6-hdmi-rebased-1

 

Linux with MMC/Ethernet/USB2/USB3/SimpleFB/THS: https://github.com/Icenowy/linux/tree/h6-integrate-2-ugly

 

@Igor @tkaiser

Thanks to help from Jernej, the magenta display issue is solved in U-Boot commit 79405999d7ee43f830825751b200d739b53f20f5 ("video: sunxi: de2/3: clear all BLD address space").

Link to comment
Share on other sites

On 8/1/2018 at 7:29 AM, Icenowy said:

Linux with MMC/Ethernet/USB2/USB3/SimpleFB/THS: https://github.com/Icenowy/linux/tree/h6-integrate-2-ugly

 

Spoiler

|  _ \(_)_ __   ___  | | | |/ /_ | || |  
| |_) | | '_ \ / _ \ | |_| | '_ \| || |_ 
|  __/| | | | |  __/ |  _  | (_) |__   _|
|_|   |_|_| |_|\___| |_| |_|\___/   |_|  
                                         

Welcome to ARMBIAN 5.55 user-built Debian GNU/Linux 9 (stretch) 4.18.0-rc7-sunxi64   
System load:   0.27 0.28 0.14   Up time:       5 min
Memory usage:  3 % of 1970MB    IP:            172.16.100.189
CPU temp:      42°C           
Usage of /:    4% of 30G    

[ 0 security updates available, 1 updates total: apt upgrade ]
Last check: 2018-08-08 10:37

Last login: Wed Aug  8 10:37:46 2018

root@pineh64:~# 

 


ATFSOURCE='https://github.com/Icenowy/arm-trusted-firmware'
ATFBRANCH='branch:sun50i_h6_pmic'
+
BOOTSOURCE='https://github.com/Icenowy/u-boot'
BOOTBRANCH='branch:h6-hdmi-rebased-1'

+

KERNELSOURCE='https://github.com/Icenowy/linux'
KERNELBRANCH='branch:h6-integrate-2-ugly'

 

no patch except the .deb packaging, no

 

Works: HDMI, Network, USB2, LEDs 

Fail: USB3, SD card, didn't try eMMC, DVFS, ...

http://ix.io/1jD0

 

What I am missing?

Link to comment
Share on other sites

21 minutes ago, Igor said:
  Reveal hidden contents


|  _ \(_)_ __   ___  | | | |/ /_ | || |  
| |_) | | '_ \ / _ \ | |_| | '_ \| || |_ 
|  __/| | | | |  __/ |  _  | (_) |__   _|
|_|   |_|_| |_|\___| |_| |_|\___/   |_|  
                                         

Welcome to ARMBIAN 5.55 user-built Debian GNU/Linux 9 (stretch) 4.18.0-rc7-sunxi64   
System load:   0.27 0.28 0.14   Up time:       5 min
Memory usage:  3 % of 1970MB    IP:            172.16.100.189
CPU temp:      42°C           
Usage of /:    4% of 30G    

[ 0 security updates available, 1 updates total: apt upgrade ]
Last check: 2018-08-08 10:37

Last login: Wed Aug  8 10:37:46 2018

root@pineh64:~# 

 


ATFSOURCE='https://github.com/Icenowy/arm-trusted-firmware'
ATFBRANCH='branch:sun50i_h6_pmic'
+
BOOTSOURCE='https://github.com/Icenowy/u-boot'
BOOTBRANCH='branch:h6-hdmi-rebased-1'

+

KERNELSOURCE='https://github.com/Icenowy/linux'
KERNELBRANCH='branch:h6-integrate-2-ugly'

 

no patch except the .deb packaging, no

 

Works: HDMI, Network, USB2, LEDs 

Fail: USB3, SD card, didn't try eMMC, DVFS, ...

http://ix.io/1jD0

 

What I am missing?

 

For USB3, did you enable CONFIG_SUN50I_USB3_PHY?

 

For SD card, did you enable AXP PMIC support?

 

Check the following:

```

CONFIG_MFD_AXP20X=y
CONFIG_MFD_AXP20X_I2C=y

CONFIG_REGULATOR_AXP20X=y

CONFIG_PHY_SUN50I_USB3=y

```

Link to comment
Share on other sites

2 minutes ago, Icenowy said:

For USB3, did you enable CONFIG_SUN50I_USB3_PHY?


This was set as module and CONFIG_MFD_AXP20X_I2C was missing ... let's try again.

Link to comment
Share on other sites

50 minutes ago, Icenowy said:

For USB3, did you enable CONFIG_SUN50I_USB3_PHY?

 

For SD card, did you enable AXP PMIC support?

 

Check the following:

```

CONFIG_MFD_AXP20X=y
CONFIG_MFD_AXP20X_I2C=y

CONFIG_REGULATOR_AXP20X=y

CONFIG_PHY_SUN50I_USB3=y


Fixed now. All is working.

Link to comment
Share on other sites

 

2 hours ago, tkaiser said:

Great! Just for fun: shall we throw in this patch already? Situation will change only for people who know what they do since https://github.com/armbian/build/blob/master/config/sources/sun50iw6.conf#L8 will still prevent cpufreq exceeding 900 MHz by default.


Patch applied but we need to look elsewhere. 1.8Ghz doesn't show up. I need to sort out a lot of things until sending up sources. Until then, the image is here: https://dl.armbian.com/pineh64/Debian_stretch_dev_nightly.7z

 

Logs: http://ix.io/1jE5

Link to comment
Share on other sites

5 minutes ago, Igor said:

 


Patch applied but we need to look elsewhere. 1.8Ghz doesn't show up. I need to sort out a lot of things until sending up sources. Until then, the image is here: https://dl.armbian.com/pineh64/Debian_stretch_dev_nightly.7z

  

Logs: http://ix.io/1jE5

It's because the constraints of the vdd-cpux regulator is set according to the recommended operation range on the datasheet, and this OPP is slightly beyond the maximum recommended operation voltage, so it's omitted (as Linux regulator framework doesn't allow voltage beyond constraints).

Link to comment
Share on other sites

12 minutes ago, Igor said:

 


Patch applied but we need to look elsewhere. 1.8Ghz doesn't show up. I need to sort out a lot of things until sending up sources. Until then, the image is here: https://dl.armbian.com/pineh64/Debian_stretch_dev_nightly.7z

  

Logs: http://ix.io/1jE5

 

BTW (off topic of this post) the NS1068 quirk in kernel cmdline can be dropped now, as it's mainlined by me.

 

NS1066 one should still be kept as I didn't get a quirky NS1066 (my NS1066 just doesn't announce UAS).

Link to comment
Share on other sites

15 minutes ago, Igor said:

 


Patch applied but we need to look elsewhere. 1.8Ghz doesn't show up. I need to sort out a lot of things until sending up sources. Until then, the image is here: https://dl.armbian.com/pineh64/Debian_stretch_dev_nightly.7z

  

Logs: http://ix.io/1jE5

It seems that you attached a USB3 Ethernet on XHCI. Have you tested it?

Link to comment
Share on other sites

12 minutes ago, Icenowy said:

It seems that you attached a USB3 Ethernet on XHCI. Have you tested it?

 

The AX88179 isn't the best choice (driver hassles and poor performance in many situations). RTL8153 is clearly the better choice. Even the Raspberry Pi 'experts' found that out! ;)https://lb.raspberrypi.org/forums/viewtopic.php?t=208512&amp;start=100#p1297857

 

When I'm done benchmarking PineH64 I might provide some numbers made with RTL8153...

 

15 minutes ago, Icenowy said:

the NS1068 quirk in kernel cmdline can be dropped now, as it's mainlined by me

 

Unfortunately not for the reality Armbian faces (with older kernels like RK's 4.4 that do not get such stuff backported).

Link to comment
Share on other sites

45 minutes ago, Icenowy said:

It's because the constraints of the vdd-cpux regulator is set according to the recommended operation range on the datasheet, and this OPP is slightly beyond the maximum recommended operation voltage, so it's omitted

 

Already patched and benchmarking. My board seems to be fine with 1800 MHz @ 1080mV :)

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines