Jump to content

Amlogic still cheating with clockspeeds


tkaiser

Recommended Posts

4 hours ago, chwe said:

I might use your post as first one, so that you could decide how the thread is named

 

Please do so. Getting tired of reading too much and repeating the same BS over and over again.

 

Look how cheating works. You do not need to write numbers on your web page, you simply ship an Android that reports bogus values so people reyling on shitty tools (again Geekbench, CPU-Z, AnTuTu) get fake numbers and these are then used in reviews, advertisements and so on...

Link to comment
Share on other sites

10 minutes ago, tkaiser said:

Please do so. Getting tired of reading too much and repeating the same BS over and over again.

done, so you should be able to rename the thread.

 

10 minutes ago, tkaiser said:

Look how cheating works. You do not need to write numbers on your web page, you simply ship an Android that reports bogus values so people reyling on shitty tools (again Geekbench, CPU-Z, AnTuTu) get fake numbers and these are then used in reviews, advertisements and so on...

As said, in case the blob reports faulty CPU speed, it's IMO your right to blame  them in whatever wording you feel comfortable. I would call it faulty cause I may like a less harsh tone to call things. Different people different wording the opensource/linux world is known to be more harsh whereas my field is often a bit more 'polite'.

 

22 minutes ago, tkaiser said:

2018 situation: BS numbers finally confirmed: http://forum.khadas.com/t/cpu-frequency-up-to-2ghz/2010/5?u=tkaiser (but I have no idea what those Vim2 users are doing? All brainwashed? It's so easy to check for real performance. Why is no one doing this but is instead blindly trusting in irrelevant BS numbers?)

Maybe cause the majority of them simple don't care? Or aren't as experienced as you in this field? I've no clue how a butcher should work so that my steak doesn't taste like shit and that he doesn't damage the gut and everything is germ-loaded. I simply (have to) trust my butcher that he know what he's doing. That said, if you want to sell premium tv-boxes, better stop to cheat on your customers otherwise they might be pissed as soon as they find evidence that you lie to them. 

 

38 minutes ago, tkaiser said:

Look how cheating works.

Reminds me to the 'dieselgate'... :lol: Worked well for VW in germany, cause it seems that law allows such a behavior but not that well for them in the US... But different topic, we better avoid to start a discussion about that... :D 

Link to comment
Share on other sites

Hi all.

I've taken my Khadas Vim2 Max from under the growing dust layer. Thanks to @tkaiser his reactions on an old post of mine on the Khadas forum.
I've red this thread and all the other ones on the khadas forum about this(I'm more cross-eyed than ever after all that reading).

http://forum.khadas.com/t/underwhelming-performance-khadas-vim2-max-in-video-rendering-kdenlive/1466

https://www.cnx-software.com/2017/11/08/khadas-vim2-board-review-part-1-unboxing-and-dual-tuner-board/#comment-548918

http://forum.khadas.com/t/s912-limited-to-1200-mhz-with-multithreaded-loads/2311/7

 

@tkaiser I'm one of those "absolutely clueless people" who bought it because I thought it would be a great step up from my Odroid C2 for video rendering. I use them when traveling with my bicycle to edit and render the filmed route. I couldn't find any information about video rendering performance with SBC's, so I started filming my results when testing. That explains my Youtube channel. https://www.youtube.com/channel/UCpv7NFr0-9AB5xoklh3Snhg
That also answers your question on the Khadas forum why the hell I would do Kdenlive benchmarks. I don't care about any other number except that. ;)

On paper the Khadas Vim2 Max should be great. Low energy consumption (most important because I do everything with solar panels and powerbanks), more cores, more ram, ...
I believed it would get better with updates of the Ubuntu distro. But still no evolution.
So thanks to your posts on this subject I finally got some clues. What makes me a bit less clueless.

 

I ain't no expert in nothing. I only started using linux 2years ago. That's when I bought my first Rasp2B. I'm a Windows programmer(C# and C, used to be C++/MFC), but that doesn't get me far in Linux(except for programs that run in a terminal).

 

I still hope to make it perform better then it is now. It did cost me a lot.

Next week I'm going to do a video of the Orange Pi +2 with Armbian. Let me know if there's anything I need to know. You'll hear from me when I need "precious" information.
Cheers all. Sorry for intruding this thread ;)
NicoD

Link to comment
Share on other sites

20 hours ago, NicoD said:

@tkaiser I'm one of those "absolutely clueless people" who bought it

 

Nope, I was talking about 'clueless people' just to outline which challenges those TV box manufacturers face. Unlike premium STB device makers like BroadCom or HiSilicon where the SoCs can be reasonably designed to do the job since no end customer has to 'choose' them based on specs they don't understand (those boxes are given away 'for free' by their broadband provider) those 'el cheapo TV box' SoC makers have to design their devices in a way they look appealing to... clueless people (many CPU cores, high CPU clockspeeds). On the other hand those TV boxes can't be designed well since... clueless customers prefering devices with poor heat dissipation.

 

In the 'TV box world' those Amlogic things work pretty well since CPU performance is irrelevant anyway (only exception: exotic codecs the video engine -- VPU -- can not handle). But on SBCs it's something different since there the users expect from a SoC advertised as 'octa-core at 2 GHz' a bit more than the laughable 'octa-core at an average 1.2 GHz' performance they get with S912 in reality.

Link to comment
Share on other sites

On 11.2.2018 at 7:56 PM, lanefu said:

I was shocked with how pleased i was with my le potato

 

Is you or @TonyMac32 running Le Potato with Xenial? If so may I ask for the full output from

echo performance >/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
sysbench --test=cpu --cpu-max-prime=20000 run --num-threads=4

Takes less than 10 seconds...

Link to comment
Share on other sites

Maximum prime number checked in CPU test: 20000


Test execution summary:
    total time:                          8.8837s
    total number of events:              10000
    total time taken by event execution: 35.5098
    per-request statistics:
         min:                                  2.58ms
         avg:                                  3.55ms
         max:                                 26.71ms
         approx.  95 percentile:               6.63ms

Threads fairness:
    events (avg/stddev):           2500.0000/53.08
    execution time (avg/stddev):   8.8774/0.00

Running it a second time yielded:

Maximum prime number checked in CPU test: 20000


Test execution summary:
    total time:                          6.9951s
    total number of events:              10000
    total time taken by event execution: 27.9649
    per-request statistics:
         min:                                  2.58ms
         avg:                                  2.80ms
         max:                                 38.21ms
         approx.  95 percentile:               2.67ms

Threads fairness:
    events (avg/stddev):           2500.0000/97.83
    execution time (avg/stddev):   6.9912/0.00

 

Link to comment
Share on other sites

18 minutes ago, TonyMac32 said:

execution time (avg/stddev): 8.8774/0.00

 

That's running with 1040 MHz.

 

18 minutes ago, TonyMac32 said:

execution time (avg/stddev): 6.9912/0.00

 

That's 1320 MHz. Numbers taken from this interesting thread where @Da Xue tested with special BLOBs he got from Amlogic but is not free to distribute...

 

@TonyMac32: Another try would be to prefix the sysbench call with 'taskset -c 0-3 '. Shouldn't make any difference but at least on Vim2 with 4.9 kernel S912 magically started to work with reasonable cpufreq behaviour.

 

@chwe: can you please move the last 3 posts from this thread into the 'Amlogic still cheating with clockspeeds' thread please? 

 

Link to comment
Share on other sites

With taskset -c 0-4:

Maximum prime number checked in CPU test: 20000


Test execution summary:
    total time:                          6.5926s
    total number of events:              10000
    total time taken by event execution: 26.3611
    per-request statistics:
         min:                                  2.58ms
         avg:                                  2.64ms
         max:                                 13.11ms
         approx.  95 percentile:               2.63ms

Threads fairness:
    events (avg/stddev):           2500.0000/62.95
    execution time (avg/stddev):   6.5903/0.00
Link to comment
Share on other sites

14 minutes ago, TonyMac32 said:

execution time (avg/stddev): 6.5903/0.00

 

That's 1400 MHz. So for whatever reasons we are not able to have any reproducible influence on how cpufreq scaling and DVFS works on Amlogic platforms. It's some proprietary / closed source crap contained in BLOBs.

 

I wonder how @Da Xue deals with this since the benchmarks he published here need some explanations... (when we all have to use a shitty and limited bl30.bin while benchmarkers can use better ones)

Link to comment
Share on other sites

Quote

@chwe: can you please move the last 3 posts from this thread into the 'Amlogic still cheating with clockspeeds' thread please? 

done.. 

 

1 hour ago, tkaiser said:

On the other hand those TV boxes can't be designed well since... clueless customers prefering devices with poor heat dissipation.

 

Did someone ever 'benchmarked' those boxes for the use-case they are made for? Means streaming capabilities.. As said, if you buy a TV-Box don't expect a SBC. And SBCs based on those SoCs should be marked under 'additional infos' something like: max cpu freq. ~1.2GHz when all 8 cores are used. 

 

14 minutes ago, tkaiser said:

I wonder how @Da Xue deals with this since the benchmarks he published here need some explanations... (when we all have to use a shitty and limited bl30.bin while benchmarkers can use better ones)

Hmm. does this mean that this binary (with the better performance) isn't part of the OS they deliver with the board? In case LibreComputer can't publish those binaries to projects like armbian (for legal reasons) we might inform our users under 'additional infos' too. Otherwise people will complain that 'performance under armbian is worse compared to stock OS for no reason' (this will save us some questions about this issue). 

Link to comment
Share on other sites

2 hours ago, chwe said:

And SBCs based on those SoCs should be marked under 'additional infos' something like: max cpu freq. ~1.2GHz when all 8 cores are used. 

 

Why is nobody looking into the DATA provided?!

 

@TonyMac32 provided the results from 3 sysbench runs. The execution time differed always. But sysbench while being not a good hardware benchmark provides great info to understand why numbers suck: Minimum execution time for generating a prime number in all 3 cases was the same: 2.58ms. What differed were the average values and as can be clearly seen DVFS and cpufreq scaling on Amlogic platforms is totally screwed. The M3 running with a proprietary firmware does funny things not under control of the kernel:

execution time (avg/stddev):   8.8774/0.00
avg:                           3.55ms
approx.  95 percentile:        6.63ms

execution time (avg/stddev):   6.9912/0.00
avg:                           2.80ms
approx.  95 percentile:        2.67ms

execution time (avg/stddev):   6.5903/0.00
avg:                           2.64ms
approx.  95 percentile:        2.63ms

Situation is even more crappy as on the Raspberry Pis where ThreadX only either does frequency capping (disabling Turbo mode when undervolted) or throttling. What the proprietary firmware on S905X and S912 is doing is beyond my imagination. That's just weird.

 

An interesting question is now whether Amlogic provides board makers with special bl30.bin blobs to allow them to generate nice benchmark numbers and why real world performance especially with S912 is so shitty. Has to be answered by people who are part of Amlogic's closed source / proprietary world (most probably not able to tell anything since having signed NDAs).

 

For me Amlogic is from now on a no go. As bad as Raspberry Pi wrt 'openness'.

Link to comment
Share on other sites

1 hour ago, tkaiser said:

Why is nobody looking into the DATA provided?!

The question is, how we deal with this data? We currently provide 'support' for 3 AMlogic devices (LePotato, Odroid C1 &C2). And at least for the LePotato, we have a issue due to binaries we can't control and we don't know what they're doing. 

Maybe someone repeats the benchmarking which was done by LibreComputer with armbian? In case we have a significant difference in performance we might look together with @Da Xue to find a solution. In case this is not possible due to whatever contracts LibreComputer has with AMlogic we should IMO have a small annotation in the download section that we can't deliver maximum performance due to this issue. 

As a description from our current website:

Quote

We are the only distribution specialized for ARM development boards. Our primary objectives are optimizing low-level settings, kernel settings and its security and security in general. They lead into lowering consumption, provide top performance and high security at the same time. Not a single of those aspects is covered by board maker or any other distribution.

IMO we should be honest and clear why we're not able to fulfill our promises. We often complain that some boardmakers (I don't want to throw dirt to a specific one) hide things to their customer which ends than that our users complain that we can't provide what was promised by the boardmaker. For me, being honest and clear to our users is more important than 'having better benchmarks than the stockOS'. A further improvement could be that we clarify on which boards/images we deal with binaries and why. There are people who care about 'opensourceness'  and they can avoid those boards and there are people who don't care...  

I don't think that this would change much on the boardmaker/SoC-maker side but at least we  wouldn't play the same game and hide those infos somewhere in our GitHub repo.

Link to comment
Share on other sites

7 hours ago, chwe said:

The question is, how we deal with this data?

 

Once the new website is up and running or even now, create a single line under 'known issues', name it 'bogus clockspeed behaviour not under our control' and link to this thread where the first post can contain a summarized TL;DR version so people get the idea what's going on and in case they feel that's of interest can read on.

 

In the meantime I tried to summarize the situation with S912 and Khadas' 4.9 kernel as follows: http://forum.khadas.com/t/s912-limited-to-1200-mhz-with-multithreaded-loads/2311/54?u=tkaiser

 

Situation with S905X is somehow related since even if S905X has not this 'big.LITTLE' emulation issue according to @TonyMac32's tests it makes a difference in performance behaviour if he lets the kernel implement the scheduling the kernel wants (using cpu 0-3) or setting a 'fixed CPU affinity' with taskset still only making use of cpu 0-3 (there is no more CPU cores) but showing better performance. Especially this weird part needs some explanations and a fix!

Link to comment
Share on other sites

3 minutes ago, tkaiser said:

create a single line under 'known issues', name it 'bogus clockspeed behaviour not under our control' and link to this thread where the first post can contain a summarized TL;DR version so people get the idea what's going on

Agreed.  Indeterminate behavior counts as an issue, since it could, in as yet unknown circumstances, result in undefined behavior, or at least unexpected.

 

Link to comment
Share on other sites

10 hours ago, chwe said:

Maybe someone repeats the benchmarking which was done by LibreComputer with armbian?

 

I just looked through those numbers on https://libre.computer/2018/03/21/raspberry-pi-3-model-b-review-and-comparison/ again just to realize that there's also something seriously wrong. The OpenSSL tests are also a great benchmark to check for real CPU clockspeeds since not affected by memory bandwidth at all (the AES scores with ARMv8 Crypto Extensions available scale linearly with clockspeed when comparing different A53).

 

When comparing Le Potato (S905X claiming to run at 1.5 GHz) with Renegade (RK3328 at 1.3 GHz) then it's pretty obvious that the S905X was running with 1320 MHz maximum. Which is a bit too low even if we already take into account that S905X can't reach the 1.5 GHz anyway due to bl30.bin situation.

 

There's something seriously wrong with CPU clockspeeds on Amlogic platforms. And maybe there's also a relationship with broken/weird scheduling though I really don't understand how this can happen (there should be no difference on a quad-core CPU whether the kernel decides to run on cpu 0-3 or the user 'forces' exactly the same using 'taskset -c 0-3' -- but reality draws a different picture)

Link to comment
Share on other sites

@Da Xue first, I appreciate that you follow the content which is related to your boards. It's good to see that a boardmaker feels responsible for open issues related to their boards. Did you mentioned the concerns directly to AMlogic that their bl30 blob reports false clockspeeds to the kernel? It's clear that we as linux users aren't in their focus but I think it's important that our concerns are placed and that AMlocic knows that part of their users aren't happy with the situation. To make sure that they can't say that they didn't notice about it.

 

On 26.4.2018 at 7:48 AM, tkaiser said:

Once the new website is up and running or even now, create a single line under 'known issues', name it 'bogus clockspeed behaviour not under our control' and link to this thread where the first post can contain a summarized TL;DR version so people get the idea what's going on and in case they feel that's of interest can read on.

IMO a 'proper' conclusion how we figured out false clocksppeds (comparison of different A53 cores and why those tests are directly related to clockspeed) would make sense.  

 

Link to comment
Share on other sites

@chwe Any company like Amlogic (Rockchip, Allwinner, Actions, etc) only care about issues that affect customers putting in orders on the order of millions of chips. If Le Potato sells in the hundred thousand to millions of units, they will naturally care about the issues we bring up. As of now, we are nowhere near there so this discussions in this thread are moot.

Link to comment
Share on other sites

1 hour ago, Da Xue said:

As of now, we are nowhere near there so this discussions in this thread are moot.

I agree in the context of any single board, or even any single vendor.  I think something like this would require all of the vendors using these SoC's to approach this together.  Probably not the most feasible approach, but given the increasing activity on the open source front on their part, it may at least be possible to get correct readouts on future devices.  (Assuming they want to save face and not admit fault by correcting the existing codebase)  I'd be curious to see what the AXG is reporting vs actually doing...

 

Given the work BayLibre is putting into mainlining has a lot to do with clocks at the moment, is there any means of indirectly determining the clock frequency so the kernel can at least report properly, if not command?

9 hours ago, Da Xue said:

Make sure that the performance governor is on and that there is a heatsink.

 

I'm currently running it on a production board and red heat sink.  I can try it as well on the older board with my heat sink which is easily 2x the mass.  I did as prescribed by @tkaiser, I am currently running the performance governor.  I have to admit I find it hard to believe it would be throttling that quickly, but testing is better than conjecture any day....  Hold please.

 

*a few minutes later* Well then:

Maximum prime number checked in CPU test: 20000


Test execution summary:
    total time:                          6.4887s
    total number of events:              10000
    total time taken by event execution: 25.9452
    per-request statistics:
         min:                                  2.58ms
         avg:                                  2.59ms
         max:                                 10.43ms
         approx.  95 percentile:               2.60ms

Threads fairness:
    events (avg/stddev):           2500.0000/2.12
    execution time (avg/stddev):   6.4863/0.00

Hmmm, mere seconds later and I'm getting

On 4/25/2018 at 11:44 AM, TonyMac32 said:

Running it a second time yielded:


Maximum prime number checked in CPU test: 20000


Test execution summary:
    total time:                          6.9951s
    total number of events:              10000
    total time taken by event execution: 27.9649
    per-request statistics:
         min:                                  2.58ms
         avg:                                  2.80ms
         max:                                 38.21ms
         approx.  95 percentile:               2.67ms

Threads fairness:
    events (avg/stddev):           2500.0000/97.83
    execution time (avg/stddev):   6.9912/0.00

 

 

again.  Massively over-protective thermal throttling?

Link to comment
Share on other sites

I'm using a chassis supply set to 5.25 with a Fluke and monitored, measured 19 C ambient.  Which Amlogic release did that BL30 come from?  Since we're using blobs, mine hasn't been updated in some time.

Link to comment
Share on other sites

On 29/4/2018 at 3:43 PM, Da Xue said:

@chwe Any company like Amlogic (Rockchip, Allwinner, Actions, etc) only care about issues that affect customers putting in orders on the order of millions of chips. If Le Potato sells in the hundred thousand to millions of units, they will naturally care about the issues we bring up. As of now, we are nowhere near there so this discussions in this thread are moot.

Well,  Hardkernel got custom BLOB's for the C2. I'm not sure whether you are related to Libre Computer, but if you are, I think it is worth giving a try. It also seems like Khadas is also considering to ask for binaries with unlocked DVFS. If both companies (Khadas and Libre Computer) join in the petition, I think there are big chances that Amlogic listens.

 

As I said before, if Amlogic people are intelligent (and I assume they are), they will care about the developers opinion, even though they are a much smaller number than the mass of TV box consumers. 

Link to comment
Share on other sites

4 hours ago, JMCC said:

Well,  Hardkernel got custom BLOB's for the C2. I'm not sure whether you are related to Libre Computer, but if you are, I think it is worth giving a try. It also seems like Khadas is also considering to ask for binaries with unlocked DVFS. If both companies (Khadas and Libre Computer) join in the petition, I think there are big chances that Amlogic listens.

 

As I said before, if Amlogic people are intelligent (and I assume they are), they will care about the developers opinion, even though they are a much smaller number than the mass of TV box consumers. 

Hardkernel got it because Amlogic put them between a rock and a hard spot since customers can claim false advertising. The only thing that has the slightest chance of happening is a bl30 that ignores thermal and other factors and stay at 1512MHz. Unlocked DVFS for overclockers is never going to happen so don't bother thinking about it.

Link to comment
Share on other sites

2 hours ago, Da Xue said:

The only thing that has the slightest chance of happening is a bl30 that ignores thermal and other factors and stay at 1512MHz.

Well, a BLOB that allows 8x1512 and reports the real clockspeed in S912 would already be a great improvement. I'd sign for it right away.

Link to comment
Share on other sites

3 hours ago, Da Xue said:

Unlocked DVFS for overclockers is never going to happen so don't bother thinking about it.

 

To be fair I think a capped blob is acceptable to everyone, as long as it is answering to the kernel and is accurately reporting it's behavior.

Link to comment
Share on other sites

10 hours ago, Da Xue said:

Unlocked DVFS for overclockers is never going to happen so don't bother thinking about it.

 

Nobody is talking about this. It's only about stopping this proprietary closed source blob game (a blob that controls cpufreq behaviour in reality and all we can do is to compare which versions of blobs are in the wild). This whole discussion is also not 'moot' but useful to avoid certain products if issues can't be fixed and a project dedicated to open source gets into a position where 'searching for the right blob' is all it could do (hooray, closed source RPi level reached)

 

This and other threads and even your own measurements over at https://libre.computer/2018/03/21/raspberry-pi-3-model-b-review-and-comparison/ confirm that there's a massive problem with this bl30.bin thing (or how do you explain your poor OpenSSL numbers that indicate not even 1350 MHz? How do you explain standard deviation with lightweight stuff like sysbench ruining performance)?

 

Can you please provide full output from 3 sysbench runs and MD5 or SHA1 hash of your bl30.bin?

Link to comment
Share on other sites

@Da Xue friendly reminder: can you please answer my questions? Hash to identify which bl30.bin blob is used where and FULL sysbench output to get an idea why your numbers are better.

 

I mean it's pretty obvious that there's something seriously wrong and not related to throttling/protection or how do you explain result variation with @TonyMac32's tests and your own low OpenSSL scores?

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines