JMCC Posted April 23, 2018 Posted April 23, 2018 1 hour ago, chwe said: And for the geeks, it should be possible do produce a tv-box based on linux with a decent quality. IMO, that's what Khadas Vim is about, and that's why I like the product. I also read that they are developing a RK3399 based box, code named "Khadas Cross" (another link). Looking forward to it.
tkaiser Posted April 23, 2018 Author Posted April 23, 2018 4 hours ago, chwe said: I might use your post as first one, so that you could decide how the thread is named Please do so. Getting tired of reading too much and repeating the same BS over and over again. Look how cheating works. You do not need to write numbers on your web page, you simply ship an Android that reports bogus values so people reyling on shitty tools (again Geekbench, CPU-Z, AnTuTu) get fake numbers and these are then used in reviews, advertisements and so on... mid 2016 situation: Amlogic 'talks' about 2 GHz: https://browser.geekbench.com/geekbench3/8052062 (of course back then the SoC was also only allowed to clock up to 1.4 GHz with single threaded loads) end of 2016 situation: Amlogic adjusts the faked top cpufreq number from 2.0 GHz to 1.5 GHz: https://browser.geekbench.com/v4/cpu/search?page=2&q=s912&utf8=✓ (it's still 1.4 GHz in reality and with multi-threaded loads it's even less) 2018 situation: BS numbers finally confirmed: http://forum.khadas.com/t/cpu-frequency-up-to-2ghz/2010/5?u=tkaiser (but I have no idea what those Vim2 users are doing? All brainwashed? It's so easy to check for real performance. Why is no one doing this but is instead blindly trusting in irrelevant BS numbers?)
chwe Posted April 23, 2018 Posted April 23, 2018 10 minutes ago, tkaiser said: Please do so. Getting tired of reading too much and repeating the same BS over and over again. done, so you should be able to rename the thread. 10 minutes ago, tkaiser said: Look how cheating works. You do not need to write numbers on your web page, you simply ship an Android that reports bogus values so people reyling on shitty tools (again Geekbench, CPU-Z, AnTuTu) get fake numbers and these are then used in reviews, advertisements and so on... As said, in case the blob reports faulty CPU speed, it's IMO your right to blame them in whatever wording you feel comfortable. I would call it faulty cause I may like a less harsh tone to call things. Different people different wording the opensource/linux world is known to be more harsh whereas my field is often a bit more 'polite'. 22 minutes ago, tkaiser said: 2018 situation: BS numbers finally confirmed: http://forum.khadas.com/t/cpu-frequency-up-to-2ghz/2010/5?u=tkaiser (but I have no idea what those Vim2 users are doing? All brainwashed? It's so easy to check for real performance. Why is no one doing this but is instead blindly trusting in irrelevant BS numbers?) Maybe cause the majority of them simple don't care? Or aren't as experienced as you in this field? I've no clue how a butcher should work so that my steak doesn't taste like shit and that he doesn't damage the gut and everything is germ-loaded. I simply (have to) trust my butcher that he know what he's doing. That said, if you want to sell premium tv-boxes, better stop to cheat on your customers otherwise they might be pissed as soon as they find evidence that you lie to them. 38 minutes ago, tkaiser said: Look how cheating works. Reminds me to the 'dieselgate'... Worked well for VW in germany, cause it seems that law allows such a behavior but not that well for them in the US... But different topic, we better avoid to start a discussion about that...
NicoD Posted April 24, 2018 Posted April 24, 2018 Hi all. I've taken my Khadas Vim2 Max from under the growing dust layer. Thanks to @tkaiser his reactions on an old post of mine on the Khadas forum. I've red this thread and all the other ones on the khadas forum about this(I'm more cross-eyed than ever after all that reading). http://forum.khadas.com/t/underwhelming-performance-khadas-vim2-max-in-video-rendering-kdenlive/1466 https://www.cnx-software.com/2017/11/08/khadas-vim2-board-review-part-1-unboxing-and-dual-tuner-board/#comment-548918 http://forum.khadas.com/t/s912-limited-to-1200-mhz-with-multithreaded-loads/2311/7 @tkaiser I'm one of those "absolutely clueless people" who bought it because I thought it would be a great step up from my Odroid C2 for video rendering. I use them when traveling with my bicycle to edit and render the filmed route. I couldn't find any information about video rendering performance with SBC's, so I started filming my results when testing. That explains my Youtube channel. https://www.youtube.com/channel/UCpv7NFr0-9AB5xoklh3Snhg That also answers your question on the Khadas forum why the hell I would do Kdenlive benchmarks. I don't care about any other number except that. On paper the Khadas Vim2 Max should be great. Low energy consumption (most important because I do everything with solar panels and powerbanks), more cores, more ram, ... I believed it would get better with updates of the Ubuntu distro. But still no evolution. So thanks to your posts on this subject I finally got some clues. What makes me a bit less clueless. I ain't no expert in nothing. I only started using linux 2years ago. That's when I bought my first Rasp2B. I'm a Windows programmer(C# and C, used to be C++/MFC), but that doesn't get me far in Linux(except for programs that run in a terminal). I still hope to make it perform better then it is now. It did cost me a lot. Next week I'm going to do a video of the Orange Pi +2 with Armbian. Let me know if there's anything I need to know. You'll hear from me when I need "precious" information. Cheers all. Sorry for intruding this thread NicoD
tkaiser Posted April 25, 2018 Author Posted April 25, 2018 20 hours ago, NicoD said: @tkaiser I'm one of those "absolutely clueless people" who bought it Nope, I was talking about 'clueless people' just to outline which challenges those TV box manufacturers face. Unlike premium STB device makers like BroadCom or HiSilicon where the SoCs can be reasonably designed to do the job since no end customer has to 'choose' them based on specs they don't understand (those boxes are given away 'for free' by their broadband provider) those 'el cheapo TV box' SoC makers have to design their devices in a way they look appealing to... clueless people (many CPU cores, high CPU clockspeeds). On the other hand those TV boxes can't be designed well since... clueless customers prefering devices with poor heat dissipation. In the 'TV box world' those Amlogic things work pretty well since CPU performance is irrelevant anyway (only exception: exotic codecs the video engine -- VPU -- can not handle). But on SBCs it's something different since there the users expect from a SoC advertised as 'octa-core at 2 GHz' a bit more than the laughable 'octa-core at an average 1.2 GHz' performance they get with S912 in reality. 1
tkaiser Posted April 25, 2018 Author Posted April 25, 2018 On 11.2.2018 at 7:56 PM, lanefu said: I was shocked with how pleased i was with my le potato Is you or @TonyMac32 running Le Potato with Xenial? If so may I ask for the full output from echo performance >/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor sysbench --test=cpu --cpu-max-prime=20000 run --num-threads=4 Takes less than 10 seconds...
TonyMac32 Posted April 25, 2018 Posted April 25, 2018 Maximum prime number checked in CPU test: 20000 Test execution summary: total time: 8.8837s total number of events: 10000 total time taken by event execution: 35.5098 per-request statistics: min: 2.58ms avg: 3.55ms max: 26.71ms approx. 95 percentile: 6.63ms Threads fairness: events (avg/stddev): 2500.0000/53.08 execution time (avg/stddev): 8.8774/0.00 Running it a second time yielded: Maximum prime number checked in CPU test: 20000 Test execution summary: total time: 6.9951s total number of events: 10000 total time taken by event execution: 27.9649 per-request statistics: min: 2.58ms avg: 2.80ms max: 38.21ms approx. 95 percentile: 2.67ms Threads fairness: events (avg/stddev): 2500.0000/97.83 execution time (avg/stddev): 6.9912/0.00
tkaiser Posted April 25, 2018 Author Posted April 25, 2018 18 minutes ago, TonyMac32 said: execution time (avg/stddev): 8.8774/0.00 That's running with 1040 MHz. 18 minutes ago, TonyMac32 said: execution time (avg/stddev): 6.9912/0.00 That's 1320 MHz. Numbers taken from this interesting thread where @Da Xue tested with special BLOBs he got from Amlogic but is not free to distribute... @TonyMac32: Another try would be to prefix the sysbench call with 'taskset -c 0-3 '. Shouldn't make any difference but at least on Vim2 with 4.9 kernel S912 magically started to work with reasonable cpufreq behaviour. @chwe: can you please move the last 3 posts from this thread into the 'Amlogic still cheating with clockspeeds' thread please?
TonyMac32 Posted April 25, 2018 Posted April 25, 2018 With taskset -c 0-4: Maximum prime number checked in CPU test: 20000 Test execution summary: total time: 6.5926s total number of events: 10000 total time taken by event execution: 26.3611 per-request statistics: min: 2.58ms avg: 2.64ms max: 13.11ms approx. 95 percentile: 2.63ms Threads fairness: events (avg/stddev): 2500.0000/62.95 execution time (avg/stddev): 6.5903/0.00
tkaiser Posted April 25, 2018 Author Posted April 25, 2018 14 minutes ago, TonyMac32 said: execution time (avg/stddev): 6.5903/0.00 That's 1400 MHz. So for whatever reasons we are not able to have any reproducible influence on how cpufreq scaling and DVFS works on Amlogic platforms. It's some proprietary / closed source crap contained in BLOBs. I wonder how @Da Xue deals with this since the benchmarks he published here need some explanations... (when we all have to use a shitty and limited bl30.bin while benchmarkers can use better ones)
chwe Posted April 25, 2018 Posted April 25, 2018 Quote @chwe: can you please move the last 3 posts from this thread into the 'Amlogic still cheating with clockspeeds' thread please? done.. 1 hour ago, tkaiser said: On the other hand those TV boxes can't be designed well since... clueless customers prefering devices with poor heat dissipation. Did someone ever 'benchmarked' those boxes for the use-case they are made for? Means streaming capabilities.. As said, if you buy a TV-Box don't expect a SBC. And SBCs based on those SoCs should be marked under 'additional infos' something like: max cpu freq. ~1.2GHz when all 8 cores are used. 14 minutes ago, tkaiser said: I wonder how @Da Xue deals with this since the benchmarks he published here need some explanations... (when we all have to use a shitty and limited bl30.bin while benchmarkers can use better ones) Hmm. does this mean that this binary (with the better performance) isn't part of the OS they deliver with the board? In case LibreComputer can't publish those binaries to projects like armbian (for legal reasons) we might inform our users under 'additional infos' too. Otherwise people will complain that 'performance under armbian is worse compared to stock OS for no reason' (this will save us some questions about this issue).
tkaiser Posted April 25, 2018 Author Posted April 25, 2018 2 hours ago, chwe said: And SBCs based on those SoCs should be marked under 'additional infos' something like: max cpu freq. ~1.2GHz when all 8 cores are used. Why is nobody looking into the DATA provided?! @TonyMac32 provided the results from 3 sysbench runs. The execution time differed always. But sysbench while being not a good hardware benchmark provides great info to understand why numbers suck: Minimum execution time for generating a prime number in all 3 cases was the same: 2.58ms. What differed were the average values and as can be clearly seen DVFS and cpufreq scaling on Amlogic platforms is totally screwed. The M3 running with a proprietary firmware does funny things not under control of the kernel: execution time (avg/stddev): 8.8774/0.00 avg: 3.55ms approx. 95 percentile: 6.63ms execution time (avg/stddev): 6.9912/0.00 avg: 2.80ms approx. 95 percentile: 2.67ms execution time (avg/stddev): 6.5903/0.00 avg: 2.64ms approx. 95 percentile: 2.63ms Situation is even more crappy as on the Raspberry Pis where ThreadX only either does frequency capping (disabling Turbo mode when undervolted) or throttling. What the proprietary firmware on S905X and S912 is doing is beyond my imagination. That's just weird. An interesting question is now whether Amlogic provides board makers with special bl30.bin blobs to allow them to generate nice benchmark numbers and why real world performance especially with S912 is so shitty. Has to be answered by people who are part of Amlogic's closed source / proprietary world (most probably not able to tell anything since having signed NDAs). For me Amlogic is from now on a no go. As bad as Raspberry Pi wrt 'openness'.
chwe Posted April 25, 2018 Posted April 25, 2018 1 hour ago, tkaiser said: Why is nobody looking into the DATA provided?! The question is, how we deal with this data? We currently provide 'support' for 3 AMlogic devices (LePotato, Odroid C1 &C2). And at least for the LePotato, we have a issue due to binaries we can't control and we don't know what they're doing. Maybe someone repeats the benchmarking which was done by LibreComputer with armbian? In case we have a significant difference in performance we might look together with @Da Xue to find a solution. In case this is not possible due to whatever contracts LibreComputer has with AMlogic we should IMO have a small annotation in the download section that we can't deliver maximum performance due to this issue. As a description from our current website: Quote We are the only distribution specialized for ARM development boards. Our primary objectives are optimizing low-level settings, kernel settings and its security and security in general. They lead into lowering consumption, provide top performance and high security at the same time. Not a single of those aspects is covered by board maker or any other distribution. IMO we should be honest and clear why we're not able to fulfill our promises. We often complain that some boardmakers (I don't want to throw dirt to a specific one) hide things to their customer which ends than that our users complain that we can't provide what was promised by the boardmaker. For me, being honest and clear to our users is more important than 'having better benchmarks than the stockOS'. A further improvement could be that we clarify on which boards/images we deal with binaries and why. There are people who care about 'opensourceness' and they can avoid those boards and there are people who don't care... I don't think that this would change much on the boardmaker/SoC-maker side but at least we wouldn't play the same game and hide those infos somewhere in our GitHub repo.
tkaiser Posted April 26, 2018 Author Posted April 26, 2018 7 hours ago, chwe said: The question is, how we deal with this data? Once the new website is up and running or even now, create a single line under 'known issues', name it 'bogus clockspeed behaviour not under our control' and link to this thread where the first post can contain a summarized TL;DR version so people get the idea what's going on and in case they feel that's of interest can read on. In the meantime I tried to summarize the situation with S912 and Khadas' 4.9 kernel as follows: http://forum.khadas.com/t/s912-limited-to-1200-mhz-with-multithreaded-loads/2311/54?u=tkaiser Situation with S905X is somehow related since even if S905X has not this 'big.LITTLE' emulation issue according to @TonyMac32's tests it makes a difference in performance behaviour if he lets the kernel implement the scheduling the kernel wants (using cpu 0-3) or setting a 'fixed CPU affinity' with taskset still only making use of cpu 0-3 (there is no more CPU cores) but showing better performance. Especially this weird part needs some explanations and a fix!
TonyMac32 Posted April 26, 2018 Posted April 26, 2018 3 minutes ago, tkaiser said: create a single line under 'known issues', name it 'bogus clockspeed behaviour not under our control' and link to this thread where the first post can contain a summarized TL;DR version so people get the idea what's going on Agreed. Indeterminate behavior counts as an issue, since it could, in as yet unknown circumstances, result in undefined behavior, or at least unexpected. 1
tkaiser Posted April 26, 2018 Author Posted April 26, 2018 10 hours ago, chwe said: Maybe someone repeats the benchmarking which was done by LibreComputer with armbian? I just looked through those numbers on https://libre.computer/2018/03/21/raspberry-pi-3-model-b-review-and-comparison/ again just to realize that there's also something seriously wrong. The OpenSSL tests are also a great benchmark to check for real CPU clockspeeds since not affected by memory bandwidth at all (the AES scores with ARMv8 Crypto Extensions available scale linearly with clockspeed when comparing different A53). When comparing Le Potato (S905X claiming to run at 1.5 GHz) with Renegade (RK3328 at 1.3 GHz) then it's pretty obvious that the S905X was running with 1320 MHz maximum. Which is a bit too low even if we already take into account that S905X can't reach the 1.5 GHz anyway due to bl30.bin situation. There's something seriously wrong with CPU clockspeeds on Amlogic platforms. And maybe there's also a relationship with broken/weird scheduling though I really don't understand how this can happen (there should be no difference on a quad-core CPU whether the kernel decides to run on cpu 0-3 or the user 'forces' exactly the same using 'taskset -c 0-3' -- but reality draws a different picture) 1
Da Xue Posted April 29, 2018 Posted April 29, 2018 All of the benchmarks on libre.computer were ran on the standard bl30 from Amlogic's openlinux site. The blob I have is for my testing only since using them and advertising the results is cheating. 1
Da Xue Posted April 29, 2018 Posted April 29, 2018 @TonyMac32 Make sure that the performance governor is on and that there is a heatsink.
chwe Posted April 29, 2018 Posted April 29, 2018 @Da Xue first, I appreciate that you follow the content which is related to your boards. It's good to see that a boardmaker feels responsible for open issues related to their boards. Did you mentioned the concerns directly to AMlogic that their bl30 blob reports false clockspeeds to the kernel? It's clear that we as linux users aren't in their focus but I think it's important that our concerns are placed and that AMlocic knows that part of their users aren't happy with the situation. To make sure that they can't say that they didn't notice about it. On 26.4.2018 at 7:48 AM, tkaiser said: Once the new website is up and running or even now, create a single line under 'known issues', name it 'bogus clockspeed behaviour not under our control' and link to this thread where the first post can contain a summarized TL;DR version so people get the idea what's going on and in case they feel that's of interest can read on. IMO a 'proper' conclusion how we figured out false clocksppeds (comparison of different A53 cores and why those tests are directly related to clockspeed) would make sense.
Da Xue Posted April 29, 2018 Posted April 29, 2018 @chwe Any company like Amlogic (Rockchip, Allwinner, Actions, etc) only care about issues that affect customers putting in orders on the order of millions of chips. If Le Potato sells in the hundred thousand to millions of units, they will naturally care about the issues we bring up. As of now, we are nowhere near there so this discussions in this thread are moot.
TonyMac32 Posted April 29, 2018 Posted April 29, 2018 1 hour ago, Da Xue said: As of now, we are nowhere near there so this discussions in this thread are moot. I agree in the context of any single board, or even any single vendor. I think something like this would require all of the vendors using these SoC's to approach this together. Probably not the most feasible approach, but given the increasing activity on the open source front on their part, it may at least be possible to get correct readouts on future devices. (Assuming they want to save face and not admit fault by correcting the existing codebase) I'd be curious to see what the AXG is reporting vs actually doing... Given the work BayLibre is putting into mainlining has a lot to do with clocks at the moment, is there any means of indirectly determining the clock frequency so the kernel can at least report properly, if not command? 9 hours ago, Da Xue said: Make sure that the performance governor is on and that there is a heatsink. I'm currently running it on a production board and red heat sink. I can try it as well on the older board with my heat sink which is easily 2x the mass. I did as prescribed by @tkaiser, I am currently running the performance governor. I have to admit I find it hard to believe it would be throttling that quickly, but testing is better than conjecture any day.... Hold please. *a few minutes later* Well then: Maximum prime number checked in CPU test: 20000 Test execution summary: total time: 6.4887s total number of events: 10000 total time taken by event execution: 25.9452 per-request statistics: min: 2.58ms avg: 2.59ms max: 10.43ms approx. 95 percentile: 2.60ms Threads fairness: events (avg/stddev): 2500.0000/2.12 execution time (avg/stddev): 6.4863/0.00 Hmmm, mere seconds later and I'm getting On 4/25/2018 at 11:44 AM, TonyMac32 said: Running it a second time yielded: Maximum prime number checked in CPU test: 20000 Test execution summary: total time: 6.9951s total number of events: 10000 total time taken by event execution: 27.9649 per-request statistics: min: 2.58ms avg: 2.80ms max: 38.21ms approx. 95 percentile: 2.67ms Threads fairness: events (avg/stddev): 2500.0000/97.83 execution time (avg/stddev): 6.9912/0.00 again. Massively over-protective thermal throttling?
Da Xue Posted April 29, 2018 Posted April 29, 2018 My setup is a programmable DC power supply set to 5.1V with the heatsink and a fan on it. Ambient temperature is about 20C. I get around 6.25ish for sysbench.
TonyMac32 Posted April 29, 2018 Posted April 29, 2018 I'm using a chassis supply set to 5.25 with a Fluke and monitored, measured 19 C ambient. Which Amlogic release did that BL30 come from? Since we're using blobs, mine hasn't been updated in some time.
JMCC Posted May 1, 2018 Posted May 1, 2018 On 29/4/2018 at 3:43 PM, Da Xue said: @chwe Any company like Amlogic (Rockchip, Allwinner, Actions, etc) only care about issues that affect customers putting in orders on the order of millions of chips. If Le Potato sells in the hundred thousand to millions of units, they will naturally care about the issues we bring up. As of now, we are nowhere near there so this discussions in this thread are moot. Well, Hardkernel got custom BLOB's for the C2. I'm not sure whether you are related to Libre Computer, but if you are, I think it is worth giving a try. It also seems like Khadas is also considering to ask for binaries with unlocked DVFS. If both companies (Khadas and Libre Computer) join in the petition, I think there are big chances that Amlogic listens. As I said before, if Amlogic people are intelligent (and I assume they are), they will care about the developers opinion, even though they are a much smaller number than the mass of TV box consumers.
Da Xue Posted May 1, 2018 Posted May 1, 2018 4 hours ago, JMCC said: Well, Hardkernel got custom BLOB's for the C2. I'm not sure whether you are related to Libre Computer, but if you are, I think it is worth giving a try. It also seems like Khadas is also considering to ask for binaries with unlocked DVFS. If both companies (Khadas and Libre Computer) join in the petition, I think there are big chances that Amlogic listens. As I said before, if Amlogic people are intelligent (and I assume they are), they will care about the developers opinion, even though they are a much smaller number than the mass of TV box consumers. Hardkernel got it because Amlogic put them between a rock and a hard spot since customers can claim false advertising. The only thing that has the slightest chance of happening is a bl30 that ignores thermal and other factors and stay at 1512MHz. Unlocked DVFS for overclockers is never going to happen so don't bother thinking about it.
JMCC Posted May 1, 2018 Posted May 1, 2018 2 hours ago, Da Xue said: The only thing that has the slightest chance of happening is a bl30 that ignores thermal and other factors and stay at 1512MHz. Well, a BLOB that allows 8x1512 and reports the real clockspeed in S912 would already be a great improvement. I'd sign for it right away.
TonyMac32 Posted May 1, 2018 Posted May 1, 2018 3 hours ago, Da Xue said: Unlocked DVFS for overclockers is never going to happen so don't bother thinking about it. To be fair I think a capped blob is acceptable to everyone, as long as it is answering to the kernel and is accurately reporting it's behavior. 1
tkaiser Posted May 2, 2018 Author Posted May 2, 2018 10 hours ago, Da Xue said: Unlocked DVFS for overclockers is never going to happen so don't bother thinking about it. Nobody is talking about this. It's only about stopping this proprietary closed source blob game (a blob that controls cpufreq behaviour in reality and all we can do is to compare which versions of blobs are in the wild). This whole discussion is also not 'moot' but useful to avoid certain products if issues can't be fixed and a project dedicated to open source gets into a position where 'searching for the right blob' is all it could do (hooray, closed source RPi level reached) This and other threads and even your own measurements over at https://libre.computer/2018/03/21/raspberry-pi-3-model-b-review-and-comparison/ confirm that there's a massive problem with this bl30.bin thing (or how do you explain your poor OpenSSL numbers that indicate not even 1350 MHz? How do you explain standard deviation with lightweight stuff like sysbench ruining performance)? Can you please provide full output from 3 sysbench runs and MD5 or SHA1 hash of your bl30.bin?
tkaiser Posted May 3, 2018 Author Posted May 3, 2018 @Da Xue friendly reminder: can you please answer my questions? Hash to identify which bl30.bin blob is used where and FULL sysbench output to get an idea why your numbers are better. I mean it's pretty obvious that there's something seriously wrong and not related to throttling/protection or how do you explain result variation with @TonyMac32's tests and your own low OpenSSL scores?
JMCC Posted May 5, 2018 Posted May 5, 2018 Some interesting posts about the possibility of reverse engineering the bl30.bin, in Khadas forums: http://forum.khadas.com/t/s912-limited-to-1200-mhz-with-multithreaded-loads/2311/58
Recommended Posts