Jump to content

wtarreau

Members
  • Posts

    41
  • Joined

  • Last visited

Everything posted by wtarreau

  1. The need for cluster is quite common in fact, especially in this price range!
  2. I tested a miner on it ("cpuminer" I think) to give numbers to a friend interested on the subject (he was impressed by the way). I didn't let it run for hours like this, but after several minutes it started to throttle down to 1 GHz then stabilized, but didn't stop (and keep in mind it's tightly enclosed in cardboard). It's certain that the modified DTB I'm using helps here with the higher temperature thresholds, but I'm suspecting you might have too weak a power supply or micro USB cable if it stopped. That's always the risk with DVFS : it consumes very little in idle but a lot under load. I discovered one bad cable in my stock using which the board would reboot in loops. @tkaiser could tell you hundreds of horror stories about micro-usb based power inputs :-) I find it really awesome and have been asking for it since I got my nanopi-fire2 about two years ago! I'm mostly interested in CPU and network, and this is the only board which comes with a CPU, some RAM, a gigabit connector and nothing else! I'm sure there's plenty of unexploited power in it and am willing to try to push it further! I'm attaching my modified DTB, it adds the 1.6 GHz frequency point and the 113,115,120 degrees critical points which work fine for me and considerably limit the throttling. Save yours before replacing it (variant "rev05"). I have no idea if my values will work on your board or will even kill it, use at your own risks! And please double-check the thermal contact between your heatsink and your CPU. s5p6818-nanopi3-rev05.1g6-1v25-113deg.dtb
  3. That's quite different from what I'm seeing given that I had made my own enclosure out of cardboard with no air around it! Of course it heats but not that fast, despite the fact that I overclocked it. If the fan helps, it's probable that you're not having a good contact between the heatsink and the CPU. Verify that the heatsink is very parallel to the board, it's possible that it touches only by one angle.
  4. I must confess I absolutely don't remember what I used given that I always have everything I need for this. It's possible that I naturally placed a thermal pad in between. I don't remember having opened a thermal paste tube. Or maybe there was a pad with the heatsink. But definitely I didn't put raw aluminum on top of the CPU die without anything to make good contact in between.
  5. The 3 critical points (in degrees celcius) for the thermal throttling and shutdown. I didn't understand the difference between the first two ones, as the CPU starts to throttle when the first value is reached. The second *seems* to do nothing, the 3rd one is for the forced shutdown. I seem to remember that it starts throttling at 80. IIRC the original values were something like 80, 85 and 105. It definitely is very informative to do so. Do not forget that such boards will heat much more in summer than in winter (you can more or less shift the high temperature by the difference of ambiant termperature). The most important is that your board remains 100% reliable even when it starts to throttle (the temperature can continue to rise a little bit at this point). A CPU's sensitivity to temperature may evolve over time, so keep a bit of margin. Also if you intend to use the GPU, it's not throttled and will definitely add to the thermal dissipation, this will require an extra margin. No the form factor is much smaller, and really well thought, but absolutely not compatible with RPi. Unfortunately there is no enclosure for these boards, it's really the only missing thing. You can stack many of them side by side vertically with just a rear cable for the power supply and a front cable for the network. The overall design is really nice for those who want high power densities.
  6. I'd say around 2 weeks. The default heatsink is enough if you're not running at 100% CPU full-time. For my use cases, it's mostly a network endpoint and I can run it at 1 Gbps without problems even with the board confined in a cardboard made enclosure. But if you run with all CPUs saturated, you'll reach around 5W that need to be dissipated one way or another. The default heatsink and the PCB are not large enough to dissipate 5W at a low temperature. I significantly raised the temperature thresholds (113, 115, 120) to prevent it from throttling too early. Note that these thresholds are higher than the datasheet's (85°C in commercial ranges). But the thermal sensor supports up to 125°C so probably there are some industrial/military grade variants with higher ranges. For a personal project I'd say you have some headroom. For a commercial product, you probably don't want to play with this and you may have to use a small fan, or to place a thermal pad behind the board against a metal enclosure.
  7. @shaun27, you're pretty close to my measurements. I hadn't noticed your point about the shutdown, but I seem to remember noticing it didn't cut off. Maybe there's no way to completely shut the DC-DC and the CPUs are not stopped but looping in place. At the very least if that's the case, I think we could improve the situation by using a WFI instruction...
  8. I think you simply didn't read anything in this thread. The whole thread precisely is about Amlogic distributing to SBC vendors firmwares that *pretend* to run at these frequencies but do not. When it reports any frequency between 1.5 and 2 GHz, in fact it's still at 1.536 GHz. And the funny thing is that @tkaiser was precisely saying that it works since a lot of people don't verify and are absolutely convinced that the board runs at 2.02 GHz when it says so. So now thanks to you we have a perfect illustration of what he was saying ;-)
  9. I agree with FriendlyElec's USB cables, I've always found they were of pretty good quality. Same for the cables coming with the MiQi by the way. I've had good experiences with a few PSU providing 2.5A under 5.2V and featuring a micro-usb cable. Since there it's not possible to connect any other way, they have to provide a good enough cable and connector. But in my opinion these good ones are almost an exception to the general trend. Regarding the increase of consumption at 90 vs 50°C, I noticed this as well with the RK3288 boards from by first farm. It's in fact due to the lower efficiency of the onboard DC-DC converter. That's another reason for focusing first on optimal heat spreading!
  10. @tkaiser , in fact there is a small category of users like me who do care about frequency because they do know their workload depends on frequency. My libslz compressor does (almost scales linearly). Crypto almost does when using native ARMv8 crypto extensions. My progressive locks code does as well. Compilation partially does. The problem I'm facing these days is that when I hit a performance wall with a board, I start to get a clear idea of what I'd like to try, and the minimum number I want for each metric to expect any improvement. For example I'm not interested in 32-bit memory for my build farms, nor am I interested in A53 below 1.8 GHz. Thus when I want to experiment with something new, I have to watch sites such as cnx-soft to hope for something new. Then when this new stuff happens, I have to wonder whether the advertised numbers are reasonably trustable or not, and whether I'm willing to spend a bit of money for something which will possibly not work at all. Recently I've seen this H6 at 1.8 GHz. I'd be interested in testing it but I'm not expecting true 1.8 yet, thus I'm waiting for someone else to get trapped before I try. My experience so far is that I'm never going to buy anything Amlogic-based anymore unless I see proofs of the numbers in my use case. It's pointless, they lie all the time, and deploy any possible effort to cheat in benchmarks. I wouldn't be surprised to find heavy metal plates in some TV STBs made with these SoCs to delay thermal throttling so that benchmark tools have the time to complete at full speed before the CPU gets too hot... However I do trust certain companies like Hardkernel or FriendlyElec who are extremely open and transparent about their limitations, and who provide everything I need to pursue my experiments, thus I randomly buy something there "just to see" if it's not too expensive. I'm not expecting them to be failsafe but at least to have run some tests before me! I don't trust Allwinner devices at all but in this case it seems to be mostly the board vendors who don't play well. BananaPi / OrangePi's numbers really cannot be trusted at all, but they are very cheap, so sometimes I buy a board and try it. I remember my OpiPC2 not even having a bootable image available at all for a few weeks. I didn't care much, it was around $15 or $25, I don't remember. I used not to be much interested in Rockchip who used to limit their RK3288 to 1.6 GHz, but overall they've made progress, some of their CPUs are very good, and I think some of them are expensive enough for the board vendors to be careful. But all this to say that sometimes a specific set of characteristics are very hard to find together and you start to build hope in a specific device. When you pay a bit more than usual on such a device and wait for a long time, you really want it to match your expectations, eventhough you know pretty well you shouldn't. And when the device arrives it's a cold shower. Overall I agree with your point that real world performance matters more than numbers. The dual-A9 based Armada38x in my clearfog outperforms any of the much higher spec'd devices I own when it comes to I/O. The only problem is that in order to know what performance to expect, you need to have some metrics. And since people mostly post useless crap like Antutu, CPU-Z or Geekbench, the only way to know is either to wait for someone else to report real numbers (as a few of us do) or to buy one and run your own test. That's why we can't completely blame the users here. And that's sad because it maintains this dishonest system in place. I initially wanted to develop a suite of anti-cheat tools like the mhz.c and ramspeed.c programs I wrote. But I've already done that 25 years ago and am not seeing myself do that again. Plus these days they need to be eye-candy to gain adoption which I really am unable to achieve. Last point, there will always be users who prefer to publish the results from the tool showing the highest numbers because it makes them feel good, so the anti-cheat ones will not necessarily be published that often :-/
  11. Thomas, I think it's problematic that you only have a watt-meter including the PSU because the PSU's efficiency depends on the consumption (usually it's optimal around 50% load). Using a USB power meter would tell you the volts, amps and watts, and would even allow to detect under-power when it happens. I managed to get my board to shut down only once, it was powered by my laptop's USB3 port (that's what I do all the time but I'm probably close to the limit). It never happened on a 5V/2A PSU however. Since it was not yet very hot, I suspect that it's the power controller instead which had shut it down rather than the temperature. I also had to play with the critical points to avoid needlessly throttling. I seem to remember having set them to 105, 110 and 112 degrees though I may be wrong since I ran many tests. Now I packed the board inside a cardboard made "enclosure" from which the heat hardly dissipates, and it can still throttle when reaching the first critical point, but that doesn't last long. When it happens, usually it's at 832/1024 of 1600 MHz = 1300 MHz, and more rarely it's 640/1024*1600 = 1000 MHz. I haven't run cpuburn yet though, I can if you're interested. Regarding the use cases, I think they are limited but the board is awesome when they can be met. For me, it's fantastic as a developer to test threaded code scalability on up to 8 cores, given that the L2 cache is shared between all cores. Usually you need a huge power hungry CPU to get the same, here I have this in may laptop's bag. I also want to see what performance level I can reach on HTTP compression using libslz. I'm pretty sure that making some content recompression farms using such boards could be very affordable. Also the CPU supports native CRC32 instructions which are missing on x86 and affect gzip's performance, so I'll have to improve my lib to benefit form this. Miners may like to exploit some algorithms which perform well on ARMv8 and exploit the native AES and SHA2 implementations (I'm a bit clueless in this area). Last, for computer assisted vision, you have 8 cores with NEON ;-)
  12. I totally agree, I've been pissed off by many boards on which it was not possible to have correct heat dissipation due to too large the distance between the CPU and the top of the enclosure. With the NEO/NEO2 now you can simply press the PCB (hence the CPU) against a thermal pad touching the enclosure and you're done. It's in fact one of the very rare board capable of spreading the heat outside of the enclosure and not to slowly heat its own environment. I wish other vendors would realize this as well. And yes, I know it's difficult and expensive to make dual-side BGA designs so that will probably limit the possible candidates here. I would have had a much easier design for my MiQis if the RK3288 had been on the other side!
  13. I usually run "openssl speed rsa2048 -multi <#cores>" for this, the RSA code is carefully optimized to achieve a very high IPC on most CPUs and I always managed to achieve the highest power consumption with this. The only difficulty is that it doesn't last long (10s) so you have to measure quickly. Another benefit is that it often comes pre-installed on most systems.
  14. It's an over-simplification. The die technology and etching resolution matters a lot as well. And my NanoPI-M3 with 8xA53 at 1.4 GHz disagrees with you, just like my Odroid-C2 at 1.536 GHz which stays really cool to the touch at full speed. Some A53 run pretty fine at 2 GHz. You'd better say that A53 is mostly used in cheap devices and that cheap devices don't scale well above 1 GHz due to the cheap technologies involved, that would be more accurate.
  15. I've just tested on my MiQi and got about 2% higher performance at 1.8 GHz : type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128 cbc 80197.72k 87590.12k 91353.34k 92255.91k 92520.45k bf-cbc 52186.86k 59762.60k 62007.89k 62678.36k 62876.33k At 2.0 GHz I get this : type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128 cbc 88145.11k 97268.91k 101094.74k 102091.43k 102386.35k bf-cbc 57751.12k 66213.18k 68619.26k 69361.66k 69580.12k My openssl is build in thumb mode with the following options under linaro gcc-4.7.4 : armv7thf-gcc47l_glibc218-linux-gnueabi-gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -mcpu=cortex-a9 -mlittle-endian -O3 -fomit-frame-pointer -march=armv7-a -D__ARM_MAX_ARCH__=7 -O3 -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM Given that most AES code is written in asm, this 2% performance gain likely comes from the rest of the code and could be caused by the mcpu=cortex-a9 build options, I'm forcing it everywhere as I found it to perform better on the RK3288 than mcpu=cortex-a17 on most programs. I suspect that the A17 core in the RK3288 is more of an A12 (as reported in cpuinfo) hence closer to A9 than what the A17 was supposed to be before ARM decided that both were the same.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines