Jump to content

wtarreau

Members
  • Posts

    41
  • Joined

  • Last visited

Everything posted by wtarreau

  1. FWIW, I've ordered two StationPC-M2 (one for me and one for a friend). The devices look really awesome and perfectly match what I expect from a small fanless machine (i.e. sturdy metal enclosure and good I/O). HOWEVER, as often with many rockchip SoC vendors, the images "provided" (i.e that you have to figure by yourself) by t-firefly fail to flash, and their crappy flashing tools can even segfault. There's a tool to turn these to SD images but the tool is only provided for windows instead of just explaining the layout and how to convert the image (I guess there are just some headers to be added or removed). And of course they do not provide bootable SD images for now, it wouldn't be fun. Thus I guess that it will take quite some time before these devices get supported since for now they are sold as bricks that cannot even be flashed, and that once someone figures how to reach that point, there will be some effort to get away from this mess. For now I'm giving up, spending the whole afternoon on it drove me sufficiently crazy that I want to crush the device under a hammer!
  2. It would have been nice at least to keep the mention that the device Balbes was talking about was the X96 Max, that's not specific to any store and it's just a board with a supported SoC (S905X3), otherwise his post becomes confusing or misleading since it's a different hardware from the one posted above.
  3. Please note, my build farm is used for cross-compiling only. I used to do native builds 20 years ago, and after having been hit several times by accidental dependencies on the build host, I stopped and am always cross-compiling nowadays, even when doing x86 on x86. That's why I can use whatever host is available for the build farm. My build farm at home is heterogeneous, it's made of the armv8 boards above, one armv7 board (odroid xu4) and sometimes some x86 hosts when the devices are up. So yes, I'm a huge proponent of cross-compiling.
  4. I understood as well that HK got their hands on the blob part and were able to figure the highest stable frequency they could use. I don't know how this blob is used but definitely without it you won't go above 1.5 GHz.
  5. Use sales@, I've used it several times and it works. Oh and put a link to the discussion in this forum so that they know you're not just doing random stuff but are actually using their boards fine. I wouldn't be surprised if they're used to receive an occasional complaint from people who just accidently erase their micro-SD and cry for help.
  6. OK, good luck! Note, you may want to contact FriendlyElec before putting the board into the oven to let them know that one of your boards seems to be dead and let them know what you've tried and that you're planning on trying the oven. Maybe they'll simply want to send you a replacement one and will ask for this one to be sent back there for inspection. They've already sent me some replacement hardware for early defects, they're really nice.
  7. You should try to connect a serial adapter to its console port to see if it emits anything at boot. If it doesn't, it's probably dead. If it emits random errors which differ upon every boot, it could be the RAM which became defective. This happened to me on a few boards in the past, and on one of the MIQIs on my build farm. It is also possible that a solder joint has gone bad under a BGA chip. That's the most common failure cause in modern hardware (especially smartphones). It's even worse with RoHS and lead-free because lead-free tin is less elastic and breaks more easily. I've repaired a few such dead boards using a hot air gun and an infrared thermometer, but each time you know that you're probably killing it even more, so you have to be prepared to this. Note that it can also work in an oven if you want to try. Most of the time it doesn't work and can even make things worse but if the board is dead, there's nothing to lose to try. Just pre-heat your oven at 180°C without the board, make sure the temperature is stable (otherwise pre-heat it at 200 and stop it). Then place the board inside for 3-5 minutes. Don't touch it to extract it, open the oven to let it cool down enough, and pick the board once the temperature has dropped enough for the solders to be solid. Some plastic may get slightly damaged if the oven is too hot (e.g. reset button). The datasheet recommends -25 to 85°C operating temperature (both ambient and die), and -40 to 150 storage. In short it means that it must work flawlessly at 85, that beyond this it only will if you're lucky, and that at 150 you're definitely certain to fry it. Also running a chip past its maximum operating temperature risks to hang it and a hung chip may quickly get damaged. But below 85 you have zero risk for the device itself. If you're overclocking, the margin may be lower. My personal appreciation of "running pretty hard" is when the chip's temperature never has the opportunity to fall back into the recommended range ;-) Yep I agree. I wanted to do it as well but my copper plates are too thick. However since my board is inside a very tight cardboard "enclosure", I've added an aluminum plate on the other side above two efficient thermal pads. It allows to collect some of the heat from the other side and spread it through the cardboard. It was enough to make the temperature drop by 5-10°C here. But in general since aluminum has a moderate thermal resistance, you indeed want to improve its contact surface with the hot source. A copper plate (which is a much better thermal conductor) does this by spreading the heat all over the base of the heat sink and avoiding hot spots below it. Ideally you need to have a soft thermal pad covering the whole surface of the copper plate, except the small silicon part, so that the heat is extracted from anything below. I've long been wondering if it would be possible to achieve this using a good chunk of silicon glue to stick the copper plate on top of everything. But that would definitely destroy the board.
  8. I never had such an issue, but I never plugged HDMI into it either, so it's hard to say if anything is related. If it fails to boot, I'd suggest it's not related to the software/distro so you'd better ask FriendlyElec directly in case they'd be aware of any stability issue or a workaround for what you're seeing. Note, be careful about the power supply, especially if it's shared between all boards. It could be possible that when they're all rebooting, the PSU doesn't cope nicely with the sudden current rush and provides too low a voltage to the boards.
  9. Hi, So I finally found some time to pick the file form the board, sorry for the delay, was busy on other stuff. I verified the temperature thresholds. First alert is at 113, second at 115 and critical is at 120. The file includes one extra DVFS entry for 1600 MHz at 1.25V. I didn't find any other difference compared to the factory dtb. For those landing here from a search engine without preliminary context, this works for my board and is very likely to fry yours, so don't randomly try this file at all, or don't complain! s5p6818-nanopi3-rev05.1g6-1v25-113deg.dtb
  10. Hehe, that doesn't change my opinion of this standard which is only useful to build development boards and nothing looking even remotely like a usable prototype. The small form factor only exposes useless stuff and the large one requires access to both sides, thus imposing an enclosure size. But for development I'm totally fine with the EE form factor as it provides a rich set of connectors and standards in a reasonable size.
  11. Confirmed, from memory I just added a line with "1600000 1250000". Mine is ultra-stable even with 8 cores at full speed. It happens to throttle a little bit from time to time because the heat cannot escape easily from the cardboard enclosure, but that's all. From what I remember from the datasheet, the chip is designed to run at 1.6 GHz, that's why I picked this frequency. I also modified the thermal triple points because I don't want the machine to throttle quickly, I seeked the limits on mine and set the points slightly below. I can upload the file later if some want it. It's just that it's possible that my temp limits might cause issues to others so we need to be cautious and I don't want people to randomly download the file without reading the context, then complain about this board's stability when using armbian...
  12. wtarreau

    NanoPi NEO4

    I thought they were two different names for the same upcoming board, with various intermediary designs. But you're right, the NEO4 is even smaller than the M4! So yes that makes sense, it's a single channel RAM. Then I think I'm more interested in the M4. However if they made a complete aluminum enclosure like they recently did with the NEO/NEO2 with the buttons and OLED, it could be very tempting to get one as well for about everything you can do with a machine lying in your computer bag!
  13. wtarreau

    NanoPi NEO4

    I just checked the Wiki and it's clearly written dual-channel for the RAM regardless of the size, so we have 64 bits.
  14. wtarreau

    NanoPi NEO4

    @tkaiser I totally agree with you. I'm checking every morning on their site while drinking my first coffee if it's available or not! While I'm not *that* much impressed by RK3399, it's still a pretty good SoC, and this combined with FE's documentation and thermal design should bring something really nice. I'm just wary of the 32-bit memory, we'll have to see once it's available. I can understand their choice given the small size of the board though.
  15. What you can do is increase the 2nd argument, it's the number of loops you want to run. At 1000 you can miss some precision. I tend to use 100000 on medium-power boards like nanopis. On the clearfog at 2 GHz, "mhz 3 100000" takes 150ms. This can be much for your use case. It reports 1999 MHz. With 1000 it has a slightly larger variation (1996 to 2000). Well, it's probably OK at 10000. I took bad habits on x86 with intel_pstate taking a while to start. Maybe you should always take a small and a large count in your tests. This would more easily show if there's some automatic frequency adjustment : the larger count would report a significantly higher frequency in this case because part of the loop would run at a higher frequency. Just an idea. Or probably that you should have two distinct tools : "sbc-bench" and "sbc-diag". The former would report measured values over short periods, an the latter would be used with deeper tests to try to figure whats wrong when the first values look suspicious.
  16. Please note that the operating points is usually fed via the DT while the operating frequency is defined by the jumpers on the board. It's very possible that the DT doesn't reference the correct frequencies here. From what I've apparently seen till now, the Armada 38x has limited ability to do frequency scaling, something like full speed or half speed possibly. When I was running mine at 1.6 GHz, I remember seeing only 1600 or 800 being effectively used. I didn't check since I upgraded to 2 GHz (well 1.992 to be precise) but I suspect I'm now doing either 2000 or 1000 and nothing else. Thus if you have a smaller number of operating points it would be possible that they are incorrectly mapped. Just my two cents :-)
  17. wtarreau

    NanoPI M4

    Oh I definitely agree and that's what I was thinking as well in the case an RPi enclosure was used : cover all the bottom with a 1mm thick aluminum plate that will radiate the heat through the plastic over all this surface. After all my cardboard-made npi-fire3 enclosure is not far from this :-) BTW I wasn't aware of the FLIRC case at all.
  18. Absolutely, and this factor is impacted by current (derived from voltage) and temperature. That's why what matters for stability is to find the proper operation conditions. Overclocking in a place where the ambien temperature can vary by 10 degrees can cause big problems. Same for those who undervolt to reduce heat because signals raise softly and degrade as well, or who use too small a heatsink. That said with nowadays software quality you often face a software bug many more times than hardware bit flips :-)
  19. wtarreau

    NanoPI M4

    It definitely is one, the size and shape leave no doubt about it. I'm also seeing some symmetric lines routed to the GPIO2 connector, maybe it's USB2 that's brought there. Maybe even PCIe (though I doubt PCIe works fine on such large connectors since it requires very low capacitance). I do also appreciate a lot the CPU on the correct side. Those who complain about the inability to use a heatsink in an RPi enclosure are also the ones not planning on using one anyway if it were on the other side Also, very likely the other side will feature the DDR4 chips, and it will still be possible to use a heatsink there to spread most of the heat into the enclosure. But having an aluminum enclosure for such a design would be really great.
  20. This is exactly why there are people like us who dissect products and push them to their limits so that end users don't have to rely on marketing nor salesmen but on real numbers reliably measured by third party who don't have any interest in cheating. Regarding your point about Tj and lifetime, you're totally right, and in general it's not a problem for people who overclock because if they want to get a bit more speed they won't keep the device for too long once it's obsolete. Look at my build farm made out of MiQi boards (RK3288). The Rockchip kernel by default limits the frequency to 1.6 GHz (thus the Tinker board is likely limited to this as well). But the CPU is designed for 1.8. In the end, with a large enough heat sink and with throttling limits pushed further, it runs perfectly well at 2.0 GHz. For a build farm, this 25% frequency increase directly results in 25% lower build time. Do I care about the shortened lifetime ? Not at all. Even if a board dies, the remaining ones are faster together than all of them at stock frequency. And these boards will be rendered obsolete long before they die. I remember a friend, when I was a kid, telling me about an Intel article explaining that their CPU's lifetime are halved for every 10 degrees Celsius above 80 or so, and based on this I shouldn't overclock my Cyrix 133 MHz CPU to 150. I replied "because you imagine that I expect to keep using that power-hungry Cyrix for 50 years? For me it's more important that my audio processing experimentation can run in real time than protecting my CPU and having to perform them offline". However it's important that we are very clear about the fact that this has to be a choice. You definitely don't want companies to build products that will end up in sensitive domains and expected to run for over a decade using these unsafe limits. On the other hand, the experiments run by a few of us help figuring available margins and optimal heat dissipation, both of which play an important role in long term stability designs.
  21. By the way if we start to be numerous to buy the board, it may finally become incentive for someone to design a 3D printed enclosure. I'd prefer a metal one with a thermal pad serving as a heat sink at the same time, but I'd be happy with anything better than cardboard+duct tape...
  22. Any "correct" USB power supply delivering more than 1.5A under 5V will work, though you'll have to make you own cable or to solder the wires. But with good quality USB cables, it will also work via the micro-USB port, because the current drawn by this board is not *that* high. I even power mine from a USB3 connector of my laptop which delivers about 1.6A (it's over spec and that's great for this use case). You really need to test. Some reported 1.2A under 5V. It's only 33% higher than the regular USB3 limit (900mA) and may actually work fine with most PCs or chargers due to large enough margins in the design.
  23. I'm pretty sure it depends on a number of parameters. Mine starts to throttle at 113 degrees C because I found that it works fine till 120 and I don't want it to throttle for no reason. In your case for a cluster it will be difficult to test all boards and check that they're running fine over time. But it can also be valuable. I seem to remember reading 90 degrees max in the datasheet so that could be a good start but it's very close to the existing limits. I don't know if the stability of your workloads is critical or if you can take the risk to see one board hang once in a while to find the limits. One other important factor to keep in mind is whether you're using the GPU or not. I am not, which is why I can trust the ability to throttle to cool it down. If you are not using it either, you could possibly decide to start with a limit at 105. I'm only concerned by temperatures getting close to the ones causing instability. For most of my hardware, when I focus on performance I don't care if it shortens its life since it will be obsolete before it dies. That's why I searched the limits for my board. You need to keep a bit of margin because it takes some time for the temperature to be reported, then when the board starts to throttle it continues to heat a bit. However at very high temperatures it cools down very quickly. Mine throttles at 113 and it rarely reaches 115. I'm using the stock heat sink, and worse, the whole thing is packed into a cardboard "enclosure" so that it can safely lay in my computer bag. Basically there is no air flow around it, it only adds latency to the temperature raise, and spreads it all around in the cardboard. It's totally horrible, and when I leave it for too long on my desk, the desk gets hot under it :-) For my use cases (mostly network endpoint for development) it doesn't throttle at all. I've run some build tests, and I have enough time to compile for a few minutes before it starts to throttle, but even when it does, it doesn't for too long (it oscillates between 113 and 115 degrees). Oh I know what you're talking about, I also happen to hate fans for the same reason. I've installed a 12cm fan behind my MiQi build farm at work, which is powered by the central board's GPIO when the temperature gets too high. It's a 12V fan running on 5V so it probably rotates at less than 1000 RPM and I almost can't hear it. The one at home has much larger heat sinks and no fan. Small fans are noisy and inefficient, you should really pick a large and slow one for your whole cluster. That's what I'd do if I built one (I'd love to just for fun, it's just that I figured that I have no use case for a NanoPi cluster at the moment!).
  24. Well, all these multi-port chargers never deliver up to the amount they claim. You can safely expect 50 to 66% though, which is not bad overall. I removed the current limit detection in mine to stabilize the output for the MiQi farm. That said, I never managed to pull more than 1.6A in peak from my Fire3 at 1.6 GHz under 1.25V, so you have some headroom I guess. You need to consider that when the board is hot, its DC-DC regulators' efficiency starts to drop and to turn the current into more heat. Thus it's more important to measure the current when the board is already hot if you want to be pessimistic (or realistic). That was the case for me at when I measured 1.6A. Quite frankly, you're worrying too much : if when loading all the boards it still works, that's fine. If you want to buy more boards, then buy them and plug them to your charger until you find the limit. The charger will either cut one port or completely shut down. Then you'll know how many more chargers you need to buy depending on the number of boards :-)
  25. No, in my experience, the board will either hang, switch off, or reset when undervolted. Usually you need a voltmeter to check the board voltage under load. If you don't have one, you'll need to verify that they're all working fine (ie: ping them). The best you can do is to run cpuburn-a53 on all of them at the same time. If nothing fails, you should be fine.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines