Overview
(Disclaimer: The following is for techies only that like to dig a bit deeper. And if you're not interested in energy-efficient servers then probably this is just a waste of time )
EDIT: Half a year after this poorly designed SBC has been released just one of the many design flaws has been fixed: Micro USB for DC-IN has been replaced by the barrel jack that was present on the pre-production batches. If you were unfortunate to get a Micro USB equipped M3 please have a look here how to fix this. Apart from that check the Banana forums what to expect regarding software/support first since this is your only source)
SinoVoip sent me a review sample of the recently shipped so called "Banana Pi M3" yesterday. It's a SBC sharing name and form factor of older "Banana Pi" models but is of course completely incompatible to them due to a different SoC, an A83T (octa-core Cortex-A7 combined with a PowerVR SGX544 GPU). For detailed and up-to-date informations please always refer to the linux-sunxi wiki.
This new model distinguishes itself from the Banana Pi M2 with twice as much CPU cores and DRAM (LPDDR3), 8 GB eMMC onboard and BT4.0. And compared to the "M1" (the original Banana Pi) it features also 802.11 b/g/n Wi-Fi. Unlike the M2 the M3 is advertised as being SATA capable. But that's not true, it's just an onboard GL830 USB-to-SATA bridge responsible for horribly slow disk access. Unfortunately the GL830 and both externally available USB ports are behind an internal USB hub therefore all ports have to share bandwidth this way and use just one single USB connection to the SoC.
Since my use cases for ARM boards are rather limited you won't find a single word about GPIO stuff (should work if pin mappings are defined correctly), GPU performance, BT, Wi-Fi or Android. Simply because I don't care
Getting Started:
The board arrived without additional peripherals (no PSU) therefore you need an USB cable using a Micro-USB connector to power the board. Both DC-IN and USB-OTG feature an Micro-USB connector which is bad news since pre-production samples had a real DC-In connector (4.0mm/1.7mm barrel plug, centre positive like the M2). I suffered from several sudden shutdowns under slight load until I realized that I used a crappy cable. Many (most?) USB cables lead to voltage drops and when the board demands more power it gets in an undervoltage situation and the PMU shuts off.
Same will happen to you unless you can verify that you've a good cable. I did not succeed querying the M3's powermanagement unit (PMU) regarding available voltage (/sys/devices/platform/axp81x_board/axp81x-supplyer.47/power_supply/ac/voltage_now shows always 0). This was a lot easier with the older Banana Pi M1: Here you can watch my cable being responsible for voltage drops under high load (I accidentally used this again with the M3).
To avoid the crappy Micro-USB connector (limited to 1.8A maximum by specs and tiny contacts) you can desolder it and solder a cable or a barrel plug -- the PCB is already prepared for the latter. Or ask SinoVoip if they can fix this mistake with the next batch of PCBs. On the bottom side of the PCB there are also solder pads for a Li-Ion battery. It has to be confirmed whether the AXP813 PMU can also be fed with 5V through the Li-Ion connector since this is the preferred way to fix the faulty power design other SinoVoip products show.
One final word regarding power: It seems currently something's wrong with power initialisation in the early boot stages (u-boot). With a connected bus-powered USB disk the board won't start or immediately shut down when the disk is connected within the first 10 seconds. I didn't verify when exactly because if you've a look at SinoVoip's commit log it seems they began to fix many obvious bugs just right now after they already started shipping the board (we've seen that with the M2 also).
First Showstoppers:
Since the board came with an unpopulated eMMC (why the heck?) I had to try out the available OS images from the banana-pi.org download site. Unlike everyone else on this planet they don't provide MD5/SHA1 checksums to be able to check integrity of downloads and even if you tell them that they've uploaded corrupted images they don't care. From 4 OS images 3 are corrupted (according to unzip) and all failed soon after boot with kernel panics. I tried the Android image to verify FEL mode works.
But since Android is of zero use for me, I decided to build an own OS image from an Ubuntu distro running on the Orange Pi where I had the SD-card inserted. Since details are boring just as a reference. From then on I used this Ubuntu image and exchanged only the freshly built stuff from SinoVoip's BSP Github repo (3.4.39 kernel, modules, bootloader and also simple things like hardware initialisation since kernel/u-boot they prefer does NOT support script.bin)
First Impressions:
Heat (dissipation):
The A83T needs a heatsink otherwise you won't be able to benefit from its performance. Allwinner's 3.4.39 kernel provides 'budget cooling' using 2 techniques: thermal throttling and shutting down CPU cores. You can define this 'thermal configuration' in sysconfig.fex and have to take care that you understand what you're doing since if throttling doesn't help your CPU cores will be deactivated and you have to can't bring them back online manually the usual way since Allwinner's kernel doesn't allow so:
echo 1 >/sys/devices/system/cpu/cpuX/online
Therefore it's better to stay with the thermal defaults to allow throttling and improve heat dissipation instead. I used a $0.5 heatsink that performs ok. Without heatsink when running CPU intensive jobs throttling limited clockspeed to 1.2 GHz but with the heatsink I was able to run most of the times at ~1.6Ghz under full load. With heatsink and an annoying fan I managed to let the SoC run constantly at 1.8GHz and achieved a 7-zip score close to 6000 and finished "sysbench --test=cpu --cpu-max-prime=20000 run --num-threads=8" in less than 53 seconds.
This is an example for wrong throttling values (too high) so that the kernel driver does not limit clockspeeds but starts to drop CPU cores instead:
CPU performance:
Since the H3 (used on the more recent Orange Pis) and the A83T seem to use much of the same kernel sources (especially the 'thermal stuff') I did a few short tests. When running with identical clockspeed and the same amount of cores they perform identical (that means they're slower than older Cortex-A7 SoCs like eg. the A20 when running at identical clockspeed -- a bit strange). Obviously the difference between H3 and A83T is the process. Both already made in 28nm but the A83T as 'tablet SoC' in the more energy efficient HPC process allowing less voltage and higher clockspeeds. According to sysconfig.fex the SoC should be able to clock above 2.1 GHz but since exceeding 1.6 Ghz already needs a fan this is pretty useless on a SBC (might be different inside a tablet where the back cover could be used as a large heatsink).
Network throughput:
I used my usual set of iperf testings and tried GBit Ethernet performance (with and without network tunables it remains the same -- reason below):
BPi-M3 --> Client:
[ 4] 0.0-10.0 sec 671 MBytes 563 Mbits/sec
[ 4] 0.0-10.0 sec 673 MBytes 564 Mbits/sec
[ 4] 0.0-10.0 sec 870 MBytes 729 Mbits/sec
[ 4] 0.0-10.0 sec 672 MBytes 564 Mbits/sec
[ 4] 0.0-10.0 sec 675 MBytes 566 Mbits/sec
Client --> BPi-M3:
[ 4] 0.0-10.0 sec 714 MBytes 599 Mbits/sec
[ 5] 0.0-10.0 sec 876 MBytes 734 Mbits/sec
[ 4] 0.0-10.0 sec 598 MBytes 501 Mbits/sec
[ 5] 0.0-10.0 sec 690 MBytes 578 Mbits/sec
[ 4] 0.0-10.0 sec 604 MBytes 506 Mbits/sec
When I used longer test periods (-t 120) then the "Client --> BPi-M3" performance increased up to the theoretical limit: 940 Mbits/s. Then a second iperf thread jumped in, both utilising a single CPU core fully. And that's the problem: Networking is CPU bound, a single client-server connection will not exceed 500-600 Mbits/sec as it was the case when I started with A20 based boards 2 years ago. Since all we have now with the A83T is an outdated 3.4.39 kernel and since I/O bandwidth on the M3 is so low, I stopped here since it's way too boring to try to improve network throughput and also useless (disk access is so slow that it simply doesn't matter when Ethernet is limited to half of the theoretical GBit Ethernet speed... at least for me )
Accessing disks:
Since there's a SATA connector on the board I gave it a try.
Important: the SATA-power connector uses the same polarity as older Banana Pis and Orange Pis (keep that in mind since combined SATA data/power cables from LinkSprite and Cubietech that share exactly the same connector use inverted polarity!).
I started with the Samsung EVO I always use for tests (but due to the old 3.4 kernel using ext4 instead of btrfs) and was shocked: 13.5/23 MB/s is the worst result I ever measured. I then realised that I limited maximum cpufreq to 480 MHz and tried with 1800 MHz again. A bit better but far away from acceptable:
GL830 USB-to-SATA performance:
480 MHz: kB reclen write rewrite read reread
4096000 4 13529 13466 22393 22516
4096000 1024 13588 13411 22717 26115
1.8 GHz: 4096000 4 15090 15082 30968 30316
4096000 1024 15174 15131 30858 29441
I disconnected the SSD from the 'SATA port' and put it in an enclosure with a JMicron JMS567 USB-to-SATA bridge and measured again: Now sequential transfer speeds @ 1800 MHz exceeded 35/34 MB/s. The GL830 is responsible for low throughput -- especially writes are slow as hell.
I made then a RAID-1 through mdadm consisting of an external 3TB HDD (good news: the GL830 can deal with partitions larger than 2 TB) and the SSD. First test with the HDD connected to the M3's GL830 bridge (GL) and the SSD connected to the JMS567 (JM). Then I disconnected the HDD from the GL830 and put it in another external enclosure with an ASMedia 1053 (ASM).
Obviously SinoVoip's decision to use an internal USB hub and only one host port of the SoC leads in both situations to limited (shared) bandwidth. But in case the internal USB-to-SATA bridge is involved performance is even worse:
GL/JM: kB reclen write rewrite read reread
4096000 4 17800 17140 14382 16807
4096000 1024 17741 17258 14493 14368
JM/ASM: 4096000 4 19307 18458 22855 26241
4096000 1024 19231 18518 21995 22362
If SinoVoip would've saved the GL830 USB-to-SATA bridge and wired both SoC's host ports to the 2 type-A USB ports directly without the internal hub in between overall performance would be twice as good. And obviously the M3's 'SATA port' is the worst choice to connect a disk to. Any dirt-cheap external USB enclosure will perform better.
SD-card and eMMC:
Just a quick check with the usual iozone settings running @ 1.8 GHz:
kB reclen write rewrite read reread
eMMC: 4096000 4 26572 27014 59187 59239
4096000 1024 25875 26614 56587 56667
SD-card: 4096000 4 20483 20855 22473 22892
4096000 1024 20526 19948 22285 22660
LOL, eMMC twice as fast as 'SATA'. The performance numbers of the SD-card (SanDisk "Extreme Pro") are irrelevant since I can not provide performance numbers from a known fast reference implementation. But since I might be able to provide this the next few days, I decided to give it a try. On older Allwinner SoCs there's a hard limitation regarding SDIO/SD-card speed. Maybe this applies here too.
EDIT: Yes, it's a board/SoC limitation. When reading/writing the SD-card on a MacBook Pro I achieve ~80 MB/s. It seems SDIO on A83T is limited to ~20MB/s
Other issues:
If you want to try out the M3 you'll have to stay on the bleeding edge. Don't expect that any of the available OS images are close to useable. They just recently started to fix a lot of essential bugs in code and hardware initialisation. If you want to test the M3 be prepared to compile the BSP daily and exchange the bootloader/kernel/initialisation stuff on your SD-card/eMMC
Currently average load is always 1 or above. When we started over 2 years ago with Cubieboards (and an outdated kernel 3.4.x) there was a similar issue. Maybe it's related. I just opened a Github issue
Mainline kernel support in very early stage. Don't count on this that soon (situation with Banana Pi M2 was a bit different. All the OS images from SinoVoip based on kernel 3.3 weren't useable but the community provided working distros backed by the work of the linux-sunxi community and existing mainline kernel support for the M2's A31s)
Always keep in mind that hardware without appropriate software is somewhat useless. SinoVoip has a long history of providing essential parts of software way too late or not at all (still applies to the M2 -- before you buy any SinoVoip product better have a look into their forums to get the idea which level of support you can expect: zero). Even worse: For the M2 and its A31s SoC there exists mainline kernel support (everything developed by the community while the vendor held back necessary informations). This does not apply to the A83T used on the M3. At the moment you're somewhat lost since you've to rely on the manufacturer's OS images (all of them currently being broken)
Conclusion:
Still no idea what to do with such a device.
Integer performance is great when you use a heatsink and even greater with an annoying fan. But where's the use case? If I would use the M3 with Android then everything that's relevant for performance does not depend on CPU (but instead CedarX for HW accelerated video decoding and GPU for 2D/3D acceleration -- BTW: the A83T is said to contain only a single core SGX544MP1 but the fex file's contents let me believe it's a faster MP2 instead).
Due to limited I/O and network bandwidth the integer performance is also irrelevant for nearly all kinds of server tasks. If it's just about 'SBC stuff' why wasting so much money? Triggering GPIO pins works also with cheap H3 based boards like Orange Pi PC or Orange Pi One that also have 4 times more I/O bandwidth compared to the M3 (due to 4 available USB ports instead of one).
And if I would really need a performant ARM SoC then I would buy such a thing and not an outdated Cortex-A7 design. I still have no idea what the M3 is made for. Except of selling something under the "Banana" brand to clueless people. Don't know. For my use cases the Banana Pi M1 outperforms the M3 easily -- both regarding price and performance (sufficient CPU power, 3 x USB and real SATA not 'worst USB-to-SATA implementation ever'). As usual: YMMV
Maybe the worst design decision (next to choosing the crappy Micro-USB connector for DC-IN) on the M3 is the 'SATA port'. If they would've saved both internal USB hub and GL830 and instead use the two available USB host ports then achievable I/O bandwidth would be way higher. Now both USB ports and the 'SATA port' have to share the bandwidth of a single USB 2.0 connection. Almost as bad as with the Raspberry Pis.
But most importantly: Check software und support situation first and don't rely on 'hardware features'. Remember: SinoVoip shipped the M2 with OS images where not a single GPIO pin was defined and Ethernet worked only with 100Mb/s since they 'forgot' to define GMAC pins. They fixed that months later but still not for every OS image (the Android image they provide is corrupted since months but they don't care even if users complain several times). Visit their forums first to get an idea what to expect. It's important!
Armbian support:
Not to be expected soon. It's worthless when having to rely on Allwinner's old 3.4.39 kernel. I combined loboris' H3 Debian image with kernel/bootloader stuff for the A83T and it worked as expected (even my RPi-Monitor setup matched almost perfectly).
Unless the linux-sunxi community improves mainline support for the A83T this situation won't change. But maybe someone interested in M3 (definitely not me) teaches SinoVoip how to escape from u-boot/kernel without support for script.bin in the meantime. Would be a first step.