This is something hopefully suitable to become a 'Board Bring up' thread.
The NanoPC T4 was the smallest RK3399 board around featuring full set of interfaces (Rock960 was smaller but there you can't use GbE without a proprietary expander) but in the meantime he got two smaller siblings: NanoPi M4 and the cute NEO4.
Pros:
Another RK3399 board so software support is already pretty mature
Rich set of interfaces (2 x USB2 without shared bandwidth, 2 x USB3, triple display output and so on)
No powering hassles due to 12V (2A) PSU requirement
16GB superfast eMMC 5.1
Usable and performant Wi-Fi (dual band and dual antenna so MIMO can be used, for numbers see here)
All 4 PCIe lanes exposed (M.2 M key connector on the bottom, suitable for NVMe SSDs, or to attach a 4 port SATA controller or a PCIe riser card)
Cons:
A bit pricey (but if you compare with RockPro64 for example and order all Add-Ons you end up with a similar price)
High idle consumption (4W PSU included in idle), maybe this is just bad settings we can improve over time
heatsink too small for continous loads
I started relying on @hjc's work since he's currently using different kernels than we use on RockPro64 or ODROID-N1 (though all the 4.4 kernels are more or less just RK's 4.4 LTS branch with some modifications, with mainline I didn't had a look what's different in Heiko's tree and 'true' mainline).
Tinymembench numbers with RK 4.4 vs. mainline kernel (the latter both showing lower latency and higher bandwidth).
Internal 16 GB eMMC performance:
eMMC / ext4 / iozone random random
kB reclen write rewrite read reread read write
102400 4 23400 28554 26356 26143 27061 29546
102400 16 48364 48810 85421 85847 84017 47607
102400 512 48789 49075 273380 275699 258495 47858
102400 1024 48939 49053 290198 291462 270699 48099
102400 16384 48673 49050 295690 295705 294706 48966
1024000 16384 49243 49238 298010 298443 299018 49255
That's what's to be expected with 16 GB and exactly same numbers as I generated on ODROID-N1 with 16 GB size. When checking SD card performance it maxed out at 23.5 MB/s which is an indication that no higher speed modes are enabled (and according to schematics not possible since not able to switch to 1.8V here -- I didn't try to adjust DT like with ODROID-N1 where SDR104 mode is possible which led to some nice speed improvements when using a fast card -- see here and there)
Quick USB3 performance test via the USB-3A port:
Rockchip 4.4.132 random random
kB reclen write rewrite read reread read write
102400 4 24818 29815 33896 34016 24308 28656
102400 16 79104 90640 107607 108892 80643 89896
102400 512 286583 288045 285021 293431 285016 298604
102400 1024 315033 322207 320545 330923 320888 327650
102400 16384 358314 353818 371869 384292 383404 354743
1024000 16384 378748 381709 383865 384704 384113 381574
mmind 4.17.0-rc6-firefly random random
kB reclen write rewrite read reread read write
102400 4 37532 42871 22224 21533 21483 39841
102400 16 86016 104508 87895 87253 84424 102194
102400 512 274257 294262 287394 296589 287757 304003
102400 1024 294051 312527 317703 323938 323353 325371
102400 16384 296354 340272 336480 352221 339591 340985
1024000 16384 367949 189404 328094 330342 328136 139675
This was with an ASM1153 enclosure which shows slightly lower numbers than my usual JMS567 (all currently busy with other stuff). Performance with RK 4.4 kernel as expected, with mainline lower for whatever reasons. I also tried to test with my VIA VL716 enclosure directly attached to the USB-C port but ran into similar issues as with RockPro64 but since my enclosure and the cable also show problems when using at a MacBook Pro I suspect I should blame the hardware here and not USB-C PHY problems with RK3399.
This is NanoPC T4 with vendor's heatsink, lying flat on a surface that allows for some airflow below, running cpuburn-a53 on all 6 cores after half an hour:
13:57:31: 1008/1416MHz 8.44 100% 0% 99% 0% 0% 0% 91.1°C 0/5
13:57:40: 1008/1416MHz 8.52 100% 0% 99% 0% 0% 0% 91.1°C 0/5
13:57:48: 1008/1416MHz 8.51 100% 0% 99% 0% 0% 0% 91.1°C 0/5
13:57:57: 1200/1416MHz 8.47 100% 0% 99% 0% 0% 0% 90.6°C 0/5
13:58:05: 1200/1416MHz 8.47 100% 0% 99% 0% 0% 0% 91.1°C 0/5
So with heavy workloads you most probably need a fan to prevent throttling.
Development related questions: IMO we should try to rely on single sources for all the various RK3399 boards that are now available or will be soon. And I would prefer ayufan's since he's somewhat in contact with RK guys and there's a lot of great information/feeback provided by TL Lim. What do others think?
Also an issue is IRQ affinity since on boards where PCIe is in use those interrupts should clearly end up on one of the big cores while on other boards USB3 and network IRQs are better candidates. I already talked about this with @Xalius ages ago and most probably the best idea is to switch from static IRQ affinity set at boot by armbian-hardware-optimization to a daemon that analyzes IRQ situation every minute and adopts then dynamically the best strategy.
Wrt information for endusers. All RK3399 boards basically behave the same since the relevant stuff is inside the SoC. There's only different DRAM (matters with regard to consumption and performance), different interfaces exposed and different power circuitry (and obviously different settings like e.g. cpufreq behaviour but I think we should consolidate those for all RK3399 boards). So you already find a lot of information in my ODROID-N1 'review', my SBC storage performance overview and most probably also a lot around RockPro64. No idea where to inform about RK3399 GPU/VPU stuff since not interested in these areas at all (hope others add references or direct information).