AxelFoley Posted March 3, 2019 Posted March 3, 2019 Anybody else seen this ? Just upgraded and after a reboot Armbian wont boot into desktop. Xorg Looks good with no errors and I can see the kernel boot screen and log in via ssh. If I disconnect and reconnect the HDMI I get the following ... but nothing appears on the monitor [ 1733.805014] rockchip-vop ff900000.vop: [drm:vop_crtc_enable] Update mode to 1920x1080p60, type: 11 The screen remains on as it can detect the graphics display .... its just blank what display manager does Armbian use ?
AxelFoley Posted March 3, 2019 Author Posted March 3, 2019 I am going backwards through earlier kernels ... I was on 14.4.167 and I have just booted back to that version, and its still not booting into the desktop despite reinstalling it via armbian-conf So it suggest that its not the kernel and the problem must have been caused by one of the packages released since feb or some firmware has been upgraded.
AxelFoley Posted March 3, 2019 Author Posted March 3, 2019 Right found the culprit ,,,, its the display manager lightdm It was upgraded to version lightdm 1.26.0 and then it broke the desktop on pine's RockPro64 I will have to learn how to run LightDM in debug mode syslog:Mar 3 13:08:28 rockpro64_0 systemd[1]: lightdm.service: Service hold-off time over, scheduling restart. syslog:Mar 3 13:08:28 rockpro64_0 systemd[1]: lightdm.service: Scheduled restart job, restart counter is at 5. syslog:Mar 3 13:08:28 rockpro64_0 systemd[1]: lightdm.service: Start request repeated too quickly. syslog:Mar 3 13:08:28 rockpro64_0 systemd[1]: lightdm.service: Failed with result 'exit-code'.
AxelFoley Posted March 3, 2019 Author Posted March 3, 2019 LightDM in debug is about as helpful as a punch in the face :-( root@rockpro64_0:/var/log# /usr/sbin/lightdm -d [+0.00s] DEBUG: Logging to /var/log/lightdm/lightdm.log [+0.00s] DEBUG: Starting Light Display Manager 1.26.0, UID=0 PID=2908 [+0.00s] DEBUG: Loading configuration dirs from /usr/share/lightdm/lightdm.conf.d [+0.00s] DEBUG: Loading configuration from /usr/share/lightdm/lightdm.conf.d/50-disable-guest.conf [+0.00s] DEBUG: Loading configuration from /usr/share/lightdm/lightdm.conf.d/50-disable-log-backup.conf [+0.00s] DEBUG: Loading configuration from /usr/share/lightdm/lightdm.conf.d/50-greeter-wrapper.conf [+0.00s] DEBUG: Loading configuration from /usr/share/lightdm/lightdm.conf.d/50-guest-wrapper.conf [+0.00s] DEBUG: Loading configuration from /usr/share/lightdm/lightdm.conf.d/50-xserver-command.conf [+0.00s] DEBUG: Loading configuration from /usr/share/lightdm/lightdm.conf.d/60-lightdm-gtk-greeter.conf [+0.00s] DEBUG: Loading configuration dirs from /usr/local/share/lightdm/lightdm.conf.d [+0.00s] DEBUG: Loading configuration dirs from /etc/xdg/lightdm/lightdm.conf.d [+0.00s] DEBUG: Loading configuration from /etc/lightdm/lightdm.conf.d/11-armbian.conf [+0.00s] DEBUG: [SeatDefaults] is now called [Seat:*], please update this configuration [+0.00s] DEBUG: Loading configuration from /etc/lightdm/lightdm.conf.d/22-armbian-autologin.conf [+0.00s] DEBUG: Loading configuration from /etc/lightdm/lightdm.conf [+0.00s] DEBUG: Registered seat module local [+0.00s] DEBUG: Registered seat module xremote [+0.00s] DEBUG: Registered seat module unity [+0.00s] DEBUG: Using D-Bus name org.freedesktop.DisplayManager [+0.01s] DEBUG: Monitoring logind for seats [+0.01s] DEBUG: New seat added from logind: seat0 [+0.01s] DEBUG: Seat seat0: Loading properties from config section Seat:* [+0.01s] DEBUG: Seat seat0: Starting [+0.01s] DEBUG: Seat seat0: Creating user session [+0.01s] DEBUG: Loading users from org.freedesktop.Accounts [+0.01s] DEBUG: User /org/freedesktop/Accounts/User1001 added [+0.02s] DEBUG: User /org/freedesktop/Accounts/User1000 added [+0.03s] DEBUG: User /org/freedesktop/Accounts/User1002 added [+0.05s] DEBUG: User /org/freedesktop/Accounts/User1003 added [+0.08s] DEBUG: Seat seat0: Creating display server of type x [+0.08s] DEBUG: Using VT 7 [+0.08s] DEBUG: Seat seat0: Starting local X display on VT 7 [+0.08s] DEBUG: XServer 0: Logging to /var/log/lightdm/x-0.log [+0.08s] DEBUG: XServer 0: Writing X server authority to /var/run/lightdm/root/:0 [+0.08s] DEBUG: XServer 0: Launching X Server [+0.08s] DEBUG: Launching process 2914: /usr/bin/X -core :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch [+0.08s] DEBUG: XServer 0: Waiting for ready signal from X server :0 [+0.08s] DEBUG: Acquired bus name org.freedesktop.DisplayManager [+0.08s] DEBUG: Registering seat with bus path /org/freedesktop/DisplayManager/Seat0 [+0.28s] DEBUG: Process 2914 exited with return value 1 [+0.28s] DEBUG: XServer 0: X server stopped [+0.28s] DEBUG: Releasing VT 7 [+0.28s] DEBUG: XServer 0: Removing X server authority /var/run/lightdm/root/:0 [+0.28s] DEBUG: Seat seat0: Display server stopped [+0.28s] DEBUG: Seat seat0: Stopping session [+0.28s] DEBUG: Seat seat0: Session stopped [+0.28s] DEBUG: Seat seat0: Stopping display server, no sessions require it [+0.28s] DEBUG: Seat seat0: Active display server stopped, starting greeter [+0.28s] DEBUG: Seat seat0: Creating greeter session [+0.29s] DEBUG: Seat seat0: Creating display server of type x [+0.29s] DEBUG: Using VT 7 [+0.29s] DEBUG: Seat seat0: Starting local X display on VT 7 [+0.29s] DEBUG: XServer 0: Logging to /var/log/lightdm/x-0.log [+0.29s] DEBUG: XServer 0: Writing X server authority to /var/run/lightdm/root/:0 [+0.29s] DEBUG: XServer 0: Launching X Server [+0.29s] DEBUG: Launching process 2921: /usr/bin/X -core :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch [+0.29s] DEBUG: XServer 0: Waiting for ready signal from X server :0 [+0.42s] DEBUG: Process 2921 exited with return value 1 [+0.42s] DEBUG: XServer 0: X server stopped [+0.42s] DEBUG: Releasing VT 7 [+0.42s] DEBUG: XServer 0: Removing X server authority /var/run/lightdm/root/:0 [+0.42s] DEBUG: Seat seat0: Display server stopped [+0.42s] DEBUG: Seat seat0: Stopping session [+0.42s] DEBUG: Seat seat0: Session stopped [+0.42s] DEBUG: Seat seat0: Stopping display server, no sessions require it [+0.42s] DEBUG: Seat seat0: Stopping; greeter display server failed to start [+0.42s] DEBUG: Seat seat0: Stopping [+0.42s] DEBUG: Seat seat0: Stopped [+0.42s] DEBUG: Required seat has stopped [+0.42s] DEBUG: Stopping display manager [+0.42s] DEBUG: Display manager stopped [+0.42s] DEBUG: Stopping daemon [+0.42s] DEBUG: Exiting with return value 1
AxelFoley Posted March 3, 2019 Author Posted March 3, 2019 looks like somebody did not test the lightDM release properly ... gbm: failed to open any driver (search paths /usr/lib/aarch64-linux-gnu/dri:${ORIGIN}/dri:/usr/lib/dri) gbm: Last dlopen error: /usr/lib/dri/rockchip_dri.so: cannot open shared object file: No such file or directory failed to load driver: rockchip Couldn't open libEGL.so.1: libEGL.so.1: cannot open shared object file: No such file or directory
AxelFoley Posted March 3, 2019 Author Posted March 3, 2019 just removed the emmc card and booted from SD of a fresh downloaded 4.4.174 Armbian (5.75) and it booted into desktop OK! No mention of rockchip_dri.so in X.org.log and the display manager is /usr/sbin/nodm So now I just need to figure out what caused the original problem after apt-get update & apt-get upgrade caused the problem going from 4.4.167 to 4.4.174 When I reinstalled the Desktop from armbian-config it may have made things worse by installing lightdm. I'd find it surprising if apt-get upgrade would have caused the display manager to flip from nodm to lightdm
AxelFoley Posted March 3, 2019 Author Posted March 3, 2019 Right ... just did the apt-get update adn apt-get upgrade and rebooted and it broke again ... would not boot into desktop; [ 36.169577] cdn-dp fec00000.dp: Direct firmware load for rockchip/dptx.bin failed with error -2 [ 36.356615] nr_pdflush_threads exported in /proc is scheduled for removal [ 68.172054] cdn-dp fec00000.dp: [drm:cdn_dp_request_firmware] *ERROR* Timed out trying to load firmware [ 78.176684] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 80.176519] rk_gmac-dwmac fe300000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx [ 84.331416] NET: Registered protocol family 17 [ 140.912912] [drm:dw_hdmi_rockchip_set_property] *ERROR* failed to set rockchip hdmi connector property [ 140.912941] [drm:dw_hdmi_rockchip_set_property] *ERROR* failed to set rockchip hdmi connector property [ 140.912956] [drm:dw_hdmi_rockchip_set_property] *ERROR* failed to set rockchip hdmi connector property [ 140.912970] [drm:dw_hdmi_rockchip_set_property] *ERROR* failed to set rockchip hdmi connector property [ 140.912982] [drm:dw_hdmi_rockchip_set_property] *ERROR* failed to set rockchip hdmi connector property [ 140.912994] [drm:dw_hdmi_rockchip_set_property] *ERROR* failed to set rockchip hdmi connector property I had to disconnect and reconnect the HDMI in order to get the screen up, One of these packages has broken the RockPro64 display driver 2019-03-03 14:32:18 status installed libgs9-common:all 9.26~dfsg+0-0ubuntu0.18.04.7 2019-03-03 14:32:18 status installed libisc169:arm64 1:9.11.3+dfsg-1ubuntu1.5 2019-03-03 14:32:18 status installed libnss-systemd:arm64 237-3ubuntu10.13 2019-03-03 14:32:18 status installed libisccc160:arm64 1:9.11.3+dfsg-1ubuntu1.5 2019-03-03 14:32:18 status installed gconf2:arm64 3.2.6-4ubuntu1 2019-03-03 14:32:18 status installed armbian-config:all 5.76 2019-03-03 14:32:19 status installed libssl1.0.0:arm64 1.0.2n-1ubuntu5.3 2019-03-03 14:32:19 status installed gvfs-common:all 1.36.1-0ubuntu1.3 2019-03-03 14:32:19 status installed libnss-myhostname:arm64 237-3ubuntu10.13 2019-03-03 14:32:19 status installed libisc-export169:arm64 1:9.11.3+dfsg-1ubuntu1.5 2019-03-03 14:32:19 status installed systemd-sysv:arm64 237-3ubuntu10.13 2019-03-03 14:32:19 status installed libldb1:arm64 2:1.2.3-1ubuntu0.1 2019-03-03 14:32:19 status installed libgtk-3-common:all 3.22.30-1ubuntu2 2019-03-03 14:32:19 status installed libglib2.0-0:arm64 2.56.3-0ubuntu0.18.04.1 2019-03-03 14:32:22 status installed libgbm1:arm64 18.2.2-0ubuntu1~18.04.2 2019-03-03 14:32:22 status installed libgd3:arm64 2.2.5-4ubuntu0.3 2019-03-03 14:32:22 status installed libnss3:arm64 2:3.35-2ubuntu2.2 2019-03-03 14:32:22 status installed libglapi-mesa:arm64 18.2.2-0ubuntu1~18.04.2 2019-03-03 14:32:22 status installed libpoppler73:arm64 0.62.0-2ubuntu2.7 2019-03-03 14:32:23 status installed libx11-xcb1:arm64 2:1.6.4-3ubuntu0.2 2019-03-03 14:32:23 status installed gtk-update-icon-cache:arm64 3.22.30-1ubuntu2 2019-03-03 14:32:23 status installed libdbusmenu-glib4:arm64 16.04.1+18.04.20171206-0ubuntu2 2019-03-03 14:32:24 status installed libc-bin:arm64 2.27-3ubuntu1 2019-03-03 14:32:31 status installed udev:arm64 237-3ubuntu10.13 2019-03-03 14:32:34 status installed gvfs-libs:arm64 1.36.1-0ubuntu1.3 2019-03-03 14:32:34 status installed libwayland-egl1-mesa:arm64 18.2.2-0ubuntu1~18.04.2 2019-03-03 14:32:35 status installed systemd:arm64 237-3ubuntu10.13 2019-03-03 14:32:35 status installed libdns-export1100:arm64 1:9.11.3+dfsg-1ubuntu1.5 2019-03-03 14:32:40 status installed unattended-upgrades:all 1.1ubuntu1.18.04.9 2019-03-03 14:33:39 status installed man-db:arm64 2.8.3-2ubuntu0.1 2019-03-03 14:33:39 status installed libjavascriptcoregtk-4.0-18:arm64 2.22.6-0ubuntu0.18.04.1 2019-03-03 14:33:41 status installed libegl-mesa0:arm64 18.2.2-0ubuntu1~18.04.2 2019-03-03 14:33:41 status installed dbus:arm64 1.12.2-1ubuntu1 2019-03-03 14:33:41 status installed libpci3:arm64 1:3.5.2-1ubuntu1.1 2019-03-03 14:33:41 status installed libdns1100:arm64 1:9.11.3+dfsg-1ubuntu1.5 2019-03-03 14:33:41 status installed openssh-client:arm64 1:7.6p1-4ubuntu0.2 2019-03-03 14:33:41 status installed libgs9:arm64 9.26~dfsg+0-0ubuntu0.18.04.7 2019-03-03 14:33:41 status installed liblwres160:arm64 1:9.11.3+dfsg-1ubuntu1.5 2019-03-03 14:33:41 status installed libx11-data:all 2:1.6.4-3ubuntu0.2 2019-03-03 14:33:41 status installed libpoppler-glib8:arm64 0.62.0-2ubuntu2.7 2019-03-03 14:33:41 status installed gvfs-daemons:arm64 1.36.1-0ubuntu1.3 2019-03-03 14:33:46 status installed libpam-systemd:arm64 237-3ubuntu10.13 2019-03-03 14:33:46 status installed poppler-utils:arm64 0.62.0-2ubuntu2.7 2019-03-03 14:33:46 status installed libdbusmenu-gtk3-4:arm64 16.04.1+18.04.20171206-0ubuntu2 2019-03-03 14:33:46 status installed libx11-6:arm64 2:1.6.4-3ubuntu0.2 2019-03-03 14:33:46 status installed libgl1-mesa-dri:arm64 18.2.2-0ubuntu1~18.04.2 2019-03-03 14:33:46 status installed ghostscript:arm64 9.26~dfsg+0-0ubuntu0.18.04.7 2019-03-03 14:33:49 status installed libgtk-3-0:arm64 3.22.30-1ubuntu2 2019-03-03 14:33:50 status installed gvfs:arm64 1.36.1-0ubuntu1.3 2019-03-03 14:33:50 status installed libisccfg160:arm64 1:9.11.3+dfsg-1ubuntu1.5 2019-03-03 14:33:50 status installed openssh-sftp-server:arm64 1:7.6p1-4ubuntu0.2 2019-03-03 14:33:50 status installed pciutils:arm64 1:3.5.2-1ubuntu1.1 2019-03-03 14:33:50 status installed libglx-mesa0:arm64 18.2.2-0ubuntu1~18.04.2 2019-03-03 14:33:50 status installed gvfs-backends:arm64 1.36.1-0ubuntu1.3 2019-03-03 14:33:50 status installed gir1.2-gtk-3.0:arm64 3.22.30-1ubuntu2 2019-03-03 14:33:50 status installed libwebkit2gtk-4.0-37:arm64 2.22.6-0ubuntu0.18.04.1 2019-03-03 14:33:54 status installed libirs160:arm64 1:9.11.3+dfsg-1ubuntu1.5 2019-03-03 14:33:57 status installed gvfs-fuse:arm64 1.36.1-0ubuntu1.3 2019-03-03 14:33:57 status installed libbind9-160:arm64 1:9.11.3+dfsg-1ubuntu1.5 2019-03-03 14:34:01 status installed openssh-server:arm64 1:7.6p1-4ubuntu0.2 2019-03-03 14:34:01 status installed libgl1-mesa-glx:arm64 18.2.2-0ubuntu1~18.04.2 2019-03-03 14:34:01 status installed bind9-host:arm64 1:9.11.3+dfsg-1ubuntu1.5 2019-03-03 14:34:01 status installed dnsutils:arm64 1:9.11.3+dfsg-1ubuntu1.5 2019-03-03 14:34:18 status installed initramfs-tools:all 0.130ubuntu3.6 2019-03-03 14:34:22 status installed libc-bin:arm64 2.27-3ubuntu1
AxelFoley Posted March 3, 2019 Author Posted March 3, 2019 This is me done with Armbian ,,, my cluster has been down for months due to stability issues. I am reformatting all my nodes with a reliable distro
Igor Posted March 3, 2019 Posted March 3, 2019 2 hours ago, AxelFoley said: This is me done with Armbian ,,, my cluster has been down for months due to stability issues. Good luck "RK3399 board support is WIP, "for testing only" and problems are expected! If you want to use those boards for some real cases, don't." P.S. Next time provide logs.
Igor Posted March 3, 2019 Posted March 3, 2019 I tried to recreate this problem, but failed. Starting with oldest image from download Armbian_5.67_Rockpro64_Ubuntu_bionic_default_4.4.166.img following by apt update and upgrade, reboot. Then installing desktop via armbian-config reboot ... and finally updating to latest beta (changing repository to beta.armbian.com). Logs with armbianmonitor -u: http://ix.io/1CuM
AxelFoley Posted March 4, 2019 Author Posted March 4, 2019 Ok I'll post detailed steps to recreate when I get home tonight. It knocked out all 10 of my RockPro64 clusters desktops when I did a salt apt update & upgrade. However I only use the Desktop on the master. I recreate the issue on the master from a rebuild on a fresh SSD not the original EMMC (I have a new one on order Sonia can test 100% like for like). I'll post the precise images I was using. It could just be a comparability issue with the HP Monitor, I do t have a spare to test. Also note ... There was a red herring about lightdm ... The display manager was nodm ...the lightdm was installed after the issue by me using armbian-config trying to fix the issue by reinstalling the desktop. In my fresh install the issue can be fixed by disconnecting the monitor after the upgrade and reconnecting it again with nodm in place like I originally had. FYI I tried some other dists and they were a lot slower than Armbian on the desktop. My main reason from moving away from Armbian would be my need to get the PCIe NVMe drive working on the master. Out of all the Dists I prefer Armbian.
AxelFoley Posted March 4, 2019 Author Posted March 4, 2019 Original Image: Armbian_5.67_Rockpro64_Ubuntu_bionic_default_4.4.166_desktop on 64GB EMMC Disk. The OS was "apt updated" on 28th Jan. When "apt updated" on the 2nd March the OS would not boot to desktop as teh screen went blank. How I recreated the problem Same Image: Armbian_5.67 on 64GB Sandisk Ultra SD XC 1 using Pine Installer to flash. Boot RockPro64 on SSD: Screen and Monitor working as expected ... see boot messages and Armbian file system resize (6min) Create New User and Pass Boots to Armbian desktop OK. console as root: apt-get update, apt-get upgrade reboot server comes up, available to ssh into. However while the display managed try's to boot into the desktop it hangs on black screen with "_" in the top left corner: Now I am trying to recreate the problem again ..... but this time I notice that the following packages are now "held back" when they where previously installed. This means that I probably did a "apt update" & "apt upgrade" the 1st time around. libgli-mesa-dri libgtk-3-0 libwayland-egli-mesa netplan.io I am waiting for the upgrade to complete and see if I can recreate the problem a second time.... will post when the upgrade is completed and attach relevant log files
AxelFoley Posted March 4, 2019 Author Posted March 4, 2019 interesting ... init 6 ... the device stopped pinging but did not reboot. I had to do a physical reset. Then it booted into successfully into the desktop. I did the "apt update" & "apt upgrade" to install the remaining 4 packages again the device stops pinging but does not reboot. I have to do a physical reset to get the device to boot. but the device now boots into desktop OK.... I now cannot recreate the problem I could at the weekend! So this time a repeat the process above again from scratch ... but this time I do an "apt update" & "apt upgrade" on successful 1st boot of a new flashed image instead of a "apt-get update" and "apt-get upgrade" However the device will not boot the new flashed SD Card until I physically pull the power cable out and put it in again. i.e. the power off and reset buttons do not cause the device to restart and boot from the newly flashed SD Card. This is not a bodged cluster project ... its a Pico cluster 10! Having said that I have been suspicious that their power supply Quality it is abysmal .. I had to re-crimp and re-cable their shoddy work. One theory was that this is power draw when the GPU Kicks in and it was causing instability. However I thought I mitigated the risk this is all a HW Power related issue by testing with the master rocpro64 with a dedicated external power supply and I was still able to recreate the problem, but this could have been "after" I broke the installation with ambian-config that replaced nodm with lightdm which totally screwed the boot image. Anyhow I'm now upgrading with "apt update" & "apt upgrade" in one go including those 4 held back packages....lets see how this goes. if it still works after this maybe this is a power issue all along ... I only replaced 50% of pico clusters shoddy cabling work (they are using automotive crimp fork connectors and trying to shove up to 3-4 awg 22 cables into an awg 14-16 fork connector (morons). I have noticed one of the devices constantly rebooting and the inconsistency of the masters behaviour ....
AxelFoley Posted March 4, 2019 Author Posted March 4, 2019 well i am stumped .... I can not recreate the problem even after a "apt update" & "apt upgrade" but I did this twice on Saturday and it knocked out all my 10 devices! I am not a moron ... I do my due diligence .. I was able to recreate (admittedly on a SD not a EMMC on just my master) This means that either the upgrade packages has been changed since Saturday or its a spurious temperature/power related issue. I have not been happy with these RockPro64 devices inconstant behaviour on power cycle and I bought them all in one batch. I have had issues with vendors before not testing kit in hotter / colder climates (Mellanox being one) so i'd think it was power before device defect. Although I have still one unit constantly power cycling. I will swap out the power supply next as if the pico cluster guys can not even get their cabling correct, how can I trust their power suppler choice! Thanks for your help and I apologies if I have wasted your valuable time. Xorg.0.log.before dpkg.log.before
AxelFoley Posted March 5, 2019 Author Posted March 5, 2019 Right, new update ... I can not even get the rockprocluster master to boot to console with the original EMMC device on the master. that previously I could boot to at least console. It could just be a dodgy emmc chip .... but my replacement emmc has not turned up and it still does not explain why i could recreate the issue on ssd on Saturday or why it barfed a lot of my 10 nodes! this is looking more and more looking like a HW power issue down to pico being incompetent. The the next best explanation is a fucked up board firmware issue from pine. It has to be related to power draw on GPU (Mali) initialization or a barfed EEMC unit .... but then why did all my 10 units choke ??? It hast to be power .... pine board designers can not be compete morons !
AxelFoley Posted March 9, 2019 Author Posted March 9, 2019 Hi All, my sincere apologies .... after several late stressful night .... the more and more I looked into this the more suspicious I became the issues was power. I bought the Cluster from PICO and their cabling loom was appalling... I last posted I have completely re-cabled the wiring loom. (see attached ... I will redo it again when my order of heat shrink crimp connectors turn up as the normal heat shrink will not be that reliable) PICO had tried to force up-to 4 x awg 22 cables into an awg 14-16 Fork crimp connector. They then tried tried to force 2 x forks on a power supply screw output terminal ... not great when you are already over forcing 4 wires into a crimp terminal designed for 1 !!! Its at this point I noticed the power supply they sent with a "High Powered" RocPro64 10H Cluster unit ...... 5V 18A !!! The spec's of the RockPro64 state 12v 3A minimum I think that they send me a power supply for a Raspberry PI 10 x cluster not a Rock Pro !! I have logged a support ticket with PICO... but until I have reliable power .... all these issues can be explain by PICO Cluster not doing their jobs. This may also explain why I can not get the NVMe Drive working. I will update accordingly.
pfry Posted March 10, 2019 Posted March 10, 2019 Wow, that worked at all? Impressive. That the fixed buck converters would run at 5V and 240% current. Not to mention your poor 90W PS -- I'd expect it to squeal like a pig. Hm... looking at the PicoCluster site, they list "Power: Internal (100W)", which sounds a bit low -- with default clocks, I've seen measurements (in board reviews) of 1.1A or so under load (>13W), and lacking any other data, I'd call that (say, 140W for 10) a starting point. If you want 2A available at each of 10 boards, you're looking at 250W. Decent modern power supplies should be stable at low loads, and you should be able to get >90% efficiency at 25-100% load (assuming you want to pay for it -- your MeanWell LRS is good for about 86%).
AxelFoley Posted March 11, 2019 Author Posted March 11, 2019 Thanks for your help .... Good news ... I sorted out the issues with PicoCluster, they are sending me a 100w 12v Unit with a StepDown converter (to feed the fans) free of charge. At least then I can get the PCIe NVMe working ... I was thinking it was the image all the time, when in fact it was the power. 100w is still on the low side as it also feeds the 2 switches ... so you are looking at 0.8a shared accross all devices peak :-0 But it will at least get me developing / testing and I can keep an eye out when I start to load the platform. I will also get my scope & multimeters out and monitor the current draw on each 12v rail on the old and new PSU
AxelFoley Posted March 16, 2019 Author Posted March 16, 2019 Right PicoCluster kindly sent me a free replacement power supply rated at 12v. I checked the power circuity doc [pfry] posted and it does say 12v input but all that happens is that it has 2 x buck converters to down volt to 2 x 5.1v and 1 5v rails. The buck converters are SY8113-B and can operate from 4.5-18v delivering 3A Current, so explains why the 5V PSU was working. It also appeared that the one cable I did not re-crimp from PicoCluster also had a loose connection that was causing power loss if the wiring loom was moved. So I do not know if that was the wiring loom issues failing to deliver enough current or it it was the PSU not able to drive the buck converters to deliver enough current to the USB 5.1V rail. @v5 those buck converters are 96% efficient so its hard to think that the difference between 5.1 and 5.0v would be significant enough to stop the unit booting reliably with a USB Keyboard and mouse in place. So my hunch was that the root cause was an dodgy wiring loom causing current drop off under boot load. I rewired the entire loom again this time with heat shrink crimp connectors to firmly hold the cables in place and soldered the wired before crimping (i know its bad practice). Now everything boots perfectly! The 10 X RockPro64 Cluster Case + 2 x Switches + 2 fans draws in peak of 60 watts on boot and 30 watts running unloaded It also helped me find the real cause of why the master node did not boot into the desktop ........ ......it looks like the package libEGL.so.1 was missing preventing nodm from booting!!!! I had to force reinstall the libegl1 package to get the libs back and it now boots into the desktop fine ... so some how an apt-get upgrade mush have crashed!! I still have two nodes power cycling ... but at least I can check them out now knowing that its not the poersupply/cabling and I can also focus on getting the nvme drive working :-) Happy Days!
pfry Posted March 17, 2019 Posted March 17, 2019 Not to nitpick, but to nitpick... e.g. the NanoPC-T4 uses a 12V input, with an NB679 regulator for 5V (USB and audio) and an NB680 for 3.3V (basically everything else). You're right, though -- most of the regulators are rated down near or below 5V for input. The RK3399 reference design uses the RK808 plus a bunch of additional regulators (for the big array, GPU, etc.), all on that 3.3V supply. The NanoPC-T4 can consume 30W fairly comfortably (2 USB 3 + 2 USB 2 = 15W alone, plus SOC, plus M.2, etc.); the RockPro64 has a PCI-e slot, which should provide 12V power over and above the 3.3V (the NanoPC-T4 only has an M.2 slot -- no 12V). You're not likely to plug your 10 boards full of USB HDDs/DVDs and 25W PCI-e cards, but I still think 100W is low. Heh -- I have fans that draw more power (not that I run them at that level -- they frighten me, in addition to deafening me). I'm curious what kind of draw you'll see under a good compute load. Soldering insulated crimp connectors without making a burned, blobby mess is kinda nifty. I always cut the insulators off and go heat-shrink happy. And while there are edge cases where solder can contribute to failures, I prefer to risk those to solve others. (At least when dealing with itty-bitty computer stuff. I can swing like an ape all day on a #2 welding cable with a crimped lug.)
Recommended Posts