Does anyone actually have a stable system?


tommitytom
 Share

6 6

Recommended Posts

Just a heads up that I reinstalled my Helios 64 with the latest armbian buster (currently running from SD) and it has been running solid as a rock for 7 days.  No longer using OMV and I don't really miss it

Link to post
Share on other sites

Armbian is a community driven open source project. Do you like to contribute your code?

Posted (edited)
On 3/25/2021 at 4:57 AM, gprovost said:

@SIGSEGV During boot up, the first messages output on the serial will show if it's U-boot TPL/SPL our Rockchip blob.

 

This is the output with U-boot TPL/SPL

 



U-Boot TPL 2020.10-armbian (Mar 14 2021 - 07:07:37)
Channel 0: LPDDR4, 50MHz
BW=32 Col=10 Bk=8 CS0 Row=16/15 CS=1 Die BW=16 Size=2048MB
Channel 1: LPDDR4, 50MHz
BW=32 Col=10 Bk=8 CS0 Row=16/15 CS=1 Die BW=16 Size=2048MB
256B stride
lpddr4_set_rate: change freq to 400000000 mhz 0, 1
lpddr4_set_rate: change freq to 800000000 mhz 1, 0
Trying to boot from BOOTROM
Returning to boot ROM...

U-Boot SPL 2020.10-armbian (Mar 14 2021 - 07:07:37 +0700)
Trying to boot from MMC2
NOTICE:  BL31: v2.2(release):a04808c-dirty
NOTICE:  BL31: Built : 07:07:20, Mar 14 2021

 

 

This is the output with Rockchip blob

 

 

 

@gprovost I took a quick look at your previous reply in the thread. It looks like I have the Rockchip blob (line "DDR Version 1.24 20191016") since I haven't updated since 2020.07 LK 5.9.X and initially my LK 5.10.X upgrades had issues. Is there a way to update the UBoot without losing the rest of the system? Perhaps that was my issue and since I didn't reboot in a while, I avoided that scenario...?

 

Modified the armbianEnv.txt on a spare Linux machine. Output below:

Spoiler

picocom v2.2

 

port is        : /dev/ttyUSB0

flowcontrol    : none

baudrate is    : 1500000

parity is      : none

databits are   : 8

stopbits are   : 1

escape is      : C-a

local echo is  : no

noinit is      : no

noreset is     : no

nolock is      : no

send_cmd is    : sz -vv

receive_cmd is : rz -vv -E

imap is        : 

omap is        : 

emap is        : crcrlf,delbs,

 

Type [C-a] [C-h] to see available commands

 

Terminal ready

DDR Version 1.24 20191016

In

channel 0

CS = 0

MR0=0x18

MR4=0x1

MR5=0x1

MR8=0x10

MR12=0x72

MR14=0x72

MR18=0x0

MR19=0x0

MR24=0x8

MR25=0x0

channel 1

CS = 0

MR0=0x18

MR4=0x1

MR5=0x1

MR8=0x10

MR12=0x72

MR14=0x72

MR18=0x0

MR19=0x0

MR24=0x8

MR25=0x0

channel 0 training pass!

channel 1 training pass!

change freq to 416MHz 0,1

Channel 0: LPDDR4,416MHz

Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB

Channel 1: LPDDR4,416MHz

Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB

256B stride

channel 0

CS = 0

MR0=0x18

MR4=0x1

MR5=0x1

MR8=0x10

MR12=0x72

MR14=0x72

MR18=0x0

MR19=0x0

MR24=0x8

MR25=0x0

channel 1

CS = 0

MR0=0x18

MR4=0x1

MR5=0x1

MR8=0x10

MR12=0x72

MR14=0x72

MR18=0x0

MR19=0x0

MR24=0x8

MR25=0x0

channel 0 training pass!

channel 1 training pass!

channel 0, cs 0, advanced training done

channel 1, cs 0, advanced training done

change freq to 856MHz 1,0

ch 0 ddrconfig = 0x101, ddrsize = 0x40

ch 1 ddrconfig = 0x101, ddrsize = 0x40

pmugrf_os_reg[2] = 0x32C1F2C1, stride = 0xD

ddr_set_rate to 328MHZ

ddr_set_rate to 666MHZ

ddr_set_rate to 928MHZ

channel 0, cs 0, advanced training done

channel 1, cs 0, advanced training done

ddr_set_rate to 416MHZ, ctl_index 0

ddr_set_rate to 856MHZ, ctl_index 1

support 416 856 328 666 928 MHz, current 856MHz

OUT

Boot1: 2019-03-14, version: 1.19

CPUId = 0x0

ChipType = 0x10, 254

SdmmcInit=2 0

BootCapSize=100000

UserCapSize=14910MB

FwPartOffset=2000 , 100000

mmc0:cmd5,20

SdmmcInit=0 0

BootCapSize=0

UserCapSize=121942MB

FwPartOffset=2000 , 0

StorageInit ok = 77912

SecureMode = 0

SecureInit read PBA: 0x4

SecureInit read PBA: 0x404

SecureInit read PBA: 0x804

SecureInit read PBA: 0xc04

SecureInit read PBA: 0x1004

SecureInit read PBA: 0x1404

SecureInit read PBA: 0x1804

SecureInit read PBA: 0x1c04

SecureInit ret = 0, SecureMode = 0

atags_set_bootdev: ret:(0)

GPT 0x3380ec0 signature is wrong

recovery gpt...

GPT 0x3380ec0 signature is wrong

recovery gpt fail!

LoadTrust Addr:0x4000

No find bl30.bin

No find bl32.bin

Load uboot, ReadLba = 2000

Load OK, addr=0x200000, size=0xdd6b0

RunBL31 0x40000

NOTICE:  BL31: v1.3(debug):42583b6

NOTICE:  BL31: Built : 07:55:13, Oct 15 2019

NOTICE:  BL31: Rockchip release version: v1.1

INFO:    GICv3 with legacy support detected. ARM GICV3 driver initialized in EL3

INFO:    Using opteed sec cpu_context!

INFO:    boot cpu mask: 0

INFO:    plat_rockchip_pmu_init(1190): pd status 3e

INFO:    BL31: Initializing runtime services

WARNING: No OPTEE provided by BL2 boot loader, Booting device without OPTEE initialization. SMC`s destined for OPTEE will return SMC_UNK

ERROR:   Error initializing runtime service opteed_fast

INFO:    BL31: Preparing for EL3 exit to normal world

INFO:    Entry point address = 0x200000

INFO:    SPSR = 0x3c9

 

 

U-Boot 2020.07-armbian (Dec 11 2020 - 22:44:41 +0100)

 

SoC: Rockchip rk3399

Reset cause: POR

DRAM:  3.9 GiB

PMIC:  RK808 

SF: Detected w25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB

MMC:   mmc@fe320000: 1, sdhci@fe330000: 0

Loading Environment from MMC... *** Warning - bad CRC, using default environment

 

In:    serial

Out:   serial

Err:   serial

Model: Helios64

Revision: 1.2 - 4GB non ECC

Net:   eth0: ethernet@fe300000

scanning bus for devices...

Hit any key to stop autoboot:  0 

switch to partitions #0, OK

mmc1 is current device

Scanning mmc 1:1...

Found U-Boot script /boot/boot.scr

3185 bytes read in 6 ms (517.6 KiB/s)

## Executing script at 00500000

Boot script loaded from mmc 1

235 bytes read in 5 ms (45.9 KiB/s)

9809293 bytes read in 434 ms (21.6 MiB/s)

22460424 bytes read in 954 ms (22.5 MiB/s)

81696 bytes read in 14 ms (5.6 MiB/s)

2698 bytes read in 8 ms (329.1 KiB/s)

Applying kernel provided DT fixup script (rockchip-fixup.scr)

## Executing script at 09000000

## Loading init Ramdisk from Legacy Image at 06000000 ...

   Image Name:   uInitrd

   Image Type:   AArch64 Linux RAMDisk Image (gzip compressed)

   Data Size:    9809229 Bytes = 9.4 MiB

   Load Address: 00000000

   Entry Point:  00000000

   Verifying Checksum ... OK

## Flattened Device Tree blob at 01f00000

   Booting using the fdt blob at 0x1f00000

   Loading Ramdisk to f558b000, end f5ee5d4d ... OK

   Loading Device Tree to 00000000f550e000, end 00000000f558afff ... OK

 

Starting kernel ...

 

Edited by hartraft
Added more detail
Link to post
Share on other sites

@hartraft Yeah you could try to update the uboot on the microSD card using your spare linux computer.(Note: This can mess up your sdcard if you do it wrongly)

 

You will need again to mount the microSD.

 

cd  <sdcard-mount>/usr/lib/linux-u-boot-current-helios64*

dd if=idbloader.bin of=<sd-card device> seek=64 conv=notrunc
dd if=uboot.img of=<sd-card device> seek=16384 conv=notrunc
dd if=trust.bin of=<sd-card device> seek=24576 conv=notrunc

 

where <sd-card device> is something like /dev/mmcblk1

Link to post
Share on other sites

On 5/9/2021 at 5:54 AM, FloBaoti said:

For people running fine, are you using 2.5G NIC ?

I do, and this interface crashes every few days. I setup a cron to check every minute and down&up it if necessary.

 

I'm using both the 1g and 2.5g nics.  Before my recent reinstall, using OMV, the 2.5g NIC would constantly drop, and after a while it just stopped working completely.  After reinstall without OMV it has been solid.  Not sure if it was OMV causing the issue, but its the largest difference between my 2 installs.

Link to post
Share on other sites

On 5/8/2021 at 9:54 PM, FloBaoti said:

For people running fine, are you using 2.5G NIC ?

I do, and this interface crashes every few days. I setup a cron to check every minute and down&up it if necessary.

I'm using the 2.5Gbps NIC exclusively with a 2.5Gbps switch for a month with no issues so far.

Link to post
Share on other sites

This is what my uptime looks like. OMV5, plex and r/rutorrent with 5 12TB WDC disks (LVM).
None of these reboots were triggered by me...
Wish I could find were the problem lies, can't read the logs for debugging as the reboot is so abrupt that nothing gets written to disk.

armbian.PNG

Link to post
Share on other sites

3 hours ago, barnumbirr said:

Wish I could find were the problem lies, can't read the logs for debugging as the reboot is so abrupt that nothing gets written to disk.

Can't you connect a logger to serial out?

Link to post
Share on other sites

I've had a secondary device connected to my Helios64 over serial for the last couple of months. Unfortunately, even at verbosity 7 it dies so suddenly that the output doesn't actually provide any valuable information:

 

Starting kernel ...

[    2.721938] cacheinfo: Unable to detect cache hierarchy for CPU 0
[    2.881028] vcc3v3_sys_s0: failed to get the current voltage: -EPROBE_DEFER
[    2.900490] dw_wdt ff848000.watchdog: No valid TOPs array specified
[    3.012992] dwmmc_rockchip fe320000.mmc: All phases bad!
[    3.013479] mmc1: tuning execution failed: -5
[    3.013881] mmc1: error -5 whilst initialising SD card
[    3.135010] dwmmc_rockchip fe320000.mmc: All phases bad!
[    3.135502] mmc1: tuning execution failed: -5
[   12.521811] rk_gmac-dwmac fe300000.ethernet: cannot get clock clk_mac_speed
[   15.212309] dw-apb-uart ff1a0000.serial: forbid DMA for kernel console
[   17.035708] lm75 2-004c: supply vs not found, using dummy regulator
[   18.572536] rk_gmac-dwmac fe300000.ethernet eth0: PTP not supported by HW
[   18.817621] OF: graph: no port node found in /i2c@ff3d0000/typec-portc@22
[   18.833386] OF: graph: no port node found in /syscon@ff770000/usb2-phy@e450/otg-port
[   19.262970] [drm] unsupported AFBC format[3231564e]
[   19.366283] rockchip_vdec: module is from the staging directory, the quality is unknown, you have been warned.
[   19.410603] r8152 4-1.4:1.0 (unnamed net_device) (uninitialized): netif_napi_add() called with weight 256
[   19.421732] hantro_vpu: module is from the staging directory, the quality is unknown, you have been warned.
[   24.903650] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2

helios64 login: DDR Version 1.24 20191016
In
soft reset
SRX
channel 0
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72

This is what my uptime looks like over the last 90 days. Again, none of the reboots were triggered by me.

1484020386_Screenshot2021-08-23at12-47-47Hosts-Grafana.thumb.png.c837aa5705da695015ea17591b9fa14c.png

 

Reboots wouldn't be such a pain the backside on their own if 90% of them didn't retrigger a mdadm RAID resync...

Link to post
Share on other sites

Finally managed to catch something. Had to reset the device after this:
 

[150372.308197] Unable to handle kernel paging request at virtual address ffff0001f77bd7bf
[150372.308900] Mem abort info:
[150372.309153]   ESR = 0x96000005
[150372.309431]   EC = 0x25: DABT (current EL), IL = 32 bits
[150372.309903]   SET = 0, FnV = 0
[150372.310178]   EA = 0, S1PTW = 0
[150372.310461] Data abort info:
[150372.310720]   ISV = 0, ISS = 0x00000005
[150372.311063]   CM = 0, WnR = 0
[150372.311333] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000000366b000
[150372.311925] [ffff0001f77bd7bf] pgd=00000000f7ff9003, p4d=00000000f7ff9003, pud=0000000000000000
[150372.312697] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[150372.313192] Modules linked in: softdog governor_performance cfg80211 rfkill r8152 snd_soc_hdmi_codec snd_soc_rockchip_i2s snd_soc_core snd_pcm_dmaengine snd_pcm rockchip_rga hantro_vpu(C) leds_pwm rockchip_vdec(C) snd_timer fusb302 videobuf2_dma_sg videobuf2_vmalloc tcpm snd gpio_charger v4l2_h264 panfrost videobuf2_dma_contig typec v4l2_mem2mem rockchipdrm soundcore videobuf2_memops videobuf2_v4l2 videobuf2_common gpu_sched dw_mipi_dsi videodev dw_hdmi mc analogix_dp drm_kms_helper cec sg rc_core drm drm_panel_orientation_quirks cpufreq_dt gpio_beeper ledtrig_netdev lm75 dm_mod sunrpc ip_tables x_tables autofs4 raid10 raid1 raid0 multipath linear raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx realtek md_mod dwmac_rk stmmac_platform stmmac pcs_xpcs adc_keys pwm_fan
[150372.319348] CPU: 5 PID: 2769 Comm: rtorrent main Tainted: G         C        5.10.60-rockchip64 #21.08.1
[150372.320178] Hardware name: Helios64 (DT)
[150372.320531] pstate: 20000085 (nzCv daIf -PAN -UAO -TCO BTYPE=--)
[150372.321074] pc : __mod_zone_page_state+0x50/0x108
[150372.321494] lr : __mod_zone_page_state+0x3c/0x108
[150372.321914] sp : ffff8000167db4f0
[150372.322212] x29: ffff8000167db4f0 x28: 0000000000000001
[150372.322687] x27: 0000000000000020 x26: ffff0000f77d9100
[150372.323160] x25: 0000000000000001 x24: ffff0000f77d9e00
[150372.323634] x23: 00000000ffff8000 x22: ffff8001115757bf

 

Now that the Kobol team has pulled the plug, I doubt these issues will ever get fixed.

Link to post
Share on other sites

4 hours ago, barnumbirr said:

Finally managed to catch something.

 

You are using kernel 5.10.60 (Armbian 21.08.1).  Several Armbian patches did not compile with this version of the kernel - it is therefore unstable (see the parallel thread - upgrading to Bullseye). The kernel panic occurred after 150372 seconds = 41.77 hours of operation !

 

Link to post
Share on other sites

41 minutes ago, ebin-dev said:

 

You are using kernel 5.10.60 (Armbian 21.08.1).  Several Armbian patches did not compile with this version of the kernel - it is therefore unstable (see the parallel thread - upgrading to Bullseye). The kernel panic occurred after 150372 seconds = 41.77 hours of operation !

 

Ahh, so it's essentially a ticking time bomb? I pushed the bootloader from an SD card and got my system back up yesterday.

Link to post
Share on other sites

vor 13 Stunden schrieb IcerJo:

Ahh, so it's essentially a ticking time bomb? I pushed the bootloader from an SD card and got my system back up yesterday.

Can you explain how you got your system back?
Unfortunately, I also updated to 21.08.01 tonight and urgently need a downgrade to get the system back.
Would be very grateful to you!

Link to post
Share on other sites

Can you explain how you got your system back?
Unfortunately, I also updated to 21.08.01 tonight and urgently need a downgrade to get the system back.
Would be very grateful to you!
If you can SSH in or possibly boot off of an SD card with the latest image, you can reinstall the bootloader, it at least gets it up and running, least for me that means I can get in and see my drives and wrote to the, but I fear the emmc is still somewhat locked down but yet it lete turn ash back on.

Sent from my Pixel 4a (5G) using Tapatalk


Link to post
Share on other sites

5 hours ago, TDCroPower said:

Can you explain how you got your system back?

 

There is a possibility discussed in the parallel thread link.

 

You also could boot a fresh Armbian 21.05.4 off  SD and rsync with it the content from emmc to another bootable SD. Then you continue to downgrade linux on that second SD (booted)  ... and rsync the result back to emmc.

 

Maybe somebody else could explain how to downgrade the kernel on emmc using a chrooted environment.

Link to post
Share on other sites

 Share

6 6