wurmfood

July 31, 2021

Sorry, the line about disabling was just to make sure you disable the armbian-zram-config service by setting ENABLED to false.

As a warning, though, I found some problems with this if you log to a zfs share. It seems you have to make sure zfs gets loaded before the logging starts up, otherwise you can get a kernel panic on occasion. I didn't really dig into how to fix this, so I just log and swap to a usb drive instead now.

July 23, 2021

That certainly could be added. I think the assumption was that as long as people were following the directions it would be fine. Otherwise, since the rule is that air exhausts on the side of the fan with the motor, it's clear from the picture that air would be pushed out the back of the case.

July 18, 2021

22 hours ago, digwer said:

I think you are missing one zero on this value. Also, it's not changeable and by default it's 950000

Yup, fixed. Thank you. It may be the default and non-changeable, but it's a suggestion that has been put in from the Kobol team, which is why I included it.

July 17, 2021

They apply to a variety of problems, but the CPU/voltage throttling can help with random freezes/reboots and the extraargs bit can help with errors accessing drives.

July 9, 2021

You can set up logging to go to a flash drive, but capturing the console is going to be the best bet.

Do you have another computer you can leave on with the serial console connected? For example, I have a small NUC where I keep picocom running in a tmux session, that way I don't lose anything. You can use the -g option to have picocom log to a file, as well.

July 7, 2021

There are a number of modifications that have been suggested that people implement to address certain issues.

The ones I can find are:

- In /boot/armbianEnv.txt:

extraargs=libata.force=3.0

- If doing debugging, also add:

verbosity=7
console=serial
extraargs=earlyprintk ignore_loglevel

- In /boot/boot.cmd

regulator dev vdd_log
regulator value 930000
regulator dev vdd_center
regulator value 950000

and then run:

mkimage -C none -A arm -T script -d /boot/boot.cmd /boot/boot.scr

- In /etc/default/cpufrequtils:

ENABLE=true
MIN_SPEED=408000
MAX_SPEED=1800000
GOVERNOR=ondemand

(or 1200000 instead of 1800000)

- And if using ZFS:

for disk in /sys/block/sd[a-e]/queue/scheduler; do echo none > $disk; done

I've gathered these from a variety of threads. Am I missing any here?

June 18, 2021

After a recent update to non-kernel stuff, my Helios64 will no longer boot. When attempting to, I get this error:

[   32.425508] xhci-hcd xhci-hcd.3.auto: Host halt failed, -110
[   35.019496] xhci-hcd xhci-hcd.3.auto: Host halt failed, -110
[   35.020028] xhci-hcd xhci-hcd.3.auto: Host controller not halted, aborting reset.

I had been using the eMMC on board to boot with the SD card as a backup, but now neither one is working.

Anyone know if there's a way to fix this?

Here's more, using verbosity=4 in the armbianEnv.txt.

Spoiler



DDR Version 1.24 20191016
In
channel 0
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
change freq to 416MHz 0,1
Channel 0: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
Channel 1: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
256B stride
channel 0
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
channel 0, cs 0, advanced training done
channel 1, cs 0, advanced training done
change freq to 856MHz 1,0
ch 0 ddrconfig = 0x101, ddrsize = 0x40
ch 1 ddrconfig = 0x101, ddrsize = 0x40
pmugrf_os_reg[2] = 0x32C1F2C1, stride = 0xD
ddr_set_rate to 328MHZ
ddr_set_rate to 666MHZ
ddr_set_rate to 928MHZ
channel 0, cs 0, advanced training done
channel 1, cs 0, advanced training done
ddr_set_rate to 416MHZ, ctl_index 0
ddr_set_rate to 856MHZ, ctl_index 1
support 416 856 328 666 928 MHz, current 856MHz
OUT
Boot1: 2019-03-14, version: 1.19
CPUId = 0x0
ChipType = 0x10, 254
SdmmcInit=2 0
BootCapSize=100000
UserCapSize=14910MB
FwPartOffset=2000 , 100000
mmc0:cmd5,20
SdmmcInit=0 0
BootCapSize=0
UserCapSize=30436MB
FwPartOffset=2000 , 0
StorageInit ok = 253372
SecureMode = 0
SecureInit read PBA: 0x4
SecureInit read PBA: 0x404
SecureInit read PBA: 0x804
SecureInit read PBA: 0xc04
SecureInit read PBA: 0x1004
SecureInit read PBA: 0x1404
SecureInit read PBA: 0x1804
SecureInit read PBA: 0x1c04
SecureInit ret = 0, SecureMode = 0
atags_set_bootdev: ret:(0)
GPT 0x3380ec0 signature is wrong
recovery gpt...
GPT 0x3380ec0 signature is wrong
recovery gpt fail!
LoadTrust Addr:0x4000
No find bl30.bin
No find bl32.bin
Load uboot, ReadLba = 2000
Load OK, addr=0x200000, size=0xe5b60
RunBL31 0x40000
NOTICE:  BL31: v1.3(debug):42583b6
NOTICE:  BL31: Built : 07:55:13, Oct 15 2019
NOTICE:  BL31: Rockchip release version: v1.1
INFO:    GICv3 with legacy support detected. ARM GICV3 driver initialized in EL3
INFO:    Using opteed sec cpu_context!
INFO:    boot cpu mask: 0
INFO:    plat_rockchip_pmu_init(1190): pd status 3e
INFO:    BL31: Initializing runtime services
WARNING: No OPTEE provided by BL2 boot loader, Booting device without OPTEE initialization. SMC`s destined for OPTEE will return SMC_UNK
ERROR:   Error initializing runtime service opteed_fast
INFO:    BL31: Preparing for EL3 exit to normal world
INFO:    Entry point address = 0x200000
INFO:    SPSR = 0x3c9


U-Boot 2020.10-armbian (Mar 08 2021 - 14:54:58 +0000)

SoC: Rockchip rk3399
Reset cause: POR
DRAM:  3.9 GiB
PMIC:  RK808
SF: Detected w25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB
MMC:   mmc@fe320000: 1, sdhci@fe330000: 0
Loading Environment from MMC... *** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
Model: Helios64
Revision: 1.2 - 4GB non ECC
Net:   eth0: ethernet@fe300000
scanning bus for devices...
starting USB...
Bus usb@fe380000: USB EHCI 1.00
Bus dwc3: usb maximum-speed not found
Register 2000140 NbrPorts 2
Starting the controller
USB XHCI 1.10
scanning bus usb@fe380000 for devices... 1 USB Device(s) found
scanning bus dwc3 for devices... cannot reset port 4!?
5 USB Device(s) found
       scanning usb for storage devices... 1 Storage Device(s) found
Hit any key to stop autoboot:  0
switch to partitions #0, OK
mmc1 is current device
Scanning mmc 1:1...
Found U-Boot script /boot/boot.scr
3185 bytes read in 5 ms (622.1 KiB/s)
## Executing script at 00500000
Boot script loaded from mmc 1
166 bytes read in 4 ms (40 KiB/s)
16091509 bytes read in 687 ms (22.3 MiB/s)
28582400 bytes read in 1180 ms (23.1 MiB/s)
81913 bytes read in 14 ms (5.6 MiB/s)
2698 bytes read in 9 ms (292 KiB/s)
Applying kernel provided DT fixup script (rockchip-fixup.scr)
## Executing script at 09000000
Moving Image from 0x2080000 to 0x2200000, end=3de0000
## Loading init Ramdisk from Legacy Image at 06000000 ...
   Image Name:   uInitrd
   Image Type:   AArch64 Linux RAMDisk Image (gzip compressed)
   Data Size:    16091445 Bytes = 15.3 MiB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 01f00000
   Booting using the fdt blob at 0x1f00000
   Loading Ramdisk to f4f97000, end f5eef935 ... OK
   Loading Device Tree to 00000000f4f1a000, end 00000000f4f96fff ... OK

Starting kernel ...

[    2.884158] vcc3v3_sys_s0: failed to get the current voltage: -EPROBE_DEFER
[    3.019799] mmc1: tuning execution failed: -5
[    3.020211] mmc1: error -5 whilst initialising SD card
[    6.327083] rk_gmac-dwmac fe300000.ethernet: cannot get clock clk_mac_speed
[    9.753748] OF: graph: no port node found in /i2c@ff3d0000/typec-portc@22
[    9.799283] OF: graph: no port node found in /syscon@ff770000/usb2-phy@e450/otg-port
[   10.249703] r8152 4-1.4:1.0 (unnamed net_device) (uninitialized): netif_napi_add() called with weight 256
[   32.425508] xhci-hcd xhci-hcd.3.auto: Host halt failed, -110
[   35.019496] xhci-hcd xhci-hcd.3.auto: Host halt failed, -110
[   35.020028] xhci-hcd xhci-hcd.3.auto: Host controller not halted, aborting reset.
[   35.022321] reboot: Power down

April 23, 2021

Enjoy your well deserved break. It's been a rough year for all of us, but I will saying that getting my Helios64 set up and running has been a welcome diversion and bright spot during this time.

Thanks all and I can't wait to see what comes next.

April 22, 2021

It's showing it boot completely. There's a login prompt there on the console. Why it's showing the extra characters, I'm not sure. My picocom configuration matches yours and I don't see that.

What's interesting is you've got the SysRq coming up when, I believe, that's normally only triggered by a key combination. I wonder if something is causing extra characters to be sent over the serial console that's causing the problem.

April 10, 2021

Sigh. Except that doesn't solve the problem. Now it's just cron filling up the log.

New solution, using the sleep option. Modified helio64-ups.service:

[Unit]
Description=Helios64 UPS Action

[Install]
WantedBy=multi-user.target

[Service]
#Type=oneshot
#ExecStart=/usr/bin/helios64-ups.sh
Type=simple
ExecStart=/usr/local/sbin/powermon.sh

Modified powermon.sh:

#!/bin/bash

#7.0V   916     Recommended threshold to force shutdown system
TH=916

# Values can be info, warning, emerg
warnlevel="emerg"

while [ : ]
do
        main_power=$(cat '/sys/class/power_supply/gpio-charger/online')
        # Only use for testing:
        # main_power=0

        if [ "$main_power" == 0 ]; then
                val=$(cat '/sys/bus/iio/devices/iio:device0/in_voltage2_raw')
                sca=$(cat '/sys/bus/iio/devices/iio:device0/in_voltage_scale')
                # The division is required to make the test later work.
                adc=$(echo "$val * $sca /1" | bc)

                echo "Main power lost. Current charge: $adc" | systemd-cat -p $warnlevel
                echo "Shutdown at $TH" | systemd-cat -p $warnlevel

                # Uncomment for testing
                # echo "Current values:"
                # echo -e "\tMain Power = $main_power"
                # echo -e "\tRaw Voltage = $val"
                # echo -e "\tVoltage Scale = $sca"
                # echo -e "\tVoltage = $adc"

                if [ "$adc" -le $TH ]; then
                    echo "Critical power level reached. Powering off." | systemd-cat -p $warnlevel
                    /usr/sbin/poweroff
                fi
        fi
        sleep 20
done

April 10, 2021

While something that sleeps for 20 seconds might be better, I've set up the following script in cron to go off every minute.

My crontab looks like:

I've tested this and it seems to work. The pipe to systemd-cat lets this log to the system journal at a set warn level.

-- Edit: Removed extra stuff here to clean things up a little. This didn't work the way I wanted.

April 10, 2021

It looks like you have been booting from the eMMC, right?

23 minutes ago, scottf007 said:

GPT 0x3380ec0 signature is wrong
recovery gpt...
GPT 0x3380ec0 signature is wrong
recovery gpt fail!

I think you're going to need to boot off an SD card and try to fix the GPT. This may help you with doing that.

April 9, 2021

@SIGSEGV

It is disabled by default, but since I had some time the last few days I've been digging into things and came across this. There's a corresponding service, but all it does is check to see the state of the battery and then exit. If something else is monitoring the battery, I don't know what it is.

The service could be rewritten as a script that has an loop that checks the state of the battery every 20 seconds and then sleeps.

April 9, 2021

While I like having the helios-ups.timer in case of power failure, I don't like that my logs get three lines written to them every 20 seconds.

Apr 09 07:43:26 helios64 systemd[1]: Starting Helios64 UPS Action...
Apr 09 07:43:26 helios64 systemd[1]: helios64-ups.service: Succeeded.
Apr 09 07:43:26 helios64 systemd[1]: Finished Helios64 UPS Action.

Does anyone know if there is a way to keep the timer on but not fill the logs this way?

Everything I've found about silencing systemd messages is about the output of the command, not the systemd activity itself.

April 8, 2021

Well, for anyone else interested in trying this, here's the basic order I did:

stop armbian-ramlog
disable armbian-ramlog
create a zfs dataset and mount it at /var/log
cp -ar everything from /var/log.hdd to the new /var/log
modify /etc/logrotate to disable compression (since the dataset is already using compression)
modify /etc/default/armbian-ramlog to disable it there as well
modify /etc/default/armbian-zram-config to adjust for new numbers (I have ZRAM_PERCENTAGE and MEM_LIMIT_PERCENTAGE at 15).
reboot

April 8, 2021

Is there an accepted procedure for moving from using ramlog to logging to disk? I've looked but everything I can find is about setting up ramlog.

I assume it's more complicated than just creating a partition (or ZFS dataset) and mounting it /var/log, disabling ramlog, and rebooting.

April 8, 2021

Thank you. I wanted to make sure I understood it correctly since I hadn't encountered this on other Linux systems.

April 8, 2021

I'm trying to understand the role of zram and having some difficulties with it. I've looked in several places and my understanding is this:

- It's a block device in RAM that uses compression

- It functions similar to swap

I notice that mine is set to about 2 Gb:

wurmfood@helios64:~$ cat /proc/swaps
Filename                                Type            Size            Used            Priority
/dev/zram0                              partition       1945084         0               5

My questions:

1. Does that mean 2 Gb of the device RAM is used for the zram swap device?

2. I can understand this on devices with slower storage, but would it be reasonable to disable this and instead use swap on an SSD?

March 26, 2021

3 minutes ago, ShadowDance said:

Are you sure about that? Generally CMR is considered good, SMR is what you'd want to stay away from. Is it something related to these specific drives?

Ah! Sorry, yes, you're right. I've been looking at these back and forth for the last few days. A lot of people here have been having trouble with WD drives and at least some of that seems to come down to the difference between the two types of drives.

March 25, 2021

17 hours ago, Vin said:

WD120EFAX-68UNTN0

One note: It looks like you have CMR drives from WD in there. Those are known to have a variety of problems and may be contributing to the issues you're seeing.

March 25, 2021

3 hours ago, Wofferl said:

All reports here seems to use WD Red disks and you can find similar reports for this disks and zfs with other hardware.

Taking a look around, this definitely seems to be the case:

https://www.truenas.com/community/threads/warning-wd-red-wd60efrx-68l0bn1.58676/

Additionally, people see a lot more errors with the *EFAX drives instead of the *EFRX, since the *EFAX drives are CMR and don't hold up.

March 24, 2021

I do. My system has been rock solid and is acting as a media storage device, NFS, and backup server.

wurmfood@helios64:~$ uname -a
Linux helios64 5.10.21-rockchip64 #21.02.3 SMP PREEMPT Mon Mar 8 01:05:08 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux

ZFS on 4 Seagate 6 Tb disks plus a 250 Gb M.2 SSD. I haven't had to do any of the governor stuff other people have. The only problems I've run into have to do with remembering to rebuild the ZFS module when I update the kernel.

One thing I've noticed is that a number of people having problem seem to be using OMV and I'm not.

March 17, 2021

Might help to add to the ZFS page to do something like this when done:

for disk in /sys/block/sd[a-e]/queue/scheduler; do echo none > $disk; done

(Assuming all 5 disks are using ZFS.)

March 16, 2021

16 hours ago, 0utc45t said:

In case of disk failure... just copying the filesystem to a backup, which then is copied to a new disk, which is inserted to slot A and booted. It will not boot. I presume so, by the fact that I tried to repartition/resize that disk holding system in slot A and it failed to boot after that. So, there is something more involved, hints are visible by looking at mounts. I would like to get the system backupped. Know the needed bits and tweaks, so that I know that I can do it and it will work, in case of disk failure.

The problem with copying the file system is the UUIDs of the devices are wrong. This is why dd will actually work (and yes, so will ddrescue) because it copies the disk block by block. That means the new partitions have the same UUID. I know this works because I just did it on my main computer.

The backup script above basically copies the file system and then adjusts /etc/fstab to have the correct UUIDs, so it should work.

March 11, 2021

21 hours ago, gprovost said:

Yes exactly and I think it's important to recommend to disable it at pool creation.

I believe it's not enabled by default. You have to choose to set it. At least, none of my pools have been created with it and I never specified not to.

As a side note, I doubled checked my compression on all of my datasets and I noticed that some of my docker data sees massive compression with zstd on. Most are in the 1-3x range, but I have several in the 8-9x range.

Sign In

Forums

Store

Crowdfunding

Applications

Events

Raffles

Community Map

Posts posted by wurmfood

Important Information