Jump to content

wurmfood

Members
  • Posts

    36
  • Joined

  • Last visited

Posts posted by wurmfood

  1. Sorry, the line about disabling was just to make sure you disable the armbian-zram-config service by setting ENABLED to false.

     

    As a warning, though, I found some problems with this if you log to a zfs share. It seems you have to make sure zfs gets loaded before the logging starts up, otherwise you can get a kernel panic on occasion. I didn't really dig into how to fix this, so I just log and swap to a usb drive instead now.

  2. 22 hours ago, digwer said:

    I think you are missing one zero on this value. Also, it's not changeable and by default it's 950000

    Yup, fixed. Thank you. It may be the default and non-changeable, but it's a suggestion that has been put in from the Kobol team, which is why I included it.

  3. You can set up logging to go to a flash drive, but capturing the console is going to be the best bet.

    Do you have another computer you can leave on with the serial console connected? For example, I have a small NUC where I keep picocom running in a tmux session, that way I don't lose anything. You can use the -g option to have picocom log to a file, as well.

  4. There are a number of modifications that have been suggested that people implement to address certain issues.

     

    The ones I can find are:

    - In /boot/armbianEnv.txt:

    extraargs=libata.force=3.0

    - If doing debugging, also add:

    verbosity=7
    console=serial
    extraargs=earlyprintk ignore_loglevel

     

    - In /boot/boot.cmd

    regulator dev vdd_log
    regulator value 930000
    regulator dev vdd_center
    regulator value 950000

    and then run:

    mkimage -C none -A arm -T script -d /boot/boot.cmd /boot/boot.scr

     

    - In /etc/default/cpufrequtils:

    ENABLE=true
    MIN_SPEED=408000
    MAX_SPEED=1800000
    GOVERNOR=ondemand

    (or 1200000 instead of 1800000)

     

    - And if using ZFS:

    for disk in /sys/block/sd[a-e]/queue/scheduler; do echo none > $disk; done

     

     

    I've gathered these from a variety of threads. Am I missing any here?

  5. After a recent update to non-kernel stuff, my Helios64 will no longer boot. When attempting to, I get this error:

    [   32.425508] xhci-hcd xhci-hcd.3.auto: Host halt failed, -110
    [   35.019496] xhci-hcd xhci-hcd.3.auto: Host halt failed, -110
    [   35.020028] xhci-hcd xhci-hcd.3.auto: Host controller not halted, aborting reset.

     

    I had been using the eMMC on board to boot with the SD card as a backup, but now neither one is working.

     

    Anyone know if there's a way to fix this?

     

    Here's more, using verbosity=4 in the armbianEnv.txt.

    Spoiler
    
    
    DDR Version 1.24 20191016
    In
    channel 0
    CS = 0
    MR0=0x18
    MR4=0x1
    MR5=0x1
    MR8=0x10
    MR12=0x72
    MR14=0x72
    MR18=0x0
    MR19=0x0
    MR24=0x8
    MR25=0x0
    channel 1
    CS = 0
    MR0=0x18
    MR4=0x1
    MR5=0x1
    MR8=0x10
    MR12=0x72
    MR14=0x72
    MR18=0x0
    MR19=0x0
    MR24=0x8
    MR25=0x0
    channel 0 training pass!
    channel 1 training pass!
    change freq to 416MHz 0,1
    Channel 0: LPDDR4,416MHz
    Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
    Channel 1: LPDDR4,416MHz
    Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
    256B stride
    channel 0
    CS = 0
    MR0=0x18
    MR4=0x1
    MR5=0x1
    MR8=0x10
    MR12=0x72
    MR14=0x72
    MR18=0x0
    MR19=0x0
    MR24=0x8
    MR25=0x0
    channel 1
    CS = 0
    MR0=0x18
    MR4=0x1
    MR5=0x1
    MR8=0x10
    MR12=0x72
    MR14=0x72
    MR18=0x0
    MR19=0x0
    MR24=0x8
    MR25=0x0
    channel 0 training pass!
    channel 1 training pass!
    channel 0, cs 0, advanced training done
    channel 1, cs 0, advanced training done
    change freq to 856MHz 1,0
    ch 0 ddrconfig = 0x101, ddrsize = 0x40
    ch 1 ddrconfig = 0x101, ddrsize = 0x40
    pmugrf_os_reg[2] = 0x32C1F2C1, stride = 0xD
    ddr_set_rate to 328MHZ
    ddr_set_rate to 666MHZ
    ddr_set_rate to 928MHZ
    channel 0, cs 0, advanced training done
    channel 1, cs 0, advanced training done
    ddr_set_rate to 416MHZ, ctl_index 0
    ddr_set_rate to 856MHZ, ctl_index 1
    support 416 856 328 666 928 MHz, current 856MHz
    OUT
    Boot1: 2019-03-14, version: 1.19
    CPUId = 0x0
    ChipType = 0x10, 254
    SdmmcInit=2 0
    BootCapSize=100000
    UserCapSize=14910MB
    FwPartOffset=2000 , 100000
    mmc0:cmd5,20
    SdmmcInit=0 0
    BootCapSize=0
    UserCapSize=30436MB
    FwPartOffset=2000 , 0
    StorageInit ok = 253372
    SecureMode = 0
    SecureInit read PBA: 0x4
    SecureInit read PBA: 0x404
    SecureInit read PBA: 0x804
    SecureInit read PBA: 0xc04
    SecureInit read PBA: 0x1004
    SecureInit read PBA: 0x1404
    SecureInit read PBA: 0x1804
    SecureInit read PBA: 0x1c04
    SecureInit ret = 0, SecureMode = 0
    atags_set_bootdev: ret:(0)
    GPT 0x3380ec0 signature is wrong
    recovery gpt...
    GPT 0x3380ec0 signature is wrong
    recovery gpt fail!
    LoadTrust Addr:0x4000
    No find bl30.bin
    No find bl32.bin
    Load uboot, ReadLba = 2000
    Load OK, addr=0x200000, size=0xe5b60
    RunBL31 0x40000
    NOTICE:  BL31: v1.3(debug):42583b6
    NOTICE:  BL31: Built : 07:55:13, Oct 15 2019
    NOTICE:  BL31: Rockchip release version: v1.1
    INFO:    GICv3 with legacy support detected. ARM GICV3 driver initialized in EL3
    INFO:    Using opteed sec cpu_context!
    INFO:    boot cpu mask: 0
    INFO:    plat_rockchip_pmu_init(1190): pd status 3e
    INFO:    BL31: Initializing runtime services
    WARNING: No OPTEE provided by BL2 boot loader, Booting device without OPTEE initialization. SMC`s destined for OPTEE will return SMC_UNK
    ERROR:   Error initializing runtime service opteed_fast
    INFO:    BL31: Preparing for EL3 exit to normal world
    INFO:    Entry point address = 0x200000
    INFO:    SPSR = 0x3c9
    
    
    U-Boot 2020.10-armbian (Mar 08 2021 - 14:54:58 +0000)
    
    SoC: Rockchip rk3399
    Reset cause: POR
    DRAM:  3.9 GiB
    PMIC:  RK808
    SF: Detected w25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB
    MMC:   mmc@fe320000: 1, sdhci@fe330000: 0
    Loading Environment from MMC... *** Warning - bad CRC, using default environment
    
    In:    serial
    Out:   serial
    Err:   serial
    Model: Helios64
    Revision: 1.2 - 4GB non ECC
    Net:   eth0: ethernet@fe300000
    scanning bus for devices...
    starting USB...
    Bus usb@fe380000: USB EHCI 1.00
    Bus dwc3: usb maximum-speed not found
    Register 2000140 NbrPorts 2
    Starting the controller
    USB XHCI 1.10
    scanning bus usb@fe380000 for devices... 1 USB Device(s) found
    scanning bus dwc3 for devices... cannot reset port 4!?
    5 USB Device(s) found
           scanning usb for storage devices... 1 Storage Device(s) found
    Hit any key to stop autoboot:  0
    switch to partitions #0, OK
    mmc1 is current device
    Scanning mmc 1:1...
    Found U-Boot script /boot/boot.scr
    3185 bytes read in 5 ms (622.1 KiB/s)
    ## Executing script at 00500000
    Boot script loaded from mmc 1
    166 bytes read in 4 ms (40 KiB/s)
    16091509 bytes read in 687 ms (22.3 MiB/s)
    28582400 bytes read in 1180 ms (23.1 MiB/s)
    81913 bytes read in 14 ms (5.6 MiB/s)
    2698 bytes read in 9 ms (292 KiB/s)
    Applying kernel provided DT fixup script (rockchip-fixup.scr)
    ## Executing script at 09000000
    Moving Image from 0x2080000 to 0x2200000, end=3de0000
    ## Loading init Ramdisk from Legacy Image at 06000000 ...
       Image Name:   uInitrd
       Image Type:   AArch64 Linux RAMDisk Image (gzip compressed)
       Data Size:    16091445 Bytes = 15.3 MiB
       Load Address: 00000000
       Entry Point:  00000000
       Verifying Checksum ... OK
    ## Flattened Device Tree blob at 01f00000
       Booting using the fdt blob at 0x1f00000
       Loading Ramdisk to f4f97000, end f5eef935 ... OK
       Loading Device Tree to 00000000f4f1a000, end 00000000f4f96fff ... OK
    
    Starting kernel ...
    
    [    2.884158] vcc3v3_sys_s0: failed to get the current voltage: -EPROBE_DEFER
    [    3.019799] mmc1: tuning execution failed: -5
    [    3.020211] mmc1: error -5 whilst initialising SD card
    [    6.327083] rk_gmac-dwmac fe300000.ethernet: cannot get clock clk_mac_speed
    [    9.753748] OF: graph: no port node found in /i2c@ff3d0000/typec-portc@22
    [    9.799283] OF: graph: no port node found in /syscon@ff770000/usb2-phy@e450/otg-port
    [   10.249703] r8152 4-1.4:1.0 (unnamed net_device) (uninitialized): netif_napi_add() called with weight 256
    [   32.425508] xhci-hcd xhci-hcd.3.auto: Host halt failed, -110
    [   35.019496] xhci-hcd xhci-hcd.3.auto: Host halt failed, -110
    [   35.020028] xhci-hcd xhci-hcd.3.auto: Host controller not halted, aborting reset.
    [   35.022321] reboot: Power down

     

     

  6. It's showing it boot completely. There's a login prompt there on the console. Why it's showing the extra characters, I'm not sure. My picocom configuration matches yours and I don't see that.

    What's interesting is you've got the SysRq coming up when, I believe, that's normally only triggered by a key combination. I wonder if something is causing extra characters to be sent over the serial console that's causing the problem.

  7. Sigh. Except that doesn't solve the problem. Now it's just cron filling up the log.

     

    New solution, using the sleep option. Modified helio64-ups.service:

    [Unit]
    Description=Helios64 UPS Action
    
    [Install]
    WantedBy=multi-user.target
    
    [Service]
    #Type=oneshot
    #ExecStart=/usr/bin/helios64-ups.sh
    Type=simple
    ExecStart=/usr/local/sbin/powermon.sh

     

    Modified powermon.sh:

    #!/bin/bash
    
    #7.0V   916     Recommended threshold to force shutdown system
    TH=916
    
    # Values can be info, warning, emerg
    warnlevel="emerg"
    
    while [ : ]
    do
            main_power=$(cat '/sys/class/power_supply/gpio-charger/online')
            # Only use for testing:
            # main_power=0
    
            if [ "$main_power" == 0 ]; then
                    val=$(cat '/sys/bus/iio/devices/iio:device0/in_voltage2_raw')
                    sca=$(cat '/sys/bus/iio/devices/iio:device0/in_voltage_scale')
                    # The division is required to make the test later work.
                    adc=$(echo "$val * $sca /1" | bc)
    
                    echo "Main power lost. Current charge: $adc" | systemd-cat -p $warnlevel
                    echo "Shutdown at $TH" | systemd-cat -p $warnlevel
    
                    # Uncomment for testing
                    # echo "Current values:"
                    # echo -e "\tMain Power = $main_power"
                    # echo -e "\tRaw Voltage = $val"
                    # echo -e "\tVoltage Scale = $sca"
                    # echo -e "\tVoltage = $adc"
    
                    if [ "$adc" -le $TH ]; then
                        echo "Critical power level reached. Powering off." | systemd-cat -p $warnlevel
                        /usr/sbin/poweroff
                    fi
            fi
            sleep 20
    done

     

  8. While something that sleeps for 20 seconds might be better, I've set up the following script in cron to go off every minute.

     

    <removed>

    My crontab looks like:

    <removed>

    I've tested this and it seems to work. The pipe to systemd-cat lets this log to the system journal at a set warn level.

     

    -- Edit: Removed extra stuff here to clean things up a little. This didn't work the way I wanted.

  9. It looks like you have been booting from the eMMC, right?

    23 minutes ago, scottf007 said:

    GPT 0x3380ec0 signature is wrong
    recovery gpt...
    GPT 0x3380ec0 signature is wrong
    recovery gpt fail!

     

    I think you're going to need to boot off an SD card and try to fix the GPT. This may help you with doing that.

  10. @SIGSEGV

    It is disabled by default, but since I had some time the last few days I've been digging into things and came across this. There's a corresponding service, but all it does is check to see the state of the battery and then exit. If something else is monitoring the battery, I don't know what it is.

     

    The service could be rewritten as a script that has an loop that checks the state of the battery every 20 seconds and then sleeps.

  11. While I like having the helios-ups.timer in case of power failure,  I don't like that my logs get three lines written to them every 20 seconds.

    Apr 09 07:43:26 helios64 systemd[1]: Starting Helios64 UPS Action...
    Apr 09 07:43:26 helios64 systemd[1]: helios64-ups.service: Succeeded.
    Apr 09 07:43:26 helios64 systemd[1]: Finished Helios64 UPS Action.

    Does anyone know if there is a way to keep the timer on but not fill the logs this way?

     

    Everything I've found about silencing systemd messages is about the output of the command, not the systemd activity itself.

  12. Well, for anyone else interested in trying this, here's the basic order I did:

    • stop armbian-ramlog
    • disable armbian-ramlog
    • create a zfs dataset and mount it at /var/log
    • cp -ar everything from /var/log.hdd to the new /var/log
    • modify /etc/logrotate to disable compression (since the dataset is already using compression)
    • modify /etc/default/armbian-ramlog to disable it there as well
    • modify /etc/default/armbian-zram-config to adjust for new numbers (I have ZRAM_PERCENTAGE and MEM_LIMIT_PERCENTAGE at 15).
    • reboot
  13. Is there an accepted procedure for moving from using ramlog to logging to disk? I've looked but everything I can find is about setting up ramlog.

     

    I assume it's more complicated than just creating a partition (or ZFS dataset) and mounting it /var/log, disabling ramlog, and rebooting.

  14. I'm trying to understand the role of zram and having some difficulties with it. I've looked in several places and my understanding is this:

    - It's a block device in RAM that uses compression

    - It functions similar to swap

     

    I notice that mine is set to about 2 Gb:

    wurmfood@helios64:~$ cat /proc/swaps
    Filename                                Type            Size            Used            Priority
    /dev/zram0                              partition       1945084         0               5

     

    My questions:

    1. Does that mean 2 Gb of the device RAM is used for the zram swap device?

    2. I can understand this on devices with slower storage, but would it be reasonable to disable this and instead use swap on an SSD?

  15. 3 minutes ago, ShadowDance said:

     

    Are you sure about that? Generally CMR is considered good, SMR is what you'd want to stay away from. Is it something related to these specific drives?

    Ah! Sorry, yes, you're right. I've been looking at these back and forth for the last few days. A lot of people here have been having trouble with WD drives and at least some of that seems to come down to the difference between the two types of drives.

  16. 3 hours ago, Wofferl said:

    All reports here seems to use WD Red disks and you can find similar reports for this disks and zfs with other hardware.

    Taking a look around, this definitely seems to be the case:

    https://www.truenas.com/community/threads/warning-wd-red-wd60efrx-68l0bn1.58676/

     

    Additionally, people see a lot more errors with the *EFAX drives instead of the *EFRX, since the *EFAX drives are CMR and don't hold up.

  17. I do. My system has been rock solid and is acting as a media storage device, NFS, and backup server.

     

    wurmfood@helios64:~$ uname -a
    Linux helios64 5.10.21-rockchip64 #21.02.3 SMP PREEMPT Mon Mar 8 01:05:08 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux
     

    ZFS on 4 Seagate 6 Tb disks plus a 250 Gb M.2 SSD. I haven't had to do any of the governor stuff other people have. The only problems I've run into have to do with remembering to rebuild the ZFS module when I update the kernel.

     

    One thing I've noticed is that a number of people having problem seem to be using OMV and I'm not.

  18. 16 hours ago, 0utc45t said:

    In case of disk failure... just copying the filesystem to a backup, which then is copied to a new disk, which is inserted to slot A and booted. It will not boot. I presume so, by the fact that I tried to repartition/resize that disk holding system in slot A and it failed to boot after that. So, there is something more involved, hints are visible by looking at mounts. I would like to get the system backupped. Know the needed bits and tweaks, so that I know that I can do it and it will work, in case of disk failure.

    The problem with copying the file system is the UUIDs of the devices are wrong. This is why dd will actually work (and yes, so will ddrescue) because it copies the disk block by block. That means the new partitions have the same UUID. I know this works because I just did it on my main computer. :)

     

    The backup script above basically copies the file system and then adjusts /etc/fstab to have the correct UUIDs, so it should work.

  19. 21 hours ago, gprovost said:

     

    Yes exactly and I think it's important to recommend to disable it at pool creation.

     

    I believe it's not enabled by default. You have to choose to set it. At least, none of my pools have been created with it and I never specified not to.

     

    As a side note, I doubled checked my compression on all of my datasets and I noticed that some of my docker data sees massive compression with zstd on. Most are in the 1-3x range, but I have several in the 8-9x range.

×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines