lzb

Members
  • Content Count

    10
  • Joined

  • Last visited

Posts posted by lzb

  1. So I've installed and I'm running watchdog for some months now on BananaPi M1. I think it reboots the board when the default gateway has some issues. And that's fine, no problem with that. The board works 24/7 and it made 14+ days uptime without problems.

     

    But things have changed and the board reboots at least few times a day (seems I have a problem with armbianmonitor -u - the board rebooted twice before it ended). And here's the problem. watchdog doesnt log anything or I'm unable to find any usefull info. :)

     

    And watchdog.conf:

    (it seems that the board went offline - maybe internet connection issues, but I have another one running that's also "logless" and they have almost the same config except for the gateway IP)

    ping                    = 192.168.1.1
    #ping                   = 172.26.1.255
    interface               = eth0
    #file                   = /var/log/messages
    #change                 = 1407
    
    # Uncomment to enable test. Setting one of these values to '0' disables it.
    # These values will hopefully never reboot your machine during normal use
    # (if your machine is really hung, the loadavg will go much higher than 25)
    max-load-1              = 24
    max-load-5              = 18
    max-load-15             = 12
    
    # Note that this is the number of pages!
    # To get the real size, check how large the pagesize is on your machine.
    #min-memory             = 1
    #allocatable-memory     = 1
    
    #repair-binary          = /usr/sbin/repair
    #repair-timeout         = 60
    #test-binary            =
    #test-timeout           = 60
    
    # The retry-timeout and repair limit are used to handle errors in a more robust
    # manner. Errors must persist for longer than retry-timeout to action a repair
    # or reboot, and if repair-maximum attempts are made without the test passing a
    # reboot is initiated anyway.
    #retry-timeout          = 60
    #repair-maximum         = 1
    
    watchdog-device = /dev/watchdog
    
    # Defaults compiled into the binary
    temperature-sensor      = /sys/class/thermal/thermal_zone0/temp
    max-temperature = 80
    
    # Defaults compiled into the binary
    #admin                  = root
    interval                = 10
    logtick                = 60
    log-dir         = /var/log/watchdog
    
    # This greatly decreases the chance that watchdog won't be scheduled before
    # your machine is really loaded
    realtime                = yes
    priority                = 1
    
    # Check if rsyslogd is still running by enabling the following line
    #pidfile                = /var/run/rsyslogd.pid

     

  2. This patch is included in -next already? I can test it with BananaPi M1.

     

    ::edit

    commit messages seems to confirm it is in -next. ;) Bleh, more like in -dev. I need more coffee...

     

    ::edit2

    4.9.38 (next), only one test made.
    fs: btrfs
                                                                  random    random
                  kB  reclen    write  rewrite    read    reread    read     write
              102400       4     3951     5919    14107    14547     1337     4955
              102400      16     9908    14394    17473    32988     3338    10818
              102400     512    26440    27888    62459    73201    31875    30057
              102400    1024    26543    29481    61408    74973    36113    28225
              102400   16384    30315    34723    57029   109099   103228    38344
    
    5.1.0-sunxi #5.86 SMP Mon May 13 21:11:09 CEST 2019 armv7l GNU/Linux (3 tests made)
    fs: btrfs
                                                                  random    random
                  kB  reclen    write  rewrite    read    reread    read     write
              102400       4     6056     7318    19250    19427     1376     4393
              102400      16    16210    16483    42932    45765     5118    16473
              102400     512    57882    44149    58178    69361    38018    61066
              102400    1024    49587    55798    51267    78644    45696    63254
              102400   16384    30345    66470    64869    82639    80843    58395
    
                                                                  random    random
                  kB  reclen    write  rewrite    read    reread    read     write
              102400       4     5115     5813    18911    17605     1259     5124
              102400      16    16791    20273    18971    38345     4750    13755
              102400     512    36974    56462    62740    80449    33872    62893
              102400    1024    44808    34004    51920    83809    43962    66327
              102400   16384    57777    47181    44560    78107    79185    53640
    
                                                                  random    random
                  kB  reclen    write  rewrite    read    reread    read     write
              102400       4     4989     6186    16698    16474     1138     3173
              102400      16    11646    10015    38828    42461     4455    15354
              102400     512    47299    49030    57053    82134    29582    61737
              102400    1024    45468    35288    58417    81450    32958    64807
              102400   16384    55345    65883    78657    99487   104514    56825

    This was made on 1TB spinning rust:

    Device Model:     ST1000LM035-1RK172
    User Capacity:    1,000,204,886,016 bytes [1.00 TB]

  3. On 5/1/2019 at 9:40 PM, alien said:

    I tried the option bootdelay = 20 by default is bootdelay = 1

    I also tried the reset button.

    Unfortunately, it does not help.

    I've made my install on SATA hard drive few weeks ago without issues, but I can see something weird over here.

     

    I suppose you should use rootdelay, not bootdelay.

     

    Quote

    The "bootdelay" variable in U-Boot indicates how long U-Boot will wait before it begins booting the system into Linux. Usually, you see it set to 'bootdelay=3' or 'bootdelay=5'; which will give the user 3 (or 5) seconds to type any key and stop the system from booting.

    Quote

    rootdelay= n

    Wait n seconds before trying to mount the root filesystem. This can be useful if the root filesystem is on a USB or Firewire device, as those disk devices take a bit longer to be discovered by the kernel.

     

  4. A little update, as I've updated and rebooted the board, but it's still here:

    root@bananapi:~# armbianmonitor -m >> armmon.log
    /usr/bin/armbianmonitor: line 385: read: read error: 0: Connection timed out
    /usr/bin/armbianmonitor: line 386: [: 43.2: integer expression expected
    /usr/bin/armbianmonitor: line 385: read: read error: 0: Connection timed out
    /usr/bin/armbianmonitor: line 386: [: 43.0: integer expression expected
    /usr/bin/armbianmonitor: line 385: read: read error: 0: Connection timed out
    /usr/bin/armbianmonitor: line 386: [: 43.1: integer expression expected
    /usr/bin/armbianmonitor: line 385: read: read error: 0: Connection timed out
    /usr/bin/armbianmonitor: line 386: [: 43.0: integer expression expected
    /usr/bin/armbianmonitor: line 385: read: read error: 0: Connection timed out
    /usr/bin/armbianmonitor: line 386: [: 42.9: integer expression expected
    /usr/bin/armbianmonitor: line 385: read: read error: 0: Connection timed out
    *snip*
    root@bananapi:~# dmesg
    *snip*
    [216532.441032] thermal thermal_zone0: failed to read out thermal zone (-110)
    [216618.559142] thermal thermal_zone0: failed to read out thermal zone (-110)
    [217500.285255] thermal thermal_zone0: failed to read out thermal zone (-110)
    [217887.448338] thermal thermal_zone0: failed to read out thermal zone (-110)
    [218005.568728] thermal thermal_zone0: failed to read out thermal zone (-110)
    [218091.686665] thermal thermal_zone0: failed to read out thermal zone (-110)
    [218295.924984] thermal thermal_zone0: failed to read out thermal zone (-110)
    [218392.699797] thermal thermal_zone0: failed to read out thermal zone (-110)
    [218478.817813] thermal thermal_zone0: failed to read out thermal zone (-110)
    [218876.637673] thermal thermal_zone0: failed to read out thermal zone (-110)
    [219080.876099] thermal thermal_zone0: failed to read out thermal zone (-110)
    [219274.425570] thermal thermal_zone0: failed to read out thermal zone (-110)
    [219371.232445] thermal thermal_zone0: failed to read out thermal zone (-110)
    [219554.125184] thermal thermal_zone0: failed to read out thermal zone (-110)
    [219672.245547] thermal thermal_zone0: failed to read out thermal zone (-110)
    [219962.601960] thermal thermal_zone0: failed to read out thermal zone (-110)
    [220156.183531] thermal thermal_zone0: failed to read out thermal zone (-110)
    root@bananapi:~# uname -a
    Linux bananapi 4.19.38-sunxi #5.83 SMP Fri May 3 18:05:49 CEST 2019 armv7l GNU/Linux
    root@bananapi:~# armbianmonitor -u
    System diagnosis information will now be uploaded to http://ix.io/1Ipx
    Please post the URL in the forum where you've been asked for.

     

  5. @guidol I can spot some difference in uptime and "a board just booted" and "a board with some uptime doing things". My is running urbackup (mostly active two or three times a day) and nextcloud (mostly idle at the moment). I had issues running this setup for more than 24h (freezes), but it seems to improve:

    root@bananapi:~# uptime
     08:56:57 up 2 days,  8:13,  3 users,  load average: 0.59, 0.56, 0.55

    And current uptime is still growing after some serious testing (running stress, running btrfs scrub and btrfs balance to simulate "typical" workload).

    11 hours ago, guidol said:

    much cooler and mine does show 4.94V-4.97V not 4.76V

    Not sure where this 4.76V came from? (cant see it in the uploaded log) I've changed governor and min_freq as the board is idle most of the time (as mentioned above). I think that my power supply isnt that bad:

    root@bananapi:~# grep -v Time armmon.log | cut -d\. -f1,4,5 | cut -d\  -f8 | sed '/^$/d' | sort | uniq -c
          1 4.84V
          1 4.85V
          6 4.86V
          2 4.87V
          6 4.88V
         24 4.89V
         32 4.90V
         69 4.91V
        107 4.92V
         85 4.93V
        210 4.94V
        335 4.95V
        339 4.96V
        458 4.97V
        929 4.98V
       1739 4.99V
       3566 5.00V
       7853 5.01V
       8978 5.02V
       7890 5.03V
       2703 5.04V
        367 5.05V
         21 5.06V
    root@bananapi:~# wc -l armmon.log
    38129 armmon.log

    I sure can try newer kernel if that can help. :) I can also ignore it as my uptime finally hit over 24h, but would be nice to solve that bug.

     

  6. Sorry for grave digging, but I have the same error messages in BananaPi M1:

    root@bananapi:~# armbianmonitor -m >> armmon.log
    /usr/bin/armbianmonitor: line 385: read: read error: 0: Connection timed out
    /usr/bin/armbianmonitor: line 386: [: 43.5: integer expression expected
    /usr/bin/armbianmonitor: line 385: read: read error: 0: Connection timed out
    /usr/bin/armbianmonitor: line 386: [: 43.3: integer expression expected
    /usr/bin/armbianmonitor: line 385: read: read error: 0: Connection timed out
    /usr/bin/armbianmonitor: line 386: [: 43.2: integer expression expected
    /usr/bin/armbianmonitor: line 385: read: read error: 0: Connection timed out
    /usr/bin/armbianmonitor: line 386: [: 43.2: integer expression expected
    /usr/bin/armbianmonitor: line 385: read: read error: 0: Connection timed out
    /usr/bin/armbianmonitor: line 386: [: 43.4: integer expression expected
    /usr/bin/armbianmonitor: line 385: read: read error: 0: Connection timed out
    /usr/bin/armbianmonitor: line 386: [: 43.4: integer expression expected
    *snip*
    root@bananapi:~# dmesg | tail
    [74843.472544] thermal thermal_zone0: failed to read out thermal zone (-110)
    [74961.592838] thermal thermal_zone0: failed to read out thermal zone (-110)
    [80058.430841] thermal thermal_zone0: failed to read out thermal zone (-110)
    [81036.963935] thermal thermal_zone0: failed to read out thermal zone (-110)
    [83488.688937] thermal thermal_zone0: failed to read out thermal zone (-110)
    [84273.576483] thermal thermal_zone0: failed to read out thermal zone (-110)
    [85252.109538] thermal thermal_zone0: failed to read out thermal zone (-110)
    [85553.122760] thermal thermal_zone0: failed to read out thermal zone (-110)
    [85660.618355] thermal thermal_zone0: failed to read out thermal zone (-110)
    [86133.835726] thermal thermal_zone0: failed to read out thermal zone (-110)
    root@bananapi:~# armbianmonitor -u
    System diagnosis information will now be uploaded to http://ix.io/1HbP
    Please post the URL in the forum where you've been asked for.

    Not that I care about those errors, but maybe it can be pushed to fix some bug. :) I'm running armbianmonitor >> file because my BananaPi setup has small uptime (like, it freezes and I need to restart it - that's another story).

     

    PS Keep up the good work on Armbian!

  7. 13 hours ago, Larry Bank said:

    Yes, the physical pins line up with the RPI original 26-pin GPIO. Remember that the connector is on the "other side" of the board with respect to the RPI way of doing it.

    Yeah, I remember from somewhere that OPi has "compatible" GPIO with RPI, but rotated (180 degrees?).

     

    (as an important side note - would be nice if pin description on the breadboard adapter fit actual pins - I can live without this or with a piece of paper, but it's usefull to have valid description for kids :) )