Jump to content

Recommended Posts

Posted

Hi,

I've two Tinker Board servers running latest Armbian stable Debian GNU/Linux 9 with 4.19.*-rockchip. In that regard, I've connected some HDDs and from the log I see that every day at a specific time all HDDs are spinning up as seen below in the spoiler. This happens for both servers at the same specific time every day..

Spoiler

# Server 1
May 13 06:25:26 Tinkerboard hd-idle[29538]: sda spinup
May 13 06:25:26 Tinkerboard hd-idle[29538]: sdc spinup
May 13 06:25:26 Tinkerboard hd-idle[29538]: sdb spinup
May 13 06:35:36 Tinkerboard hd-idle[29538]: sda spindown
May 13 06:35:45 Tinkerboard hd-idle[29538]: sdc spindown
May 13 06:35:45 Tinkerboard hd-idle[29538]: sdb spindown
May 14 06:26:07 Tinkerboard hd-idle[29538]: sda spinup
May 14 06:26:07 Tinkerboard hd-idle[29538]: sdc spinup
May 14 06:26:07 Tinkerboard hd-idle[29538]: sdb spinup
May 14 06:36:17 Tinkerboard hd-idle[29538]: sda spindown
May 14 06:36:26 Tinkerboard hd-idle[29538]: sdc spindown
May 14 06:36:26 Tinkerboard hd-idle[29538]: sdb spindown
May 15 06:25:51 Tinkerboard hd-idle[29538]: sda spinup
May 15 06:25:51 Tinkerboard hd-idle[29538]: sdc spinup
May 15 06:25:51 Tinkerboard hd-idle[29538]: sdb spinup
May 15 06:36:01 Tinkerboard hd-idle[29538]: sda spindown
May 15 06:36:10 Tinkerboard hd-idle[29538]: sdc spindown
May 15 06:36:10 Tinkerboard hd-idle[29538]: sdb spindown
May 16 06:25:33 Tinkerboard hd-idle[29538]: sda spinup
May 16 06:25:33 Tinkerboard hd-idle[29538]: sdc spinup
May 16 06:25:33 Tinkerboard hd-idle[29538]: sdb spinup
May 16 06:35:43 Tinkerboard hd-idle[29538]: sda spindown
May 16 06:35:52 Tinkerboard hd-idle[29538]: sdc spindown
May 16 06:35:52 Tinkerboard hd-idle[29538]: sdb spindown

# Server 2
May 13 06:25:37 Tinkerboard hd-idle[794]: sda spinup
May 13 06:25:37 Tinkerboard hd-idle[794]: sdb spinup
May 13 06:35:42 Tinkerboard hd-idle[794]: sda spindown
May 13 06:35:43 Tinkerboard hd-idle[794]: sdb spindown
May 14 06:25:45 Tinkerboard hd-idle[794]: sda spinup
May 14 06:25:45 Tinkerboard hd-idle[794]: sdb spinup
May 14 06:35:52 Tinkerboard hd-idle[794]: sda spindown
May 14 06:35:53 Tinkerboard hd-idle[794]: sdb spindown
May 15 06:25:48 Tinkerboard hd-idle[794]: sda spinup
May 15 06:25:48 Tinkerboard hd-idle[794]: sdb spinup
May 15 06:35:56 Tinkerboard hd-idle[794]: sda spindown
May 15 06:35:56 Tinkerboard hd-idle[794]: sdb spindown
May 16 06:25:44 Tinkerboard hd-idle[794]: sda spinup
May 16 06:25:44 Tinkerboard hd-idle[794]: sdb spinup
May 16 06:35:54 Tinkerboard hd-idle[794]: sda spindown
May 16 06:35:54 Tinkerboard hd-idle[794]: sdb spindown

 

I've been thinking of using fatrace to trace file access events around the specific time to find which service that spins up my HDDs, however, when I try to use fatrace by running the code below I get the message "Cannot initialize fanotify: Function not implemented". I've come across two bug reports which seems to be related to this issue: some archs/kernels define O_LARGEFILE and Hardcoded KERNEL_O_LARGEFILE does not work on ARM. So, I guess fatrace is not an option given that it's not enabled in the kernel configuration.

$ sudo fatrace -o /tmp/trace -s 60
Cannot initialize fanotify: Function not implemented


btrace doesn't seem to be working either as seen below.

$ sudo btrace /dev/sdb
BLKTRACESETUP(2) /dev/sdb failed: 5/Input/output error


Neither is iotop as seen below.

$ sudo iotop
Could not run iotop as some of the requirements are not met:
- Linux >= 2.6.20 with
  - I/O accounting support (CONFIG_TASKSTATS, CONFIG_TASK_DELAY_ACCT, CONFIG_TASK_IO_ACCOUNTING)


Any recommendation for tools that I can use to find out which process that is spinning up my HDDs every day? 

Thanks in advance!:)

Posted

I suspected that "Function not implemented" was related to some kernel configuration, however, I'm looking for an alternative to find processes which are accessing all three HDDs every day at the same time without disabling processes one by one...

 

I've found blktrace which seems to give some relevant information as seen below. So, I'm going to run the following command a minute before the timestamp my HDDs are spinning up

$ sudo blktrace -d /dev/sdb -o - | blkparse -i -
  8,16   1        1     0.000000000    17  C   N [0]
  8,16   2        1 1266874889.708500152   888  G   N [smartd]
  8,16   2        2 1266874889.708505985   888  I   N 0 [smartd]
  8,16   2        3 1266874889.708510360   888  D   N 0 [smartd]
  8,16   1        2     5.001279610    17  C   N [0]
  8,16   1        3     5.002521825    17  C   N [65531]
  8,16   2        4     5.000177978   888  G   N [smartd]
  8,16   2        5     5.000184978   888  I   N 0 [smartd]
  8,16   2        6     5.000188770   888  D   N 0 [smartd]
  8,16   2        7     5.001649445 30968  G   N [kworker/2:2]
  8,16   2        8     5.001652070 30968  I   N 0 [kworker/2:2]
  8,16   2        9     5.001653528 30968  D   N 0 [kworker/2:2]

 

I'll also run strace to see if I can get some more information regarding this

sudo strace -f -e open -t ls 2>&1

 

Tracking syscalls with auditctl (Audit framework) is also a possibility I've found and might be possible if kernels CONFIG_AUDIT is enabled. I'll look more into auditctl tomorrow if I don't find some relevant information from blktrace or strace.

Posted

As I was in the middle of the exam period when I posted this thread, I stopped all daily cron jobs as a shortcut, which fixed this issue. I've since found out that the daily cron job etc/cron.daily/armbian-ram-logging is the problem. armbian-ram-logging contains

#!/bin/sh
/usr/lib/armbian/armbian-ramlog write >/dev/null 2>&1

/usr/lib/armbian/armbian-ramlog contains

Spoiler

#!/bin/bash
#
# Copyright (c) Authors: http://www.armbian.com/authors
#
# This file is licensed under the terms of the GNU General Public
# License version 2. This program is licensed "as is" without any
# warranty of any kind, whether express or implied.

SIZE=50M
USE_RSYNC=true
ENABLED=false

[ -f /etc/default/armbian-ramlog ] && . /etc/default/armbian-ramlog

[ "$ENABLED" != true ] && exit 0

# Never touch anything below here. Only edit /etc/default/armbian-ramlog

HDD_LOG=/var/log.hdd/
RAM_LOG=/var/log/
LOG2RAM_LOG="${HDD_LOG}armbian-ramlog.log"
LOG_OUTPUT="tee -a $LOG2RAM_LOG"

isSafe () {
    [ -d $HDD_LOG ] || (echo "ERROR: $HDD_LOG doesn't exist! Can't sync." >&2 ; exit 1)
    NoCache=$(which nocache 2>/dev/null)
}

RecreateLogs (){
    # in case of crash those services don't start if there are no dirs & logs
    check_if_installed apache2 && [ ! -d /var/log/apache2 ] && mkdir -p /var/log/apache2
    check_if_installed cron-apt && [ ! -d /var/log/cron-apt ] && \
        (mkdir -p /var/log/cron-apt ; touch /var/log/cron-apt/log)
    check_if_installed proftpd-basic && [ ! -d /var/log/proftpd ] && \
        (mkdir -p /var/log/proftpd ; touch /var/log/proftpd/controls.log)
    check_if_installed nginx && [ ! -d /var/log/nginx ] && \
        (mkdir -p /var/log/nginx ; touch /var/log/nginx/access.log ; touch /var/log/nginx/error.log)
    check_if_installed samba && [ ! -d /var/log/samba ] && mkdir -p /var/log/samba
    check_if_installed unattended-upgrades && [ ! -d /var/log/unattended-upgrades ] && mkdir -p /var/log/unattended-upgrades
    return 0
}

syncToDisk () {
    isSafe

    echo -e "\n\n$(date): Syncing logs from $LOG_TYPE to storage\n" | $LOG_OUTPUT

    if [ "$USE_RSYNC" = true ]; then
        ${NoCache} rsync -aXWv --delete --exclude armbian-ramlog.log --links $RAM_LOG $HDD_LOG 2>&1 | $LOG_OUTPUT
    else
        ${NoCache} cp -rfup $RAM_LOG -T $HDD_LOG 2>&1 | $LOG_OUTPUT
    fi

    sync
}

syncFromDisk () {
    isSafe

    echo -e "\n\n$(date): Loading logs from storage to $LOG_TYPE\n" | $LOG_OUTPUT

    if [ "$USE_RSYNC" = true ]; then
        ${NoCache} rsync -aXWv --delete --exclude armbian-ramlog.log --exclude *.gz --exclude='*.[0-9]' --links $HDD_LOG $RAM_LOG 2>&1 | $LOG_OUTPUT
    else
        ${NoCache} find $HDD_LOG* -maxdepth 1 -type f -not \( -name '*.[0-9]' -or -name '*.xz*' -or -name '*.gz' \) | xargs cp -ut $RAM_LOG
    fi

    sync
}

check_if_installed () {
    local DPKG_Status="$(dpkg -s "$1" 2>/dev/null | awk -F": " '/^Status/ {print $2}')"
    if [[ "X${DPKG_Status}" = "X" || "${DPKG_Status}" = *deinstall* ]]; then
        return 1
    else
        return 0
    fi
}

# Check whether zram device is available or we need to use tmpfs
if [ "$(blkid -s TYPE /dev/zram0 | awk ' { print $2 } ' | grep ext4)" ]; then
    LOG_TYPE="zram"
else
    LOG_TYPE="tmpfs"
fi

case "$1" in
    start)
        [ -d $HDD_LOG ] || mkdir -p $HDD_LOG
        mount --bind $RAM_LOG $HDD_LOG
        mount --make-private $HDD_LOG

        case $LOG_TYPE in
            zram)
                echo -e "Mounting /dev/zram0 as $RAM_LOG \c" | $LOG_OUTPUT
                mount -o discard /dev/zram0 $RAM_LOG 2>&1 | $LOG_OUTPUT
                ;;
            tmpfs)
                echo -e "Setting up $RAM_LOG as tmpfs \c" | $LOG_OUTPUT
                mount -t tmpfs -o nosuid,noexec,nodev,mode=0755,size=$SIZE armbian-ramlog $RAM_LOG 2>&1 | $LOG_OUTPUT
                ;;
        esac

        syncFromDisk
        RecreateLogs
        ;;
    stop)
        syncToDisk
        umount -l $RAM_LOG
        umount -l $HDD_LOG
        ;;
    write)
        syncToDisk
        ;;
    *)
        echo "Usage: ${0##*/} {start|stop|write}" >&2
        exit 1
        ;;
esac

 

 

/etc/default/armbian-ramlog contains

Spoiler

# configuration values for the armbian-ram-logging service
#
# enable the armbian-ram-logging service?
ENABLED=true
#
# size of the tmpfs mount -- please keep in mind to adjust /etc/default/armbian-zram-config too when increasing
SIZE=50M
#
# use rsync instead of cp -r
# requires rsync installed, may provide better performance
# due to copying only new and changed files
USE_RSYNC=true

 

 

So, I'm going to modify the file /etc/default/armbian-ramlog to prevent it from waking up my external hard drives every day. In that regard, it's much appreciated if anyone could point out which lines in the file that might wake up the hard drives. I'm thinking about line:

  • 81 with "... blkid -s TYPE /dev/zram0 ..." which I think wake up all drives. What do you think?
  • 65 with the find command "... find $HDD_LOG ...", yet it should only search for folders in the folder $HDD_LOG (/var/log.hdd/).

 

@Igor, this issue is likely something that others might find interesting as armbian-ramlog currently is spinning up all connected drives, which shouldn't be necessary.

Posted
3 hours ago, Z11ntal33r said:

which I think wake up all drives. What do you think?


Yes, it looks like that is the problem so this part needs to be done differently.

Posted

Something like this would probably be just fine:

 

Spoiler

--- a/packages/bsp/common/usr/lib/armbian/armbian-ramlog
+++ b/packages/bsp/common/usr/lib/armbian/armbian-ramlog
@@ -43,7 +43,7 @@ RecreateLogs (){
 syncToDisk () {
        isSafe
 
-       echo -e "\n\n$(date): Syncing logs from $LOG_TYPE to storage\n" | $LOG_OUTPUT
+       echo -e "\n\n$(date): Syncing logs to storage\n" | $LOG_OUTPUT
 
        if [ "$USE_RSYNC" = true ]; then
                ${NoCache} rsync -aXWv --delete --exclude "lost+found" --exclude armbian-ramlog.log --links $RAM_LOG $HDD_LOG 2>&1 | $LOG_OUTPUT
@@ -57,7 +57,7 @@ syncToDisk () {
 syncFromDisk () {
        isSafe
 
-       echo -e "\n\n$(date): Loading logs from storage to $LOG_TYPE\n" | $LOG_OUTPUT
+       echo -e "\n\n$(date): Loading logs from storage\n" | $LOG_OUTPUT
 
        if [ "$USE_RSYNC" = true ]; then
                ${NoCache} rsync -aXWv --delete --exclude "lost+found" --exclude armbian-ramlog.log --exclude *.gz --exclude='*.[0-9]' --links $HDD_LOG $RAM_LOG 2>&1 | $LOG_OUTPUT
@@ -77,19 +77,19 @@ check_if_installed () {
        fi
 }
 
-# Check whether zram device is available or we need to use tmpfs
-if [ "$(blkid -s TYPE /dev/zram0 | awk ' { print $2 } ' | grep ext4)" ]; then
-       LOG_TYPE="zram"
-else
-       LOG_TYPE="tmpfs"
-fi
-
 case "$1" in
        start)
                [ -d $HDD_LOG ] || mkdir -p $HDD_LOG
                mount --bind $RAM_LOG $HDD_LOG
                mount --make-private $HDD_LOG
 
+               # Check whether zram device is available or we need to use tmpfs
+               if [ "$(blkid -s TYPE /dev/zram0 | awk ' { print $2 } ' | grep ext4)" ]; then
+                       LOG_TYPE="zram"
+               else
+                       LOG_TYPE="tmpfs"
+               fi
+
                case $LOG_TYPE in
                        zram)
                                echo -e "Mounting /dev/zram0 as $RAM_LOG \c" | $LOG_OUTPUT

 

 

Posted
49 minutes ago, Z11ntal33r said:

I changed the time for daily cron jobs to trigger etc/cron.daily/armbian-ram-logging. So, we are still not there yet... Please let me know if you want more testing etc


But now it should not touch hard drive at all ... unless your /var/log is the hard drive.

Edit: it works for me. HDD remain sleeping after executing command found in cronjob
 

/usr/lib/armbian/armbian-ramlog write

 

Posted

I know, yet the hard drives spins up... :mellow: /var/log is on my Micro SD card. My hard drives are mounted with fstab to folders in "/mnt/usb*". Everything else is on my Micro SD card, so I've no clue why armbian-ramlog is still spinning up my hard drives...

Posted

 

22 minutes ago, Z11ntal33r said:

so I've no clue why armbian-ramlog is still spinning up my hard drives...


Perhaps restarting crond?

service cron restart

or by rebooting?

Posted

I rebooted after the changes last time, and tried again now by restarting cron, yet still the same result. All hard drives spin up during daily cron and spin down after 10 minutes. So there seems to be something we are missing. 
 

Posted
9 hours ago, Z11ntal33r said:

All hard drives spin up during daily cron and spin down after 10 minutes. So there seems to be something we are missing. 


Armbian-ramlog part only execute: "armbian-ramlog write" and if that command doesn't spin the HDD ... ?

 

Do you have anything else in daily cron?

Posted

I’ve several other jobs in daily cron, but none of them trigger HDD spin ups. I started to look into this by removing all jobs and adding one by one until HDDs are spun up during daily cron task. So commenting /etc/cron.daily/armbian-ram-logging without removing the file stops my HDDs from spinning up.

 

I’ll look into this later today when I get home. The way you have tested this is by running “/usr/lib/armbian/armbian-ramlog write”, correct?

Posted
8 minutes ago, Z11ntal33r said:

The way you have tested this is by running “/usr/lib/armbian/armbian-ramlog write”, correct?


Exactly. Because there was a problem in our script ... command blkid -s TYPE /dev/zram0 spins all hard drives up but now it only gets executed at boot time, when service is enabled.

Posted
2 hours ago, Z11ntal33r said:

I’ve several other jobs in daily cron,

could it be a correlation?
After updating the code you could try again by removing all of your jobs and adding one by one.

 

Posted

I've found the issue. As I did only check hd-idle log to see if the HDD's are spinning or not, it seems to be saying that the HDDs are spinning up when they actually are not. This is likely related to that hd-idle is using /proc/diskstats to read disk statistics and then writes the log to systemd as seen below. The hard drives are not spinning up and everything seems fine, disregarding the wrong logging. Thanks for the great support! :)

Server 1

Spoiler

| => sudo /usr/lib/armbian/armbian-ramlog write
  
Mon Jun 17 17:36:44 CEST 2019: Syncing logs to storage

sending incremental file list
auth.log
daemon.log
lastlog
messages
syslog
user.log
wtmp

sent 10,126,622 bytes  received 159 bytes  20,253,562.00 bytes/sec
total size is 27,876,771  speedup is 2.75

 => journalctl -u hd-idle
   
Jun 17 17:36:59 Tinkerboard hd-idle[774]: sda spinup
Jun 17 17:36:59 Tinkerboard hd-idle[774]: sdb spinup

Running the same command again gives


| => sudo /usr/lib/armbian/armbian-ramlog write

Mon Jun 17 18:06:37 CEST 2019: Syncing logs to storage

sending incremental file list
aptitude
auth.log
daemon.log
dpkg.log
fail2ban.log
syslog
apt/
apt/eipp.log.xz
apt/history.log
apt/term.log
nginx/access.log

 => journalctl -u hd-idle
Jun 17 17:36:59 Tinkerboard hd-idle[774]: sda spinup
Jun 17 17:36:59 Tinkerboard hd-idle[774]: sdb spinup
Jun 17 17:47:09 Tinkerboard hd-idle[774]: sda spindown
Jun 17 17:47:09 Tinkerboard hd-idle[774]: sdb spindown

Third time running the command gives


| => sudo /usr/lib/armbian/armbian-ramlog write

Mon Jun 17 18:11:57 CEST 2019: Syncing logs to storage

sending incremental file list
auth.log
daemon.log
syslog

| => journalctl -u hd-idle
Jun 17 17:36:59 Tinkerboard hd-idle[774]: sda spinup
Jun 17 17:36:59 Tinkerboard hd-idle[774]: sdb spinup
Jun 17 17:47:09 Tinkerboard hd-idle[774]: sda spindown
Jun 17 17:47:09 Tinkerboard hd-idle[774]: sdb spindown
Jun 17 18:07:29 Tinkerboard hd-idle[774]: sda spinup
Jun 17 18:07:29 Tinkerboard hd-idle[774]: sdb spinup

However, checking both drives with smartctl verifies that the hard drives are not spinning up, even when the log says they are.
 


| => sudo smartctl -i -d sat -n standby /dev/sda
smartctl 6.6 2017-11-05 r4594 [armv7l-linux-4.19.41-rockchip] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

Device is in STANDBY mode, exit(2)
________________________________________________________________________________
| => sudo smartctl -i -d sat -n standby /dev/sdb
smartctl 6.6 2017-11-05 r4594 [armv7l-linux-4.19.41-rockchip] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

Device is in STANDBY mode, exit(2)

| => journalctl -u hd-idle
Jun 17 17:36:59 Tinkerboard hd-idle[774]: sda spinup
Jun 17 17:36:59 Tinkerboard hd-idle[774]: sdb spinup
Jun 17 17:47:09 Tinkerboard hd-idle[774]: sda spindown
Jun 17 17:47:09 Tinkerboard hd-idle[774]: sdb spindown
Jun 17 18:07:29 Tinkerboard hd-idle[774]: sda spinup
Jun 17 18:07:29 Tinkerboard hd-idle[774]: sdb spinup
Jun 17 18:22:44 Tinkerboard hd-idle[774]: sda spindown
Jun 17 18:22:44 Tinkerboard hd-idle[774]: sdb spindown

 



 

Server 2

Spoiler

| => sudo /usr/lib/armbian/armbian-ramlog write

Mon 17 Jun 2019 18:37:24 CEST: Syncing logs to storage

sending incremental file list
deleting user.log.1
deleting syslog.1
deleting php7.0-fpm.log.1
deleting messages.1
deleting kern.log.1
deleting fail2ban.log.1
deleting debug.1
deleting daemon.log.1
deleting auth.log.1
deleting armbian-hardware-monitor.log.1.gz
deleting nginx/access.log.1
deleting samba/log.smbd.1
deleting samba/log.nmbd.1
./
aptitude
armbian-hardware-monitor.log
auth.log
daemon.log
debug
dpkg.log
fail2ban.log
kern.log
lastlog
messages
openvpn.log
php7.0-fpm.log
syslog
user.log
wtmp
apt/
apt/eipp.log.xz
apt/history.log
apt/term.log
nginx/
nginx/access.log
samba/
samba/log.nmbd
samba/log.smbd

sent 18,116,509 bytes  received 681 bytes  36,234,380.00 bytes/sec
total size is 18,110,220  speedup is 1.00

| => journalctl -u hd-idle
Jun 17 18:37:32 Tinkerboard hd-idle[862]: sda spinup
Jun 17 18:37:32 Tinkerboard hd-idle[862]: sdb spinup
Jun 17 18:37:32 Tinkerboard hd-idle[862]: sdc spinup

| => sudo smartctl -i -d sat -n standby /dev/sda
smartctl 6.6 2016-05-31 r4324 [armv7l-linux-4.19.41-rockchip] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

Device is in STANDBY mode, exit(2)
________________________________________________________________________________
| => sudo smartctl -i -d sat -n standby /dev/sdb
smartctl 6.6 2016-05-31 r4324 [armv7l-linux-4.19.41-rockchip] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

Device is in STANDBY mode, exit(2)
________________________________________________________________________________
| => sudo smartctl -i -d sat -n standby /dev/sdc
smartctl 6.6 2016-05-31 r4324 [armv7l-linux-4.19.41-rockchip] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

Device is in STANDBY mode, exit(2)

 

Posted

I've found, that SATA HDD are still spins-up during daily cron job. Short investigation showed, that the cause is the "sync" command in the syncToDisk() function. Replacing it with "sync /var" (or "sync /", to be safe), completely fixes this problem.

 

Posted

@bxm, In both functions, syncToDisk () and syncFromDisk (), it should be "sync /" instead of only "sync" (without the quotes) and the issue is fixed? If that's the case, could you add a PR request on Github for the change?

 

Update

I can confirm that changing "sync" to both "sync /" or "sync /var" did not wake up my hdds or trigger any hdd-spinup logging according to "systemctl status hd-idle". Thanks!:)

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines