Jump to content

How can I set the watchdog timeout of H2/H3 to 16 seconds ?


edupv

Recommended Posts

I think 16 seconds is the max. value of the watchdog timeout of H2/H3, However, I cannot set it to 16 seconds.

 

When watchdog-timeout = 16 in /etc/watchdog.conf, the timeout is 11 seconds :

/etc/watchdog.conf 
max-load-1		= 24
watchdog-device	= /dev/watchdog
realtime		= yes
priority		= 1
watchdog-timeout = 16
interval = 3




root@orangepizero:~# systemctl status watchdog.service
● watchdog.service - watchdog daemon
   Loaded: loaded (/lib/systemd/system/watchdog.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2018-01-29 11:10:27 HKT; 3min 31s ago
  Process: 851 ExecStart=/bin/sh -c [ $run_watchdog != 1 ] || exec /usr/sbin/watchdog $watchdog_options (code=exited, status=0/SUCCESS)
  Process: 835 ExecStartPre=/bin/sh -c [ -z "${watchdog_module}" ] || [ "${watchdog_module}" = "none" ] || /sbin/modprobe $watchdog_module (code=exited, status=0/SUCCESS)
 Main PID: 855 (watchdog)
   CGroup: /system.slice/watchdog.service
           └─855 /usr/sbin/watchdog

Jan 29 11:10:27 orangepizero watchdog[855]: int=3s realtime=yes sync=no soft=no mla=24 mem=0
Jan 29 11:10:27 orangepizero watchdog[855]: ping: no machine to check
Jan 29 11:10:27 orangepizero watchdog[855]: file: no file to check
Jan 29 11:10:27 orangepizero systemd[1]: Started watchdog daemon.
Jan 29 11:10:27 orangepizero watchdog[855]: pidfile: no server process to check
Jan 29 11:10:27 orangepizero watchdog[855]: interface: no interface to check
Jan 29 11:10:27 orangepizero watchdog[855]: temperature: no sensors to check
Jan 29 11:10:27 orangepizero watchdog[855]: test=none(0) repair=none(0) alive=/dev/watchdog heartbeat=none to=root no_act=no force=no
Jan 29 11:10:27 orangepizero watchdog[855]: watchdog now set to 11 seconds
Jan 29 11:10:27 orangepizero watchdog[855]: hardware watchdog identity: sunxi_wdt
root@orangepizero:~#

 

When watchdog-timeout = 7 in /etc/watchdog.conf, the timeout becomes 6 seconds :

/etc/watchdog.conf
max-load-1		= 24
watchdog-device	= /dev/watchdog
realtime		= yes
priority		= 1
watchdog-timeout = 7
interval = 3


root@orangepizero:~# systemctl status watchdog.service
● watchdog.service - watchdog daemon
   Loaded: loaded (/lib/systemd/system/watchdog.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2018-01-29 11:16:49 HKT; 1min 12s ago
  Process: 846 ExecStart=/bin/sh -c [ $run_watchdog != 1 ] || exec /usr/sbin/watchdog $watchdog_options (code=exited, status=0/SUCCESS)
  Process: 823 ExecStartPre=/bin/sh -c [ -z "${watchdog_module}" ] || [ "${watchdog_module}" = "none" ] || /sbin/modprobe $watchdog_module (code=exited, status=0/SUCCESS)
 Main PID: 849 (watchdog)
   CGroup: /system.slice/watchdog.service
           └─849 /usr/sbin/watchdog

Jan 29 11:16:49 orangepizero watchdog[849]: int=3s realtime=yes sync=no soft=no mla=24 mem=0
Jan 29 11:16:49 orangepizero watchdog[849]: ping: no machine to check
Jan 29 11:16:49 orangepizero watchdog[849]: file: no file to check
Jan 29 11:16:49 orangepizero systemd[1]: Started watchdog daemon.
Jan 29 11:16:49 orangepizero watchdog[849]: pidfile: no server process to check
Jan 29 11:16:49 orangepizero watchdog[849]: interface: no interface to check
Jan 29 11:16:49 orangepizero watchdog[849]: temperature: no sensors to check
Jan 29 11:16:49 orangepizero watchdog[849]: test=none(0) repair=none(0) alive=/dev/watchdog heartbeat=none to=root no_act=no force=no
Jan 29 11:16:49 orangepizero watchdog[849]: watchdog now set to 6 seconds
Jan 29 11:16:49 orangepizero watchdog[849]: hardware watchdog identity: sunxi_wdt
root@orangepizero:~#

 

How can I set the watchdog timeout to 16 seconds ?

 

Thanks.

 

Link to comment
Share on other sites

7 hours ago, chrisf said:

Have you tested it?

According to this

https://github.com/torvalds/linux/blob/master/drivers/watchdog/sunxi_wdt.c#L71

There is no option for 11 seconds, the value 11 (0xB ) maps to 16 seconds.

Thanks for your reply.

I don't know C language, so I seldom check the source code. 

 

After reading your reply, I tested the timeout with a script. Yes, it is really 16 seconds.

 

Thanks again for your help.

Link to comment
Share on other sites

10 hours ago, rufik said:

How is your watchdog doing? :)

Does it work reliable over the time?

I think it is working fine.

However, watchdog will only function when system hangs, it does not function normally. Therefore, I said "I think".

 

Link to comment
Share on other sites

My OPI PC (mainline kernel) just freezes from time to time, it respond to ping but I cannot ssh into it (waiting forever for session then disconnects), services does not respond also. It looks like some OOM or similar problems, I cannot check it via serial console because it's remote location.
So I thought that watchdog would be nice there, just to reset board in such cases.

Link to comment
Share on other sites

42 minutes ago, rufik said:

My OPI PC (mainline kernel) just freezes from time to time, it respond to ping but I cannot ssh into it (waiting forever for session then disconnects), services does not respond also. It looks like some OOM or similar problems, I cannot check it via serial console because it's remote location.
So I thought that watchdog would be nice there, just to reset board in such cases.

If your OPi PC can respond to ping, then it is not freeze and watchdog will not reset it normally.

I think you have to check (for example) if the sshd is listening to the correct port/interface, the firewall rules etc. If your OPi PC is not directly connected to the internet, then you should also check the port forwarding rule of your router etc....

 

Link to comment
Share on other sites

I've already checked - firewall is disabled all the time, because OPI PC is inside my LAN. Nmap show open ports 22, 8123 (HomeAssistant), 3306 (MySQL) and so on. But every service accepts TCP connection and does not respond at all, terminating connection after some timeout.

Sshd accepts connection, asks for password and hangs...until timeout. Ping works :) So it looks like some OS internal problem, maybe with memory and spawning processes/threads? That's why I'd like to try out watchdog.

Link to comment
Share on other sites

I'm just getting error starting watchdog service on OPI2 Ubuntu Bionic 4.14.70 like Cannot open /dev/watchdog (errno = 16 = 'Device or resource busy').

rufik@farmer:~$ sudo systemctl status watchdog
● watchdog.service - watchdog daemon
   Loaded: loaded (/lib/systemd/system/watchdog.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2018-10-11 10:52:08 CEST; 15s ago
  Process: 17410 ExecStopPost=/bin/sh -c [ $run_wd_keepalive != 1 ] || false (code=exited, status=1/FAILURE)
  Process: 17436 ExecStart=/bin/sh -c [ $run_watchdog != 1 ] || exec /usr/sbin/watchdog $watchdog_options (code=exited, status=0/SUCCESS)
  Process: 17433 ExecStartPre=/bin/sh -c [ -z "${watchdog_module}" ] || [ "${watchdog_module}" = "none" ] || /sbin/modprobe $watchdog_module (code=exited, status=
 Main PID: 17438 (watchdog)
   CGroup: /system.slice/watchdog.service
           └─17438 /usr/sbin/watchdog

Oct 11 10:52:08 farmer watchdog[17438]: starting daemon (5.14):
Oct 11 10:52:08 farmer watchdog[17438]: int=3s realtime=yes sync=no soft=no mla=0 mem=0
Oct 11 10:52:08 farmer watchdog[17438]: ping: no machine to check
Oct 11 10:52:08 farmer watchdog[17438]: file: no file to check
Oct 11 10:52:08 farmer watchdog[17438]: pidfile: no server process to check
Oct 11 10:52:08 farmer watchdog[17438]: interface: no interface to check
Oct 11 10:52:08 farmer watchdog[17438]: temperature: no sensors to check
Oct 11 10:52:08 farmer watchdog[17438]: test=none(0) repair=none(0) alive=/dev/watchdog heartbeat=none to=root no_act=no force=no
Oct 11 10:52:08 farmer watchdog[17438]: cannot open /dev/watchdog (errno = 16 = 'Device or resource busy')
Oct 11 10:52:08 farmer systemd[1]: Started watchdog daemon.

 

But /dev/watchdog seems not to be opened:

rufik@farmer:~$ sudo fuser -v /dev/watchdog
rufik@farmer:~$ sudo lsof /dev/watchdog

 

I have disabled wd_keepalive deamon - is it really required to run? Or just excludes with watchdog daemon?

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines