<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Daemon-Reload on RESEARCHUT</title><link>https://researchut.com/tags/daemon-reload/</link><description>Recent content in Daemon-Reload on RESEARCHUT</description><generator>Hugo -- gohugo.io</generator><language>en</language><managingEditor>rrs@researchut.com (Ritesh Raj Sarraf)</managingEditor><webMaster>rrs@researchut.com (Ritesh Raj Sarraf)</webMaster><lastBuildDate>Fri, 22 Apr 2022 00:00:00 +0000</lastBuildDate><atom:link href="https://researchut.com/tags/daemon-reload/index.xml" rel="self" type="application/rss+xml"/><item><title>Systemd Service Hang</title><link>https://researchut.com/blog/Systemd_Service_Hang/</link><pubDate>Fri, 22 Apr 2022 00:00:00 +0000</pubDate><author>rrs@researchut.com (Ritesh Raj Sarraf)</author><guid>https://researchut.com/blog/Systemd_Service_Hang/</guid><description>&lt;p>Finally, TIL, what can all be the reason for systemd services to hang indefinitely. The internet is flooded with numerous reports on this topic but no clear answers. So no more uselessly marked workarounds like: &lt;code>systemctl daemon-reload&lt;/code> and &lt;code>systemctl-daemon-reexec&lt;/code> for this scenario.&lt;/p>
&lt;p>The scene would be something along the lines of:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>rrs &lt;span style="color:#ae81ff">6467&lt;/span> 0.0 0.0 &lt;span style="color:#ae81ff">23088&lt;/span> &lt;span style="color:#ae81ff">15852&lt;/span> pts/1 Ss 12:53 0:00 | | &lt;span style="color:#ae81ff">\_&lt;/span> /bin/bash
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>rrs &lt;span style="color:#ae81ff">11512&lt;/span> 0.0 0.0 &lt;span style="color:#ae81ff">14876&lt;/span> &lt;span style="color:#ae81ff">4608&lt;/span> pts/1 S+ 13:18 0:00 | | | &lt;span style="color:#ae81ff">\_&lt;/span> systemctl restart snapper-timeline.timer
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>rrs &lt;span style="color:#ae81ff">11513&lt;/span> 0.0 0.0 &lt;span style="color:#ae81ff">14984&lt;/span> &lt;span style="color:#ae81ff">3076&lt;/span> pts/1 S+ 13:18 0:00 | | | &lt;span style="color:#ae81ff">\_&lt;/span> /bin/systemd-tty-ask-password-agent --watch
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>rrs &lt;span style="color:#ae81ff">11514&lt;/span> 0.0 0.0 &lt;span style="color:#ae81ff">234756&lt;/span> &lt;span style="color:#ae81ff">6752&lt;/span> pts/1 Sl+ 13:18 0:00 | | | &lt;span style="color:#ae81ff">\_&lt;/span> /usr/bin/pkttyagent --notify-fd &lt;span style="color:#ae81ff">5&lt;/span> --fallback
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The &lt;code>snapper-timeline&lt;/code> service is important to me and it not running for months is a complete failure. Disappointingly, commands like &lt;code>systemctl --failed&lt;/code> do not report of this oddity. The overall system status is reported to be fine, which is completely incorrect.&lt;/p>
&lt;p>Thankfully, a kind soul&amp;rsquo;s &lt;a href="https://github.com/NixOS/nixpkgs/issues/2584#issuecomment-42616675">comment&lt;/a> gave the hint. The problem is that you could be having certain services in &lt;code>Activating&lt;/code> status, which thus blocks all other services; quietly. So much for the unnecessary fun.&lt;/p>
&lt;p>Looking further, in my case, it was:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>rrs@priyasi:~$ systemctl list-jobs
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>JOB UNIT TYPE STATE
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">81&lt;/span> timers.target start waiting
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">85&lt;/span> man-db.timer start waiting
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">88&lt;/span> fstrim.timer start waiting
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">3832&lt;/span> snapper-timeline.service start waiting
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">83&lt;/span> snapper-timeline.timer start waiting
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">39&lt;/span> systemd-time-wait-sync.service start running
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">87&lt;/span> logrotate.timer start waiting
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">84&lt;/span> debspawn-clear-caches.timer start waiting
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">89&lt;/span> plocate-updatedb.timer start waiting
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">91&lt;/span> dpkg-db-backup.timer start waiting
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">93&lt;/span> e2scrub_all.timer start waiting
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">40&lt;/span> time-sync.target start waiting
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">86&lt;/span> apt-listbugs.timer start waiting
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ae81ff">13&lt;/span> jobs listed.
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>13:12 ♒ ॐ ♅ ♄ ⛢ ☺ 😄
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>That was it. I knew the &lt;code>systemd-timesyncd&lt;/code> service, in the past, had given me enough headaches. And so was it this time, just quietly doing it all again.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>rrs@priyasi:~$ systemctl status systemd-time-wait-sync.service
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>● systemd-time-wait-sync.service - Wait Until Kernel Time Synchronized
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> Loaded: loaded &lt;span style="color:#f92672">(&lt;/span>/lib/systemd/system/systemd-time-wait-sync.service; enabled; vendor preset&amp;gt;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> Active: activating &lt;span style="color:#f92672">(&lt;/span>start&lt;span style="color:#f92672">)&lt;/span> since Fri 2022-04-22 13:14:25 IST; 1min 38s ago
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> Docs: man:systemd-time-wait-sync.service&lt;span style="color:#f92672">(&lt;/span>8&lt;span style="color:#f92672">)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> Main PID: &lt;span style="color:#ae81ff">11090&lt;/span> &lt;span style="color:#f92672">(&lt;/span>systemd-time-wa&lt;span style="color:#f92672">)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> Tasks: &lt;span style="color:#ae81ff">1&lt;/span> &lt;span style="color:#f92672">(&lt;/span>limit: 37051&lt;span style="color:#f92672">)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> Memory: 836.0K
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> CPU: 7ms
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> CGroup: /system.slice/systemd-time-wait-sync.service
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> └─11090 /lib/systemd/systemd-time-wait-sync
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Apr &lt;span style="color:#ae81ff">22&lt;/span> 13:14:25 priyasi systemd&lt;span style="color:#f92672">[&lt;/span>1&lt;span style="color:#f92672">]&lt;/span>: Starting Wait Until Kernel Time Synchronized...
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Apr &lt;span style="color:#ae81ff">22&lt;/span> 13:14:25 priyasi systemd-time-wait-sync&lt;span style="color:#f92672">[&lt;/span>11090&lt;span style="color:#f92672">]&lt;/span>: adjtime state &lt;span style="color:#ae81ff">5&lt;/span> status &lt;span style="color:#ae81ff">40&lt;/span> time Fri 2022-&amp;gt;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>13:16 ♒ ॐ ♅ ♄ ⛢ ☹ 😟&lt;span style="color:#f92672">=&lt;/span>&amp;gt; &lt;span style="color:#ae81ff">3&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Dear LazyWeb, anybody knows of why the &lt;code>systemd-time-wait-sync&lt;/code> service would hang indefinitely? I&amp;rsquo;ve had identical setups on many machines, in the same network, where others don&amp;rsquo;t exhibit this problem.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>rrs@priyasi:~$ systemctl cat systemd-time-wait-sync.service
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>...snipped...
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#f92672">[&lt;/span>Service&lt;span style="color:#f92672">]&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Type&lt;span style="color:#f92672">=&lt;/span>oneshot
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>ExecStart&lt;span style="color:#f92672">=&lt;/span>/lib/systemd/systemd-time-wait-sync
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>TimeoutStartSec&lt;span style="color:#f92672">=&lt;/span>infinity
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>RemainAfterExit&lt;span style="color:#f92672">=&lt;/span>yes
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#f92672">[&lt;/span>Install&lt;span style="color:#f92672">]&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>WantedBy&lt;span style="color:#f92672">=&lt;/span>sysinit.target
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The &lt;code>TimeoutStartSec=infinity&lt;/code> is definitely an attribute that shouldn&amp;rsquo;t be shipped in any system services. There are use cases for it but that should be left for local admins to explicitly decide. Hanging for &lt;code>infinity&lt;/code> is not a desired behavior for a system service.&lt;/p>
&lt;p>In figuring all this out, today I learnt the handy &lt;code>systemctl list-jobs&lt;/code> command, which will give the list of active &lt;code>running/blocked/waiting&lt;/code> jobs.&lt;/p>
&lt;h2 id="update-2024-08-15">Update: 2024-08-15&lt;/h2>
&lt;p>This week I finally found the cause of the issue. I have a bunch of bridge interfaces defined on my machine. And all of them, most of their time, would be &lt;code>DOWN&lt;/code>&lt;/p>
&lt;pre tabindex="0">&lt;code>
@ ip a
1: lo: &amp;lt;LOOPBACK,UP,LOWER_UP&amp;gt; mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
2: withnet: &amp;lt;NO-CARRIER,BROADCAST,MULTICAST,UP&amp;gt; mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether XXXXXXXXXXXXXXXXX brd ff:ff:ff:ff:ff:ff
3: nonet: &amp;lt;NO-CARRIER,BROADCAST,MULTICAST,UP&amp;gt; mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether XXXXXXXXXXXXXXXXX brd ff:ff:ff:ff:ff:ff
4: tap0: &amp;lt;NO-CARRIER,BROADCAST,MULTICAST,UP&amp;gt; mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
link/ether XXXXXXXXXXXXXXXXX brd ff:ff:ff:ff:ff:ff
6: wlan0: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt; mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether XXXXXXXXXXXXXXXXX brd ff:ff:ff:ff:ff:ff
inet 10.42.1.66/24 brd 10.42.1.255 scope global dynamic noprefixroute wlan0
valid_lft 5325sec preferred_lft 5325sec
inet6 fe80::9bc0:e362:7c9d:be7c/64 scope link noprefixroute
valid_lft forever preferred_lft forever
7: email-laptop: &amp;lt;POINTOPOINT,NOARP,UP,LOWER_UP&amp;gt; mtu 1280 qdisc noqueue state UNKNOWN group default qlen 1000
link/none
inet XXXXXXXXXXXXXXXXX scope global noprefixroute email-laptop
valid_lft forever preferred_lft forever
inet6 XXXXXXXXXXXXXXXXX scope global noprefixroute
valid_lft forever preferred_lft forever
inet6 XXXXXXXXXXXXXXXXXXXXXXXX/64 scope link noprefixroute
valid_lft forever preferred_lft forever
⛢ 13:55:21 rrs@priyasi ~/pu/researchut-hugo researchut|+1…2
&lt;/code>&lt;/pre>&lt;p>And as it stands, that is the cause of the problem.&lt;/p>
&lt;p>&lt;code>systemd-timesyncd&lt;/code> would invoke an attempt to sync time for all the interfaces defined under &lt;code>systemd-networkd&lt;/code>. And given the setup, that is bound to &lt;del>fail&lt;/del> timeout; which is why I had added the override against:&lt;/p>
&lt;pre tabindex="0">&lt;code>TimeoutStartSec=infinity
&lt;/code>&lt;/pre>&lt;p>So, apparently, for every interface in &lt;code>networkd&lt;/code>, it&amp;rsquo;d attempt the ntp sync. It reports along the lines of:&lt;/p>
&lt;pre tabindex="0">&lt;code>Aug 15 13:24:06 priyasi systemd-timesyncd[1987]: Network configuration changed, trying to establish connection.
&lt;/code>&lt;/pre>&lt;p>So I learnt this week, of the attribute to handle this:&lt;/p>
&lt;pre tabindex="0">&lt;code>@ cat sysbr0.network
[Match]
Name=withnet
[Network]
DHCPServer=yes
IPv4Forwarding=yes
IPv6Forwarding=yes
IPMasquerade=both
Address=192.168.1.1/24
LLMNR=yes
[Link]
RequiredForOnline=no
#[DHCPServer]
#DNS=192.168.1.1
&lt;/code>&lt;/pre>&lt;p>With that, NTP time synchronization is back to what it should be. Precision synchronization of the system time.&lt;/p>
&lt;pre tabindex="0">&lt;code>ॐ 14:11:35 rrs@priyasi /etc/systemd/network
@ timedatectl show
LocalRTC=no
CanNTP=yes
NTP=yes
NTPSynchronized=yes
TimeUSec=Thu 2024-08-15 14:12:45 IST
RTCTimeUSec=Thu 2024-08-15 14:12:45 IST
&lt;/code>&lt;/pre></description></item></channel></rss>