Google Groups Home
Help | Sign in
Linux NTP Kernel unsync flag remains long after NTP&Kernel have PPL sync
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 27 - Collapse all   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
Darryl Miles  
View profile
 More options 25 Aug, 09:52
Newsgroups: comp.protocols.time.ntp
From: darryl-mailingli...@netbauds.net (Darryl Miles)
Date: Mon, 25 Aug 2008 08:52:20 GMT
Local: Mon 25 Aug 2008 09:52
Subject: Linux NTP Kernel unsync flag remains long after NTP&Kernel have PPL sync
I am currently using NTP 4.2.2p1.

My problem is with the management of the "unsync" flag in the kernel.  
This is visible from the "status" line in :

ntpdc> kerninfo
pll offset:           4.7e-05 s
pll frequency:        -62.146 ppm
maximum error:        16.384 s
estimated error:      16.384 s
status:               0041  pll unsync
pll time constant:    2
precision:            1e-06 s
frequency tolerance:  512 ppm
ntpdc>

Older version of NTP used to manage this flag, this is my understanding
of how things used worked (based on observation not understanding) :

 * Kernel would bootup and by default the status would have the "unsync"
bit set.
 * Then NTP would be started.
 * NTP would take a few minutes to obtain PLL lock with multiple time
sources.
 * Then select a preferred source as candidate to configure the kernel.
 * NTP would then configure the kernel PLL to obtain convergence.
 *** Once convergence was complete the 0x40 UNSYNC bitwise flag would be
reset in the kernel by NTP. ***
 * NTP would continue to monitor/manage/update the kernel PLL.

It appears since between version 4.2.0.a.20040617 and 4.2.2p1 the
penultimate item in the list above is no longer occurring.

Question 1) Can someone confirm is the "UNSYNC" status flag held inside
the kernel is arbitrary, i.e. its just an informational flag and is
independent of the operation / function of NTP ?

Question 2) Am I correctly interpreting the purpose of the UNSYNC flag
?  I have a periodic script that runs and checks to see if adjtimex
reports the nominal status of 0x01 PLL, as opposed to (anything else for
example 0x41 = PLL|UNSYNC).   This used to provide me with a mechanism
to alert me to problems with NTP configuration when a host became UNSYNC
so that as administrator I could investigate why a system became unsync.

Question 3) If NTP exits/crashes does the kernel automatically re-arm
the UNSYNC flag if the PLL data has not been updated within a specified
period of time (like within 3 minutes) ?  i.e. the kernel will fail-safe
back to UNSYNC if it can clearly observe that no application has called
the appropiate NTP API to keep the UNSYNC status flag muted.  This is a
sort of watchdog that does the correct thing in the case of failure ?

When I googled this problem I found a suggestion that "enable kernel"
command can do the trick.  I do use NTP keys between my external data
sources and when I tried this command into ntpdc it asked me for a key.

ntpdc> enable kernel
Keyid: 1
MD5 Password:
***Permission denied
ntpdc>

Both systems run ntp as non-root, both systems have the appropriate Linux kernel capability bit set CAP_SYS_TIME :

# cat /proc/3268/status
CapInh: 0000000002000000
CapPrm: 0000000002000000
CapEff: 0000000002000000

I guess in order to configure the PLL ntp is going to need that capability anyway.

So I'm at a bit for a loss as what the cause of the UNSYNC flag sticking
long after both NTP and the kernel have obtain a good enough PLL sync to
believe they are "in-step".

Thanks,

Darryl


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Unruh  
View profile
 More options 26 Aug, 20:22
Newsgroups: comp.protocols.time.ntp
From: Unruh <unruh-s...@physics.ubc.ca>
Date: Tue, 26 Aug 2008 19:22:57 GMT
Local: Tues 26 Aug 2008 20:22
Subject: Re: Linux NTP Kernel unsync flag remains long after NTP&Kernel have PPL sync

darryl-mailingli...@netbauds.net (Darryl Miles) writes:
>I am currently using NTP 4.2.2p1.
>My problem is with the management of the "unsync" flag in the kernel.  
>This is visible from the "status" line in :

The kernel manages that flag. It has problems, for example the 11 min
write-to-rtc which can mess up any attempt to maintain rtc statistics and
drift. Ie it is better to have it off
.

ntp has nothing to do with the kernel. The kernel is Linus Torvald's
business, not David Mills (much to that latter's annoyance when the kernel
people mess up the kernel timekeeping).

As far as I know, the only purpose of that flag is turn on the 11 min rtc
procedure ( evey 11 min the kernel resets the rtc to the current system
time) with a very inaccurate procedure.

> * NTP would continue to monitor/manage/update the kernel PLL.
>It appears since between version 4.2.0.a.20040617 and 4.2.2p1 the
>penultimate item in the list above is no longer occurring.
>Question 1) Can someone confirm is the "UNSYNC" status flag held inside
>the kernel is arbitrary, i.e. its just an informational flag and is
>independent of the operation / function of NTP ?
>Question 2) Am I correctly interpreting the purpose of the UNSYNC flag
>?  I have a periodic script that runs and checks to see if adjtimex
>reports the nominal status of 0x01 PLL, as opposed to (anything else for
>example 0x41 = PLL|UNSYNC).   This used to provide me with a mechanism
>to alert me to problems with NTP configuration when a host became UNSYNC
>so that as administrator I could investigate why a system became unsync.

A far better idea is to monitor the offset from the ntp servers to let you
know if there is a clock problem.

>Question 3) If NTP exits/crashes does the kernel automatically re-arm
>the UNSYNC flag if the PLL data has not been updated within a specified
>period of time (like within 3 minutes) ?  i.e. the kernel will fail-safe
>back to UNSYNC if it can clearly observe that no application has called
>the appropiate NTP API to keep the UNSYNC status flag muted.  This is a
>sort of watchdog that does the correct thing in the case of failure ?

I do not think so.

Leave it unsynced. It serves no useful purpose AFAIK. hwclock is a much better
idea to use to set the rtc, and does a much better job of it ( including
determining the drift of the rtc and compensating for it. )


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David Woolley  
View profile
 More options 26 Aug, 21:56
Newsgroups: comp.protocols.time.ntp
From: David Woolley <da...@ex.djwhome.demon.co.uk.invalid>
Date: Tue, 26 Aug 2008 21:56:30 +0100
Local: Tues 26 Aug 2008 21:56
Subject: Re: Linux NTP Kernel unsync flag remains long after NTP&Kernel have PPL sync

Unruh wrote:
> darryl-mailingli...@netbauds.net (Darryl Miles) writes:

>> I am currently using NTP 4.2.2p1.

Rather old.

>> My problem is with the management of the "unsync" flag in the kernel.  
>> This is visible from the "status" line in :

> The kernel manages that flag. It has problems, for example the 11 min
> write-to-rtc which can mess up any attempt to maintain rtc statistics and
> drift. Ie it is better to have it off

ntpd manages that flag.  The kernel only changes it on a manual time set
or when the estimated error overflows.

Status is preset to just UNSYNC.

>> * Then NTP would be started.
>> * NTP would take a few minutes to obtain PLL lock with multiple time
>> sources.

ntpd obtains PLL lock with a mix of all the good time sources, unless
you force it to use one.  It chooses the reference source long before it
has tight PLL lock.

>> * Then select a preferred source as candidate to configure the kernel.

ntpd's source selection doesn't affect what is fed to the kernel.
Unless one specifically inhibits it, that is a mix of all the valid time
sources.

>> * NTP would then configure the kernel PLL to obtain convergence.
>> *** Once convergence was complete the 0x40 UNSYNC bitwise flag would be
>> reset in the kernel by NTP. ***

ntpd releases UNSYNC as soon as it starts disciplining the kernel, which
is long before the PLL has stabilised.

> ntp has nothing to do with the kernel. The kernel is Linus Torvald's
> business, not David Mills (much to that latter's annoyance when the kernel
> people mess up the kernel timekeeping).

This part of the kernel was contributed by the NTP project!  The UNSYNC
flag is not masked out in the API, so is controlled by ntpd.

> As far as I know, the only purpose of that flag is turn on the 11 min rtc
> procedure ( evey 11 min the kernel resets the rtc to the current system
> time) with a very inaccurate procedure.

It might not be optimal, but it does take steps to improve accuracy.

>> * NTP would continue to monitor/manage/update the kernel PLL.

>> It appears since between version 4.2.0.a.20040617 and 4.2.2p1 the
>> penultimate item in the list above is no longer occurring.

>> Question 1) Can someone confirm is the "UNSYNC" status flag held inside
>> the kernel is arbitrary, i.e. its just an informational flag and is
>> independent of the operation / function of NTP ?

As noted, I think I believe it controls setting of the RTC clock every
11 minutes.  It doesn't look like it affects the behaviour of the kernel
discipline.  But then you could have looked at the code, the same as I did.

>> Question 2) Am I correctly interpreting the purpose of the UNSYNC flag
>> ?  I have a periodic script that runs and checks to see if adjtimex
>> reports the nominal status of 0x01 PLL, as opposed to (anything else for
>> example 0x41 = PLL|UNSYNC).   This used to provide me with a mechanism
>> to alert me to problems with NTP configuration when a host became UNSYNC
>> so that as administrator I could investigate why a system became unsync.

> A far better idea is to monitor the offset from the ntp servers to let you
> know if there is a clock problem.

UNSYNC will tell you when you haven't had updates.

>> Question 3) If NTP exits/crashes does the kernel automatically re-arm
>> the UNSYNC flag if the PLL data has not been updated within a specified
>> period of time (like within 3 minutes) ?  i.e. the kernel will fail-safe
>> back to UNSYNC if it can clearly observe that no application has called
>> the appropiate NTP API to keep the UNSYNC status flag muted.  This is a
>> sort of watchdog that does the correct thing in the case of failure ?

UNSYNC gets set (2.4 kernel) when you manually set the time or the
estimated error overflows.  In your case, the estimated error is at the
end stop.  That could be cause or effect, as setting the time manually
also forces the error to maximum.

>> When I googled this problem I found a suggestion that "enable kernel"
>> command can do the trick.  I do use NTP keys between my external data
>> sources and when I tried this command into ntpdc it asked me for a key.

The kernel discipline is enabled by default, provided that you don't
have configuration parameters that incompatible.  If you did have
incompatible parameters, I don't believe you would get a PLL state.

>> ntpdc> enable kernel
>> Keyid: 1
>> MD5 Password:
>> ***Permission denied
>> ntpdc>

>> Both systems run ntp as non-root, both systems have the appropriate Linux kernel capability bit set CAP_SYS_TIME :

I'm not sure if that mode is supported in the official code.

Being unsynced indicates a problem.  The end stop estimated errors also
indicate a problem.  If you don't want the 11 minute mode, build a
kernel without it.

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David L. Mills  
View profile
 More options 26 Aug, 22:00
Newsgroups: comp.protocols.time.ntp
From: "David L. Mills" <mi...@udel.edu>
Date: Tue, 26 Aug 2008 21:00:38 +0000
Local: Tues 26 Aug 2008 22:00
Subject: Re: Linux NTP Kernel unsync flag remains long after NTP&Kernel have PPL sync
Darryl,

It doesn't make sense to manage that bit. Use the maximum error
statistic instead.

When the client is first started until setting the clock, this statistic
will be large (~16 s), as it is in your example. Once the clock is set
and after that this statistic is set to the synchronization distance
determined by the daemon.

If the daemon crashes or loses all sources, the kernel will increase the
distance as required by the specification. Application programs can
establish their own bound (~1 s) above which they consider the clock
unsynchronized. The problem with managing the bit is that the kernel
doesn't know your particular bound.

Dave


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David Woolley  
View profile
 More options 26 Aug, 22:56
Newsgroups: comp.protocols.time.ntp
From: David Woolley <da...@ex.djwhome.demon.co.uk.invalid>
Date: Tue, 26 Aug 2008 22:56:30 +0100
Local: Tues 26 Aug 2008 22:56
Subject: Re: Linux NTP Kernel unsync flag remains long after NTP&Kernel have PPL sync

David L. Mills wrote:

> It doesn't make sense to manage that bit. Use the maximum error
> statistic instead.

Whilst that is a good suggestion.  Those statistics are also showing an
alarm state in this case.

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David L. Mills  
View profile
 More options 27 Aug, 03:47
Newsgroups: comp.protocols.time.ntp
From: "David L. Mills" <mi...@udel.edu>
Date: Wed, 27 Aug 2008 02:47:47 +0000
Local: Wed 27 Aug 2008 03:47
Subject: Re: Linux NTP Kernel unsync flag remains long after NTP&Kernel have PPL sync
David,

The bit is never set, so the system calls never show error.

Dave


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David Woolley  
View profile
 More options 27 Aug, 07:29
Newsgroups: comp.protocols.time.ntp
From: David Woolley <da...@ex.djwhome.demon.co.uk.invalid>
Date: Wed, 27 Aug 2008 07:29:20 +0100
Local: Wed 27 Aug 2008 07:29
Subject: Re: Linux NTP Kernel unsync flag remains long after NTP&Kernel have PPL sync

David L. Mills wrote:
> David,

> The bit is never set, so the system calls never show error.

That conflicts with the evidence presented by the questioner.  I think
it is true that ntpd never sets it in the kernel(although 4.2.4p4 (which
is more recent than his) does set it in the user space copy.  However
the kernel does set it, as I already noted, on startup, when the time is
set manually, and when the estimated error hits its end stop.

However, that is largely irrelevant, as one could rephrase the question
to be, earlier versions of ntpd used to set the estimated error to soem
low value when started, why is his version leaving it set at 16+ seconds?

(I suspect user error.)


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David L. Mills  
View profile
 More options 27 Aug, 16:21
Newsgroups: comp.protocols.time.ntp
From: "David L. Mills" <mi...@udel.edu>
Date: Wed, 27 Aug 2008 15:21:52 +0000
Local: Wed 27 Aug 2008 16:21
Subject: Re: Linux NTP Kernel unsync flag remains long after NTP&Kernel have PPL sync
David,

The NTP development version on the web (p125) does not set the
STA_UNSYNC bit anywhere. A grep for this bit shows only legacy means for
ntpdc to clear it. While the production version on the web is dated one
day before the development version, its ntp_loopfilter.c file is dated
February 2007 and does set it.

Unfortunately, the productino version and stable version are on two
different tracks and with different heritage of individual modules. I
would hope that a version of the release date would have been
synchronized to the development version of that date, but this is not
the case. Accordingly, you can't believe anytnhing I say or can I fix
anything you report, unless you are using a relatively recent
development version. This holds true for all presumed features, bugs and
documentation.

As some of you know, I have been working full time since June 2007
cleaning up the code, aligning to the NTPv4 specification, adding new
features and rewriting much of the web documentation. The core protocol
modules in the production version date from late 2006 and early 2007, so
most of the work reported to this list and the hackers list is not in
the production version. So, if you suspect I have done something evil
and are using the production version, I can't help you.

Dave


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Linux NTP Kernel unsync flag remains long after NTP & Kernel have PPL sync" by Steve Kostecke
Steve Kostecke  
View profile
 More options 27 Aug, 17:50
Newsgroups: comp.protocols.time.ntp
From: Steve Kostecke <koste...@ntp.org>
Date: 27 Aug 2008 16:50:14 GMT
Local: Wed 27 Aug 2008 17:50
Subject: Re: Linux NTP Kernel unsync flag remains long after NTP & Kernel have PPL sync
On 2008-08-27, David L. Mills <mi...@udel.edu> wrote:

> David Woolley wrote:

>> David L. Mills wrote:

>>> The bit is never set, so the system calls never show error.

>> That conflicts with the evidence presented by the questioner.  I think
>> it is true that ntpd never sets it in the kernel(although 4.2.4p4 (which
>> is more recent than his) does set it in the user space copy.

> The NTP development version on the web (p125) does not set the
> STA_UNSYNC bit anywhere. A grep for this bit shows only legacy means for
> ntpdc to clear it. While the production version on the web is dated one
> day before the development version, its ntp_loopfilter.c file is dated
> February 2007 and does set it.

[snip]

> As some of you know, I have been working full time since June 2007
> cleaning up the code, aligning to the NTPv4 specification, adding new
> features and rewriting much of the web documentation. The core protocol
> modules in the production version date from late 2006 and early 2007, so
> most of the work reported to this list and the hackers list is not in
> the production version. So, if you suspect I have done something evil
> and are using the production version, I can't help you.

Both the NTP-stable and NTP-Dev releases are given equal billing on the
NTP Project download page (http://www.ntp.org/downloads.html) and the
NTP Public Services Project download page
(http://support.ntp.org/download).

All NTP releases are announced in a variety of ways which are detailed
at http://support.ntp.org/bin/view/Main/ReleaseNotifications. We do
not yet offer a dedicated release announcements mailing list but would
consider doing so if there was sufficient interest.

Notifications for every NTP-dev release are sent to the hackers@ mailing
list at the time of the release. These notifications contain change
information as well as download links.

We have made an effort to insure that all releases are well publicized.
But we can't control what version of NTP is shipped with the many OSes
which are out there. Plus, quite a few people prefer to stick with the
software versions which are pre-packaged for and shipped with their
particular OS.

For BSD users FreshPorts currently lists ntp-devel 4.2.5p122
and ntp 4.2.4p4 at http://www.freshports.org/net/ntp-devel/ and
http://www.freshports.org/net/ntp/ respectively.

Debian does not ship any ntp-dev packages so I have set up a build
system for Debian packages of ntp-dev (against the current Debian
stable release "etch" on x86). These packages are available from
http://packages.ntp.org/debian.

If there is sufficient interest "we" could look at leveraging
the openSUSE Build Service, https://build.opensuse.org/, for
building packages for other Linux OSes and architectures. A list
of the supported RPM-based OSes is available about half-way down
http://en.opensuse.org/Build_Service/cross_distribution_package_how_to.
I may need assistance from someone well versed in building RPM packages
and setting up RPM archives.

--
Steve Kostecke <koste...@ntp.org>
NTP Public Services Project - http://support.ntp.org/


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.