: Differences between tags 'r2_2_1' and 'PPS_0_5_0' : Using module 'linux21' in CVS repository '/root/21REP/linux21' : This patch was created automatically at Mon Feb 15 22:18:33 MET 1999 Index: linux/Documentation/kernel-time.txt diff -u /dev/null linux/Documentation/kernel-time.txt:1.2.4.1 --- /dev/null Mon Feb 15 22:18:37 1999 +++ linux/Documentation/kernel-time.txt Mon Feb 15 22:14:46 1999 @@ -0,0 +1,322 @@ +Kernel Time Overview +==================== + +This document gives a short overview about the code that handles time +in the Linux kernel in order to make it understandable what all the +NTP stuff is about. The description concentrates on the user's view of +kernel time. + +1. Hardware +=========== + +The kernel uses the timer (programmable periodic interrupt at rate +``HZ'') to increment the kernel clock. The kernel clock consists of +two counters. The first one (``tv_sec'') counts the seconds since +1970-01-01, while the second one (``tv_usec'') manages the fraction of +the current second in microseconds. During boot the first value is +derived from the time found in the CMOS clock (also known as ``real +time clock'') which only has a resolution of one second. + +During each interrupt a specific amount (the "tick", 1000000/HZ) of +microseconds is added to the fractional second. If a second overflow +occurs, it is handled accordingly. Normally the new kernel time is not +written back to the CMOS clock, but it can be done. + +If an application program asks for the current time, it is possible to +get an estimate of the elapsed time since the last timer interrupt, by +extrapolating the time. That one is called ``fast time offset''. + +On the Intel 386 architecture the registers in the timer chip can be +read to get a resolution of less than one millisecond. + +On a Pentium the CPU's cycle counter can be used to estimate the time +since the last timer interrupt by scaling the cycles elapsed during +two successive timer interrupts. The resolution is significantly +higher than for the timer's counter register (close to or even below +one microsecond). Furthermore it can be read much faster than the +hardware registers of the timer chip. Unfortunately modern +motherboards may vary the CPU frequency to avoid overheating the +processor or to save power. + +Other architectures have similar fast timing facilities. + +These are the components that determine the "resolution" of the kernel +clock. Also the "stability" (reliability or predictability) and +precision (the error accumulated during reading the time) is +determined by the hardware. + +2. Software +=========== + +While the hardware is responsible for the resolution and precision +of the clock, the software is responsible for calibrating the clock +and programming the hardware. Linux uses an extended algorithm +to manage system time. That algorithm is basically the NTP kernel +clock model described in RFC 1589, RFC 1305 and related papers. + +As most clocks are not very precise or stable, it is necessary to +correct the kernel clock from time to time. Traditionally it is +possible to overwrite the current kernel counters to make the time +"step" (settimeofday). Also it's possible to initiate a more gradual +change in time by specifying the amount of time that is to be +changed. This is called "slewing" the clock (adjtime). + +When slewing the clock, the value of "tick" is modified by a small +amount, called "tickadj" (500/HZ). That way the clock seems to be +running faster or slower for some time, and then continues to run at +nominal speed. Obviously tickadj not only affects the speed in which +an adjustment is made, but also the resolution of the adjustment. +Usually there is no adjustment below tickadj. One reason against using +the smallest possible value for tickadj is that the correction must be +more than the clock's intrinsic error; otherwise you'll never keep up +with the error. + +2.1 NTP extensions +================== + +Independent on which of the above two methods is used, the process has +to be repeated again and again to keep the deviation of the software +clock within a small interval. When you know the percentage of your +clock error (e.g. 15 seconds to fast within 24 hours), you can also +pass a correction value to the kernel, asking for automatic +correction of the frequency error. + +Such a correction value can be expressed in PPM (parts per million, +0.0001%) of the clock's "frequency". Additionally the enhanced clock +model can be told to learn from periodic adjustments to the time +offset, such that the virtual frequency of the kernel clock is +modified (you can't tune the hardware). Alternatively the frequency +can be set directly. As these corrections are limited by ``tickadj'' +(max <= HZ*tickadj), it's sometimes necessary to change the value of +``tick'' (if you want to keep tickadj small). + +Unfortunately specifying the frequency alone is not sufficient to make +a stable clock, because the frequency varies over time. These +variations in frequency are usually influenced by the environmental +temperature and are commonly named "drift". When the clock algorithm +learns from periodic adjustments, it also gets an estimate for the +drift. Thus at each timer interrupt a small "tolerance" (uncertainty) +is added to the clock's error. If the initial error is set when +synchonizing, the kernel maintains an estimate of the current maximum +error. + +The idea is to "synchronize" the kernel clock to a reference clock and +let it run free for a while, until the estimated error is too big to +be tolerated. (Here a few milliseconds is already "big", and typical +update intervals are between one minute and one hour) + +To make the automatic correction work, the basic algorithm is modified +in a way that can deal with fractions of microseconds, and tick is +temporarily modified to yield a value that results in an error of less +than one microsecond per timer interrupt. + +Besides that regulation and error propagating machinery, the kernel +clock also handles insertion and deletion of leap seconds if it is +told to do so. + +Managing a precise and stable kernel time requires periodic updates to +the clock offset and estimated error (tolerance) of the kernel clock. +The xntpd daemon is a program to do that (see +http://www.eecis.udel.edu/~ntp), but there's also a way to calibrate +the kernel clock by external hardware. One of these methods is called +the PPS (Pulse Per Second) code. + +2.1.1 PPS calibration +===================== + +PPS calibration works very similar to time calibration, but only on a +smaller scale. The code assumes the clock is closer than 0.5 seconds +to some reference time. + +Basically the code automatically compares the current clock offset for +each second with a high precision external pulse (with an error of +less than 200PPM, e.g. derived from a GPS receiver) and learns whether +the clock frequency is too slow or fast. The code can be told to +adjust only the time offset, or to adjust the frequency of the kernel. +The precision achievable with PPS synchronization is significantly +higher than with usual adjustments. + +2.2 The new Linux implementation +================================ + +In the Linux implementation the CMOS clock is updated periodically +(every 11 minutes) from the kernel time when the clock is synchronized +(flag STA_UNSYNC is cleared). The flag STA_UNSYNC will automatically +be set, once the maximum error is above a fixed limit +(NTP_PHASE_LIMIT, currently about 16s). + +Linux uses one function, adjtimex, to implement the functions for +classical adjtime() and the new ntp_gettime() and ntp_adjtime() +functions. In order to achieve that, the function uses one rather big +"struct kernel_timex" (currently just named ``timex'' to confuse +people) that contains a mode field and several state variables. By +setting individual bits in the modes field the specific function of +the adjtimex system call is selected while the remaining fields of the +structure are written or read as needed. + +struct kernel_timex { + unsigned int modes; /* mode selector */ + long offset; /* time offset (usec) */ + long freq; /* frequency offset (scaled ppm) */ + long maxerror; /* maximum error (usec) */ + long esterror; /* estimated error (usec) */ + int status; /* clock command/status */ + long constant; /* pll time constant */ + long precision; /* clock precision (usec) (read only) */ + long tolerance; /* clock frequency tolerance (ppm) + * (read only) + */ + struct timeval time; /* (read only) */ + long tick; /* (modified) usecs between clock ticks */ + + long ppsfreq; /* pps frequency (scaled ppm) (ro) */ + long jitter; /* pps jitter (us) (ro) */ + int shift; /* interval duration (s) (shift) (ro) */ + long stabil; /* pps stability (scaled ppm) (ro) */ + long jitcnt; /* jitter limit exceeded (ro) */ + long calcnt; /* calibration intervals (ro) */ + long errcnt; /* calibration errors (ro) */ + long stbcnt; /* stability limit exceeded (ro) */ + + int :32; int :32; int :32; int :32; + int tickadj; /* tickadj (rw) -- extension by UW */ + int :32; int :32; int :32; + int :32; int :32; int :32; int :32; +}; + +The official ntp_gettime() function only returns these data: + +struct ntptimeval { + struct timeval time; /* current time (ro) */ + long maxerror; /* maximum error (us) (ro) */ + long esterror; /* estimated error (us) (ro) */ +}; + +The official ntp_adjtime() function contains these data (note the lack +of `time'!): + +struct timex { + unsigned int modes; /* mode selector */ + long offset; /* time offset (usec) */ + long freq; /* frequency offset (scaled ppm) */ + long maxerror; /* maximum error (usec) */ + long esterror; /* estimated error (usec) */ + int status; /* clock command/status */ + long constant; /* pll time constant */ + long precision; /* clock precision (usec) (read only) */ + long tolerance; /* clock frequency tolerance (ppm) + * (read only) + */ + + long ppsfreq; /* pps frequency (scaled ppm) (ro) */ + long jitter; /* pps jitter (us) (ro) */ + int shift; /* interval duration (s) (shift) (ro) */ + long stabil; /* pps stability (scaled ppm) (ro) */ + long jitcnt; /* jitter limit exceeded (ro) */ + long calcnt; /* calibration intervals (ro) */ + long errcnt; /* calibration errors (ro) */ + long stbcnt; /* stability limit exceeded (ro) */ +}; + +The symbolic names of the bits used to select the operating mode +(`modes') of adjtimex() all start with "ADJ_" (Being a superset of the +"MOD_" bits defined for the ntp_{get,adj}time() functions). The +specific status of the kernel clock (`status') is also expressed by a +set of bits, all starting with "STA_". + +Thus the classical function "adjtime" can be implemented as follows: + +int adjtime(struct timeval *itv, struct timeval *otv) +{ + struct timex tx; + + tx.modes = 0; + if ( itv ) + { + tx.offset = itv->tv_sec * 1000000L + itv->tv_usec; + tx.modes = ADJ_OFFSET_SINGLESHOT; + } + if ( adjtimex(&tx) ) + return -1; + if ( otv ) + { + otv->tv_sec = tx.offset / 1000000; + if ( tx.offset < 0 ) + otv->tv_usec = -(-tx.offset % 1000000); + else + otv->tv_usec = tx.offset % 1000000; + } + return 0; +} + +The "ntp_gettime" and "ntp_adjtime" functions can be implemented as +follows: + +int ntp_gettime(struct ntptimeval *tptr) +{ + struct kernel_timex tx; + int result; + + tx.modes = 0; + result = adjtimex(&tx); + tptr->time = tx.time; + tptr->maxerror = tx.maxerror; + tptr->esterror = tx.esterror; + return(result); +} + +int ntp_adjtime(struct timex *tptr) +{ + struct kernel_timex tx; + int result; + + tx.modes = tptr->modes; + tx.offset = tptr->offset; + tx.frequency = tptr->frequency; + tx.maxerror = tptr->maxerror; + tx.esterror = tptr->esterror; + tx.status = tptr->status; + tx.constant = tptr->constant; + /* precision is (ro) */ + /* tolerance is (ro) */ + /* ppsfreq is (ro) */ + /* jitter is (ro) */ + /* shift is (ro) */ + /* stabil is (ro) */ + /* jitcnt is (ro) */ + /* calcnt is (ro) */ + /* errcnt is (ro) */ + /* stbcnt is (ro) */ + result = adjtimex(&tx); + tptr->modes = tx.modes; + tptr->offset = tx.offset; + tptr->frequency = tx.frequency; + tptr->maxerror = tx.maxerror; + tptr->esterror = tx.esterror; + tptr->status = tx.status; + tptr->constant = tx.constant; + tptr->precision tx.precision; + tptr->tolerance tx.tolerance; + tptr->ppsfreq tx.ppsfreq; + tptr->jitter tx.jitter; + tptr->shift tx.shift; + tptr->stabil tx.stabil; + tptr->jitcnt tx.jitcnt; + tptr->calcnt tx.calcnt; + tptr->errcnt tx.errcnt; + tptr->stbcnt tx.stbcnt; + return(result); +} + + +2.3 Problems +============ + +Due to a naming conflict for "struct timex" the kernel's structure has +been renamed from "timex" to "kernel_timex" in the examples above. The +kernel's structure is an equivalent superset of NTP's timex structure. + +It is planned to keep these things separate. + +Ulrich Windl +10th January 1999 Index: linux/include/linux/ppsclock.h diff -u /dev/null linux/include/linux/ppsclock.h:1.1.4.1 --- /dev/null Mon Feb 15 22:20:03 1999 +++ linux/include/linux/ppsclock.h Mon Feb 15 22:07:03 1999 @@ -0,0 +1,34 @@ +/* This file is used by xntpd to read the exact time of an external pulse via + * ioctl CIOGETEV. + * + * Copyright (c) by Ulrich Windl and Harald Koenig + */ +#ifndef _LINUX_PPSCLOCK_H_ +#define _LINUX_PPSCLOCK_H_ + + +#if defined(__GNU_LIBRARY__) && __GNU_LIBRARY__ >= 6 /* libc6 or glibc2 */ +# warning "Compatibility with this library has not been tested a lot!" +# include /* to define CIOGETEV */ +#else +# include /* to define CIOGETEV */ +#endif +#define PPSCLOCKSTR "ppsclock" /* purpose unknown, but in xntp3-5 */ + +struct ppsclockev { + struct timeval tv; /* timestamp of event */ + u_int serial; /* event counter */ +}; + +#ifdef __KERNEL__ + +#define PPSCLOCK_MAGIC 0x5003 + +struct pps { + int magic; /* "polymorphic magic tag" PPSCLOCK_MAGIC */ + struct ppsclockev ev; /* event structure */ +}; + +#endif /* __KERNEL__ */ + +#endif /* _LINUX_PPSCLOCK_H_ */