Tips on Controlling Clock Skew
(Originally published in
Electronic Design
Magazine, July, 1997)
Tiny differences
in propagation delay, when compounded across all the
clock nets in a complex digital product, often lead to
unacceptable degradations in overall system-timing
margins. This generic problem is often referred to as the
"clock skew" problem. Recently, our ability to
manage and control these tiny differences has been
improved by the introduction of a new generation of
multi-output, low-skew clock drivers. The latest examples
of these new parts will typically have one common clock
input and eight or more ganged outputs. The multi-output
parts are intended for use in a "spider" or
"tree" clock distribution topology, where each
output feeds only a single load (or a group of loads
located at the end of a long trace). Each trace is used
in point-to-point fashion. One working assumption of this
method is that the delays of all the clock traces will be
balanced.
Uncompensated
versions of the new, low-skew clock driver parts commonly
guarantee worst-case skew between all outputs of no more
than 500 ps. More sophisticated versions of these same
parts, with active skew correction circuitry included,
also are available. The active skew correction circuitry
can shave the worst-case output skew specification down
to as little as 50 ps. Both styles are a boon to
designers of high-speed digital circuitry, and I welcome
their development.
Regarding the
use of low-skew clock drivers, I would like to make two
points.
Point
1: The
skew specification between outputs has nothing to do
with the variation in delay from input to output.
Parts with a 500-ps output skew specification often
will have a variability as large as 1000 to 2000 ps
from input to ouput. In applications that use only
one clock driver chip, this subtle point is of no
consequence. Applications that require multi-level
clock trees are different. In a multi-level clock
tree, we need to control the worst case skew between
any of the leaf nodes, even if those leaf nodes are
sourced by different driver chips. Whenever the path
between two leaf nodes traverses a driver input, the
input-to-output skew specification for that driver
enters the overall skew equation. Good multi-level
trees need drivers with low input-to-output skew.
Some clock drivers use PLL technology to actually
advance their output timing to the point where it
closely matches the input timing, thus guaranteeing
good input-to-output skew performance. A pyramid of
these so-called zero-delay clock repeaters may be
cascaded to form large clock trees with very low
leaf-to-leaf skew.
Point
2: Our applications demand low skew
between clock signals as received. Our
low-skew clock drivers give us low skew between the
outputs as transmitted. To obtain the former
from the latter, we must do more than simply provide
equal length traces on all clock nets. We must use
traces of equal delay (remember that outer
layer, or microstrip, traces go a little faster than
inner layer, or stripline, traces), we must use the
same termination strategy on each trace, and we must
place the same loads at the end of each line. To the
extent that we have achieved these three objectives,
the trace delays will be properly balanced.
Figure 1
indicates how trace delay can vary with termination type
and capacitive load, even for short traces. This figure
charts the actual line delay versus line length for
various combinations of termination style and capacitive
loading. In the figure, the assumed signal risetime is
3.00 ns, the assumed trace impedance is 75 W, and the assumed trace
delay is 180 ps/inch (FR-4 stripline at 25°C).

Figure 1Trace
delay versus trace length, for various termination types
and values of loading.
Results for a
series-terminated line are shown in red. With zero
loading, the lowest red line shows an ideal delay of 180
ps/inch (540 ps at 3 inches). As each increment of 10 pF
is added to the line, the delay goes up by anywhere from
500 to 750 ps (this is approximately R*C, where R=75 W and C is the load
capacitance). In this example the extra delay introduced
by loading is quite large compared to the driver skew
specification, but we get a nicely damped response with
almost no overshoot or ringing.
Results for a
short, unterminated line are shown with blue dotted
lines. Note that these are non-linear curves. This
non-linearity renders most rule-of-thumb delay equations
quite useless. The non-linearity derives from minor
amounts of overshoot or undershoot at the end of the
unterminated line, which shift the precise time at which
the output signal crosses the clock threshold. This
non-linear effect dominates the delay performance of
short unterminated lines. (In this example we have
confined the line length to no more than 3 in., at which
point ringing is still controlled to an acceptable
degree.)
As a general
rule, shorter lines show less variation with capacitive
loading than longer lines. That is because a capacitive
load, if located near the driver, is directly charged-up
by the rather low source impedance of the driver. When
removed from the driver, even if only by a couple of
inches, the interposing series inductance of the trace
charges the capacitor more slowly, thus producing a
somewhat slower risetime and a correspondingly longer
effective trace delay
The ringing
effects cause unexpected results. For example, an
unterminated line with no load shows almost no variation
in line delay with length because, as the line gets
longer, the signal overshoot increases. The increasing
overshoot actually advances the clock threshold crossing
almost enough to compensate for the increase in line
length. As shown in figure 1, the effective delay of an
unloaded 3-in. unterminated trace (with 3-ns risetime
logic) is only 100 ps, not 540 ps as you would expect
from 3 in. x 180 psi. Even minor amounts of overshoot
cause this effect. The overshoot at the end of the 3-in.
line in this example is only 20 %.
If you are
serious about controlling clock skew, you will achieve
the best results by carefully balancing your clock
distribution tree. Pay close attention to the
specifications for input-to-output delay on the drivers.
Use the same drivers at every level of the clock
hierarchy. Balance the nominal trace delays at each
level. Use the same termination strategy on each line.
Balance the loading on each line, even if you have to add
dummy capacitors to one branch to balance out loads on
the other branches. Lastly, check your results with a
high-quality probe and high-bandwidth oscilloscope.
Tight control of
clock skew can be accomplished only with a complete
awareness of all the relevant circuit parameters. Merely
balancing the trace lengths is not enough.
All Publications by Dr. Howard Johnson except as noted.
Signal Integrity Training Classes taught exclusively by Dr. Howard Johnson -
for full schedule, see www.sigcon.com
© 1997 Signal Consulting, Inc., Dr. Howard
Johnson. All rights reserved.
site map
|