Overview:
The mathematical foundation of common integrated
circuit yield models based on the assumption that the yield is dominated by
random point defects is discussed. Various mathematical models which are commonly
used to account for defect clustering are given a physical interpretation and
are compared mathematically and graphically. It is shown that the yield of
systems with circuit redundancy can be substantially affected by defect
clustering and hence that a correct understanding of defects and yield is
essential to predict the yields of wafer scale products.
Introduction:
Some basic definitions that we will need in order
to understand the concepts of Yield and Yield Modeling.
In our case, the yield of an integrated circuit is
the fraction of IC chips that meet a specific set of functional requirements
out of the total number of chips manufactured. In other words, it's the
probability that a given chip chosen randomly will be functional.
Yield is divided into two components:
1. Parametric Yield:
This quantifies deviations in device and material parameters such as:
a- Threshold voltage
b- Sheet resistance
Both of these parameters can cause circuits to fail to meet specific functional
requirements such as:
a- Speed margin
b- Noise margin
2. Random Defect Yield:
It is associated with problems like undesired short circuits and open circuits which result in incorrect logical functioning.
The mechanisms that cause defects are many and varied, including silicon material defects, sodium contamination, and particulate based pattern disruption. Every processing step can introduce or cause defects. The probability that a defect will result from a processing step, or steps, is a function of the chip area, pattern features, and the defect level of the process.
Defect
Definition:
In modeling yield of redundancy purposes in a wafer scale system, we are mainly interested in random “point” defects. Large area and parametric defects are most likely to be associated with design and process problems or catastrophic contamination; these by their very nature are not random and hence can be more easily identified and corrected.
To define a random defect, we will use the following: "any pattern or material disturbance, which if properly located, would cause a circuit to fail to function within a desired set of specified limits." This distinction is the basis of the defect descriptors fatal and non-fatal; a fatal defect manifests itself by causing the circuit to fail while a non-fatal defect will not cause circuit failure [1].
Circuit
Critical Area:
Defect densities (number of defects per unit area) and circuit or chip area. It is important the gross chip area the size and the nature of the circuit patterns within that chip. This leads to the concept of circuit critical area.
The use of a critical area for yield calculations arises from two factors:
1. Integrated circuit patterns with finite dimensions.
2. Realistic point defects are not idealized mathematical points but have finites sizes o the order of the dimensions of IC features.
The combined effect of these two factors causes the number of fatal defects in a chip to be a function of pattern complexity, feature sizes, average defect density, and defect size distribution.
Therefore,
critical area is simply proportional to actual area, where Ac = qc A , and qc is the
proportionality constant [3].
Yield
Models:
Defects may be introduced during any of the many
processing steps that an IC undergoes. Functional chips must pass through
each and every processing step without collecting any fatal
defects. Thus, chip yield is the product of the yields of the individual
processing steps. Each processing step can be characterized by a critical
area and a defect density (number of defects per unit area). The critical
areas and defect densities are assumed to collectively represent the
constituent steps. The desired result is that the product of the critical
area (Ac) and the average defect density (D0) gives the expected number of fatal defects per
circuit. Non- fatal defects are taken into account in the computation of Ac, therefore D0 must include all defects present, not just fatal defects.
Simple
Random Defect Model:
This model is the simplest yield model and is developed by assuming that the defects occur randomly. Consider a wafer of area Awafer, populated with chips of area Achip. Assume that Nd defects occur randomly on the wafer and that all chips are equally susceptible to defects. The probability that a given defect occurs in a given chip is the ratio of Achip to Awafer. The number of defects per chip, P(i), can be expressed as a binomial random variable with parameters Nd and Achip /Awafer:
The average defect density, D0
After the substitution we have
For large Nd and moderate D0 Achip , the above equation for P(i) is approximated by a Poisson random variable [6, p. 102].
For identical chips and random defect placement, we only need substitute Acchip for Achip . The yield, Y, is the probability of zero fatal defects in a chip, and is expressed as
This is known as the simple Poisson yield model. It is based on the
well known approximation of a binomial random variable by a Poisson random
variable.
A random variable is such a variable which has a definite value for each sample
point. For instance, it is frequently convenient to make a list of the
possible outcomes of an experiment. The set of all possible mutually
exclusive outcomes is called a sample space; each individual outcome is
called a point of the sample space.
Mathematically speaking, a more formal definition of probability is:
If there are
several equally likely, mutually exclusive, and collectively exhaustive
outcomes of an experiment, the probability of an event E is
(this definition has been taken from the book "Mathematical Methods
in the Physical Sciences"; Mary L. Boas, 2nd Edition, Wiley,
page 686, chapter 16)
Compound
Poisson Statistics:
According to some authors, the simple Poisson yield formula is too pessimistic for single chips [4], [5], [6], [7], because defects are often not randomly distributed, but rather are clustered in certain regions. Defect clustering can cause large areas of a wafer to have fewer defects than a random distribution, such as the Poisson model, would predict, which in turn results in higher yields in those areas. This argument makes a model based on the assumption of a random distribution of defects questionable according to some authors.
Then, to tackle this nonrandom defects situation, compound Poisson statistics are the best way to go. The Poisson distribution is compounded with a function, f(D), which represents the normalized distribution of chip defect densities:
The function f (D) is a weighting
function which accounts for the nonrandom distribution of defects.
Figure 2
Yield is
the probability of zero fatal defects (i.e., Y = P( i = 0 )). Defining a
partial yield, Y(i), as the conditional probability that the defects in the die are
non-fatal, given that there are i defects present.
With no repair capability this gives
Die
yield is the weighted sum of all the Y(i) conditional yields
interchanging
the order of summation and the integration, it gives a completely summed
Poisson distribution inside the integral that equals to 1. This results in Murphy’s formula [5]:
Murphy’s formula
Y0 is the probability of zero fatal
defects or yield in a chip.
Distribution Functions,
f(D):
When we assume that defects are randomly and uniformly distributed over the wafer, the wafer defect statistics can be characterized by a single parameter, D0, which is the average defect density. Therefore, f(D) is a delta function centered at D = D0. Resulting in a simple Poisson distribution and a yield given by:
When you integrate a Gaussian distribution gets difficult most of the time. But in order to carry out the integration in a easier way we could use an approximation to a Gaussian distribution. Murphy proposed using a symmetrical triangle weighting function for f(D) [5] . Let L be the triangle function:
Substituting this into Murphy's formula gives
This is another distribution function to express defect distribution. The decaying form of the exponential function implies that higher defect densities are increasingly unlikely. Physically this means that high defect densities are restricted to small regions of a wafer. The exponential function can be used to represent severe clustering in a small regions of a wafer. The resulting yield formula for the exponential function:
Seeds proposed this yield equation to model individual processing steps. The same result was derived by Price. Price considered the total number of ways indistinguishable defects could be distributed among chips. When the number of chips is large the probability, P(i =0), of zero defects would take the form [8, p. 779] , of the above equation resulting in the often referred Price’s yield formula:
If the product D0 AC is increased, this yield will decay less rapidly than the simple Poisson model does. If clustering is not significant, the Price model will result in not realistic yield predicitions.
This function is constant between zero and 2D0, and zero elsewhere. The meaning of this function, physically, implies that chip defect densities are evenly distributed up to 2D0, but none have a higher value. Approximating a Gaussian with this function, we have:
Substituting f(D) into Murphy’s formula will result in:
It’s the most used weighting function. The form of f(D) is
where G(a) is the gamma function [9], a and b are given by
Substituting f(D) into Murphy’s formula gives [8, p. 786],
This is the negative binomial distribution [1]. Using ab = D0 the above equation, P(i,a,b,D0,A), may be written as
The case of i = 0 and A = AC gives the yield:
The parameter a can be used to account for defect clustering. By varying a the model covers the entire range of yield predictions. The parameter has physical relevance in that it is related to the variance of D. The larger the variance (more clustering) the smaller the value of a.
If a = 1 the yield reduces to the Price formula (exponential weighting); for a = ¥ it becomes the Poisson formula (no clustering). The value of a must be experimentally determined. The smaller values of a reflect higher yield and occur as a process becomes more mature.
Yield With Repair Capabilities:
Yield Equations
Here is an example. There is a die consisting of two area
components, and
. We assume that defects are randomly distributed within the
die, then the probability of an individual defect being in the core area
, or the support area
, is simply the appropriate area fraction. Therefore,
Assume further that within the core region, fatal
defects can be independently repaired with probability . The repair may involve using redundant circuitry for
defective circuitry, or repairing pattern defects. In the support area no
repair is possible.
Support Area Yield
The support area yield is the probability that there are no
fatal defects in the support area. We stated above that the distribution of
defects between the core area and support area of a chip is random. Therefore,
we can use the way of computing the probability mass distribution of fatal
defects within a die to compute the probability mass distribution for defects
in the support area. Fatal defects in the support area can be accounted for
simply by substituting the critical area , in the yield formulas derived earlier, Thus, the support
yield is given by
The exact form depends on the function used for .
Core Area Yield
The core area yield must include the possibility of
defects being repaired. We define as the
probability that the core area is made functional by repair given that there
are k
fatal defects in the core area. The probability of k fatal defects, given that there
are i total defects, is a binomial random
variable with
. Then if there are i defects in the
die, the core yield
will be
Assuming that every defect is repaired with an
independent, constant probability of success. will be taken to
be
The total expression for the core yield is
If the repair efficiently is zero (for nonzero
defect counts) this equation will reduce to the same form as the support yield
calculation. This is expected since without repairs in the core, the two
regions are identical from a yield perspective.
Die Yield
The die yield is simply the product of the support and core area yields
Effects of Repair on Yield:
First, an individual defect in the support area will cause a
failure with probability qC, so the more defects that are
present (larger i ), the higher the probability of a failure. Second, defects in the core area cannot
be repaired with 100% efficiency, so the more that are present, the higher the
probability that at least one will not be repaired.
A larger Pc value implies that defects are more likely to be in the
core region where they have a high probability of being accommodated by
repair.
Where Pc is the core area and Ps is the support area
Let Pc = 0.50 and AD0 = 5.0, then the defect probability density function Pc ( i ) is broadened and the corresponding
yield values Yc ( i ) do not match Pc ( i ). This implies the effect of the non-repairable support area
and the finite repair efficiency in the core. Repair can be quite effective in increasing
yield, but only if most of the chip critical area has repair capability.
Clustering Effects On Wafer Scale Yields:
Intrawafer Defect Density Variation
For single chip yields, defect clustering increases yield. However, in a wafer scale system
partitioned into macrocells, the system yield is a function of interacting
macrocell yields rather than independent macrocell yields. In this case clustering can potentially
result in decreased system yields [1].
Lets consider a system with a 2-fold block redundancy [10].
This implies that if either, (or both) of the two blocks are functional
then the pair is functional from a system yield perspective. Choose a system of NB block pairs and let each
individual block have AC, and D0. Comparing system
yields for systems of different number of block pairs using a random defect
distribution and a pseudo-exponential defect distribution.
The system yield is taken as the product of the block pair
yields. The expression for system
yield using the pseudo-exponential distribution with three area components is
where Mreg denote the number of regions into which a wafer is divided to achieve approximately uniform defect distributions in each. The values of the NBi are determined by NB and the distribution area fractions such that
where fi is the areal fraction of the i th region.
Clustering decreases system yield with the
deviation increasing with system size as measured by the value of NB.
Triple modular redundancy (TMR) also suffers yield degradation due to intrawafer clustering. This is demonstrated by modeling a system as consisting of an intrinsic area, Ain, which is partitioned into small areas that are triplicated to implement TMR. Each partition is assumed to have a variable logic circuitry area, Alogic, and a constant voter circuitry area, Avoter. The triple yield, YT, is the probability that at least two of the three partitions are functional. YT is given by
where YP is the yield of each logic-voter partition.
The system yield, Ysys, is given by the product of the triple yields
The values of the individual NTi are determined by the areal fractions assumed in the defect distribution such that
where fi id the areal fraction of the ith region and the total number of triples, NT , is given by
Interwafer Defect Density Variation
Another yield consideration is interwafer (wafer to wafer) defect density variation. This is conceptually similar to intrawafer variation. For single chip yield, Murphy’s formula can simply be applied twice:
is the
average function reflecting the distribution of defect densities among the
wafers considered. Both could actually have temporal as well as spatial
variations, but that is not considered here. Strictly,
would be a discrete
function (a sum of delta functions perhaps), but an appropriate continuous
approximation may be used.
The Combination of Interwafer and Intrawafer Variation:
For a single chip yield,
interwafer and intrawafer variation can be lumped together. The chips can be
viewed as part of one super wafer which is characterized by an appropriate . A processing line can then be characterized using the
versatile gamma distribution function with an appropriate value of
.
In wafer scale products with redundancy, the product yield is more complicated. A general expression is
The exact form of depends on the
type of redundancy used.
In systems using 2-fold block redundancy and triple modular redundancy, intrawafer clustering can significantly reduce wafer yields. Because the wafer yield is a product of macrocell yields so that any low yielding macrocell reduces the entire yield. With single chips, wafer yield is a spatial average over many small independent chips, and thus more robust.
The Total Yield of a Batch of Wafer Scale Systems:
The total yield of a batch of
wafer scale systems is again an average over independent systems so interwafer
variation for a given value of should increase
yield. Considering the 2-fold block redundant system discussed earlier with the
pseudo-exponential
. Also consider interwafer variation as represented by a
pseudo-triangle
(weighted delta
functions with areal fraction < 0.2, 0.6, 0.2 > and local defect
densities < 2.0, 5.0, 8.0 >
). Both of these distributions have
.
Summary:
Yield prediction is a complex issue involving design and process
features, equipment characteristics, and human learning. An understanding of yield issues is
essential to understand the usefulness of repair capabilities and the
feasibility of projects like wafer scale integration.
Defect clustering has significant effect on yield
predictions. Furthermore, defect
clustering in single chips was treated in terms of weighting functions in
Murphy’s formula. Also, the most
physically intuitive function uses weighted delta functions to physically
partition a wafer into regions having locally random distributions. The most versatile function is the
gamma function which can approximate a wide variety of clustering modes with
adjustable parameter a, which is related to the
variance of the defect density distribution.
A simple model was used to explore the effect of repair within a
chip. Repair is quite effective in
increasing yield, if the majority of a chip’s critical area is protected by the
repair capability.
Yield degradation in wafer scale systems due to intrawafer defect
clustering is a result of the multiplicative nature of the system yield. Degradation in systems using 2-fold
block redundancy and triple modular redundancy (TMR) is potentially
significant.
In systems using block redundancy or TMR , interwafer
defect clustering increases yield numbers when yields are compared on the basis
of an ensemble average defect density.
The conflicting effects of defect clustering within wafers and between
wafers on which systems containing redundancy are fabricated, highlight the
importance of having a full understanding of what defect densities and yield
statistics actually mean.
Presentation Page Table of Contents