Yield Modeling with Redundancy

Timothy L. Michalka, Ramesh C. Varshney, and James D. Meindl

 

Overview:

The mathematical foundation of common integrated circuit yield models based on the assumption that the yield is dominated by random point defects is discussed. Various mathematical models which are commonly used to account for defect clustering are given a physical interpretation and are compared mathematically and graphically. It is shown that the yield of systems with circuit redundancy can be substantially affected by defect clustering and hence that a correct understanding of defects and yield is essential to predict the yields of wafer scale products.
 

Introduction:

Some basic definitions that we will need in order to understand the concepts of Yield and Yield Modeling.
 


In our case, the yield of an integrated circuit is the fraction of IC chips that meet a specific set of functional requirements out of the total number of chips manufactured. In other words, it's the probability that a given chip chosen randomly will be functional.

    Yield is divided into two components:

                       1.    Parametric Yield:

                               This quantifies deviations in device and material parameters such as:
                                           a- Threshold voltage
                                           b- Sheet resistance
                                Both of these parameters can cause circuits to fail to meet specific functional requirements such as:
                                           a- Speed margin
                                           b- Noise margin

                        2.    Random Defect Yield:

                                It is associated with problems like undesired short circuits and open circuits which result in incorrect logical functioning.

The mechanisms that cause defects are many and varied, including silicon material defects, sodium contamination, and particulate based pattern disruption. Every processing step can introduce or cause defects. The probability that a defect will result from a processing step, or steps, is a function of the chip area, pattern features, and the defect level of the process.

 

Defect Definition:
 

In modeling yield of redundancy purposes in a wafer scale system, we are mainly interested in random “point” defects. Large area and parametric defects are most likely to be associated with design and process problems or catastrophic contamination; these by their very nature are not random and hence can be more easily identified and corrected.

To define a random defect, we will use the following: "any pattern or material disturbance, which if properly located, would cause a circuit to fail to function within a desired set of specified limits."  This distinction is the basis of the defect descriptors fatal and non-fatal; a fatal defect manifests itself by causing the circuit to fail while a non-fatal defect will not cause circuit failure [1].

Circuit Critical Area:

Defect densities (number of defects per unit area) and circuit or chip area. It is important the gross chip area the size and the nature of the circuit patterns within that chip. This leads to the concept of circuit critical area.

The use of a critical area for yield calculations arises from two factors:

              1.    Integrated circuit patterns with finite dimensions.

              2.    Realistic point defects are not idealized mathematical points but have finites sizes o the order of the dimensions of IC features.

The combined effect of these two factors causes the number of fatal defects in a chip to be a function of pattern complexity, feature sizes, average defect density, and defect size distribution.

        Figure 1

                                   
        Therefore, critical area is simply proportional to actual area, where  Ac  qc A , and qc  is the proportionality constant [3].
 

Yield Models:

Defects may be introduced during any of the many processing steps that an IC undergoes. Functional chips must pass through each  and every processing step without collecting any fatal defects.  Thus, chip yield is the product of the yields of the individual processing steps.  Each processing step can be characterized by a critical area and a defect density (number of defects per unit area).  The critical areas and defect densities are assumed to collectively represent the constituent steps.  The desired result is that the product of the critical area (Ac) and the average defect density  (D0) gives the expected number of fatal defects per circuit.  Non- fatal defects are taken into account in the computation of Ac, therefore D0 must include all defects present, not just fatal defects.
 

Simple Random Defect Model:

This model is the simplest yield model and is developed by assuming that the defects occur  randomly.  Consider a wafer of area Awafer, populated with chips of area Achip.  Assume that Nd defects occur randomly on the wafer and that all chips are equally susceptible to defects.  The probability that a given defect occurs in a given chip is the ratio of Achip to Awafer.  The number of defects per chip, P(i), can be expressed as a binomial random variable with parameters Nd and Achip /Awafer:

                       

The average defect density, D0

                       

After the substitution we have

                       

For large Nd and moderate D0 Achip , the above equation for P(i) is approximated by a Poisson random variable [6, p. 102].

For identical chips and random defect placement, we only need substitute Acchip for Achip .  The yield, Y, is the probability of zero fatal defects in a chip, and is expressed as

                                   

This is known as the simple Poisson yield model.  It is based on the well known approximation of a binomial random variable by a Poisson random variable.
A random variable is such a variable which has a definite value for each sample point.  For instance, it is frequently convenient to make a list of the possible outcomes of an experiment.  The set of all possible mutually exclusive outcomes is called a sample space; each individual outcome is called a point of the sample space.

Mathematically speaking, a more formal definition of probability is:

If there are several equally likely, mutually exclusive, and collectively exhaustive outcomes of an experiment, the probability of an event E is

                                       
                                              

(this definition has been taken from the book "Mathematical Methods in the Physical Sciences"; Mary L. Boas, 2nd Edition, Wiley,  page 686, chapter 16)
 

Compound Poisson Statistics:

According to some authors, the simple Poisson yield formula is too pessimistic for single chips [4], [5], [6], [7], because defects are often not randomly distributed, but rather are clustered in certain regions.  Defect clustering can cause large areas of a wafer to have fewer defects than a random distribution, such as the Poisson model, would predict, which in turn results in higher yields in those areas.  This argument makes a model based on the assumption of a random distribution of defects questionable according to some authors.

Then, to tackle this nonrandom defects situation, compound Poisson statistics are the best way to go.  The Poisson distribution is compounded with a function,  f(D), which represents the normalized distribution of chip defect densities:

                                   

The  function  f (D) is a weighting function which accounts for the nonrandom distribution of defects.
 
            Figure 2                       

 

Fatal Defects and Murphy’s Formula

Yield is the probability of zero fatal defects (i.e., Y = P(    i = 0 )).   Defining a partial yield, Y(i), as the conditional probability that the defects in the die are non-fatal, given that there are i defects present.  With no repair capability this gives

 

                                               

 

Die yield is the weighted sum of all the Y(i) conditional yields

 

 

                                               

 

                                                           

 

interchanging the order of summation and the integration, it gives a completely summed Poisson distribution inside the integral that equals to 1.  This results in Murphy’s formula [5]:

 

 

                                                                                                          Murphy’s formula

 

Y0  is the probability of zero fatal defects or yield in a chip.

 

 

 

Distribution Functions,  f(D):

 

 

 

When we assume that defects are randomly and uniformly distributed over the wafer, the wafer defect statistics can be characterized by a single parameter, D0, which is the average defect density.  Therefore, f(D) is a delta function centered at  D = D0.  Resulting in a simple Poisson distribution and a yield given by:

 

                                   

 

                                                                       

 

                                               

 

 

                                               

When you integrate a Gaussian distribution gets difficult most of the time. But in order to carry out the integration in a easier way we could use an approximation to a Gaussian distribution. Murphy proposed using a symmetrical triangle weighting function for f(D) [5] . Let L be the triangle function:

                       

                                               

 

                                               

           

Substituting this into Murphy's formula gives

                                               

                         

                                                                                                                                                 

                                               

                                               

           

 

·        Exponential Function  

This is another distribution function to express defect distribution.  The decaying form of the exponential function implies that higher defect densities are increasingly unlikely.  Physically this means that high defect densities are restricted to small regions of a wafer.  The exponential function can be used to represent severe clustering in a small regions of a wafer.  The resulting yield formula for the exponential function:

 

                                   

 

 

                             

 

 

Seeds proposed this yield equation to model individual processing steps.  The same result was derived by Price.  Price considered the total number of ways indistinguishable defects could be distributed among chips.  When the number of chips is large the probability, P(i =0), of zero defects would take the form [8, p. 779] , of the above equation resulting in the often referred Price’s yield formula:

                       

If the product D0 AC is increased, this yield will decay less rapidly than the simple Poisson model does.  If clustering is not significant, the Price model will result in not realistic yield predicitions.

 

·        Rectangle Function

This function is constant between zero and 2D0, and zero elsewhere.  The meaning of this function, physically, implies that chip defect densities are evenly distributed up to 2D0, but none have a higher value.   Approximating a Gaussian with this function, we have:

 

                       

            Substituting  f(D) into Murphy’s formula will result in:

 

·        Gamma Function

It’s the most used weighting function.  The form of f(D) is

                       

where G(a) is the gamma function [9], a and b are given by

                                                                                                       

Substituting f(D) into Murphy’s formula gives [8, p. 786],

                       

This is the negative binomial distribution [1].  Using ab = D0 the above equation, P(i,a,b,D0,A), may be written as

                       

The case of  i = 0 and  A = AC gives the yield:

                       

 

The parameter a can be used to account for defect clustering.  By varying a the model covers the entire range of yield predictions.  The parameter has physical relevance in that it is related to the variance of D.  The larger the variance (more clustering) the smaller the value of a. 

If a = 1 the yield reduces to the Price formula (exponential weighting); for a = ¥  it becomes the Poisson formula (no clustering).  The value of a  must be experimentally determined.  The smaller values of a  reflect higher yield and occur as a process becomes more mature.

 

Yield With Repair Capabilities:

Yield Equations

Here is an example. There is a die consisting of two area components,  and . We assume that defects are randomly distributed within the die, then the probability of an individual defect being in the core area , or the support area , is simply the appropriate area fraction. Therefore,

Assume further that within the core region, fatal defects can be independently repaired with probability . The repair may involve using redundant circuitry for defective circuitry, or repairing pattern defects. In the support area no repair is possible.

Support Area Yield

The support area yield is the probability that there are no fatal defects in the support area. We stated above that the distribution of defects between the core area and support area of a chip is random. Therefore, we can use the way of computing the probability mass distribution of fatal defects within a die to compute the probability mass distribution for defects in the support area. Fatal defects in the support area can be accounted for simply by substituting the critical area , in the yield formulas derived earlier, Thus, the support yield is given by

The exact form depends on the function used for .

Core Area Yield

The core area yield must include the possibility of defects being repaired. We define  as the probability that the core area is made functional by repair given that there are k fatal defects in the core area. The probability of k fatal defects, given that there are i total defects, is a binomial random variable with . Then if there are i defects in the die, the core yield  will be

Assuming that every defect is repaired with an independent, constant probability of success.  will be taken to be

The total expression for the core yield is

If the repair efficiently is zero (for nonzero defect counts) this equation will reduce to the same form as the support yield calculation. This is expected since without repairs in the core, the two regions are identical from a yield perspective.

Die Yield

The die yield is simply the product of the support and core area yields

 

Effects of Repair on Yield:

First, an individual defect in the support area will cause a failure with probability qC, so the more defects that are present (larger i ), the higher the probability of a failure.  Second, defects in the core area cannot be repaired with 100% efficiency, so the more that are present, the higher the probability that at least one will not be repaired.

A larger Pc value implies that defects are more likely to be in the core region where they have a high probability of being accommodated by repair. 

Where Pc is the core area and Ps is the support area

                       

Let Pc = 0.50 and AD0 = 5.0, then the defect probability density function Pc ( i ) is broadened and the corresponding yield values Yc ( i ) do not match Pc ( i ).  This implies the effect of the non-repairable support area and the finite repair efficiency in the core. Repair can be quite effective in increasing yield, but only if most of the chip critical area has repair capability.

 

Clustering Effects On Wafer Scale Yields:

Intrawafer Defect Density Variation

For single chip yields, defect clustering increases yield.  However, in a wafer scale system partitioned into macrocells, the system yield is a function of interacting macrocell yields rather than independent macrocell yields.  In this case clustering can potentially result in decreased system yields [1].

Lets consider a system with a 2-fold block redundancy [10].  This implies that if either, (or both) of the two blocks are functional then the pair is functional from a system yield perspective.  Choose a system of NB block pairs and let each individual block have AC, and D0.  Comparing system yields for systems of different number of block pairs using a random defect distribution and a pseudo-exponential defect distribution.

The system yield is taken as the product of the block pair yields.  The expression for system yield using the pseudo-exponential distribution with three area components is

                       

where Mreg denote the number of regions into which a wafer is divided to achieve approximately uniform defect distributions in each. The values of the NBi are determined by NB and the distribution area fractions such that

                       

where fi  is the areal fraction of the i th region.

Clustering decreases system yield with the deviation increasing with system size as measured by the value of NB.

Triple modular redundancy (TMR) also suffers yield degradation due to intrawafer clustering.  This is demonstrated by modeling a system as consisting of an intrinsic area, Ain, which is partitioned into small areas that are triplicated to implement TMR.  Each partition is assumed to have a variable logic circuitry area, Alogic, and a constant voter circuitry area, Avoter.  The triple yield, YT, is the probability that at least two of the three partitions are functional.  YT is given by

                       

where YP is the yield of each logic-voter partition.

The system yield, Ysys, is given by the product of the triple yields

                       

The values of the individual NTi are determined by the areal fractions assumed in the defect distribution such that

                       

where fi id the areal fraction of the ith region and the total number of triples, NT , is given by

                       

 

Interwafer Defect Density Variation

Another yield consideration is interwafer (wafer to wafer) defect density variation. This is conceptually similar to intrawafer variation. For single chip yield, Murphy’s formula can simply be applied twice:

 

 is the average function reflecting the distribution of defect densities among the wafers considered. Both could actually have temporal as well as spatial variations, but that is not considered here. Strictly,  would be a discrete function (a sum of delta functions perhaps), but an appropriate continuous approximation may be used.

 

The Combination of Interwafer and Intrawafer Variation:

 

For a single chip yield, interwafer and intrawafer variation can be lumped together. The chips can be viewed as part of one super wafer which is characterized by an appropriate . A processing line can then be characterized using the versatile gamma distribution function with an appropriate value of .

In wafer scale products with redundancy, the product yield is more complicated. A general expression is

The exact form of  depends on the type of redundancy used.

In systems using 2-fold block redundancy and triple modular redundancy, intrawafer clustering can significantly reduce wafer yields. Because the wafer yield is a product of macrocell yields so that any low yielding macrocell reduces the entire yield. With single chips, wafer yield is a spatial average over many small independent chips, and thus more robust.

 

The Total Yield of a Batch of Wafer Scale Systems:

 

The total yield of a batch of wafer scale systems is again an average over independent systems so interwafer variation for a given value of  should increase yield. Considering the 2-fold block redundant system discussed earlier with the pseudo-exponential . Also consider interwafer variation as represented by a pseudo-triangle  (weighted delta functions with areal fraction < 0.2, 0.6, 0.2 > and local defect densities < 2.0, 5.0, 8.0 > ). Both of these distributions have .

 

Summary:

Yield prediction is a complex issue involving design and process features, equipment characteristics, and human learning.  An understanding of yield issues is essential to understand the usefulness of repair capabilities and the feasibility of projects like wafer scale integration. 

Defect clustering has significant effect on yield predictions.  Furthermore, defect clustering in single chips was treated in terms of weighting functions in Murphy’s formula.  Also, the most physically intuitive function uses weighted delta functions to physically partition a wafer into regions having locally random distributions.  The most versatile function is the gamma function which can approximate a wide variety of clustering modes with adjustable parameter a, which is related to the variance of the defect density distribution.

A simple model was used to explore the effect of repair within a chip.  Repair is quite effective in increasing yield, if the majority of a chip’s critical area is protected by the repair capability.

Yield degradation in wafer scale systems due to intrawafer defect clustering is a result of the multiplicative nature of the system yield.  Degradation in systems using 2-fold block redundancy and triple modular redundancy (TMR) is potentially significant.

In systems using block redundancy or TMR , interwafer defect clustering increases yield numbers when yields are compared on the basis of an ensemble average defect density.  The conflicting effects of defect clustering within wafers and between wafers on which systems containing redundancy are fabricated, highlight the importance of having a full understanding of what defect densities and yield statistics actually mean.

 

 

Presentation Page                     Table of Contents