 
      
     
 
    
    
    
    The objective of this
        exercise is to learn as much as we can about the underlying
        properties of a digitized signal using the signal processing
        tools in this book and, if possible, to obtain a mathematical
        description of the signal (script).
At
        first glance, the signal (MysteryPeak.mat) appears to be a single, asymmetrical peak
        with a maximum at x=55.5. The signal-to-noise ratio seems to be
        very good - there is little visible noise unless you look very
        closely - and the signal begins and ends near zero, so baseline
        correction is likely not an issue. The bad news is that we do
        not know anything else. The asymmetry might be due to some
        asymmetrical process applied to an originally symmetrical peak
        shape, but it could be a group of closely spaced overlapping
        peaks, which is suggested by the faint bumps in the shape. Some
        quick preliminary curve fitting can be done using command line peakfit.m: 
        
       for a
        two-Gaussian model (shape 1)
     [FitResults,GOF]=peakfit([x
          y],0,0,2,1)
          
        for
                a single Logistic model (shape 3)
       
    [FitResults,GOF]=peakfit([x
                y],0,0,1,3)
    for a 4
                  exponentially broadened Gaussian model (shape 39)
            [FitResults,GOF]=peakfit([x y],0,0,4,39,1)
    
        Alternatively, it is even easier to use the interactive peak
        fitter ipf.m for this
        purpose; you can quickly change the model shape, number of
        peaks, starting guesses, data region to be fitted, etc., with
        single keystrokes and mouse clicks. Either way, the three
        initial fits in the figures show that the signal contains a
        small amount of random noise, which appears to be white (so the
        signal has probably not been smoothed, which is fortunate) and
        which has a relative standard deviation of about 0.2%, based on
        1/5th of the visual peak-to-peak value in the
        residual plot. But
        unfortunately these fits are not successful because their
        fitting errors (0.5 to 0.8%) are all significantly larger than
        the 0.2% random noise. Trying different shapes and greater
        numbers of peaks does not help either, resulting in either
        higher fitting errors, unstable fits, or zero peak heights for
        additional peaks (as in the four-Gaussian fit).  
      
       Another approach
        to the problem of asymmetrical peaks is to use the technique of
        first-derivative
          symmetrization described previously. This applies specifically to exponential
        broadening, a commonly-    encountered peak broadening mechanism. The
        idea is that if you compute the first derivative of an
        exponentially broadened peak, multiply it by a weighting factor
        equal to the time constant of the exponential, and add it to the
        original broadened signal, the result will be the original peak
        before broadening, which makes the peak overlap less severe.
        This works for any
original
          peak shape. Even if you
        do not know the time constant beforehand, you can try different
        values until the baseline after the peak is as low as possible
        but not negative, as shown in this
GIF
          animation. This is easily
        done interactively in iSignal, version 7, which has smoothing,
        derivatives, symmetrization, and curve fitting, or by using the
      symmetrize.m function. The derivative of y with respect
        to x, by the "derivxy.m" function, shown by the green line in the
        figure above, is quite noisy. As usual, we must smooth the
        derivatives of noisy signals to make them useful, but we must
        not over-smooth and distort the signals. As a rule of thumb, a
        smooth width equal to 1/10th of the number of data
        points in the halfwidth does not distort the signal visibly.
Another approach
        to the problem of asymmetrical peaks is to use the technique of
        first-derivative
          symmetrization described previously. This applies specifically to exponential
        broadening, a commonly-    encountered peak broadening mechanism. The
        idea is that if you compute the first derivative of an
        exponentially broadened peak, multiply it by a weighting factor
        equal to the time constant of the exponential, and add it to the
        original broadened signal, the result will be the original peak
        before broadening, which makes the peak overlap less severe.
        This works for any
original
          peak shape. Even if you
        do not know the time constant beforehand, you can try different
        values until the baseline after the peak is as low as possible
        but not negative, as shown in this
GIF
          animation. This is easily
        done interactively in iSignal, version 7, which has smoothing,
        derivatives, symmetrization, and curve fitting, or by using the
      symmetrize.m function. The derivative of y with respect
        to x, by the "derivxy.m" function, shown by the green line in the
        figure above, is quite noisy. As usual, we must smooth the
        derivatives of noisy signals to make them useful, but we must
        not over-smooth and distort the signals. As a rule of thumb, a
        smooth width equal to 1/10th of the number of data
        points in the halfwidth does not distort the signal visibly.  Our peak has about
        530 points (=halfwidth(1:length(y),y)), so a smooth width near 53
        will not distort the signal peak, but it does eliminate most of
        the noise from the derivative (red line, above right). Also, we
        can see that the derivative, in y/x units, is comparable in
        numerical magnitude to the original signal, so the weighting
        factor (in x units) is probably somewhere near 1.0. Next, we add
        the first derivative times that factor to the original signal,
        looking at the trailing edge as we try six different factor
        values near 1.0. The graph on the left shows that the optimum
        value is about 1.25.
Our peak has about
        530 points (=halfwidth(1:length(y),y)), so a smooth width near 53
        will not distort the signal peak, but it does eliminate most of
        the noise from the derivative (red line, above right). Also, we
        can see that the derivative, in y/x units, is comparable in
        numerical magnitude to the original signal, so the weighting
        factor (in x units) is probably somewhere near 1.0. Next, we add
        the first derivative times that factor to the original signal,
        looking at the trailing edge as we try six different factor
        values near 1.0. The graph on the left shows that the optimum
        value is about 1.25. 
      
       When this symmetrization is applied to the
        entire signal, the result, shown below on the right, has
        more distinct bumps. When that modified signal is used for curve
        fitting, it is found that a 3-Gaussian model works quite well,
        with a fitting error of only 0.25%. This is evidence that the
        signal actually consists of three exponentially broadened
        Gaussians. Ordinarily there is no independent way to check the
        accuracy of the peak parameters so measured, but - full
        disclosure - the signal in the case was not actually unknown but
        rather was generated by the file MysteryPeaks.m. The true time constant is 1.3, the peak
        positions are 53, 55, and 57.5 and the peak areas are each equal
        to 1.0. The curve-fit results after symmetrization (shown 
        on the right) are within 0.1% for the peak positions and within
        2% of the areas. In contrast, direct fitting of three
        equal-alpha exponential Gaussians to the original data, which is
        now possible because we have determined a value for the time
        constant (expressed as the number of data points: symmfactor./(x(2)-x(1)), gives less accurate results
        in this case (graphic), probably because the
        overlap of the asymmetrical peak is greater. Moreover, it takes
        three times longer to compute.
 on the right, has
        more distinct bumps. When that modified signal is used for curve
        fitting, it is found that a 3-Gaussian model works quite well,
        with a fitting error of only 0.25%. This is evidence that the
        signal actually consists of three exponentially broadened
        Gaussians. Ordinarily there is no independent way to check the
        accuracy of the peak parameters so measured, but - full
        disclosure - the signal in the case was not actually unknown but
        rather was generated by the file MysteryPeaks.m. The true time constant is 1.3, the peak
        positions are 53, 55, and 57.5 and the peak areas are each equal
        to 1.0. The curve-fit results after symmetrization (shown 
        on the right) are within 0.1% for the peak positions and within
        2% of the areas. In contrast, direct fitting of three
        equal-alpha exponential Gaussians to the original data, which is
        now possible because we have determined a value for the time
        constant (expressed as the number of data points: symmfactor./(x(2)-x(1)), gives less accurate results
        in this case (graphic), probably because the
        overlap of the asymmetrical peak is greater. Moreover, it takes
        three times longer to compute. 
        
      This page is part of "A Pragmatic Introduction to Signal
        Processing", created and maintained by Tom O'Haver, Professor
    Emeritus, Department of Chemistry and Biochemistry, The University
    of Maryland at College Park. Comments, suggestions and questions
    should be directed to Prof. O'Haver at toh@umd.edu. Updated July, 2022.