Wednesday, January 21, 2015

Think DSP Chapter 5: Autocorrelation

It's time for Chapter 5: Autocorrelation!  If you don't think autocorrelation is interesting and fun, you are mistaken.  It turns out that autocorrelation, or something like it, explains the Missing Fundamental phenomenon I presented in last week's article.

Here are links to the previous installments:

Chapter 1: Signals and Spectrums
Chapter 2: Harmonics
Chapter 3: Chirps
Chapter 4: Noise

Autocorrelation

In Chapter 5, I explain autocorrelation by starting with the statistical definition of correlation, moving on to serial correlation, then generalizing to the autocorrelation function.

Signals often represent measurements of quantities that vary in time. For example, sound signals represent air pressure varying over time.
Measurements like this almost always have serial correlation, which is the correlation between each element and the next (or the previous). To compute serial correlation, we can shift a signal and then compute the correlation of the shifted version with the original.
def corrcoef(xs, ys):
    return numpy.corrcoef(xs, ys, ddof=0)[0, 1]

def serial_corr(wave, lag=1):
    n = len(wave)
    y1 = wave.ys[lag:]
    y2 = wave.ys[:n-lag]
    corr = numpy.corrcoef(y1, y2)[0, 1]
    return corr

corrcoef is a convenience function that simplifies the interface to numpy.corrcoef, which computes the correlation coefficient.
serial_corr takes a Wave object and lag, which is the integer number of places to shift the waves.  You can think of serial_corr as a function that maps from each value of lag to the corresponding correlation, and we can evaluate that function by looping through values of lag:
def autocorr(wave):
    lags = range(len(wave.ys)//2)
    corrs = [serial_corr(wave, lag) for lag in lags]
    return lags, corrs

autocorr takes a Wave object and returns the autocorrelation function as a pair of sequences: lags is a sequence of integers from 0 to half the length of the wave; corrs is the sequence of serial correlations for each lag.
The following figure shows autocorrelation functions for pink noise with three values of β (see Chapter 4: Noise). For low values of β, the signal is less correlated, and the autocorrelation function drops toward zero relatively quickly. For larger values, serial correlation is stronger and drops off more slowly. With  β=1.2, serial correlation is quite strong even for long lags; this phenomenon is called long-range dependence, because it indicates that each value in the signal depends on many preceding values.

With that introduction, you should check out this IPython notebook, which presents this example, then uses autocorrelation to estimate the fundamental frequency of a vocal recording.

The missing fundamental: found!

The previous installment included this notebook, which presents the phenomenon of the Missing Fundamental.  It turns out that autocorrelation sheds some light on this phenomenon.  You can read an explanation in this notebook.

Next week, for people who like the math, I derive the Discrete Cosine Transform.


No comments:

Post a Comment

Note: Only a member of this blog may post a comment.