- //LinearInterpolation.c
- void linearInterpolateBuffer(float* previousFrame, int numChannels, float* input, int inNumFrames, float* output, int outNumFrames)
- {
- int i, j, index;
- double distance, prevValue, nextValue;
- for(i=0; i<outNumFrames; i++)
- {
- for(j=0; j<numChannels; j++)
- {
- distance = i * (inNumFrames / (double)outNumFrames);
- index = ((int)distance) * numChannels + j;
- nextValue = input[index];
- prevValue = distance < 1 ? previousFrame[j] : input[index-numChannels];
- distance -= (int)distance;
- output[i*numChannels+j] = (nextValue-prevValue) * distance + prevValue;
- }
- }
- for(j=0; j<numChannels; j++)
- previousFrame[j] = input[(inNumFrames - 1) * numChannels + j];
- }
Builds On
AIFF TemplateExplanation of the Concepts
This is a linear interpolation algorithm that is capable of stretching out or contracting a buffer of interleaved audio data.
Linear interpolation fills unknown samples by drawing a straight line segments between the known samples and choosing values that fall along the line:
remember that mathematically, if we want to draw a straight line that passes through two adjacent known samples, the equation is:
y=mx+b
where, in this case, if we always choose a coordinate system such that the left known sample lies on x=0 and the right on x=1, then y is the value of the unknown sample, m is the slope of the line, x represents where, exactly the unknown point is in relation to the known points (halfway between, a third of the way...), and b is the value of the left known sample. The slope of a line is given by:
slope=rise/run
and since our coordinate system was chosen to make the run 1, the slope is just equal to the rise, or the right known sample minus the left known sample. So the equation becomes:
y=(rightSample - leftSample)*x+leftSample
In the example above, x is .5 for each line segment (ie the unknown sample lies halfway between the known samples), but in practice it needs to be calculated for each sample based on the scope of the interpolation.
The above graph raises an issue. The new sample points lie halfway between the old sample points, which indicates that the audio was intended to be stretched to twice its original duration. In the graph, however, it has only been stretched to 7/4 its original duration (the rightmost sample has duration although the graph ends). For musical applications, this could have a noticeable effect on pitch. To get the intended ratio, we will need one more interpolated sample that comes before (or after, although this entire textbook uses before) the first known sample, so that there can be exactly one interpolated sample for each known sample. To find the first interpolated sample, we will need to know the value of the previous known sample:
In many cases a long file will be interpolated in small chunks, so the last known sample from the previous chunk can be used to find the first interpolated sample of the new chunk. Of course, at the beginning of the audio file there is no previous sample, so one will have to be made up, which could either be 0, the value of the first sample, or it could be found by extrapolating backwards from the known samples. For audio, extrapolating is more trouble than it is worth, and is likely to cause clipping.
Another problem is that, when working with multichannel audio, each channel has to be interpolated separately. If a harpsichord is being sent on the right channel, and a trumpet on the left, then it would not normally make much sense to try to find values that lie between them. Because the channels are interleaved, this will be somewhat tricky... We will just have to remember that the right-hand known sample will not necessarily be the next sample in the buffer: It will be NUM_CHANNELS samples beyond the left-hand known sample.
Explanation of the Code
As arguments, this function takes the previous frame of audio (used to find the first interpolated sample[s]), the number of interleaved audio channels, the input buffer, the number of samples in the input buffer, an empty buffer where the output is to be written, and the number of samples in the output buffer. previousFrame is the only one of these that is not self-explanitory. This should be an array containing enough space for one frame of audio data. If the you are interpolating the beginning of some audio data, and there is no previous frame, then these frames should probably be equal to 0. If, however you are interpolating a large audio file in small segments, then this function will save the last frame from the previous segment in this array, so that you can just pass it in as-is for the next chunk of data. In other words, the function just uses this argument as storage space that exceeds the life of the function. If the number of channels was not variable, then this could be accomplished with an 'extern' variable or a return value. In any event, the last frame of the un-interpolated data is written into this array on lines 23-34.
The equation given above:
y=(rightSample - leftSample)*x+leftSample
is actually solved on line 20:
output[i*numChannels+j] = (nextValue-prevValue) * distance + prevValue;
All of the other lines are dedicated to figuring out what, exactly, the previous value, next value, and distance actually are.
For every sample in the output buffer, line 15 tries to find the analogous location in the input buffer, which may lie between samples. Line 16 uses the result of line 15 to get the actual index of the correct sample out of the next frame of the input buffer. Line 17 prepares the value at the index that was found in line 17 to be used as the right hand sample of the line-segment that is being calculated. Line 18 finds the value of the left-hand sample: it takes the sample on frame previous to the right-hand sample, unless the right hand sample was the beginning of the input buffer, in which case it takes the correct sample from the previous frame. Line 15 finds how far between the known samples is the interpolated sample, by removing the integer portion of the total distance into the input buffer. For example, if the current frame of the output buffer is analogous to the point 435.8 frames from the beginning of the input buffer, then the interpolated sample lies at 0.8 on the x axis, (where the previous frame is at 0 and the next frame is at 1).