My battle with shy_fft.h and what it taught me.... (ShyFFT Quick Guide)

I have been trouble shooting a phase vocoder based pitch shift effect for nearly 6 months. Today I finally made a breakthrough and I would like to share what I have learned about the stmlib FFT header function, shy_fft.h! Any good info on using this FFT function is few and far between so I figured this must be helpful to someone out there.

There are definitely somethings I don’t know so perhaps people could share other valuable information they may have in the replies. I won’t be going in to any of the DSP theory behind Fast Fourier Transform here but will rather focus purely on using the ShyFFT class

Using shy_fft in your Daisy project
Simply copy shy_fft.h from DaisyExamples/stmlib/fft/ into your project folder and #include it in your c++ file.
Now you can instantiate a shy_fft template class as follows

#define FFT_SIZE 1024
ShyFFT<float, FFT_SIZE, RotationPhasor> i_fft;

The template class needs the following parameters:

  1. Data type (default is float)
  2. The FFT length
  3. The Phasor

So here is the first thing I don’t know a whole lot about and thats the Phasor class parameter. I understand it effects how the trigonometric data used in the FFT calculation is derived but I can’t really comment on the pros/cons of either. Options include either RotationPhasor or LutPhasor. The definitions of these phasor classes can also be found in the shy_fft.h header

FFT Analysis
FFT analysis is performed on an array of audio samples as follows:

float input_buffer[FFT_SIZE];
float fft_buffer[FFT_SIZE];
i_fft.Direct(input_buffer, fft_buffer);

Here, a fast fourier transform is performed on the input buffer. The phase and magnitude information of the different frequency bins is stored in the fft_buffer (more on how that looks later).

FFT Synthesis
FFT synthesis is performed on the processed fft buffer as follows:

float output_buffer[FFT_SIZE];
float fft_buffer[FFT_SIZE];
i_fft.Inverse(fft_buffer, output_buffer);

Here, an inverse fast fourier transform is performed on the fft buffer, converting it back to a buffer of audio samples

Arrangement of FFT Buffer Phase Bins !! (The big issue)
The FFT of a length-N signal is always a length-N complex signal. When the input is real (i.e. our audio buffer), that complex output signal has some symmetry, so all the information lives in the first N / 2 bins. Real FFT functions return those N / 2 complex numbers as an array of N real numbers, but there isn’t a standard convention for how those numbers should be arranged.

Here is the issue that kept me scratching my head for so long. I had read somewhere that these N/2 complex values were stored in the fft output buffer as follows:
{real[0], real[1], real[2], …, real[N / 2 - 1], imag[0], imag[1], imag[2], …, imag[N / 2 - 1]}.

After a whole bunch of careful testing with a controlled source, what I discovered is that the actual arrangement is the following:
{imag[0], imag[1], imag[2], …, imag[N / 2 - 1], real[0], real[1], real[2], …, real[N / 2 - 1]}.

The imaginary component of the different complex frequency bin data comes first. This is then followed by the real components.

Therefore, the magnitude and phase of the different frequency bins can be calculated as follows:

float phase[FFT_SIZE/2];
float magnitude[FFT_SIZE/2];

for (int i = 0; i < FFT_SIZE/2; i++) {
     magnitude[i] = sqrt(fft_buffer[i + FFT_SIZE/2] * fft_buffer[i + FFT_SIZE/2] + fft_buffer[i] * fft_buffer[i]);
     phase[i] = atan2_approx(fft_buffer[i], fft_buffer[i + FFT_SIZE/2]);
}

My pitchshifters main issue was solved by swapping the two arguments in the atan2 function and by correcting how I stored the processed fft_buffer prior to synthesis

Weird Scaling of the output buffer
The final thing to mention is that after calling the inverse FFT it is necessary to divide the output audio buffer by FFT_SIZE. I haven’t a notion why this is the case but it is needed in order to get unity input/output volume with no frequency processing of the fft_buffer.

This seems to be a theme with stmlib functions as I found it necessary to divide the stmlib atan2 function output by 10,000 aswell when testing with that. Weird!

Hope this proves helpful to someone!

3 Likes

I have been facing similar battles with the GitHub - bkshepherd/DaisySeedProjects: A collection of hardware and software projects based around the Electro-Smith Daisy Seed project!

How is it going now that you got past those hurdles?

I am VERY happy with my delay based pitch shifter, but the latency for more than few semitones (like an octave) is pretty unusable. I still have a blast with it and have implemented nearly everything from the Digitech Ricochet (latching and momentary modes with ramp to/from times in momentary).

I then dabbled with trying a FFT version Comparing main...refactorToUseACommonFFT · bkshepherd/DaisySeedProjects · GitHub because another user got a working shy_fft effect into the project for a granular delay.

and then today I was dabbling with PSOLA Comparing main...pitchShifterPSOLA · bkshepherd/DaisySeedProjects · GitHub

I haven’t had any luck with FFT or PSOLA, it is a huge uphill battle against CPU usage I think.

I have to throw in the towel I think and “be happy” with my time based pitch shifter and get back to actually playing guitar

Wow thats a lot of interesting projects! Will definitely give these a try!

I haven’t had a chance to run it through the ringer just yet but now that it isn’t a phasey incoherent mess, I’d say its pretty good so far! I have only implemented it on my pod which I tend to rely on to verify more complex effects before putting it into my terrarium. The next step will definitely be to build the whammy I always wanted but never cared to pay for :slight_smile:

The CPU utilisation battle is an entire other chapter of my struggle I didn’t go into here. I basically built a dedicated project for testing run time of different math functions (atan2, sqrt, sine, cosine) in order to best optimise the processing time of my FFT window. See here

NOTE: After a good nights rest, I am starting to think maybe my understanding of exactly what fixed my project is a little off, especially regarding what I had to say regarding Arrangement of FFT Buffer Phase Bins. I will be doing more testing later to confirm I am not spreading misinfo on the web again :slight_smile: In the meantime, if anyone wants to correct me feel free

But hey, who needs to understand their code when it works?

1 Like

OK ! My testing has found that I am in fact wrong and that the arrangement of phase bins is as follows:
{real[0], real[1], real[2], …, real[N / 2 - 1], imag[0], imag[1], imag[2], …, imag[N / 2 - 1]}.

The true source of my pitch shifters woes came to be an assumption I made regarding a frequency bins phase advance between succesive FFT windows. I now have a theoretically sound algorithm rather than one that happens to work if If i treat the FFT buffer the way I mentioned above :slight_smile: Its funny that I never would have worked out the true problem without testing + experimenting my way into a working but inexplainable solution

Hope this is still a help to people out there looking to use ShyFFT.h