Clicking at end of sample playback

The following snippet of code alternates between reading and writing from a 1sec buffer. Works OK but at the transistion between the two I hear a small click. I don’t know much about audio engineering but am guessing it’s something related to the discontinuity between the live signal and the sample playback (?) Is there a standard simple way to filter for this? Guessing I’m missing something fundamental…

#include "daisy_patch.h"
#include "daisysp.h"

using namespace daisy;
using namespace daisysp;

DaisyPatch patch;

const size_t BUFFER_SIZE = 48000;
float DSY_SDRAM_BSS buffer[BUFFER_SIZE];
bool record = false;
size_t idx = 0;

void AudioCallback(AudioHandle::InputBuffer in,
                   AudioHandle::OutputBuffer out,
                   size_t size) {
  for (size_t i = 0; i < size; i++) {
    if (record) {
      buffer[idx] = in[0][i];
    }
    out[0][i] = buffer[idx];
    idx++;
    if (idx==BUFFER_SIZE) {
      record = !record;
      idx = 0;
    }
  }
}

int main(void) {
  patch.Init();
  patch.StartAdc();
  patch.StartAudio(AudioCallback);
  while(true) { }
}

You need fade in/out samples at transitions like you’ve guessed. If you’re using default block size of 48, you can just fade linearly for 1 block’s duration.

Note that if you want to crossfade audio transitions without gaps (fade out, read from a different place of sample or next sample, fade in, sum the results), linear transitions give you a gain dip by about ~30% at intersection point. To solve this, you should be using an equal power curve for transitions.

DaisySP has a crossfader unit that includes EP curve crossfade.

3 Likes

awesome, thanks @antisvin . i had just stumbled onto the CrossFade utilities but wasn’t sure how to use it; i’ll give that a go!

the CROSSFADE_CPOW works perfectly, thanks for the pointer!

do you know offhand if there is a generalisation of this for N signals? I really notice the volume drop you mention when i’m just mixing, say, 30 signals using just the mean value. (as part of a hand rolled granular sampler)

I’m not sure what would be a good way to generalize that to multichannel mixing, but also it’s not necessary what you need for a granular processor. My understanding is that linear crossfade is preferred for correlated audio, equal power for non-correlated. When you crossfade 2 random samples, you typically get the later, but with granular synthesis the former is more common.

I assume that you have variable shape/length for grains. In such case the volume drop you might get is due to oversimplified normalization. If your grain envelopes are not simple squares, you don’t include the level drop from transients in that volume calculation. Just normalizing by 1 / number of grains assumes that you always have grains with magnitude = 1.0.

So maybe try something like this:

  1. calculate sum of envelopes for all grains as you generate them
  2. find maximum value of that per block of audio, normalization level should be 1 / this value and not greater than 1.0
  3. linearly crossfade normalization from previous block’s level to current for every audio block EDIT: using lowpass filtering for norm sounds like a better idea

Not sure if this would work, just something I would try myself after seeing that the naive approach is not good enough.

1 Like

If you grains are not correlated, you’d probably better take the square root of the sum of the square of the envelopes (as step 1 of the antisvin’s method).

1 Like

Coming back to this after a bit of hacking around. The best result I’ve had to date is by normalising with 1/sqrt(num_grains) based on ideas put down by a friend in this notebook Constant Power Mixing of Multiple Signals / Mark Reid / Observable