Optimizing performance

Hello! I am working on a 16-voice virtual analog drumkit for Daisy Patch (code). The most complex voices so far are the tom and the 808-style hi-hat noise source.

I think I’ve hit a performance limit – if I add a third tom to my kit, the whole thing stops working. I don’t think it’s a bug, since I can run two toms with no problems. I’m at about 10 voices overall. I can start working on optimizations (e.g. sharing the filtered noise component across all tom voices), but before I get into that, I have some questions:

  1. What’s a good way to verify my guess that I’m running out of processing power?
  2. Is there any info on how much CPU the various DaisySP objects require, and the overall capacity (e.g. how many oscillators can be run at once, or how much polyphony have people gotten with a simple synth patch?)
  3. Any settings I should mess with? I assume larger block size could help, but I’m already at 128 which I think is the max
  4. Any guidelines on optimizing performance, particularly in the process loop? At the moment, I just iterate through my voices and call Process() on each

I’ve done this kind of thing before, eg when building similar drumkits for the Nord G1 and G2, but I feel like I’m flying blind here, without the Nord GUI’s helpful DSP percentage meter.

Yes, you can measure CPU utilization. You probably would have to measure your code to answer questions #2 and 3, they are too vague. Generally speaking you have up to 10k CPU cycles per sample if you run at 48 KHz SR, although some operations take more than 1 cycle.

Increasing block size may be less useful then you think, because most of the DaisySP stuff does have block-based processing code. Compiler might be able to inline/optimize some things, but ultimately in some cases you would need to allocate buffers in advance for block-based implementations of certain algorithms and it won’t happen automatically.

Other thing worth exploring is finding places in your code that could be precomputed and placed in LUTs on flash.

1 Like

Thank you for the pointer to CpuLoadMeter! That should help me understand what’s going on and map out what I might need to optimize. Maybe I can make a picture of typical CPU utilization of objects like Oscillator, Svf, etc.

There should be opportunities to precompute sounds or parts of sounds, thanks for the suggestion.

This is great! I added in a CPU usage display and added in the sounds one at a time. When it gets to about 95% the timing starts to suffer, and when I add in one more, it crashes. And the toms are definitely the problem right now – each one is about 19% of CPU.

Note: default clock frequency for Seed is 400Mhz. You can run it at 480Mhz.

is that what the “boost” bolean in the .Init() is for?

Yes, boost is for 480.

I’ve seen some people use a heatsink, but I’m not convinced it’s needed.

1 Like

I created a small utility to measure the CPU usage of various DaisySP modules, currently here.

Results:

  • Oscillator: 2.3%
  • SVF: 1.3%
  • WhiteNoise: 0.4%
  • AdEnv: 1.3%
  • ADSR: 0.8%
  • OnePole: 0.6%
  • AnalogBassDrum: 21.9%
  • AnalogSnareDrum: 87.2%
  • SquareNoise: 1.8%
  • ZOscillator: 6.0%
  • ModalVoice: 47.8%
  • Wavefolder: 0.9%
  • ClockedNoise: 1.4%
  • FormantOscillator: 3.1%

Note that, in my informal testing, response time (to MIDI anyway) starts to suffer at around 85% of CPU, so you probably can’t make a patch with 43 oscillators, or a simple analog bass + snare drum module (using those models anyway; you can make simpler ones).

Interesting.
Did oscillator waveform affect utilization?
Was this at 400Mhz or 480Mhz?

this was at 400Mhz, and everything is just default settings. I haven’t tried any param tweaking yet.

This is super helpful. I’m prototyping a simple sequencer on Seed using some of the preset modules. I can run the hihat, bassdrum, oscillator plus 1 AdEnv together, but when i add one extra oscillator plus AdEnv, things falls apart. It sounds bad and the hooked up oled display starts skipping frames. The snaredrum seems especially demanding, which is in line with your measurements. I’m developing in Arduino / Daisyduino and have not been able to get the CPU monitor running. If anyone can point me towards a CPULoadMeter example or some basic documentation for Arduino (not C++) I’d be grateful. Also, info on moving stuff into SDRAM for Arduino (not C++) seems hard to find.

Just a note, that daisySP actually is sample-based(meaning most of the algorithms processes single sample). Switching to a block processing within your dsp code will give you significant performance improvement, especially if you have multiple algorithms chained one after another.

2 Likes

Quick update here. I implemented a simple optimization to make sure I’m not processing sounds when they’re not making noise (seems obvious, but I wasn’t doing it and wasn’t tracking when envelopes actually finished). That got me a lot of CPU back, though of course it can run into trouble if you play too many sounds at once.

I started on adding a function to essentially sample the percussion sounds when they’re played and just play back from the samples next time. That works pretty well and really brings down the CPU meter, but I ran into trouble trying to get it to resample when I change sound parameters. I should be able to get it working, but have to do some refactoring and put it aside for the moment.