my code is running on Daisy Seed. Arduino IDE.
I’m using uint8_t throughout the code. This is intent-driven: it’s supposed to work with MIDI so all the data is in 0…127 range. Reading through the forums I’ve seen lots of times that uint32_t would be faster because MCU is 32 bit. So I decided to compare two versions. Here is the project I’ve created for comparison.
I’m counting cycles using CYCCNT. My measurements show that code with uint8_t needs ~20% cycles less to finish comparing to the same code operating on uint32_t. So far I have two theories about it:
It’s because I’m not doing math there - 99% is iterating through the arrays and assigning variables.
I’m just measuring it wrong.
I’d really appreciate if someone could help me understand what’s going on there.
You’d have to look at disassembled code to understand why it performs differently. There are multiple reasons why you can get better performance:
- better cache utilization (as your data takes less memory and more of it ends in cache)
- more data fits into registers instead of being copied on stack (i.e. a single Note in 8 bit version can fit in a single MCU register)
- less data to copy (if you count in bits rather than number of variables)
You also have an option to get x4 times more throughput in 8 bit processing compared to 32 bit by using SIMD instructions, although in practice it tends to be hard to find good usage for lower precision integer ops in DSP code.
Thank you for the insights! Yea, reading through disassembled code is still hard for me but looks like it’s time to dig deeper.
Is there a convenient way to disassemble the result of Arduino builds?
ARM toolchain + VSCode for viewing looks fine for me. Might be there‘s something more convenient.
The original post was from an Arduino user. Yes,it’s easy to look at assembler output from the normal C++ tools.