Storing of QSPI buffer causing data corruption and hard faults

nettech15 · September 16, 2023, 12:56pm

Hello all!

I have been having issues with successfully loading and storing data to QSPI on Daisy Seed v1.1 rev boards.

Loading and saving of small amounts of data (< 1 kB mixture of floats and ints) seems to work fine, but on occasion when I read the data back in, some of the data shows different values than the original saved values.

I am now trying to save larger amounts of data (192 kB floats per sample). The result is a combination of corrupted data (either noise, a mixture of noise and the sample sound) and also getting hard faults (Bus Errors).

Here is the snippet of code that seems to be the culprit;

// QSPI - load and save
// 4kB blocks
#define FLASH_BLOCK 4096

uint8_t DSY_QSPI_BSS qspi_buffer[FLASH_BLOCK * 2048]; // allocate all for sample storage

void FlashLoad(uint8_t aSlot)
{
hardware.qspi.DeInit();
QSPIHandle::Config qspi_config;
qspi_config.device = QSPIHandle::Config::Device::IS25LP064A;
qspi_config.mode = QSPIHandle::Config::Mode::MEMORY_MAPPED;
qspi_config.pin_config.io0 = {DSY_GPIOF, 8};
qspi_config.pin_config.io1 = {DSY_GPIOF, 9};
qspi_config.pin_config.io2 = {DSY_GPIOF, 7};
qspi_config.pin_config.io3 = {DSY_GPIOF, 6};
qspi_config.pin_config.clk = {DSY_GPIOF, 10};
qspi_config.pin_config.ncs = {DSY_GPIOG, 6};
hardware.qspi.Init(qspi_config);

size_t size = (48000 * bufferTime);

//memcpy(*dest, *src, sizet);
memcpy(&sBufferL, &qspi_buffer[(48000 * bufferTime * aSlot)], size);

}

void FlashSave(uint8_t aSlot)
{
hardware.qspi.DeInit();
QSPIHandle::Config qspi_config;
qspi_config.device = QSPIHandle::Config::Device::IS25LP064A;
qspi_config.mode = QSPIHandle::Config::Mode::INDIRECT_POLLING;
qspi_config.pin_config.io0 = {DSY_GPIOF, 8};
qspi_config.pin_config.io1 = {DSY_GPIOF, 9};
qspi_config.pin_config.io2 = {DSY_GPIOF, 7};
qspi_config.pin_config.io3 = {DSY_GPIOF, 6};
qspi_config.pin_config.clk = {DSY_GPIOF, 10};
qspi_config.pin_config.ncs = {DSY_GPIOG, 6};
hardware.qspi.Init(qspi_config);

size_t start_address = (48000 * bufferTime * aSlot);

size_t slot_address = start_address;

size_t size = (48000 * bufferTime);

hardware.qspi.Erase(slot_address, slot_address + size);
hardware.qspi.Write(slot_address, size, (uint8_t*)&sBufferL);

}

The entire code project is here .

Is there something I am overlooking as far as proper coding in the QSPI load/save subroutines?

Does anybody have any insights as to why the code is failing?

Thanks!

antisvin · September 16, 2023, 2:54pm

There’s a possibillity that your code is totally fine, but you’ll still run into a bus fault or something similar. You might be running into several unrelated issues here as it seems like you describe more than one symptom. You’d have to read parts of STM32H7 MCU reference manual for specific fault type explanation and some advanced topics (cache, MPU, memory barriers). It’s a fairly complex microcontroller with things like instructions / data caching and prefetched code executon with branch prediction and more fun stuff, so some errors are simply because you’re doing something that goes against normal patterns it’s designed to handle gracefully.

If your QSPI flash memory is configured to be cacheable by MPU settings, you would have to invalidate cache after writing, since writes happen when it’s not in memory mapped mode, so cached values from previous reads may persist. Cache is not aware that underlying contents have changed when you were sending flash command to reprogram the chip. An alternative is to configure QSPI access to be non-cacheable, but you’ll lose some read performance (might be acceptable or not depending on how you use flash).

Troubleshooting bus faults is more complicated, but I would guess that code prefetch might be the culprit here. In order to run your code at maximum speed, MCU can start in advance - while the hardware is still dealing with QSPI and you can’t fetch data from it. Or some of your code can be reorder if it gives theoretically the same results from computations (and become unsafe for things like switching QSPI peripheral modes). This can be resolved by adding ISB/DSB instruction via “__ISB”/“__DSB” macros (defined in CMSIS, IIRC) to ensure that memory barriers are added in required places. This would force MCU to finish with previous code/data access before continuing after the barrier.

However I’ve only needed to use barriers once and it was required for dynamic code loading, not sure if QSPI mode switching would require it or not.

nettech15 · September 16, 2023, 8:17pm

Thank you, @antisvin for replying. I agree that this may be why the data corruption is happening. It makes sense now, here’s why.

If I store one sample in slot 1 and another sample in slot 2, and then load the sample from slot 1 and play it, I hear what sounds like an addition of both samples playing back simultaneously, plus all sorts of noise as well.

If I use more slots to store more samples, and then I try to load any sample for playback afterwards, then I get complete noise, or else the debugger halts with a bus error.

I am relying on libDaisy libraries and default linker scripts, so I have not considered that libDaisy may be over-optimizing code, or enabling MPU cache, or as you say, caching QSPI writes.

I will have to go thru the entire libDaisy code to see if I can spot what is happening on the libraries handling of caches for MPU, QSPI, etc.

Also, I will experiment around with some of the linker script settings to see if I can create code with less optimization than the current default for libDaisy, which I believe is -O3.

Thanks again for responding so quickly!

nettech15 · September 17, 2023, 3:56pm

I found the solution to the bus faults was to make sure I call qspi.erase with a size value divisable by the minimum block erase size (4096 bytes) and then make sure I subtract one from the total size value.

Because the erase function pads out the remainder of a block with FF’s, when I load the sample, the code ends up seeing “FFFFFFFF” as a float value, and that is NaN (Not a number). That seems to be what triggers the bus fault (even though it is truly not a bus fault in reality, it is a NaN error.)

So, everything is working now. I can save and load 8 different samples from QSPI with no issues.

Thanks again for the advice!

Here is the working code;

void FlashLoad(uint32_t aSlot)
{
gPlay = PLAY_OFF;

size_t size = (48000 * bufferTime * 4);

//memcpy(*dest, *src, sizet);
memcpy(&sBufferL, &qspi_buffer[aSlot], size);

gPlay = PLAY_ON;

}

void FlashSave(uint32_t aSlot)
{
gPlay = PLAY_OFF;

size_t start_address = aSlot;

size_t size = (48000 * bufferTime * 4);

hardware.qspi.Erase(start_address, start_address + 0xFFFFF);
hardware.qspi.Write(start_address, size, (uint8_t*)&sBufferL);

gPlay = PLAY_ON;

}

antisvin · September 18, 2023, 9:55am

Actually, it’s not a feature of erase function, but rather how flash erase works

There’s a separate register for floating point exceptions and it should be much easier to notice when this sort of thing happens. I’m not sure that it’s supposed to lead to a bus fault.

As for data alignment, I’ve done it like this in the past: size = (size + (alignment - 1)) & -alignment; . It looks like you’d have to align bufferTime to 32 to get data size aligned to 4k when you multiply it by 48000

nettech15 · September 18, 2023, 3:45pm

Yes, this is excellent advice to follow. My code seems to be working now only probably because I erase an entire 1MB of flash per sample storage slot, and then I write only 384KB per sample slot.

Soon I need to add additional storage space per sample slot to include Sample Start/End address, Loop Start/End address, Loop Off/On, as well as all the parameters for the Synth Engine (VCA & VCF controls and settings, Midi, Channel, etc) so I will need to keep in mind the rules of data alignment in the future.

BTW, the solution to the crazy super-imposed sample errors (and noise as well) was all solved by simply clearing the sample buffers by writing 0.0f to all locations before loading sample data from QSPI. Again, just forcing the cache to flush the previously stored sample before loading in another sample.

Here is the link to the current code , still far from final release, but it is getting there little by little.