DTCMRAM overflow trying to save large buffer

Hi friends,

new day, new problem: I am trying to save my cv-recorder buffer[1920000] to the preset, which gives me a DTCMRAM overflow. Usually the buffer lives in SDRAM, code runs from SRAM.

I assumed that I could use the full 8MB of QSPI, but now the DTCMRAM is in the way. Can this be solved?

Here is a small example:

Struct:

//Setting Struct containing parameters we want to save to flash
struct Settings {

int current_preset;
float p[5][6];
int modAssign[3][6];
float modAmt[3][6];
int modSource[3][6];
float modprm[14];
float cvaudioBuffer[1920000];

Load / Save:

//Persistent Storage Declaration. Using type Settings and passed the devices qspi handle
PersistentStorage<Settings>storage(hw.qspi);

bool trigger_save = false;

void Load() {
    // Ensure storage.Init() has been called before Load()
    storage.Save();
    Settings &LocalSettings = storage.GetSettings();
    current_preset = LocalSettings.current_preset;
    for (int i = 0; i < 5; i++) 
        {
            for (int y = 0; y < 6; y++) 
                {
                    p[i][y] = LocalSettings.p[i][y];
                }
        }
    for (int i = 0; i < 3; i++) 
        {
            for (int y = 0; y < 6; y++) 
                {
                    modAssign[i][y] = LocalSettings.modAssign[i][y];
                    modAmt[i][y] = LocalSettings.modAmt[i][y];
                    modSource[i][y] = LocalSettings.modSource[i][y];
                }
        }
    for (int i = 0; i < 14; i++) {
        modprm[i] = LocalSettings.modprm[i];
        }
   current_preset = LocalSettings.current_preset;

        for (int i = 0; i < 5; i++)
        {
                stutter_1->prm[i] = p[i][0] / 48;
                gdelay_2->prm[i] = p[i][1] / 48;
                dist->prm[i] = p[i][2] / 48;
                filter_4->prm[i] = p[i][3] / 48;
                delay_5->prm[i] = p[i][4] / 48;
                reverb_6->prm[i] = p[i][5] / 48;
        }
        
            for (int x = 0; x < 14; x++)
            {
                modprm[x] = modprms[x] / 48;
            }
        for (int x = 0; x < 1920000; x++)
            {
                cvrec->cvaudioBufferL[x] = LocalSettings.cvaudioBuffer[x];
            }
        for (int i = 0; i < 3; i++)
        {
            for (int x = 0; x < 6; x++)
            {
                mod[i][x] = modAmt[i][x] / 48;
            }
        }
        
}

void Save() {
   Settings &LocalSettings = storage.GetSettings();
   LocalSettings.current_preset = current_preset;
    for (int i = 0; i < 5; i++) 
        {
            for (int y = 0; y < 6; y++) 
                {
                    LocalSettings.p[i][y] = p[i][y];       
                }
        }
    for (int i = 0; i < 3; i++) 
        {
            for (int y = 0; y < 6; y++) 
                {
                    LocalSettings.modAssign[i][y] = modAssign[i][y];
                    LocalSettings.modAmt[i][y] = modAmt[i][y];
                    LocalSettings.modSource[i][y] = modSource[i][y];
                }
        }

    for (int i = 0; i < 14; i++) {
        LocalSettings.modprm[i] = modprm[i];
        }

    for (int x = 0; x < 1920000; x++)
            {
                LocalSettings.cvaudioBuffer[x] = cvrec->cvaudioBufferL[x];
            }
    
    LocalSettings.current_preset = current_preset;
    System::Delay(100);
    trigger_save = true;
}

// Function to read the button state
bool ReadButton() {
    return dsy_gpio_read(&button_pin) != 0;       // Return true if the button is pressed
}
bool ReadClock() {
    return dsy_gpio_read(&clock_pin) != 0;       // Return true if the button is pressed
}

Thanks!

No ideas? I really don‘t understand why the compiler wants to reserve dtcm ram for my preset data… shouldn‘t it be possible to use persistent storage for an audio buffer?

Short Answer

You can definitely save a buffer like that into QSPI flash but PersistentStorage wasn’t really designed with that use case in mind and probably won’t perform well at that task regardless of the memory issues you’re hitting.

Long Answer

I really don‘t understand why the compiler wants to reserve dtcm ram for my preset data

PersistentStorage internally has a member variable which is an entire copy of your SettingsStruct. So, because you’ve globally allocated your instance of PersistentStorage, it will end up in .bss which is mapped to DTCMRAM in the default linker script for BOOT_SRAM apps.

In other words, your float cvaudioBuffer[1920000] ends up contributing to the necessary static allocation size of PersistentStorage.

But there’s another problem: 1920000 single-precision floats is about 7.5 MB. That fits comfortably into SDRAM, but QSPI flash on the Seed is only 8MB - and remember that with bootloader apps, your program data is also stored in QSPI. So this will be a tight squeeze no matter what. If you want to store that whole buffer in contiguous memory in QSPI, you’ll need to store it after the program data - starting at the first sector address after 0x90040000 + program_size.

And lastly, in order to write all that data to flash, you’d need to first erase all the space required (on every save) - which will take quite a bit of time every time you save. libDaisy doesn’t currently implement full-block erasing in QSPIHandle, which would save a bit of time vs. the sector-based erase currently implemented. The flash can only be erased in aligned sectors or blocks, so you also have to be conscious of your data alignment, i.e. which blocks you’re erasing/writing to in order to avoid inadvertently erasing other data.

I would suggest reading up on the QSPI flash - study the datasheet and QSPIHandle to better understand how it works and what the limitations are. Honestly, you’re probably better off using an SD Card if you want to store audio buffers this big. And either way, for that much data, a streaming read/write strategy is going to be more efficient than doing it all in one go. Ultimately PersistentStorage is just probably not the best tool for this use case.

2 Likes

Thank you!

In other words, if I have my code in SRAM, I cannot save large files in QSPI?

This seems odd to me. Is there any way around this?

In other words, if I have my code in SRAM, I cannot save large files in QSPI?

That’s not quite what I said - you can save “large” files in QSPI, you just have to take care specifically where you’re saving them, make sure they’re not too large, and (I would recommend) not use PersistentStorage to do it.

The Daisy Bootloader stores code starting at address 0x90040000 in the flash. This means that there’s 256kB available at the start of the flash address space (starting at 0x9000000) for user data, OR whatever space is leftover after the program data. However, you can only erase the flash in aligned, whole-sectors of 4kB.

So for example let’s say your program size is 162kB. That means it will occupy 162kB, or 40.5 45kB sectors, starting at 0x90040000 in the QSPI flash. Rounding up, since you would have to start at the next sector boundary, that gives you an offset of 256kB at the start + 41 * 4kB sectors for program data = 420kB or address 0x90069000 where you can safely* begin storing more than 256kB of contiguous user data after program data ends, and 8MB - 420kB = 7772kB is how much space you have.

* “Safely” is relative - usually you want to leave several sectors of empty space after program data to account for the program size growing to occupy one or more additional sectors in the future as firmware is developed

Now, your buffer from the example code (1920000 floats) is 7500kB or ~7.3 MB - depending on how big your program data is, it’s possible this might fit in the flash space after program data ends (subject to the limitations I mentioned), so you can do this if you want to. However, as I mentioned before, using PersistentStorage to do this will be slow and inefficient. If nothing else, you’d want to use full-block erases instead of sector erases on the chip (which are not currently implemented in libDaisy but are possible - I’ve done it in my own code) to speed things up.

I hope that’s helpful. What you’re trying to do isn’t impossible it’s just not the ideal use case for PersistentStorage and requires a bit more care when dealing with the QSPI flash especially when using Daisy Bootloader.

Oh and also the QSPI flash is characterized for “More than 100,000 Erase/Program Cycles” but this isn’t a guarantee. The more erases you perform repeatedly on the same sectors, the greater wear you’re putting on the flash lifetime. For smaller data sizes it’s possible to implement a wear-leveling strategy and “parsimonious writes” which don’t necessarily require erases every time you want to write new data, but for an amount of data that big… you’re erasing almost the whole chip every time you need to overwrite it.

This is partly why I suggested an SD Card for this use case. Modern SD Cards are optimized for write performance and (I believe) handle wear-leveling at a hardware level, so these are things you typically don’t have to worry about with an SD Card.

many thanks for your reply.

Unfortunately, I have no idea how to go on from here. When building the code with preset data that exceeds the 128kb of DTCMRAM, it will not build due to overflow. So here is the bottleneck, and I have no idea how to overcome it.

It is not only regarded to saving a big audio buffer.
The other day I just wanted to expand to a few hundreds of parameter presets and I ran into the same overflow.

Do I have to edit the linker script, and if yes, I would not know how (and why :))

So atm I seem to have to accept a 128kb limit while the chip offers 8mb. It just makes no sense.

If you can bear with a bloody beginner, maybe you can show how I can save data without the DTCMRAM interfering?

Sorry If I am overlooking something obvious…

“The chip offers 8MB” true, but you have to have your preset data in actual RAM at some point… I think that’s the distinction. And yes, the default BOOT_SRAM app linker script puts .bss (explained most simply as “global variables”) into DTCMRAM so it’s fairly limited in size. The reason you’d edit the linker script is to map .bss to something else - for example splitting SRAM between code (.text) and data (.bss).

If the struct itself is sufficiently big you also may run into problems making copies of it on the stack if you’re not careful to use reference semantics everywhere, regardless of whether the static variable fits into memory during compilation/linking or not.

How big is your data struct, and are you using PersistentStorage?