Congratulations on finishing this epic post Actually I’m not sure if it is fully finished as it covers a lot of topics at once.
While it is certainly true that SDMMC DMA can access only specific memory regions, it’s also possible to make a workaround that allows using arbitrary destination with SDRAM while still using DMA transfers. The idea is to have only a small buffer for a single transaction on AXI SRAM and perform another copy to final destination buffer from SDMMC callback.
IIRC, CubeMX generates this sort of code and adapted version of it worked well for adding SD card support on Owlsy. This has some overhead as we have to perform an extra copy between different areas of memory, but it was necessary when memory gets allocated dynamically and can end up in any region.
Note that you have to modify startup file in order to get an extra .data
section on SDRAM. And in order for it to work, SDRAM must be initialized before your firmware runs. This means that it can only work if you run bootloader first, otherwise SDRAM will be initialized by firmware at later time.
Your core processing code will likely be running as fast as possible from any area thanks to MCU cache! Only parts of it that are less frequently accessed will gain benefits from running from faster memories. And even for those cases MCU would prefetch your code in advance, reducing amount of wait time to some extent. My guess would be that the best use for *TCMRAM is storing most frequently called DMA callbacks.