Maximum SPI transmission size

You should be able to double transfer amount per DMA transaction if you switch to 16 bit transfers. I wrote some details about this before. Note that you don’t have to use registers, there should be some equivalent functions in STM32 HAL libraries.

However, using 2 transfers vs 4 will actually not make that much difference. The problem why you get low throughput is that rendering and transfer process probably has pauses in it. To get anywhere close to maximum speed you would have to be able to send a buffer immediately after previous transfer is finished. This means that you would have to render it in advance (using 2 buffers instead of just one) and maybe have DMA priority for SPI higher than SAI, so that high audio DSP load won’t slow down display operations.