Your math was correct. The wording is vague probably because it’s intended to answer a very vague question like “how much audio can it store?” that is often asked without clarifying which format is used for that audio.
You can end up with >10 minutes of high quality audio data at 48KHz if it’s stored as 16bit integers. In fact, that’s the reasonable thing to do if it’s read from a WAV file with that format - storing it as 32 bit float would double buffer size which might end up even being slower as you’ll also be reading twice as much data from SDRAM.
When you’re capturing audio from inputs, using 16 bit data would lose some data, but it might be negligible as upper bits from the audio codec contain just noise. You’re still working with “CD quality” audio. If you still want to avoid losing some details that are below hearing threshold, you could try writing a custom 24 bit integer serialization class that would store more or less the same amount of data as 32 bit float for normalized audio.
Something as popular as MI Clouds supported different formats for its audio buffer, going as low-fi as 8-bit with companding at 16 KHz. Of course it was done out of necessity as it had no external SDRAM chip and had to fit everything in 192 Kbytes of on-chip SRAM