Unexpected data on reads from SD card

Hi!

I’m trying to figure out an unexpected corrupted reads from SD card. Whenever I f_lseek before reading the file, I’m getting some bytes being corrupted. The behavior is quite consistent between 2 cards I tested. Maybe, it’s not a corruption, but rather a race condition of some sort. When I read a Wav file, there is an interesting pattern of reordered bytes. Here is a chunk of mismatched bits (compared to the hex editor on my Mac):

ID[618] hw = 0x66 pc = 0x00
ID[619] hw = 0x6d pc = 0x00
ID[620] hw = 0x74 pc = 0x00
ID[621] hw = 0x20 pc = 0x66
ID[622] hw = 0x10 pc = 0x6d
ID[623] hw = 0x00 pc = 0x74
ID[624] hw = 0x00 pc = 0x20
ID[625] hw = 0x00 pc = 0x10
ID[626] hw = 0x01 pc = 0x00
ID[628] hw = 0x02 pc = 0x00
ID[629] hw = 0x00 pc = 0x01
ID[630] hw = 0x44 pc = 0x00
ID[631] hw = 0xac pc = 0x02
ID[633] hw = 0x00 pc = 0x44
ID[634] hw = 0x98 pc = 0xac
ID[635] hw = 0x09 pc = 0x00
ID[636] hw = 0x04 pc = 0x00
ID[637] hw = 0x00 pc = 0x98
ID[638] hw = 0x06 pc = 0x09
ID[639] hw = 0x00 pc = 0x04
ID[640] hw = 0x18 pc = 0x00
ID[641] hw = 0x00 pc = 0x06
ID[642] hw = 0x64 pc = 0x00
ID[643] hw = 0x61 pc = 0x18

I’m using Daisy Pod, with firmware running from internal flash. Here’s a code I’ve used for testing:

and bin file filled with 0xFFs:
output.bin.zip (221 Bytes)

I appreciate any ideas/suggestions!

It seems like this issue might be related to cache maintenance. However I have not been able to fix the issue with solutions I found online. Here’s another pattern of messed-up bytes:
original: 220900 F50500 7A0A00 EE0600
read: 220905 007A0A 00EE06

As you can see, bytes 3 and 4 are missing from the read pattern, which might indicate a caching cleanup issue.

@shensley by any chance does this ring a bell?

Thinking out loud here… does using ordinary RAM for the buffer make a difference?

No, unfortunately. I believe that the root cause is a combination of non-aligned buffers and cache invalidation, but it is kinda over my head.

When I see cache issues they usually present as a no change in the data, but that can also be hard to tell with the SDRAM because it isn’t guaranteed to be zero at startup.

This does kind of seem like some sort of SD Reading issue, and possibly errors due to hardware/diskio.
Things to try to identify if that’s the issue or not:

  1. Set the SD card bus-width to 1-bit instead of 4
  2. Reduce the speed of SD Card down to MEDIUM_SLOW

If either of these resolve the issue then it’s most likely related to routing of the SD card specific traces.

Otherwise, there are a few things to try to at least rule out cache as the issue:

  1. Disable the DCache entirely using SCB_DIsableDCache() after calling Init on your daisy object.
  2. Add a dsy_dma_invalidate_cache_for_buffer(inbuff, length); before your f_read call.

Note: the OP is using Daisy Pod.

1 Like

Thanks!

I’m using Pod, so I hope it’s not a hw issue, and the fact that the issue is consistent with 2 different cards, as well as on my custom hardware, I’d say it’s a software issue.

I tried your suggestions, but none of them worked unfortunately. The only thing that helps is reading in 512 chunks, but it’s not ideal in my usecase.

hm, so I was just looking over the SD Diskio stuff, and it should be doing a clean before reading, and an invalidate after reading. So cache should be totally safe on that, as long as the memory being used is cacheable, and the memory sections are compatible with SDMMC (e.g. either AXI SRAM or SDRAM).

I think we recently spoke about something similar, and identified that there might be some switch internally in FatFS that will use an internal buffer for blocks <512, and then operate directly on the buffer if it’s larger.

If you’ve got a debugger hooked up, it might be putting a breakpoint in the SD_read function in util/sd_diskio.c and seeing if the address for the *buff is your SDRAM buffer or something internal.
That won’t directly solve the issue, but it might at least hint at where a problem might be occurring.

Also, we did just do a few releases on libDaisy that include a huge HAL driver update. So if you haven’t it might be worth seeing if that behavior persists after an update (in case there was some bug/error in the HAL itself causing this issue).

1 Like

I’ll look into buff address, thanks.

I did switch to a new HAL, but it hasn’t changed anything. I also tried the latest FatFS version and the more recent sd_diskio.io implementation(STM32CubeH7/Middlewares/Third_Party/FatFs/src/drivers/sd_diskio_dma_template_bspv2.c at master · STMicroelectronics/STM32CubeH7 · GitHub).

One interesting thing I noticed that affects the corruption is the f_lseek amount. If I read from the beginning of the file (without calling f_lseek), the received data is correct. But when I move read pointer, it gets tricky. In some cases, I’m getting correct data (when pointer is multiply of 32, or a bit lower), in other - corrupted data. Also, it seems that when the buffer is a multiply of 32 it makes it a bit more stable in relation to various f_lseek positions.

Interestingly, when I switched the sd_discio.c implementation to blocking transfers, I’m not getting any data corruption, and the timing is kinda the same. So I guess, the root issue is somewhere within DMA configuration?

I think I figured it out. The problem is coming from the SystemClocks setup. Whenever I comment out all the PeriphClkInitStruct setup, I get stable correct reads.

@shensley What do you think? Seems like something is up with the clocks, or other peripherals that might interfere with sdmmc controller :thinking:

I was wrong :slight_smile: turns out it was the issue with my test code, so clock configuration does not have any effect.

In my test code, I do repeated reads while advancing offset that is used to f_lseek the read pointer.

From my tests, it seems that the f_lseek works in increments of 4, meaning for values [0, 3], f_read will load the values from 0, for values [4, 7], it will load values from 4 etc.

This behavior is kinda unnoticeable when reading data in even chunks (i.e. when working with native types int8, int16 etc). The worst that might happen here is missing or duplicated data, which isn’t a big deal.

In my case, I am working with 24bit values (24bit PCM wav format, converted to 32bit floats), so this behavior leads to shifted bytes, which means corrupted 32bit representation.

Below is the stream of 516 bytes when reading from 0 and 1 bytes respectively (the file contains bytes rising from 0 to ff, and repeated)

Reading from: 0  | ID[0]  0  1 ... fd fe ff  0  1  ... fc fd fe ff | 0  1  2  3 ...
Reading from: 1  | ID[1]  1  2 ... fd fe ff  0  1  ... f9 fa fb fc | 0  1  2  3 ...
Reading from: 2  | ID[2]  2  3 ... fd fe ff  0  1  ... fa fb fc fd | 0  1  2  3 ... 
Reading from: 3  | ID[3]  3  4 ... fd fe ff  0  1  ... fb fc fd fe | 0  1  2  3 ...
Reading from: 4  | ID[4]  4  5 ... fd fe ff  0  1  ... fc fd fe ff | 0  1  2  3 ...

As you can see, for cases 1-3, the few last bytes in 512 chunks (left to |) are missing. It feels like DMA data alignment thing, but all of my attempts to __attribute__((aligned(32))) any of the buffers I can find are not helping.

Oh fascinating!
Thanks for the detailed follow up! Glad it wasn’t something weird with the clock tree.

It makes sense that seeking would be 32-bit aligned, but I never realized it was.

Depending on what types you’re using, and how things are getting converted. You may actually want to use the packed attribute to ensure the minimum number of bytes used is what you expect (esp. if you made a custom union or struct for your 24-bit datatype).
Also fwiw, I believe both aligned and packed use the number of bytes, not the number of bits for their arguments (in case your usage above was meant to be 32-bit alignment).

Very interesting. Is this something which should be documented as a difference from normal disk-based seeking, or just a subtle programming error?

It seems like a limitation posed by DMA. Blocking IO works fine in my tests, meaning I can move the pointer on a single byte.

With that figured out, I’m kinda back to the square one :smiley:

My initial issue was an unexpected hard faults when reading a lot of small chunks of data. After defining HAL_SD_ErrorCallback I see that I’m getting RX overrun errors (SDMMC_ERROR_RX_OVERRUN) and I’m not really sure what to do next. I’m kinda found a “sweet” spot in my configuration + some hacks from the internet, which reduces the frequency of these errors but they still happen.

Ideally, I would have a solid implementation in daisy, so I wouldn’t need to compromise performance to avoid bugs.

I’m getting similar issues with reading data for DrMP3 and DrFlac. Tested the same code on PC and Daisy. Using fatfs on daisy and the boot loader, looks like random sized f_reads occasionally give 3 bytes of random data in the buffer. I’ve tried invalidating the cache - but no joy with that. So I’m now writing a fixed sized (read in chunks of 64kb e.t.c) circular buffer read consume system. Just posted this here if anyone else get the same trouble. Here’s the raw data load that is getting corrupted: (3 bytes in the wrong places in small 741 byte read)

CC,99,99 is missing from the daisy seed read and then 44,22,BB is blatted in further down the file - freaky!

This is the same file - on the same SD card and all using the same code.

More info on this:

My SD reader is on the SDIO pins - 4bit - it’s pretty quick - I do a music file scan at boot up and it takes 800ms to scan 1400 files in 97 directories. SD card is exfat 128gb.

This is after I setup my spi LCD, but before I start the Audio callback.
I did change the clk source on spi - so it’s faster and gets around 13ms for a full 240x280 16bit blit. Can’t remember what internal code I changed - like a idiot I forgot to mark it-doh!

I know the DrMp3 decode works just fine because I’ve loaded the entire mp3 into sdram and decoded with no problems - seems one big file load is fine.

SD card speed is set to sd_cfg.speed = daisy::SdmmcHandler::Speed::MEDIUM_SLOW; - which is a fine data rate for me.

My code uses no Seeking - it’s purely read after read e.t.c, all small sized chunks of random sized data.

It’s taken me days to track this down so far. I think that reading a file in pow2 chunks will work fine, I think that it’s some kind on non-aligned buffer issue and odd sized file read that causes the problem.

Did a raw file test on PC and Daisy - identical results - so must post above is probably another issue!

Here’s the daisy code I used:

#define SerialLogVA(str,…) hardware.PrintLine(str,VA_ARGS);
#define SerialLog(str) hardware.PrintLine(str);
unsigned long crctable[256] = {

0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, 0x076dc419, 0x706af48f,

0xe963a535, 0x9e6495a3, 0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988,

0x09b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91, 0x1db71064, 0x6ab020f2,

0xf3b97148, 0x84be41de, 0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7,

0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, 0x14015c4f, 0x63066cd9,

0xfa0f3d63, 0x8d080df5, 0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172,

0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b, 0x35b5a8fa, 0x42b2986c,

0xdbbbc9d6, 0xacbcf940, 0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59,

0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, 0x21b4f4b5, 0x56b3c423,

0xcfba9599, 0xb8bda50f, 0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924,

0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d, 0x76dc4190, 0x01db7106,

0x98d220bc, 0xefd5102a, 0x71b18589, 0x06b6b51f, 0x9fbfe4a5, 0xe8b8d433,

0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, 0x7f6a0dbb, 0x086d3d2d,

0x91646c97, 0xe6635c01, 0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e,

0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457, 0x65b0d9c6, 0x12b7e950,

0x8bbeb8ea, 0xfcb9887c, 0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65,

0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, 0x4adfa541, 0x3dd895d7,

0xa4d1c46d, 0xd3d6f4fb, 0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0,

0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9, 0x5005713c, 0x270241aa,

0xbe0b1010, 0xc90c2086, 0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f,

0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, 0x59b33d17, 0x2eb40d81,

0xb7bd5c3b, 0xc0ba6cad, 0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a,

0xead54739, 0x9dd277af, 0x04db2615, 0x73dc1683, 0xe3630b12, 0x94643b84,

0x0d6d6a3e, 0x7a6a5aa8, 0xe40ecf0b, 0x9309ff9d, 0x0a00ae27, 0x7d079eb1,

0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, 0xf762575d, 0x806567cb,

0x196c3671, 0x6e6b06e7, 0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc,

0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5, 0xd6d6a3e8, 0xa1d1937e,

0x38d8c2c4, 0x4fdff252, 0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b,

0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, 0xdf60efc3, 0xa867df55,

0x316e8eef, 0x4669be79, 0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236,

0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f, 0xc5ba3bbe, 0xb2bd0b28,

0x2bb45a92, 0x5cb36a04, 0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d,

0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, 0x9c0906a9, 0xeb0e363f,

0x72076785, 0x05005713, 0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38,

0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0x0bdbdf21, 0x86d3d2d4, 0xf1d4e242,

0x68ddb3f8, 0x1fda836e, 0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777,

0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, 0x8f659eff, 0xf862ae69,

0x616bffd3, 0x166ccf45, 0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2,

0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db, 0xaed16a4a, 0xd9d65adc,

0x40df0b66, 0x37d83bf0, 0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9,

0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, 0xbad03605, 0xcdd70693,

0x54de5729, 0x23d967bf, 0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94,

0xb40bbe37, 0xc30c8ea1, 0x5a05df1b, 0x2d02ef8d};

u32 CalcCRC (u8* data,u32 size)

{

u32 iCRC = 0xFFFFFFFF;

for ( int i = 0; i < size; i++ )

iCRC = ( ( iCRC >> 8 ) & 0x00FFFFFF ) ^ crctable[ ( iCRC ^ data[i] ) & 0xFF];

return iCRC;

}

#define CRCCHUNKSIZE 0x10000
u8 DSY_SDRAM_BSS CrcFileBuffer[CRCCHUNKSIZE];
DSY_TEXT FIL tf; // must be in DSY_TEXT or DSY_SDRAM_BSS - else things start to go wrong eof not working e.t.c.

void CRCFileChecker(string fn)
{

f_open(&tf,fn.c_str(),FA_READ);
u32 Chunk=0;
SerialLogVA("CRC Check File: %s",fn.c_str());
while (!f_eof(&tf)!=0)
    {
    u32 BytesRead=0;
    f_read(&tf,CrcFileBuffer,CRCCHUNKSIZE,&BytesRead);
    hardware.DelayMs(10);
    u32 chunkcrc=CalcCRC(CrcFileBuffer,BytesRead);
    SerialLogVA("Chunk %d CRC 0x%08x ChunkSize %d",Chunk++,chunkcrc,BytesRead);
    }
f_close(&tf);
SerialLog("FileCheck done!");
hardware.DelayMs(500000);

}

here’s the PC code:

#define CRCCHUNKSIZE 0x10000
u8 CrcFileBuffer[CRCCHUNKSIZE];
FILE *tf;

void CRCFileChecker(string fn)
{

tf = fopen(fn.c_str(), “rb”);
u32 Chunk = 0;
SerialLogVA(“CRC Check File: %s\n”, fn.c_str());
while (!feof(tf) != 0)
{
u32 BytesRead = 0;
BytesRead=fread(CrcFileBuffer,1, CRCCHUNKSIZE, tf);

  u32 chunkcrc = CalcCRC(CrcFileBuffer, BytesRead);
  SerialLogVA("Chunk %d CRC 0x%08x ChunkSize %d\n", Chunk++, chunkcrc, BytesRead);

}
fclose(tf);
SerialLogVA(“FileCheck done!\n”);

}

and the results are identical - so it’s somethning else going on!

Finally fixed my FatFS issues - I’m sure there is read/seek problem with it. So I wrote my Buffered file system and that works a treat - been tested with over 1470 files e.t.c…

This is pretty scrappy coding - but should point you in the right direction if u get read/seek issues e.t.c.

I’m really surprised just how powerful the Daisy is - my current tests on cpu usage decoding MP3/Flac/Wav give this (including file io time):

5-12% flac
3-4% mp3
5-12% wav

> #define DSY_TEXT __attribute__((section(".text")))
> // Must be declared in DSY_TEXT
> typedef struct BufferedFile
> {
>     // 8kb SD sectors seems to give decent performance
>     #define FBSize 0x2000
>     FIL file;
>     u8  FileBuffer[FBSize];
>     
>     volatile u32 DataAvail=0;
>     volatile u32 FileRP=0;
>     volatile u32 FileWP=0;
>     volatile u32 VirtualPos=0;
>     string FileName;
> 
>     BufferedFile()      {  }
>     void ClearBuffers() { return;for (int a=0;a<FBSize;a++) FileBuffer[a]=0; }
> 
>     void Reset()
>     {
>     DataAvail=0;
>     FileRP=0;
>     FileWP=0;
>     VirtualPos=0;
>     ClearBuffers();
>     }
>     
>     void Open(string fn)
>     {
>     Close();// just in case!
>     Reset();
>     FileName=fn;
>     FRESULT fr=f_open(&file,fn.c_str(),FA_READ);
>     }
> 
>     void Reopen()
>     {
>     f_close(&file);
>     Open(FileName);
>     }
>   // internal functions 
>   // Don't call inside the Audio callback - ever!
>   u32 iread(u8* buf,u32 count)
>     {
> 
>   u32 ActualBytesRead=0;
>   u32 WantedBytes=count;
>   while (count>0)
>     {
>     // Load in a new chunk if there's space!
>     // probably the best place to round robin if u need multiple streams e.t.c.
>     if ((f_eof(&file)==0) && (DataAvail==0))
>             {
>             u32 BytesRead=0; 
>             FRESULT fr=f_read(&file,FileBuffer,FBSize,&BytesRead);          
>             DataAvail+=BytesRead;          
>             }
>     if (DataAvail>0)
>       {
>       if (buf) buf[ActualBytesRead]=FileBuffer[FileRP];
>       FileRP=(FileRP+1)&(FBSize-1);
>       DataAvail--;
>       ActualBytesRead++;
>       VirtualPos++;
>       }
>     count--;
>     }
>     return ActualBytesRead;
> }
> 
> // dirty and inefficient but it works!
> void iseek(u32 offset)
> {
>   f_lseek(&file,offset&0xFFFFFFF0);   // Seek to 16byte aligned pos from file beginning 
>   Reset();  
>   VirtualPos=offset&0xFFFFFFF0;
>   iread(0,offset&0xF); // consume till we get to the right offset
> 
> }
> 
> u32 Tell()    { return VirtualPos;}
> void Close()  { f_close(&file);}
> bool IsEof()  { return f_eof(&file)!=0;}
> u32 size()    { return f_size(&file);}
> 
> // FatFS replacements
> 
> // stdio  replacements
> 
> };
> 
> BufferedFile DSY_TEXT gBF;
1 Like

Suggestion: use ``` (three back ticks) before and after your code to preserve formatting.

1 Like