reoccuring DB corruption we don't understand

Hi, We are seeing recurrent corruption on one of our DB nodes, running RHEL9, under vmware, MariaDB 10.11.10.
2025-05-13 6:32:08 0 [ERROR] InnoDB: tried to purge non-delete-marked record in index `cardChipNumber` of table `mydb`.`idm_usc_kaart_status`: tuple: TUPLE (info_bits=0, 2 fields): {[17]36147687118764292(0x3336313437363837313138373634323932),[4] " (0x9722A2FB)}, record: COMPACT RECORD(info_bits=0, 2 fields): {[17]36147687118764292(0x3336313437363837313138373634323932),[4] " (0x9722A2FB)} 2025-05-13 6:32:08 0 [ERROR] InnoDB: Flagged corruption of `cardChipNumber` in table `mydb`.`idm_usc_kaart_status` in purge 2025-05-13 10:39:20 442752 [ERROR] Got error 180 when reading table './mydb/idm_usc_kaart_status' 2025-05-13 12:44:51 0 [ERROR] InnoDB: tried to purge non-delete-marked record in index `inschrijving_id` of table `mydb`.`tinschrijving`: tuple: TUPLE (info_bits=0, 2 fields): {[4] ](0x800D0C5D),[4] \Y[(0x805C595B)}, record: COMPACT RECORD(info_bits=0, 2 fields): {[4] ](0x800D0C5D),[4] \Y[(0x805C595B)} 2025-05-13 12:44:51 0 [ERROR] InnoDB: Flagged corruption of `inschrijving_id` in table `mydb`.`tinschrijving` in purge 2025-05-13 12:45:59 453516 [ERROR] Got error 180 when reading table './mydb/tinschrijving' 2025-05-13 12:46:03 453524 [ERROR] Got error 180 when reading table './mydb/tinschrijving' 2025-05-13 12:46:21 453540 [ERROR] Got error 180 when reading table './mydb/tinschrijving' 2025-05-13 12:46:26 453541 [ERROR] Got error 180 when reading table './mydb/tinschrijving'
and last thursday suddenly again:
2025-05-29 6:32:22 290727 [ERROR] Got error 180 when reading table './mydb/idm_usc'
It starts without an obvious (to us) reason. We have verified that we are running (with what seems to be) safe settings, like: innodb_doublewrite = ON and innodb_flush_method = O_DIRECT | The system has enough ram, swap configured but none in use, no full disks/partitions. At a certain moment (after may 13) it became so bad, that I took a full mysqldump of all DB's:
mysqldump --all-databases --single-transaction --routines --triggers --events > /mnt/sdc/full_admin_backup.sql
emptied the datadir, and imported it all fresh again:
mysql < /mnt/sdc/full_admin_backup.sql
And to my surprise, last week thursday, it happened again. Filesystem is xfs, mounted like:
/dev/sdb on /data type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
No fs corruption detected checked with
xfs_scrub /data and xfs_spaceman -c 'health -c ' /data/ | grep -v ok
We are unsure what causes this. It should not be possible (I guess?) for userspace / php scripts (this is LAMP setup) to cause issues like this. We are unsure how to further troubleshoot. Anyone with valuable suggestions or insights? Or more information required..? Thanks in advance! MJ
participants (1)
-
sacawulu