
Sergei Golubchik <serg@askmonty.org> writes:
On Dec 14, Arjen Lentz wrote:
Can we adopt/implement http://forge.mysql.com/worklog/task.php?id=4925 in MariaDB?
P.S. These are the changesets:
final version: http://lists.mysql.com/commits/113309 http://lists.mysql.com/commits/113306 http://lists.mysql.com/commits/113307 review comments: http://lists.mysql.com/commits/116965 http://lists.mysql.com/commits/121478
The implementation of this makes me very uneasy. The problem is that I see nothing that properly handles partial writes into the binlog, at least from a quick read-through. Neither in the worklog nor in the patch. Just the fact that this is not clearly described up-front in the worklog is very worrying! The worklog says this: "For replication threads, when reading the latest binary log, getting actual size information is needed to check EOF [...] If binlog size is not set, 4KB is read so bogus data is read if actual binlog size is smaller than 4KB. This makes slave i/o thread terminated)" But there is no guarantee that "bogus data" will be detected as such. We don't even have a checksum on events. So basically, after a crash the last binlog event may be corrupt, with no sure way to detect this corruption. In other words, we loose crash recovery. Which is the whole point of setting sync_binlog=1 in the first place. [I would love to learn that I am wrong, as this is a very nice feature. But the whole reason fsync() is slow when appending to files is handling the difficult issue of partial writes, so I would be really curious how the patch manages to handle this properly.] - Kristian.