On Fri, Aug 25, 2023 at 9:40 AM Marko Mäkelä <marko.makela@mariadb.com> wrote:
Like Gordan has said in this thread, you might just let the file system handle compression if you need it. But, there is no free lunch. I suppose that ZFS would not support O_DIRECT.
Doesn't support it YET. But if the concern is double-caching things, then ZFS has a solution for that - setting primarycache=metadata. Obviously buffered write require an extra memcopy, so things will get faster when O_DIRECT implementation lands, but this is generally not where significant bottlenecks are at the moment, especially when you can set sync=disabled and still preserve write ordering (which makes it safer than disabling innodb_flush_log_at_trx_commit, sync_binlog, sync_master_info, and similar).
In any case, with file system compression, page writes would become more than a simple matter of sending the data to a DMA controller. You could also let the storage layer handle compression. I was really impressed by the performance of ScaleFlux when we tested it some years ago.
ZFS compression performance is fast enough that it isn't really a problem. In many cases (spinning rust, slower SSDs), it often makes things faster because disk throughput is a bigger bottleneck than the compression cost.