I am writing this to describe work in progress for the Facebook patch. Comments are welcome. In the future we hope to get parts of the Facebook patch into public forks. We have two big projects in progress: * admission control * InnoDB compression for OLTP --- Admission control enforces max concurrent queries per account. MySQL already limits max concurrent connections per account, but we want to set the connection limit to a much larger value than the query limit. The limit will be enforced at statement start and will allow queries to timeout in the queue while waiting to run. By "descheduled" I mean that a connection decrements the count of concurrent queries for a particular account. Connections will be descheduled when: * a statement ends * a connection is about to send results back to a client -- only when the socket write is to be done, not when data is buffered * srv_suspend_mysql_thread is called on a row lock wait * possibly when a disk read is done within innodb * when enter_cond is called By "scheduled" I mean that a connection increments the count of concurrent queries for an account. If done with "noforce" then the caller will wait until this doesn't exceed concurrency limits for the account. If done with "force" then this doesn't wait and can exceed the concurrency limits. Schedule with noforce is done: * at statement start * after a socket write Schedule with force is done: * when exit_cond is called * when a thread wakes after calling srv_suspend_mysql_thread This change will allow us to run with higher values for max concurrent connections and we assume that will significantly increase mutex contention on kernel_mutex so we have begun to evaluate: * using the pthread mutex implementation that does not spin in more cases * limit the max number of threads that can do concurrent busy-wait loops for InnoDB's mutex and rw-lock * removing use of the sync array within InnoDB * changing Innodb to wake one rather than all waiters when a mutex is unlocked * somehow making read_view_open_now faster --- InnoDB compression for OLTP tries to make compression faster for OLTP workloads. We have work in progress to change InnoDB to not log page images for many but not all insert/update/delete operations. When InnoDB compression is used, InnoDB can log page images. That was not done before and for my production workloads this doubles the rate at which IO is done to the transaction log and increases the dirty page write rate by at least 20% because checkpoint IO must be done sooner. We previously added an option that allows the LZ compression level to be set globally. I think the default is level 6. We will evaluate: * using QuickLZ which provides faster compression and decompression than zlib * adding compress_ops and compress_ops_ok per table to information_schema.table_statistics * adding an option to limit compression to the clustered index and not compress secondary indexes -- Mark Callaghan mdcallag@gmail.com