Laurynas Biveinis <laurynas.biveinis@gmail.com> writes:
Did you test InnoDB or XtraDB?
It should be XtraDB, which is the default in MariaDB 10 now. This is the version info from univ.i: #define INNODB_VERSION_MAJOR 5 #define INNODB_VERSION_MINOR 6 #define INNODB_VERSION_BUGFIX 15 #ifndef PERCONA_INNODB_VERSION #define PERCONA_INNODB_VERSION 63.0 #endif
The affected waits are those that go to wait on events in the sync array(s). No global mutex is used if locking is completed through spinning.
We have implemented priority mutex/rwlocks in XtraDB for a different issue, but it indirectly helps here: allow high priority waiters waiting on their own designated event. When the mutex/rwlock is released, signal high-priority waiters only, There are much fewer higher priority waiter threads than regular ones.
We also have the innodb_thread_concurrency. All of these seem to be work-arounds for the fundamental problem that InnoDB locking primitives are fundamentally non-scalable. The global mutex on the sync arrays is bad enough, but wake-all is a real killer, as it creates O(N**2) cost of having N threads waiting on the same lock. But of course, this is the view of an outsider. I appreciate that the issue is much more complex once one gets down to the real code. The locking primitives are the very core of a complex legacy codebase. And the InnoDB locking primitives provide a lot of status information to the DBA that would not be available from a simple pthread_mutex_t.
dict_mem_table_create() creating mutexes and rwlocks all the time is a known issue: http://bugs.mysql.com/bug.php?id=71708. It was here forever, made worse in Oracle 5.6.16, fully fixed in Percona 5.6.16. Oracle should have a partial fix in 5.6.19 and full in 5.7.
Ah, nice, thanks for the pointer!
I wonder if the InnoDB team @ Oracle is doing something for this in 5.7? Does anyone know? I vaguely recall reading something about it, but I am not sure.
5.7 allows different mutex implementations to co-exist, and there is a new implementation that uses futexes. The sync array implementation is still there too. The code pushed so far seems to focus on getting the framework right and adding implementations more than on performance. I'd expect that to change in the later pushes.
Ok, so that sounds promising.
It would seem a waste to duplicate their efforts.
There are Percona's efforts too ;)
Indeed! I wouldn't want to duplicate any of that effort, though with Percona's effort it's a lot easier due to better communication. It seems that you great people at Percona have a good handle on the InnoDB issues, together with whatever the Oracle InnoDB team might come up with, so it makes sense for me to focus on other stuff. Though it still makes me cry to look at that sync array code in InnoDB... Thanks, - Kristian.