Hi Sergei, I just realized that I didn't share benchmark results (read-only OLTP that XL did): 5.6 tps: ~18k 10.0 tps: ~9k 10.0 + MDEV-4956 tps: ~11k I estimate tc_acquire_table and tc_release_table are eating up ~6k tps (2k per list). Regards, Sergey On Tue, Sep 10, 2013 at 09:11:16PM +0200, Sergei Golubchik wrote:
Hi, Sergey!
On Sep 10, Sergey Vojtovich wrote:
Hi Sergei,
thanks for looking into this patch. Frankly speaking I find it a bit questionable too. Below are links that should answer your questions... What problem do I attempt to solve: https://lists.launchpad.net/maria-developers/msg06118.html How do I attempt to solve it: https://mariadb.atlassian.net/browse/MDEV-4956
Yes, I've seen and remember both, but they don't answer my question, which was about specific changes that you've done, not about the goal. But ok, see below.
For every statment we acquire table from table cache and then release table back to the cache. That involves update of 3 lists: unused_tables, per-share used_tables and free_tables. These lists are protected by LOCK_open (see tc_acquire_table() and tc_release_table()).
Why per-share lists are updated under the global mutex?
Every time we update global pointer, corresponding cache lines of sibling CPUs have to be invalidated. This is causing expensive memory reads while LOCK_open is held.
Oracle solved this problem by partitioning table cache, allowing emulation of something like per-CPU lists.
We attempted to split LOCK_open logically, and succeeded at everything but these 3 lists. I attempted lock-free list for free_tables, but TPS rate didn't improve.
How did you do the lock-free list, could you show, please?
What we need is to reduce number of these expensive memory reads, and there are two solutions: partition these lists or get rid of them. As we agreed not to partition, I'm trying the latter solution.
Well, you can partition the list. With 32 list head pointers. And a thread adding a table only to "this thread's" list. Of course, it's not complete partitioning betwen CPUs, as any thread can remove a table from any list. But at least there won't be one global list head pointer.
Why I find this patch questionable? It reduces LOCK_open wait time by 30%, to get close to Oracle wait time, we need to reduce wait time by 90%. We could remove unused_tables as well, but it will be 60% not 90%.
Hmm, if you're only interested in optimizing this specific use case - one table, many threads - then yes, may be. But if you have many tables, then modifying per-share lists under the share own mutex is, basically, a must.
Regards, Sergei