Hi Sergei, comments inline and a question: 10.0 throughput is twice lower than 5.6 in a specific case. It is known to be caused by tc_acquire_table() and tc_release_table(). Do we want to fix it? If yes - how? On Thu, Sep 12, 2013 at 10:13:30PM +0200, Sergei Golubchik wrote:
Hi, Sergey!
On Sep 11, Sergey Vojtovich wrote:
For every statment we acquire table from table cache and then release table back to the cache. That involves update of 3 lists: unused_tables, per-share used_tables and free_tables. These lists are protected by LOCK_open (see tc_acquire_table() and tc_release_table()).
Why per-share lists are updated under the global mutex?
I would have done that already if it would give us considerable performance gain. Alas, it doesn't solve CPU cache coherence problem.
It doesn't solve CPU cache coherence problem, yes. And it doesn't help if you have only one hot table.
But it certainly helps if many threads access many tables. Ok, let's agree to agree: it will help in certain cases. Most probably it won't improve situation much if all threads access single table.
We could try to ensure that per-share mutex is on the same cache line as free_tables and used_tables list heads. In this case I guess mysql_mutex_lock(&share->tdc.LOCK_table_share) will load list heads into CPU cache along with mutex structure. OTOH we still have to read per-TABLE prev/next pointers. And in 5.6 per-partition mutex should less frequently jump out of CPU cache than our per-share mutex. Worth trying?
How did you do the lock-free list, could you show, please? Please find it attached. It is mixed with different changes, just search for my_atomic_casptr.
Thanks.
What we need is to reduce number of these expensive memory reads, and there are two solutions: partition these lists or get rid of them. As we agreed not to partition, I'm trying the latter solution.
Well, you can partition the list. With 32 list head pointers. And a thread adding a table only to "this thread's" list. Of course, it's not complete partitioning betwen CPUs, as any thread can remove a table from any list. But at least there won't be one global list head pointer. Yes, that's what Oracle did and what we're trying to avoid.
I thought they've partitioned the TDC itself. And sometimes they need to lock all the partitions. If you only partition the unused_tables list, the TDC is shared by all threads and you always lock only one unused_tables list, never all of them.
Since they didn't split locks logically, yes, they had to do more complex solution: they have global hash of TABLE_SHARE objects (protected by LOCK_open) + per-partition hash of Table_cache_element objects (protected by per-partition lock). class Table_cache_element { TABLE_list used_tables; TABLE_list free_tables; TABLE_SHARE *share; } class Table_cache // table cache partition { mysql_mutex_t m_lock; HASH m_cache; // collection of Table_cache_elements objects TABLE *m_unused_tables; uint m_table_count; } Except for "m_cache", per-share mutex protects exactly what is protected by our LOCK_open currently. Thanks, Sergey