I'm not closing them explicitly. The process running on the machine does the following (summarized): Management event: STOP SLAVE 'slave-name'; FLUSH TABLES replicated_table; DROP TABLE replicated_table_batch; ALTER TABLE replicated_table RENAME TO replicated_table_batch; CREATE TABLE replicated_table LIKE replicated_table_batch; START SLAVE 'slave-name'; Processing thread (the one that gets stuck): INSERT INTO accumulating_table SELECT ... FROM replicated_table_batch ... ON DUPLICATE KEY UPDATE ... With a few triggers on accumulating_table running to update a few other summary tables. There is a status table that maintains state and makes sure the replicated_table/replicated_table_batch doesn't get rotated while the process is running. The management event NEVER runs while the processing event is running, so the table flushes it is doing shouldn't be related (and if it was happening I'd expect to see in in the show full processlist output, which is not the case). As an aside, there are interesting side-effects of replication on transactions when you start renaming tables. If you use an InnoDB table, stopping the slave stops mid-transaction. After the table rename, the outcome becomes highly "undefined". I have not been able to establish if the transaction continues on the old renamed table or on a new table with the same name that has been created in the meantime. This throws the data consistency out of sync if there are multiple processes running on the rename target table as some transactions appear to arrive after the rename. When using a non-transactional table type (e.g. MyISAM) this doesn't happen - slave thread knows to continue committing until the end of transaction before returning. Arguably, this should also happen for transaction-capable tables (or at least the behaviour should be configurable in this regard. Similar for ALTER TABLE ... RENAME TO ... - it should probably ensure a clean break in the transaction commit charge so that pre-rename transactions don't arrive after the rename has completed. This may could be considered to be a bug, but perhaps it is worthy of a discussion WRT what the correct behaviour _should_ be. As for the variables - any particular variables you're interested in? The full list is quite large. Gordan On Wed, 20 Mar 2013 14:47:29 +0100, Jesper Staun Hansen <jesper@staunhansen.dk> wrote:
Hello Gordan,
What does your system variables look like? What I am wondering about myself is why you're even closing the tables.
Jesper
On 20-03-2013 14:25, Gordan Bobic wrote:
Just as an additional data point, the query does seem to eventually complete (and the transaction it is in commits) after about 2 hours. But 2 hours seems an excessively long time for "closing tables". It is a simple INSERT ... SELECT ... ON DUPLICATE KEY UPDATE, with a few triggers on the table being inserted into firing off. The SELECT on it's own takes only seconds to complete. The UPDATEs fired off by the triggers complete in milliseconds.
Gordan
On Wed, 20 Mar 2013 12:51:50 +0000, Gordan Bobic <gordan@bobich.net> wrote:
On Wed, 20 Mar 2013 13:20:28 +0100, Sergei Golubchik <serg@askmonty.org> wrote:
Hi, Gordan!
On Mar 20, Gordan Bobic wrote:
On Mar 20, Gordan Bobic wrote: > I have a situation with MariaDB 10.0.0 where an INSERT INTO ... > SELECT query that takes 3 seconds to run ends up stuck in the > closing tables state for 15+ minutes while burning 100% of CPU. > Stack trace attached in a text file. Does this provide enough information for a worthwhile bug report?
Was this stack trace created when you had your problem with a stuck query?
Yes. And the same query is still stuck in the same state 35 minutes later.
I have attached a fresh stack trace.
Because it doesn't look like there's any thread stuck in the "closing table" state.
In which case the state shown in "show full processlist" is incorrect. Note that I think that helps.
There's one event running, a couple of idle connection threads, a couple of dead connection thread (waiting in the thread cache for resurrection), and internal service threads, aria, innodb, signal thread, etc.
The one event that is running must be it, then. I specifically disabled everything else on the box (slaves, event schduler) to get a clean trace.
Gordan
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp