Hi!
"Kristian" == Kristian Nielsen <knielsen@knielsen-hq.org> writes:
<cut>
The problem is the following:
The alarm code now makes sure that we don't send the signal if we are not waiting for it; I may not be safe for the thread to receive the kill signal at any point in time (for example in thread engine code, which we don't want to interrupt).
Kristian> I do not see any problems sending the signal at any time. Of course, there Kristian> should be an appropriate handler set up (so we do not kill ourselves), but any Kristian> interuptible system call (like socket read()/write()) should in any case be Kristian> coded in a way that is safe for EAGAIN interruption. But maybe I did not Kristian> understand what particular problem you had in mind, not sure what you mean by Kristian> "thread engine code" and why we do not want to interrupt it. It depends on how good all the other libraries are that are in used. For example, assume that we send a signal while a storage engine is doing a read on a file. There is a notable change the storage engine will not do a retry ready in case of interrupts, especially if it would use some library to do read/writes. This is because in normal cases on never gets a signal during read/write no MySQL.
The alarm code makes sure that the signal is never missed. For example, if we would send the signal just before we enter read with SNO_SNDTIMEO, the thread would miss the signal and the 'kill command' would not have any effect.
Kristian> Yes, you are right, this would be prone to races with missed signal. Kristian> One option might be to call shutdown(2) on the socket, and then send the Kristian> signal. But this only works for killing the connection, not for just killing a Kristian> query. So not sure if this is a good idea. Yes, we can't use shutdown() as we also want to be able to just kill queries. The other problem is that if we do a shutdown() we can't tell the client that we did a 'graceful kill' and it didn't hit a bug.
To solve this, we would need to add the following mechanism:
- Add a flag to THD that signals if we are in a read() call on a connection. This flag should be modified under a mutex to ensure that the 'kill thread-id' code knows if it should send a signal or not.
Kristian> I did not understand why it is important not to send a signal if we are not in Kristian> read(). Kristian> (Protecting with a mutex seems a bit of a problem, as I think there is no way Kristian> to atomically unlock the mutex and initiate the read() call?) The above is needed to ensure that we really get a signal during read and we don't miss it. Pseudo code: Thread1: get_mutex() thd->in_read= 1; release_mutex(); if (!thd->killed) read() get_mutex() thd->in_read= 0; release_mutex(); The mutex would of course be a local mutex so there is never a conflict from this, except if someone wants to send a kill signal. When sending a kill in thread 2 do { get_mutex(); in_read= thd->in_read; thd->killed= 1; release_mutex(); if (!in_read) break; send_kill(); sleep(1); } As you see, we don't need to have the mutex over the read. We however need to mutex to ensure that we don't miss the kill signal whatever happens. In the above code, we may miss the kill signal, but this is ok as we will retry until thread 2 succeeds to break the read. Without a mutex, there is a chance that thread 2 will not detect that thread 1 will do a read and just set the killed flag, while thread 1 may not see the killed flag but instead block in the read.
- The kill code should send multiple kill commands to the thread, until the 'read()' flag changes state to 'not in read'.
Kristian> If this is acceptable (looping, sending kill and waiting a bit for the thread Kristian> to respond), then the race can be solved easily enough this way. Yes, but you need a mutex to make this fool proof. Regards, Monty