Michael Widenius <monty@askmonty.org> writes:
"Kristian" == Kristian Nielsen <knielsen@knielsen-hq.org> writes:
Kristian> Well, I checked the code, and it seems to wake up the thread using Kristian> pthread_kill(thread, signal) for the 'kill connection-id' command. This should Kristian> work fine also when using SO_SNDTIMEO for timeouts on the socket.
Kristian> Just send the signal to the thread blocking on the socket with SO_SNDTIMEO, Kristian> and the blocking socket call will return with EAGAIN or similar.
It's not that easy.
The problem is the following:
The alarm code now makes sure that we don't send the signal if we are not waiting for it; I may not be safe for the thread to receive the kill signal at any point in time (for example in thread engine code, which we don't want to interrupt).
I do not see any problems sending the signal at any time. Of course, there should be an appropriate handler set up (so we do not kill ourselves), but any interuptible system call (like socket read()/write()) should in any case be coded in a way that is safe for EAGAIN interruption. But maybe I did not understand what particular problem you had in mind, not sure what you mean by "thread engine code" and why we do not want to interrupt it.
The alarm code makes sure that the signal is never missed. For example, if we would send the signal just before we enter read with SNO_SNDTIMEO, the thread would miss the signal and the 'kill command' would not have any effect.
Yes, you are right, this would be prone to races with missed signal. One option might be to call shutdown(2) on the socket, and then send the signal. But this only works for killing the connection, not for just killing a query. So not sure if this is a good idea.
To solve this, we would need to add the following mechanism:
- Add a flag to THD that signals if we are in a read() call on a connection. This flag should be modified under a mutex to ensure that the 'kill thread-id' code knows if it should send a signal or not.
I did not understand why it is important not to send a signal if we are not in read(). (Protecting with a mutex seems a bit of a problem, as I think there is no way to atomically unlock the mutex and initiate the read() call?)
- The kill code should send multiple kill commands to the thread, until the 'read()' flag changes state to 'not in read'.
If this is acceptable (looping, sending kill and waiting a bit for the thread to respond), then the race can be solved easily enough this way.
Possible to do, but still a little bit of work and test.
Yes. - Kristian.