Re: [Maria-developers] Review request: SHOW EXPLAIN

26 Jun 2012

Hi Sergei,

I've made notes from review phone call, as well as replied to some
comments below. (Comments that have been addressed were silently deletedx ,
comments that are without reply were not adressed are still on my todo).
...
...
=== added file 'mysql-test/r/show_explain.result'

--- mysql-test/r/show_explain.result	1970-01-01 00:00:00 +0000
+++ mysql-test/r/show_explain.result	2012-06-19 09:51:33 +0000
@@ -0,0 +1,698 @@
+max(c)
+9
+# We can catch EXPLAIN, too.
+set @show_expl_tmp= @@optimizer_switch;
+set optimizer_switch='index_condition_pushdown=on,mrr=on,mrr_sort_keys=on';
+explain select max(c) from t1 where a < 10;
+show explain for $thr2;
+id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
+1	SIMPLE	t1	range	a	a	5	NULL	10	Using index condition; Rowid-ordered scan
+Warnings:
+Note	1003	explain select max(c) from t1 where a < 10
+id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
+1	SIMPLE	t1	range	a	a	5	NULL	10	Using index condition; Rowid-ordered scan
+set optimizer_switch= @show_expl_tmp;
+# UNION, first branch 
+set @show_explain_probe_select_id=1;
+set debug_dbug='d,show_explain_probe_join_exec_start';
generally, we have debug_sync facility for synchronizing threads.
it should be much easier and less error prone to use, than coding
around dbug.
would debug_sync work here?
As agreed on the phone: I was unable to make use of debug sync for this
purpose, SergeiG will investigate.
...
...
+explain select a from t0 A union select a+1 from t0 B;
+show explain for $thr2;
+id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
+1	PRIMARY	A	ALL	NULL	NULL	NULL	NULL	10	
+2	UNION	B	ALL	NULL	NULL	NULL	NULL	10	
+NULL	UNION RESULT	<union1,2>	ALL	NULL	NULL	NULL	NULL	NULL	
+Warnings:
+Note	1003	explain select a from t0 A union select a+1 from t0 B
=== modified file 'sql/filesort.cc'
--- sql/filesort.cc	2012-05-21 13:30:25 +0000
+++ sql/filesort.cc	2012-06-12 19:04:11 +0000
@@ -1231,18 +1235,18 @@ int merge_buffers(SORTPARAM *param, IO_C
   void *first_cmp_arg;
   element_count dupl_count= 0;
   uchar *src;
-  killed_state not_killable;
   uchar *unique_buff= param->unique_buff;
-  volatile killed_state *killed= ¤t_thd->killed;
+  const bool killable= !param->not_killable;
+  THD* const thd=current_thd;
This "not_killable" thing is only used once, in uniques.cc. Added in 2001.
Can you ask Monty why filesort shouldn't be killed if
invoked from uniques.cc? I'd rather prefer to remove this
code, than to let it drag along for another eleven years
Ok, I can ask Monty and remove not_killable after the SHOW EXPLAIN work is
done. I do not want to mix SHOW EXPLAIN patch with random changes in filesort.
...
...
=== modified file 'sql/sql_select.cc'
--- sql/sql_select.cc	2012-06-04 15:26:11 +0000
+++ sql/sql_select.cc	2012-06-12 19:05:07 +0000
@@ -272,6 +272,53 @@ Item_equal *find_item_equal(COND_EQUAL *
 JOIN_TAB *first_depth_first_tab(JOIN* join);
 JOIN_TAB *next_depth_first_tab(JOIN* join, JOIN_TAB* tab);
+#ifndef DBUG_OFF
+// psergey:
+void dbug_serve_apcs(THD *thd, int n_calls)
+{
+  // TODO how do we signal that we're SHOW-EXPLAIN-READY? 
+  const char *save_proc_info= thd->proc_info;
+  thd_proc_info(thd, "show_explain_trap");
+  
+  int n_apcs= thd->apc_target.n_calls_processed + n_calls;
+  while (thd->apc_target.n_calls_processed < n_apcs)
+  {
+    my_sleep(300);
Argh. Sleeps in the test. debug_sync was created to allow tests
with reliable syncronization and without sleeps.
This seems to be the only way with current synchonization scheme. If we use
dbug sync this may change.

Currently show_explain.test takes 3.5 sec on the laptop, for 76 SHOW EXPLAIN
commands.
...
...
@@ -7861,6 +7961,13 @@ static bool create_ref_for_key(JOIN *joi
       if (keyuse->null_rejecting) 
         j->ref.null_rejecting |= 1 << i;
       keyuse_uses_no_tables= keyuse_uses_no_tables && !keyuse->used_tables;
+      /*
+        Todo: we should remove this check for thd->lex->describe on the next
+        line. With SHOW EXPLAIN code, EXPLAIN printout code no longer depends
+        on it. However, removing the check caused change in lots of query
+        plans! Does the optimizer depend on the contents of
+        table_ref->key_copy ? If yes, do we produce incorrect EXPLAINs? 
+      */
Are you going to do something about it, or just leave the comment around?
I'll just leave a comment, for now. I don't want SHOW EXPLAIN patch to include
a rewrite of parts of the optimizer.  I think, eventually we will figure out
the reason for the difference and remove dependency on thd->lex->describe.
...
...
=== modified file 'sql/sql_class.h'
--- sql/sql_class.h	2012-05-21 13:30:25 +0000
+++ sql/sql_class.h	2012-06-19 14:08:12 +0000
@@ -3243,6 +3318,37 @@ public:
/*
+  A select result sink that collects the sent data and then can flush it to
+  network when requested.
+
+  This class is targeted at collecting EXPLAIN output:
+  - Unoptimized data storage (can't handle big datasets)
+  - Unlike select_result class, we don't assume that the sent data is an 
+    output of a SELECT_LEX_UNIT (and so we dont apply "LIMIT x,y" from the
+    unit)
+*/
+
+class select_result_explain_buffer : public select_result_sink
Eh... Why haven't you done it as an I_S table?
It doesn't have to be available via SELECT * FROM I_S.xxx
but it should still use the same interface.
Why did you prefer to code a completely new way of implementing SHOW
commands?
(As discussed on the phone) I'll look at using I_S tables and SHOW command
handling for storing EXPLAIN output.
The rationale for this is "we've had switch other SHOW commands to using I_S".
...
...
+{
+public:
+  THD *thd;
+  Protocol *protocol;
+  select_result_explain_buffer(){};
+
+  /* The following is called in the child thread: */
+  int send_data(List<Item> &items);
+
+  /* this will be called in the parent thread: */
+  void flush_data();
+
+  void discard_data();
+
+  List<String> data_rows;
+};
+
+
+
+/*
   Base class for select_result descendands which intercept and
   transform result set rows. As the rows are not sent to the client,
   sending of result set metadata should be suppressed as well.
=== modified file 'sql/protocol.h'
--- sql/protocol.h	2012-01-13 14:50:02 +0000
+++ sql/protocol.h	2012-06-12 19:04:11 +0000
@@ -73,6 +78,20 @@ public:
   virtual bool send_result_set_metadata(List<Item> *list, uint flags);
   bool send_result_set_row(List<Item> *row_items);
+  void get_packet(const char **start, size_t *length) 
+  {
+    *start= packet->ptr();
+    *length= packet->length(); 
+  }
+  void set_packet(const char *start, size_t len)
+  {
+    packet->length(0);
+    packet->append(start, len);
+#ifndef DBUG_OFF
+  field_pos= field_count - 1;
+#endif
+  }
This is part of your select_result_explain_buffer code.
I don't understand the question? Anyway, this part is likely to be gone
after SHOW EXPLAIN is made to use I_S tables.
...
...
+
   bool store(I_List<i_string> *str_list);
   bool store(const char *from, CHARSET_INFO *cs);
   String *storage_packet() { return packet; }
=== modified file 'sql/sql_class.cc'
--- sql/sql_class.cc	2012-05-21 13:30:25 +0000
+++ sql/sql_class.cc	2012-06-19 14:02:19 +0000
@@ -2151,6 +2166,21 @@ void THD::rollback_item_tree_changes()
 }
+/*
+  Check if the thread has been killed, and also process "APC requests"
+
+  @retval true  The thread is killed, execution should be interrupted
+  @retval false Not killed, continue execution
+*/
+
+bool THD::check_killed()
+{
+  if (killed)
+    return TRUE;
+  apc_target.process_apc_requests(); 
+  return FALSE;
+}
I imagined it slightly differently. Without special code for killing, but
simply
bool THD::check_killed()
{
  return apc_target.process_apc_requests(); 
}
where process_apc_requests() returns the return value of
request->func(request->func_arg)
This way the function can decide whether to abort the statement processing
(and how) or not. And to kill the thread one would need to do not
thd->killed=1;
but
thd->killed=apc_kill_thread_func;
Independently from the above, process_apc_requests() should do as little as
possible in the normal case, when there are no apc requests. It used to be one
pointer dereference for killed, now it's a function calls.
Suggestion, move the first if(get_first_in_queue()) to a separate wrapper function,
which is declared inline, in the my_apc.h and call the real process_apc_requests
from this wrapper function. Like
class Apc_target
 {
   void process_apc_requests()
   {
     if (get_first_in_queue())
       really_process_apc_requests();
   }
   ...
This way the compiler will inline process_apc_requests and get_first_in_queue,
making the most common code path into one if (apc_calls)
And check_killed should be an inline function too, in the sql_class.h
Also, you forgot to change thd_killed() function to use thd->check_killed().
...
+
 /*****************************************************************************
 ** Functions to provide a interface to select results
 *****************************************************************************/
@@ -3198,6 +3322,42 @@ void THD::restore_active_arena(Query_are
   DBUG_VOID_RETURN;
 }
...
+
+
 Statement::~Statement()
 {
 }
=== modified file 'sql/sql_parse.cc'
--- sql/sql_parse.cc	2012-05-21 18:54:41 +0000
+++ sql/sql_parse.cc	2012-06-12 19:04:11 +0000
@@ -3127,6 +3128,32 @@ end_with_restore_list:
                            thd->security_ctx->priv_user),
                           lex->verbose);
     break;
+  case SQLCOM_SHOW_EXPLAIN:
+  {
+    /* Same security as SHOW PROCESSLIST (TODO check this) */
please add a test for it
Ok, will do.
...
...
+    if (!thd->security_ctx->priv_user[0] &&
+        check_global_access(thd,PROCESS_ACL))
+      break;
+
+    Item *it= (Item *)lex->value_list.head();
+
+    if (lex->table_or_sp_used())
+    {
+      my_error(ER_NOT_SUPPORTED_YET, MYF(0), "Usage of subqueries or stored "
+               "function calls as part of this statement");
This doesn't allow to localize the error message.
generally, hard-coding of error messages should be avoided, whenever possible
I'd simply made it a syntax error in the parser.
On the other hand... why not to allow it, if KILL does.
KILL doesn't allow this, grep for 'case SQLCOM_KILL' in sql_parse.cc
...
...
+      break;
+    }
+
+    if ((!it->fixed && it->fix_fields(lex->thd, &it)) || it->check_cols(1))
+    {
+      my_message(ER_SET_CONSTANTS_ONLY, ER(ER_SET_CONSTANTS_ONLY),
What does it have to do with SET?
"You may only use constant expressions with SET"
This check was copied from KILL statement handling.
...
...
+		 MYF(0));
+      goto error;
+    }
+
+    mysqld_show_explain(thd, (ulong)it->val_int());
+    break;
+  }
   case SQLCOM_SHOW_AUTHORS:
     res= mysqld_show_authors(thd);
     break;
=== modified file 'sql/sql_show.cc'
--- sql/sql_show.cc	2012-05-21 18:54:41 +0000
+++ sql/sql_show.cc	2012-06-12 19:04:11 +0000
@@ -1998,6 +1998,116 @@ void mysqld_list_processes(THD *thd,cons
   DBUG_VOID_RETURN;
 }
+
+/*
+  SHOW EXPLAIN FOR command handler
+
+  @param  thd         Current thread's thd
+  @param  thread_id   Thread whose explain we need
+
+  @notes
+  - Attempt to do "SHOW EXPLAIN FOR <myself>" will properly produce "target not
+    running EXPLAINable command".
+  - todo: check how all this can/will work when using thread pools
Done?
There is no testcase for it, but I've had a discussion with Wlad and
we came to conclusion that SHOW EXPLAIN should not have any issues with thread
pools.
I've removed the todo:
...
...
+*/
+
+void mysqld_show_explain(THD *thd, ulong thread_id)
+{
+  THD *tmp;
+  Protocol *protocol= thd->protocol;
+  List<Item> field_list;
+  DBUG_ENTER("mysqld_show_explain");
+
You don't seem to check the privileges here.
I check them inside mysql_execute_command(), like KILL does.
...
...
+  thd->make_explain_field_list(field_list);
+  if (protocol->send_result_set_metadata(&field_list, Protocol::SEND_NUM_ROWS |
+                                                      Protocol::SEND_EOF))
+    DBUG_VOID_RETURN;
+   
+  /* 
+    Find the thread we need EXPLAIN for. Thread search code was copied from
+    kill_one_thread()
+  */
+  mysql_mutex_lock(&LOCK_thread_count); // For unlink from list
+  I_List_iterator<THD> it(threads);
+  while ((tmp=it++))
+  {
+    if (tmp->command == COM_DAEMON)
+      continue;
+    if (tmp->thread_id == thread_id)
+    {
+      mysql_mutex_lock(&tmp->LOCK_thd_data);	// Lock from delete
+      break;
+    }
+  }
+  mysql_mutex_unlock(&LOCK_thread_count);
+  
+  if (tmp)
+  {
+    bool bres;
+    /* 
+      Ok we've found the thread of interest and it won't go away because 
+        we're holding its LOCK_thd data.
+      Post it an EXPLAIN request.
+      todo: where to get timeout from?
+    */
+    bool timed_out;
+    int timeout_sec= 30;
+    Show_explain_request explain_req;
+    select_result_explain_buffer *explain_buf;
+    
+    explain_buf= new select_result_explain_buffer;
+    explain_buf->thd=thd;
+    explain_buf->protocol= thd->protocol;
+
+    explain_req.explain_buf= explain_buf;
+    explain_req.target_thd= tmp;
+    explain_req.request_thd= thd;
+    explain_req.failed_to_produce= FALSE;
+    
+    /* Ok, we have a lock on target->LOCK_thd_data, can call: */
+    bres= tmp->apc_target.make_apc_call(Show_explain_request::get_explain_data,
+                                        (void*)&explain_req,
+                                        timeout_sec, &timed_out);
+
+    if (bres || explain_req.failed_to_produce)
+    {
+      /* TODO not enabled or time out */
+      if (timed_out)
+      {
+        my_error(ER_ERROR_WHEN_EXECUTING_COMMAND, MYF(0), 
+                 "SHOW EXPLAIN",
+                 "Timeout");
+      }
+      else
+      {
+        my_error(ER_ERROR_WHEN_EXECUTING_COMMAND, MYF(0), 
+                 "SHOW EXPLAIN",
+                 "Target is not running EXPLAINable command");
This doesn't allow to localize the error message.
generally, hard-coding of error messages should be avoided, whenever possible
(as discussed on the phone) Will look into adding an error message that will
be a template for all EXPLAIN errors.
...
...
=== added file 'sql/my_apc.h'
--- sql/my_apc.h	1970-01-01 00:00:00 +0000
+++ sql/my_apc.h	2012-06-19 13:59:00 +0000
...
+
+    @retval FALSE - Ok, the call has been made
+    @retval TRUE  - Call wasnt made (either the target is in disabled state or
+                    timeout occured)
+  */
+  bool make_apc_call(apc_func_t func, void *func_arg, 
+                     int timeout_sec, bool *timed_out);
Hmm. Either you make a C++ interface or a C interface. If it's C++,
then you pass an object instance, with a execute (for example) method,
and this object has all the data. If you pass a function and a void pointer -
it's C-style interface, but then why did you write the rest of apc code in C++?
(agreed to make a C++ interface with function-class)
...
...
+
+#ifndef DBUG_OFF
+  int n_calls_processed; /* Number of calls served by this target */
+#endif
+private:
+  class Call_request;
+
+  /* 
+    Non-zero value means we're enabled. It's an int, not bool, because one can
+    call enable() N times (and then needs to call disable() N times before the 
+    target is really disabled)
+  */
+  int enabled;
+
+  /* 
+    Circular, double-linked list of all enqueued call requests. 
+    We use this structure, because we 
+     - process requests sequentially (i.e. they are removed from the front)
+     - a thread that has posted a request may time out (or be KILLed) and 
+       cancel the request, which means we need a fast request-removal
+       operation.
This explains why you have a list, and why it's doubly-linked.
But why is it circular?
...
+  */
+  Call_request *apc_calls;
+ 
+  class Call_request
+  {
+  public:
+    apc_func_t func; /* Function to call */
+    void *func_arg;  /* Argument to pass it */
+
+    /* The caller will actually wait for "processed==TRUE" */
+    bool processed;
+
+    /* Condition that will be signalled when the request has been served */
+    mysql_cond_t COND_request;
+    
+    /* Double linked-list linkage */
+    Call_request *next;
+    Call_request *prev;
+    
+    const char *what; /* (debug) state of the request */
+  };
+
+  void enqueue_request(Call_request *qe);
+  void dequeue_request(Call_request *qe);
+
+  /* return the first call request in queue, or NULL if there are none enqueued */
+  Call_request *get_first_in_queue()
+  {
+    return apc_calls;
+  }
+};
+
+///////////////////////////////////////////////////////////////////////
+
=== added file 'sql/my_apc.cc'
--- sql/my_apc.cc	1970-01-01 00:00:00 +0000
+++ sql/my_apc.cc	2012-06-12 19:04:11 +0000
@@ -0,0 +1,377 @@
+/*
+  TODO: MP AB Copyright
Right. Please do.
...
+*/
+
+
+#ifdef MY_APC_STANDALONE
+
+#include <my_global.h>
+#include <my_pthread.h>
+#include <my_sys.h>
+
+#include "my_apc.h"
+
+#else
+
+#include "sql_priv.h"
+#include "sql_class.h"
+
+#endif
+
+
+/*
+  Standalone testing:
+    g++ -c -DMY_APC_STANDALONE -g -I.. -I../include -o my_apc.o my_apc.cc
+    g++ -L../mysys -L../dbug -L../strings my_apc.o -lmysys -ldbug -lmystrings -lpthread -lrt
Eh, please no. Nobody will be doing that.
Add unit tests properly, into unittest/
...
+*/
+
+
+/* 
+  Initialize the target. 
+   
+  @note 
+  Initialization must be done prior to enabling/disabling the target, or making
+  any call requests to it.
+  Initial state after initialization is 'disabled'.
+*/
+void Apc_target::init(mysql_mutex_t *target_mutex)
+{
+  DBUG_ASSERT(!enabled);
+  LOCK_thd_data_ptr= target_mutex;
This doesn't seem to make a lot of sense. Apc_target is inside the THD,
and it needs to store a pointer to the THD's mutex?
I understand that originally, it has its own mutex and then you simply
changed it to use the THD's mutex. But the result isn't particularly logical.
On the other hand, you don't remember the THD itself, and the apc function
will always need to use current_thd.
It doesn't use it now.
...
I'm not sure what a good solution could be. For example, it apc could be more
tightly bound to a THD, by inheriting THD class from Apc_target.
What exactly is wrong with the current solution? Apc_target is not integrated
with THD, there is only one point where they interact, and it's the mutex.

Maybe this question/comment will become irrelevant once Apc_target is changed
to be the universal APC delivery mechanism (see next comment).
...
Or less bound, and providing a C interface, by putting it in mysys_var.
But I think that making it very tightly THD-dependent is the right approach.
Otherwise, it is assumed that it can be used for non-THD threads,
and it puts unnecessary limitations on what a callback function can do.
...
+#ifndef DBUG_OFF
+  n_calls_processed= 0;
+#endif
+}
+
+
+/* 
+  Destroy the target. The target must be disabled when this call is made.
+*/
+void Apc_target::destroy()
+{
+  DBUG_ASSERT(!enabled);
+}
+
+
+/* 
+  Enter ther state where the target is available for serving APC requests
+*/
+void Apc_target::enable()
+{
+  /* Ok to do without getting/releasing the mutex: */
1. Why?
2. I don't see why you need enable/disable calls at all.
You use that to disable SHOW EXPLAIN for certain parts of the code,
but this is wrong. Apc is not SHOW EXPLAIN, you should not disable the
whole apc mechanism only because SHOW EXPLAIN cannot be served.
Instead the show-explain-generating callback function should
detect when it can and cannot be used and react appropriately.
(as discussed on the phone) will try to
- get the APC mechanism to work always, including ability to wake up
  threads that are sleeping/waiting.
- EXPLAIN will check whether the APC is available in the context of the target
  thread.
...
...
+  enabled++;
+}
+
+
+/* 
+  Make the target unavailable for serving APC requests. 
+  
+  @note
+    This call will serve all requests that were already enqueued
+*/
+
+void Apc_target::disable()
+{
+  bool process= FALSE;
+  mysql_mutex_lock(LOCK_thd_data_ptr);
+  if (!(--enabled))
+    process= TRUE;
+  mysql_mutex_unlock(LOCK_thd_data_ptr);
+  if (process)
+    process_apc_requests();
+}
+
+
+/* [internal] Put request qe into the request list */
+
+void Apc_target::enqueue_request(Call_request *qe)
+{
...
+/*
+  Make an APC (Async Procedure Call) to another thread. 
+
+  - The caller is responsible for making sure he's not calling to the same
+    thread.
+
+  - The caller should have locked target_thread_mutex.
+
+
+  psergey-todo: Should waits here be KILLable? (it seems one needs 
+  to use thd->enter_cond() calls to be killable)
I think yes, with a timeout of 30 seconds - killable.
...
+*/
+
+  if (enabled)
+  {
+    /* Create and post the request */
+    Call_request apc_request;
+    apc_request.func= func;
+    apc_request.func_arg= func_arg;
+    apc_request.processed= FALSE;
+    mysql_cond_init(0 /* do not track in PS */, &apc_request.COND_request, NULL);
Why not?
...
+    enqueue_request(&apc_request);
+    apc_request.what="enqueued by make_apc_call";
+ 
+    struct timespec abstime;
+    const int timeout= timeout_sec;
+    set_timespec(abstime, timeout);
...
+
+class Apc_order
+{
+public:
+  int value;   // The value 
+  int *where_to;  // Where to write it
+  Apc_order(int a, int *b) : value(a), where_to(b) {}
+};
+
+void test_apc_func(void *arg)
+{
+  Apc_order *order=(Apc_order*)arg;
+  usleep(int_rand(1000));
+  *(order->where_to) = order->value;
+  __sync_fetch_and_add(&apcs_served, 1);
It's gcc-only. Use my_atomic instead.
See other concurrent tests in unittest/mysys
...
+}
+
BR
 Sergei
-- 
Sergei Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog