[Maria-developers] MDEV-5019 - THREADPOOL - Create Information Schema Table for Threadpool
Hi guys! I openned a new MDEV https://mariadb.atlassian.net/browse/MDEV-5019 the idea is expose the threadpool group information like oracle thread pool (more information about oracle thread pool here: http://mikaelronstrom.blogspot.com.br/2011_10_01_archive.html) I can help, but i don't know how access the threadpool at scheduler.h/scheduler.c from a information schema plugin 1) lock important mutex and others mdl 2) acess variables 3) write information schema based on variables 1 and 2 i don't know how to do :( can anyone help ? :) thanks guys :) -- Roberto Spadim
Hi Roberto, The structures you need are in sql/threadpool_unix.cc Global all_groups array contains all thread groups . Every thread_group_t has list of waiting threads , called "waiting_threads", and queue of not yet handled requests, called "queue" (request is represented by connection_t ), a listener etc. "Active" connections, i.e connections that currently are executing queries , can neither be found in threadpool's waiting_threads, nor there is a different "active" list for them. So you may want to introduce special handling for those (i.e iterate the global "threads" list looking for active connections, get corresponding connection_t* struct from thd->event_scheduler.data, and do something with it, e.g look which thread group it belongs) On Windows, I doubt you can implement any information_schema plugin for the threadpool. This is because threadpool is native OS threadpool, and OS structure representing it PTP_POOL structure is opaque, and there is not much info you can extract from it (well, maybe you can , if you debug with a kernel debugger, but not otherwise) Wlad From: Maria-developers [mailto:maria-developers-bounces+wlad=montyprogram.com@lists.launchpad.net] On Behalf Of Roberto Spadim Sent: Montag, 16. September 2013 21:20 To: maria-developers@lists.launchpad.net Subject: [Maria-developers] MDEV-5019 - THREADPOOL - Create Information Schema Table for Threadpool Hi guys! I openned a new MDEV https://mariadb.atlassian.net/browse/MDEV-5019 the idea is expose the threadpool group information like oracle thread pool (more information about oracle thread pool here: <http://mikaelronstrom.blogspot.com.br/2011_10_01_archive.html> http://mikaelronstrom.blogspot.com.br/2011_10_01_archive.html) I can help, but i don't know how access the threadpool at scheduler.h/scheduler.c from a information schema plugin 1) lock important mutex and others mdl 2) acess variables 3) write information schema based on variables 1 and 2 i don't know how to do :( can anyone help ? :) thanks guys :) -- Roberto Spadim
Hi Wlad! thanks for the explain... for now, just unix is ok, windows is a second step =] Just some doubts... 2013/9/17 Vladislav Vaintroub <wlad@montyprogram.com>
Hi Roberto,****
** **
The structures you need are in sql/threadpool_unix.cc****
Global all_groups array contains all thread groups . Every thread_group_t has list of waiting threads , called “waiting_threads”, and queue of not yet handled requests, called “queue” (request is represented by connection_t ), a listener etc.
connection_t = THD class? it's a different structure/class? i didn't read the source yet, just high level doubt...
****
“Active” connections, i.e connections that currently are executing queries , can neither be found in threadpool’s waiting_threads, nor there is a different “active” list for them.
Where they are? ok i understood that we don't have a flag to show "active/not active" this should be done with a loop in threads to know if it's active or not, but what should i look? waiting_thread? queue? "processlist" threads? the thread group only have threads that are not executing, just queue and waiting_threads, and executing threads are outside thread pool? Other doubt... a waiting query (inside threadpool and not executing, inside "waiting_threads") is parsed, it have a THD structure or something that have the THD->lex? or it just have the network packages "saved in memory"? when it start execution it will be parsed and the THD will be executed?
So you may want to introduce special handling for those (i.e iterate the global “threads” list looking for active connections, get corresponding connection_t* struct from thd->event_scheduler.data, and do something with it, e.g look which thread group it belongs)
Hum, in this case should be more 'inteligent' have the thread_group at PROCESSLIST table? and a SQL SELECT could do the job at SQL Layer about "what connection is active"? It's nice have a table with all thread, but i think it's the job of PROCESSLIST right? the point here is the THREAD_POOL information, in other words, now with you explain, we will have information about queue (connections without queries, maybe in sleep state) and connections waiting start (i don't know how is the state of this queries in processlist, do you know?) In this case (add information at processlist), probably, we will have two kind of PROCESSLIST one for windows and other for unix, i'm right or wrong? I'm thinking about some thing like current processlist: IDQUERY_IDUSERHOSTDBCOMMANDTIMESTATEINFOTIME_MSSTAGEMAX_STAGEPROGRESS MEMORY_USEDEXAMINED_ROWS1962436279rspadim***:48785spd_cashflowQuery0 executingselect * from information_schema.PROCESSLIST0.321000.000832800 add two columns: SCHEDULER ENUM('no-threads','one-thread-per-connection','pool-of-threads','daemon-plugin') NOT NULL DEFAULT 'one-thread-per-connection', THREADPOOL_GROUP BIGINT UNSIGNED DEFAULT NULL COMMENT 'NULL = not thread pool connection' the THREADPOOL_GROUP column is a INT? BIGINT? UNSIGNED? NOT NULL? maybe this could be allowed without the thread-poll and we could use NULL to show that's a "one-thread-one-connection" or "no-threads" or "daemon-plugin", what you think? in patch of QUERY_ID column (MDEV-4911) we had *many* test cases changes, but it's not a problem =], just change one time instead of add one column, write a patch, add other column write other patch... ****
** **
On Windows, I doubt you can implement any information_schema plugin for the threadpool. This is because threadpool is native OS threadpool, and OS structure representing it PTP_POOL structure is opaque, and there is not much info you can extract from it (well, maybe you can , if you debug with a kernel debugger, but not otherwise)
Windows will be a other MDEV (https://mariadb.atlassian.net/browse/MDEV-5046) =) let's start with linux, it's easier than windows hehehe :P
****
****
** **
Wlad****
** **
*From:* Maria-developers [mailto:maria-developers-bounces+wlad= montyprogram.com@lists.launchpad.net] *On Behalf Of *Roberto Spadim *Sent:* Montag, 16. September 2013 21:20 *To:* maria-developers@lists.launchpad.net *Subject:* [Maria-developers] MDEV-5019 - THREADPOOL - Create Information Schema Table for Threadpool****
** **
Hi guys!****
I openned a new MDEV *https://mariadb.atlassian.net/browse/MDEV-5019* the idea is expose the threadpool group information like oracle thread pool (more information about oracle thread pool here: http://mikaelronstrom.blogspot.com.br/2011_10_01_archive.html)****
** **
I can help, but i don't know how access the threadpool at scheduler.h/scheduler.c from a information schema plugin****
****
1) lock important mutex and others mdl****
2) acess variables ****
3) write information schema based on variables****
** **
1 and 2 i don't know how to do :( ****
can anyone help ? :)****
** **
thanks guys :)****
** **
-- ****
Roberto Spadim****
-- Roberto Spadim SPAEmpresarial
Hi! Some read/study i'm doing about thread pool to understand what could be exposed with information_schema tables... check if i'm grouping ("creating tables") with the right information... i'm reading threadgroup_unix.cc and others threadgroup* files to start this work ----- from your mail Vladislav: *"Global all_groups array contains all thread groups ."* * * => static thread_group_t *all_groups*[MAX_THREAD_GROUPS]; => static uint *group_count*; "group_count" is the max value of a loop using all_groups var? something realted to "threadpool_max_threads", with a limit of group_count<MAX_THREAD_GROUPS? *"Every thread_group_t has list of * * waiting threads , called “waiting_threads”, * * and queue of not yet handled requests, called “queue” * * (request is represented by connection_t ), * * a listener etc."* a request is a "php mysql_connect"? each new tcp/ip connection create a new request? => struct *thread_group_t* { mysql_mutex_t mutex; *connection_queue_t* *queue*; *worker_list_t* *waiting_threads*; * worker_thread_t* *listener; (listener thread? maybe it have a THD and we could show the query_id/thread_id or some information?) pthread_attr_t *pthread_attr; (what is pthread_attr? i didn't found in threadpool*) int pollfd; (a fd to kevent and others libs?) int thread_count; (number of threads in this thread group?!) int active_thread_count; (active threads running this thread group?!) int connection_count; (active connections in this thread group?!) int io_event_count; (io event count? what is this? network io?) int queue_event_count; (queue event count? maybe the COUNT(*) for connection_queue_t queue ?) ulonglong last_thread_creation_time; (last thread create time, it's a unixtimestamp * 1.000.000 (us)? ) int shutdown_pipe[2]; (maybe rpc to call a server shutdown?) bool shutdown; (shutdown information?) bool stalled; (hum... stall (nice name), well i must study about threadpool stall yet, but i think it's a nice information to report to DBA =] ) } MY_ALIGNED(512); I don't know yet how I_P_List<> and I_P_List_adapter<> work, but i will search about it in code... it's like a C++ object with ->legth and others easy to use tools? something like "foreach" (connection_queue_t as xxx) and interact with each connection_t inside connection_queue_t using xxx? typedef I_P_List<*connection_t*, I_P_List_adapter<connection_t, &connection_t::next_in_queue, &connection_t::prev_in_queue>, I_P_List_null_counter, I_P_List_fast_push_back<connection_t> > => *connection_queue_t*; typedef I_P_List<*worker_thread_t*, I_P_List_adapter<worker_thread_t, &worker_thread_t::next_in_list, &worker_thread_t::prev_in_list> > => *worker_list_t*; --- => struct *worker_thread_t* { ulonglong *event_count*; /* number of request handled by this thread */ thread_group_t* *thread_group*; (it's point to thread_group inside all_groups? does it have an ID about what index of all_groups we are?) worker_thread_t *next_in_list; worker_thread_t **prev_in_list; mysql_cond_t *cond*; (what is this?) bool woken; (what is this?) }; => struct *connection_t* { THD *thd; thread_group_t **thread_group*; connection_t *next_in_queue; connection_t **prev_in_queue; ulonglong *abs_wait_timeout*; (what is this?) bool *logged_in*; (hummm waiting password?) bool bound_to_poll_descriptor; (what is this?) bool waiting; (connection in waiting queue state?) }; => *pthread_attr_t* ??? (didn't found in threadpool* files, i will search with time about it, maybe something to linux/unix pthread lib?) * =========================================================================================================== * well now i'm thinking right about *information_schema *tables using the variables from the top of this email... : *TABLES:* 1) *all_groups *variable => *THREADPOOL_THREAD_GROUPS *table columns: *THREAD_GROUP_ID *INT NOT NULL DEFAULT 0, *THREAD_COUNT *INT NOT NULL DEFAULT 0, *ACTIVE_THREAD_COUNT *INT NOT NULL DEFAULT 0, *CONNECTION_COUNT *INT NOT NULL DEFAULT 0, *IO_EVENT_COUNT** *INT NOT NULL DEFAULT 0 COMMENT "WHAT IS THIS? NETWORK I/O?", *QUEUE_EVENT_COUNT** *INT NOT NULL DEFAULT 0, *LAST_THREAD_CREATION_TIME *DECIMAL UNSIGNED? UNIXTIMESTAMP? *SHUTDOWN *ENUM('Y','N') NOT NULL DEFAULT 'N' *STALLED *ENUM('Y','N') NOT NULL DEFAULT 'N' maybe we should put information about *LISTERNER_THREAD_ID* (thd->query_id / thd->thread_id) ? *LISTENER_QUERY_ID *BIGINT NOT NULL DEFAULT 0, *LISTENER_THREAD_ID *BIGINT NOT NULL DEFAULT 0 2) connection_queue_t *queue *=> *THREADPOOL_CONNECTION_QUEUE *table columns from queue->thd *QUERY_ID *INT NOT NULL DEFAULT 0, (does it exists? or just after parser?) *THREAD_ID *INT NOT NULL DEFAULT 0, queue->thread_group *THREAD_GROUP_ID *INT NOT NULL DEFAULT 0, ( i didn't found a ID for each thread_group, there's something for it? like am all_groups[] index? queue->cond (mysql_cond_t) ??? (what is this?) queue->woken (bool) ???? (what is this?) 3) worker_list_t *waiting_threads* => *THREADPOOL_WAITING_THREADS *table, maybe WAITING_QUEUE? *THREAD_GROUP_ID *INT NOT NULL DEFAULT 0, (thread_group_t* thread_group) *EVENT_COUNT *BIGINT UNSIGNED NOT NULL DEFAULT 0, (ulonglong event_count; /* number of request handled by this thread */ what is this?) ->cond (mysql_cond_t) ??? (what is this?) ->woken (bool) ???? (what is this?) i didn't found a THD, for this (query_id/thread_id) pthread_attr_t *pthread_attr; what is this? ----- something that i didn't understand yet... for example... if i got a *waiting_thread*, how i know what THD or what 'worker' thread will "do the job" of this query? i know it's too fast that i will not get it with information_schema, but it's a deterministic function right? there's a pool of worker thread that handle waiting threads? about the thread pool work... if the query is attached to one thread_group, it could 'jump' to another thread group? the "worker thread" is outside from thread group and at end of the execution it will realloc the connection to another thread_group? if anyone have more ideas about this information tables, and where i could get information about each column, please reply with the answer or with ideas =) thanks guys!
From: Roberto Spadim [mailto:roberto@spadim.com.br] Sent: Freitag, 20. September 2013 21:12 To: Vladislav Vaintroub Cc: maria-developers@lists.launchpad.net Subject: Re: [Maria-developers] MDEV-5019 - THREADPOOL - Create Information Schema Table for Threadpool Hi! Some read/study i'm doing about thread pool to understand what could be exposed with information_schema tables... check if i'm grouping ("creating tables") with the right information... i'm reading threadgroup_unix.cc and others threadgroup* files to start this work I strongly suggest reading the worklog, before jumping straight to the code. It is not trivial, so reading first can save time. http://worklog.askmonty.org/worklog/Server-BackLog/?tid=246 ----- from your mail Vladislav: "Global all_groups array contains all thread groups ." => static thread_group_t all_groups[MAX_THREAD_GROUPS]; => static uint group_count; "group_count" is the max value of a loop using all_groups var? something realted to "threadpool_max_threads", with a limit of group_count<MAX_THREAD_GROUPS? "Every thread_group_t has list of waiting threads , called "waiting_threads", and queue of not yet handled requests, called "queue" (request is represented by connection_t ), a listener etc." a request is a "php mysql_connect"? each new tcp/ip connection create a new request? Usually, request is an SQL query . More strictly , request in this context is a network packet from client (it can be an SQL query, QUIT packet that informs server that connection is about to be terminated, one of the handshake packets during connection establishment, etc) => struct thread_group_t { mysql_mutex_t mutex; connection_queue_t queue; worker_list_t waiting_threads; worker_thread_t *listener; (listener thread? maybe it have a THD and we could show the query_id/thread_id or some information?) No it does not have THD. It is an OS thread waiting for network events. pthread_attr_t *pthread_attr; (what is pthread_attr? i didn't found in threadpool*) It is not important for the discussion. You can find it in Unix man pages int pollfd; (a fd to kevent and others libs?) Yes. Kevent, epoll, etc all have a special file descriptor. Listener thread waits on it. int thread_count; (number of threads in this thread group?!) Yes int active_thread_count; (active threads running this thread group?!) Yes int connection_count; (active connections in this thread group?!) No, all connections, idle or active. Connection is bound to thread group. int io_event_count; (io event count? what is this? network io?) Yes. It is used to avoid stalls. Pleae look in the code how it is used int queue_event_count; (queue event count? maybe the COUNT(*) for connection_queue_t queue ?) Also used to avoid stalls. I is number of connections that were explicitly added into the "queue". This is something that only happens during connect phase, polling thread (dedicated MySQL thread, that only handles new connections) adds a new connection to the queue. ulonglong last_thread_creation_time; (last thread create time, it's a unixtimestamp * 1.000.000 (us)? ) Internal statistics, look how it is used. The idea is not to create too many threads too quickly, and last thread creation time tells you if you're creating threads too quickly. int shutdown_pipe[2]; (maybe rpc to call a server shutdown?) This is a pipe, to wake listener thread for shut down. bool shutdown; (shutdown information?) true if group is shutdown bool stalled; (hum... stall (nice name), well i must study about threadpool stall yet, but i think it's a nice information to report to DBA =] ) used to determine stalls. } MY_ALIGNED(512); I don't know yet how I_P_List<> and I_P_List_adapter<> work, but i will search about it in code... it's like a C++ object with ->legth and others easy to use tools? something like "foreach" (connection_queue_t as xxx) and interact with each connection_t inside connection_queue_t using xxx? You can traverse the list with an iterator. Something like connection_queue_t::Iterator it(group->queue); connection_t *con; while((con= it++)) { // use con somehow, e.g con->thd } typedef I_P_List<connection_t, I_P_List_adapter<connection_t, &connection_t::next_in_queue, &connection_t::prev_in_queue>, I_P_List_null_counter, I_P_List_fast_push_back<connection_t> > => connection_queue_t; typedef I_P_List<worker_thread_t, I_P_List_adapter<worker_thread_t, &worker_thread_t::next_in_list, &worker_thread_t::prev_in_list> > => worker_list_t; --- => struct worker_thread_t { ulonglong event_count; /* number of request handled by this thread */ thread_group_t* thread_group; (it's point to thread_group inside all_groups? does it have an ID about what index of all_groups we are?) you can calculate its offset in the all_groups array. e.g ( (thread_group - all_groups)/sizeof(thread_group_t)) worker_thread_t *next_in_list; worker_thread_t **prev_in_list; mysql_cond_t cond; (what is this?) condition variable. A waiting thread waits on condition variable bool woken; (what is this?) avoiding spurious wakeups, so if thread wakes up and "woken" is set, then it is really woken }; => struct connection_t { THD *thd; thread_group_t *thread_group; connection_t *next_in_queue; connection_t **prev_in_queue; ulonglong abs_wait_timeout; (what is this?) Looks how wait_timeouts is handled- it needs special handling in threadpools. Here, there is a timer thread, that periodically wakes up, or wakes up when first query timer expires. Then, all connections are examined whether query timeout has expired. If so, the "expired" connection is shut down. bool logged_in; (hummm waiting password?) Sorta, waiting for the handshake response from client. bool bound_to_poll_descriptor; (what is this?) Internal stuff, ignore bool waiting; (connection in waiting queue state?) Waiting for some internal mutex (row lock, table lock, stuff like that). Or inside SELECT SLEEP(N) }; => pthread_attr_t ??? (didn't found in threadpool* files, i will search with time about it, maybe something to linux/unix pthread lib?) Not interesting in our discussion. It is OS structure, opaque to its users. Used to set thread stack size. ============================================================================ =============================== well now i'm thinking right about information_schema tables using the variables from the top of this email... <skip> Here is what I could think of Threadpool_Threads (combined from all waiting_lists in groups and "thread" list): thread id , group id,THD id (0 if currently idle), event_count, is_listener, is_waiting (we do not store OS thread id btw, because there was no need, you could use address of worker_thread_t struct at least temporarily) Threadpool_pending_requests (combined from all "queue" lists in groups): THD id, group id Threadpool_Group_info (from thread_group_t struct) : group id, thread_count, active_thread_count, connection count, microseconds since last thread creation I do not think there is much more interesting info to show. something that i didn't understand yet... for example... if i got a waiting_thread, how i know what THD or what 'worker' thread will "do the job" of this query? i know it's too fast that i will not get it with information_schema, but it's a deterministic function right? No, you do not know that for sure. The logic is rather complicated, and you should read the worklog to understand how it works. But threads are woken in LIFO order, and requests are processed in FIFO order, so you can imagine that the last thread in waiting_threads will be woken and take the first request from queue. More often than not though, listener thread will do the job about the thread pool work... if the query is attached to one thread_group, it could 'jump' to another thread group? the "worker thread" is outside from thread group and at end of the execution it will realloc the connection to another thread_group? Connection does not usually change the group, but it can when someone changes thread_pool_size online. The connection with id N belongs to the group N%thread_group_size if anyone have more ideas about this information tables, and where i could get information about each column, please reply with the answer or with ideas =) thanks guys!
*-* wow! many many information, maybe it's something that we could put in a blog or a beginners developers blog or something like that well i will study work log, and start some work soon =) thanks a lot Vladislav! with time i will send a patch and finish this MDEV =) bye! good weekend :) it's 21:03 here in brazil :) 2013/9/20 Vladislav Vaintroub <wlad@montyprogram.com>
** **
** **
*From:* Roberto Spadim [mailto:roberto@spadim.com.br] *Sent:* Freitag, 20. September 2013 21:12 *To:* Vladislav Vaintroub *Cc:* maria-developers@lists.launchpad.net *Subject:* Re: [Maria-developers] MDEV-5019 - THREADPOOL - Create Information Schema Table for Threadpool****
** **
Hi!****
Some read/study i'm doing about thread pool to understand what could be exposed with information_schema tables...****
check if i'm grouping ("creating tables") with the right information...*** *
i'm reading threadgroup_unix.cc and others threadgroup* files to start this work****
** **
I strongly suggest reading the worklog, before jumping straight to the code. It is not trivial, so reading first can save time.****
http://worklog.askmonty.org/worklog/Server-BackLog/?tid=246 ****
** **
-----****
from your mail Vladislav:****
*"Global all_groups array contains all thread groups ."*****
** **
=> static thread_group_t *all_groups*[MAX_THREAD_GROUPS];****
=> static uint *group_count*; ****
"group_count" is the max value of a loop using all_groups var?****
something realted to "threadpool_max_threads", with a limit of group_count<MAX_THREAD_GROUPS?****
****
*"Every thread_group_t has list of *****
* waiting threads , called “waiting_threads”, *****
* and queue of not yet handled requests, called “queue” *****
* (request is represented by connection_t ), *****
* a listener etc."*****
a request is a "php mysql_connect"? each new tcp/ip connection create a new request?****
** **
Usually, request is an SQL query . More strictly , request in this context is a network packet from client (it can be an SQL query, QUIT packet that informs server that connection is about to be terminated, one of the handshake packets during connection establishment, etc)****
** **
** **
** **
=> struct *thread_group_t*****
{****
mysql_mutex_t mutex;****
*connection_queue_t* *queue*;****
*worker_list_t* *waiting_threads*;****
* worker_thread_t* *listener; (listener thread? maybe it have a THD and we could show the query_id/thread_id or some information?)* ***
No it does not have THD. It is an OS thread waiting for network events.*** *
** **
pthread_attr_t *pthread_attr; (what is pthread_attr? i didn't found in threadpool*)****
It is not important for the discussion. You can find it in Unix man pages* ***
** **
int pollfd; (a fd to kevent and others libs?)****
Yes. Kevent, epoll, etc all have a special file descriptor. Listener thread waits on it.****
** **
int thread_count; (number of threads in this thread group?!)****
Yes****
int active_thread_count; (active threads running this thread group?!)****
Yes****
int connection_count; (active connections in this thread group?!)****
No, all connections, idle or active. Connection is bound to thread group.* ***
** **
int io_event_count; (io event count? what is this? network io?)****
Yes. It is used to avoid stalls. Pleae look in the code how it is used****
** **
int queue_event_count; (queue event count? maybe the COUNT(*) for connection_queue_t queue ?)****
Also used to avoid stalls. I is number of connections that were explicitly added into the “queue”. This is something that only happens during connect phase, polling thread (dedicated MySQL thread, that only handles new connections) adds a new connection to the queue.****
** **
ulonglong last_thread_creation_time; (last thread create time, it's a unixtimestamp * 1.000.000 (us)? )****
Internal statistics, look how it is used. The idea is not to create too many threads too quickly, and last thread creation time tells you if you’re creating threads too quickly.****
** **
int shutdown_pipe[2]; (maybe rpc to call a server shutdown?)****
This is a pipe, to wake listener thread for shut down. ****
** **
bool shutdown; (shutdown information?)****
true if group is shutdown****
** **
bool stalled; (hum... stall (nice name), well i must study about threadpool stall yet, but i think it's a nice information to report to DBA =] )****
used to determine stalls.****
****
} MY_ALIGNED(512);****
** **
** **
** **
I don't know yet how I_P_List<> and I_P_List_adapter<> work, but i will search about it in code... it's like a C++ object with ->legth and others easy to use tools? something like "foreach" (connection_queue_t as xxx) and interact with each connection_t inside connection_queue_t using xxx?****
** **
You can traverse the list with an iterator. Something like ****
** **
connection_queue_t::Iterator it(group->queue);****
connection_t *con;****
while((con= it++)) {****
// use con somehow, e.g con->thd****
}****
** **
** **
** **
typedef I_P_List<*connection_t*,****
I_P_List_adapter<connection_t,****
&connection_t::next_in_queue,****
&connection_t::prev_in_queue>,****
I_P_List_null_counter,****
I_P_List_fast_push_back<connection_t> >****
=> *connection_queue_t*;****
****
typedef I_P_List<*worker_thread_t*, I_P_List_adapter<worker_thread_t,****
&worker_thread_t::next_in_list,****
&worker_thread_t::prev_in_list> ****
>****
=> *worker_list_t*;****
** **
---****
=> struct *worker_thread_t*****
{****
ulonglong *event_count*; /* number of request handled by this thread */****
thread_group_t* *thread_group*; (it's point to thread_group inside all_groups? does it have an ID about what index of all_groups we are?)****
you can calculate its offset in the all_groups array. ****
e.g ( (thread_group – all_groups)/sizeof(thread_group_t))****
** **
worker_thread_t *next_in_list;****
worker_thread_t **prev_in_list;****
****
mysql_cond_t *cond*; (what is this?)****
condition variable. A waiting thread waits on condition variable****
bool woken; (what is this?)****
avoiding spurious wakeups, so if thread wakes up and “woken” is set, then it is really woken****
** **
};****
** **
=> struct *connection_t*****
{****
** **
THD *thd;****
thread_group_t **thread_group*;****
connection_t *next_in_queue;****
connection_t **prev_in_queue;****
ulonglong *abs_wait_timeout*; (what is this?)****
** **
Looks how wait_timeouts is handled- it needs special handling in threadpools. Here, there is a timer thread, that periodically wakes up, or wakes up when first query timer expires. Then, all connections are examined whether query timeout has expired. If so, the “expired” connection is shut down.****
** **
bool *logged_in*; (hummm waiting password?)****
Sorta, waiting for the handshake response from client.****
** **
bool bound_to_poll_descriptor; (what is this?)****
Internal stuff, ignore****
** **
bool waiting; (connection in waiting queue state?)****
Waiting for some internal mutex (row lock, table lock, stuff like that). Or inside SELECT SLEEP(N)****
};****
** **
=> *pthread_attr_t* ??? (didn't found in threadpool* files, i will search with time about it, maybe something to linux/unix pthread lib?)****
** **
Not interesting in our discussion. It is OS structure, opaque to its users. Used to set thread stack size.****
****
* =========================================================================================================== *****
** **
well now i'm thinking right about *information_schema *tables using the variables from the top of this email...****
<skip>****
Here is what I could think of ****
** **
Threadpool_Threads (combined from all waiting_lists in groups and “thread” list): ****
thread id , group id,THD id (0 if currently idle), event_count, is_listener, is_waiting****
(we do not store OS thread id btw, because there was no need, you could use address of worker_thread_t struct at least temporarily)****
** **
Threadpool_pending_requests (combined from all “queue” lists in groups): THD id, group id****
** **
Threadpool_Group_info (from thread_group_t struct) : group id, thread_count, active_thread_count, connection count, microseconds since last thread creation****
** **
I do not think there is much more interesting info to show. ****
** **
something that i didn't understand yet...****
for example... if i got a *waiting_thread*, how i know what THD or what 'worker' thread will "do the job" of this query? i know it's too fast that i will not get it with information_schema, but it's a deterministic function right? ****
No, you do not know that for sure. The logic is rather complicated, and you should read the worklog to understand how it works. But threads are woken in LIFO order, and requests are processed in FIFO order, so you can imagine that the last thread in waiting_threads will be woken and take the first request from queue. More often than not though, listener thread will do the job****
** **
** **
about the thread pool work... if the query is attached to one thread_group, it could 'jump' to another thread group? the "worker thread" is outside from thread group and at end of the execution it will realloc the connection to another thread_group?****
** **
Connection does not usually change the group, but it can when someone changes thread_pool_size online. The connection with id N belongs to the group N%thread_group_size****
** **
if anyone have more ideas about this information tables, and where i could get information about each column, please reply with the answer or with ideas =)****
thanks guys!****
-- Roberto Spadim SPAEmpresarial
Hi Vladislav! i did some initial work, check if i'm going to the right direction: https://mariadb.atlassian.net/browse/MDEV-5019 something todo... ENUM('Y','N'), i never used... i will try some search at sql_show.cc to get this information thd->thread_id, thd->query_id, is listener, is waiting i didn't found information how could i get the thd, or how to know if a thread is the listener, or a thread is waiting i know that i have the all_groups[].listener to get listener, but i don't know how to check for example if a waiting_threads is the listener or not about thread waiting, i didn't understand yet where i could check if it's waiting or not well just first steps, i didn't compiled yet, just write the scratch =) thanks!!
Hi guys! new doubts... 1) threadpool.h don't have information about threadpool_unix.cc structures/interactors maybe we should rewrite the threadpool_unix.cc to a .cc file and a .h header file? samething to windows threadpool 2) other doubt... how i know what threadpoll it's being used? windows/unix? there's a #define that i could use with #ifdef XXX #else #endif ? thanks! 2013/9/21 Roberto Spadim <roberto@spadim.com.br>
Hi Vladislav! i did some initial work, check if i'm going to the right direction: https://mariadb.atlassian.net/browse/MDEV-5019 something todo... ENUM('Y','N'), i never used... i will try some search at sql_show.cc to get this information
thd->thread_id, thd->query_id, is listener, is waiting i didn't found information how could i get the thd, or how to know if a thread is the listener, or a thread is waiting i know that i have the all_groups[].listener to get listener, but i don't know how to check for example if a waiting_threads is the listener or not about thread waiting, i didn't understand yet where i could check if it's waiting or not
well just first steps, i didn't compiled yet, just write the scratch =)
thanks!!
-- Roberto Spadim SPAEmpresarial
From: Roberto Spadim [mailto:roberto@spadim.com.br] Sent: Sonntag, 22. September 2013 07:46 To: Vladislav Vaintroub Cc: maria-developers@lists.launchpad.net Subject: Re: [Maria-developers] MDEV-5019 - THREADPOOL - Create Information Schema Table for Threadpool Hi guys! new doubts... 1) threadpool.h don't have information about threadpool_unix.cc structures/interactors Why should it? The structs were only used in threadpool_unix.cc itself. maybe we should rewrite the threadpool_unix.cc to a .cc file and a .h header file? You can if you wish, but I see no need. If you plan to write I_S plugin that works only with unix threadpool, you can define it in the same file threadpool_unix.cc , right? samething to windows threadpool As already discussed, MariaDB on Windows relies on OS to manage threadpool, and there is no simple way to get threadpool diagnostics ( apart from learning kernel debugging , and equipping yourself with "Windows Internals" book that describes the structures). There are no interesting structures either, the pool is represented by opaque PTP_POOL 2) other doubt... how i know what threadpoll it's being used? windows/unix? there's a #define that i could use with #ifdef XXX #else #endif ? On Windows, Windows threadpool is used. Otherwise, Unix threadpool is used . CMake will conditionally compile either threadpool_win.cc or threadpool_unix.cc, dependent on OS. See sql/CMakeLists.txt thanks! 2013/9/21 Roberto Spadim <roberto@spadim.com.br> Hi Vladislav! i did some initial work, check if i'm going to the right direction: https://mariadb.atlassian.net/browse/MDEV-5019 something todo... ENUM('Y','N'), i never used... i will try some search at sql_show.cc to get this information thd->thread_id, thd->query_id, is listener, is waiting i didn't found information how could i get the thd, or how to know if a thread is the listener, or a thread is waiting i know that i have the all_groups[].listener to get listener, but i don't know how to check for example if a waiting_threads is the listener or not about thread waiting, i didn't understand yet where i could check if it's waiting or not well just first steps, i didn't compiled yet, just write the scratch =) thanks!! -- Roberto Spadim SPAEmpresarial
connection_t = THD class? it's a different structure/class? i didn't read the source yet, just high level doubt... It is a different class. "Active" connections, i.e connections that currently are executing queries , can neither be found in threadpool's waiting_threads, nor there is a different "active" list for them. Where they are? In the global "threads" list. IT is the list of THDs, not threads in the OS sense of it. Please disregard mixing "threads" with THDs, I know MySQL terminology gets confusing if one-thread-per-connection is not used. If you see mentioning of "thread" in threadpool_xxx.cc, then you can be sure it is about OS threads. For example, waiting_threads list are OS threads, and not associated with any THDs. ok i understood that we don't have a flag to show "active/not active" this should be done with a loop in threads to know if it's active or not, but what should i look? waiting_thread? queue? "processlist" threads? Since you say "active/not active", you're talking about are connections, and not threads. The connections(THDs) are in the global "threads" list. You already know if they are active or not, nothing needs to be done for that. In the current processlist, if Command=Sleep, then the connection is not active. the thread group only have threads that are not executing, just queue and waiting_threads, and executing threads are outside thread pool? I think you mixing something. Again, I remind that correct naming is when "thread" is OS thread, and not "THD", or "connection". So back to the basics : Threadpool consist of worker threads. At any given moment, there are possibly some threads that do work, and there are other threads that are idle. Of those that are idle, typically one would be listening for network events . All threads are part of the threadpool. When a thread did some work, but became idle, because there is no work to do (all clients are idle), it adds itself to the list of "idle" threads, aka waiting_threads. Other doubt... a waiting query (inside threadpool and not executing, inside "waiting_threads") is parsed, it have a THD structure or something that have the THD->lex? or it just have the network packages "saved in memory"? when it start execution it will be parsed and the THD will be executed? Waiting threads do absolutely nothing. They wait until they are woken. They are not associated with THD, until they do some work. waiting_threads are not THD, those are OS threads that wait on condition variable. Usually, a listener thread that I was talking above, will find out that there is some work, take an idle thread out of the waiting_threads list, hand it some work to do , and wake it.. So you may want to introduce special handling for those (i.e iterate the global "threads" list looking for active connections, get corresponding connection_t* struct from thd->event_scheduler.data, and do something with it, e.g look which thread group it belongs) Hum, in this case should be more 'inteligent' have the thread_group at PROCESSLIST table? and a SQL SELECT could do the job at SQL Layer about "what connection is active"? It's nice have a table with all thread, but i think it's the job of PROCESSLIST right? the point here is the THREAD_POOL information, in other words, now with you explain, we will have information about queue (connections without queries, maybe in sleep state) and connections waiting start (i don't know how is the state of this queries in processlist, do you know?) PROCESSLIST is about connections, or THDs. It is not about OS threads. It just happened historically, that each THD in the past always had one thread, so the terminology got confusing. In this case (add information at processlist), probably, we will have two kind of PROCESSLIST one for windows and other for unix, i'm right or wrong? Well, I would prefer not to have more in processlist than there is now. And I do not like to have different processlist for different OSes. <skip> On Windows, I doubt you can implement any information_schema plugin for the threadpool. This is because threadpool is native OS threadpool, and OS structure representing it PTP_POOL structure is opaque, and there is not much info you can extract from it (well, maybe you can , if you debug with a kernel debugger, but not otherwise) Windows will be a other MDEV (https://mariadb.atlassian.net/browse/MDEV-5046) =) let's start with linux, it's easier than windows hehehe :P This is wrong, when you talk about threadpool. Windows implementation is half of Unix in terms of lines of code, and one tens in terms of complexity. Windows can also be half of what it is now (in LOCS), if I we would to need to run on XP. Unix implementation is also much more complicated than it had to be, because it is built as a workaround for not scaling multithreaded Linux epoll (http://markmail.org/message/piebp4xosrjv6w7k ). There would not be any need for thread groups and complicated stuff list that, if multithreaded epoll worked. Wlad From: Maria-developers [mailto:maria-developers-bounces+wlad <mailto:maria-developers-bounces%2Bwlad> =montyprogram.com@lists.launchpad.net] On Behalf Of Roberto Spadim Sent: Montag, 16. September 2013 21:20 To: maria-developers@lists.launchpad.net Subject: [Maria-developers] MDEV-5019 - THREADPOOL - Create Information Schema Table for Threadpool Hi guys! I openned a new MDEV https://mariadb.atlassian.net/browse/MDEV-5019 the idea is expose the threadpool group information like oracle thread pool (more information about oracle thread pool here: <http://mikaelronstrom.blogspot.com.br/2011_10_01_archive.html> http://mikaelronstrom.blogspot.com.br/2011_10_01_archive.html) I can help, but i don't know how access the threadpool at scheduler.h/scheduler.c from a information schema plugin 1) lock important mutex and others mdl 2) acess variables 3) write information schema based on variables 1 and 2 i don't know how to do :( can anyone help ? :) thanks guys :) -- Roberto Spadim -- Roberto Spadim SPAEmpresarial
participants (2)
-
Roberto Spadim
-
Vladislav Vaintroub