Re: [Maria-discuss] Doubts about Thread Pool

17 Sep 2013

      MDEV created for this topic:
https://mariadb.atlassian.net/browse/MDEV-5019

i will need some help with internals, could anyone give me some points
where i could get variables about thread pool groups?

2013/9/16 Roberto Spadim <roberto@spadim.com.br>
...
Hi Michael!
2013/9/16 Michael Paulini <michael.paulini@wormbase.org>
...
Hi Roberto,
I do believe the idea of the thread pool was to get rid of the one
thread/connection paradigm, so all connections will be served by
potentially all threads.
Yes the idea of thread pool from
https://mariadb.com/kb/en/thread-pool-in-mariadb-51-53/ and
https://mariadb.com/kb/en/threadpool-in-55/ is ok, handle many
connections in less threads, and use better hardware resources (with some
problems with meta data locks, deadlocks, and others locks)
but let me explain my doubts...
*1) *what's the name of this "task selector", internally (and at worklog
246) i see a scheduler.cc file, should i call this as threads scheduler?
from that file there're three kinds o schedulers "one-process-per-thread"
and "pool-of-threads" and "no-thread" right?
i think that's the right name to this part of code that select what query
will "work" is "scheduler", isn't? i will use it in this email
*2) *what's the internal or maybe the complete name of ID column of
processlist?
internally i know it's the "->thread_id" of THD class variable, from
sql_show.cc:
/* ID */
      table->field[0]->store((longlong) tmp->thread_id, TRUE);
i know that processlist is a very old code, maybe from mysql 3.23 and at
that time we had only threads, and no thread pool
now we have three schedulers and maybe the ->thread_id is a old var name,
that should be called "connection_id"
but changing this name is a big patch without rewards, and a bad reward of
incompatibility with plugins and others external tools that use THD class,
i'm right?
if yes, maybe we could add more information at
information_schema.processlist, with a comment about the real name, just to
remove the wrong idea about "ID", something like:
CREATE TABLE `PROCESSLIST` (`ID` BIGINT(4) NOT NULL DEFAULT '0'* COMMENT "INTERNAL CONNECTION ID or something better?"*,`QUERY_ID` BIGINT(4) NOT NULL DEFAULT '0',`USER` VARCHAR(128) NOT NULL DEFAULT '',`HOST` VARCHAR(64) NOT NULL DEFAULT '',`DB` VARCHAR(64) NULL DEFAULT NULL,`COMMAND` VARCHAR(16) NOT NULL DEFAULT '',`TIME` INT(7) NOT NULL DEFAULT '0',`STATE` VARCHAR(64) NULL DEFAULT NULL,`INFO` LONGTEXT NULL,`TIME_MS` DECIMAL(22,3) NOT NULL DEFAULT '0.000',`STAGE` TINYINT(2) NOT NULL DEFAULT '0',`MAX_STAGE` TINYINT(2) NOT NULL DEFAULT '0',`PROGRESS` DECIMAL(7,3) NOT NULL DEFAULT '0.000',`MEMORY_USED` INT(7) NOT NULL DEFAULT '0',`EXAMINED_ROWS` INT(7) NOT NULL DEFAULT '0')COLLATE='utf8_general_ci'ENGINE=Aria;
*3)* now that ID is better explained in processlist, could we show more
information?
i don't know if it's the right place (processlist table), maybe another
table (thread_pool table) is better, check what i'm talking about...
i need information about thread pool group, from "Threadpool
implementation on Unix." worklog (maybe the windows implementatition too,
but let's check unix for now), and see what each thread pool is doing
from worklog, we have this information (i will mark what i think
important):
Each *group** *in itself is a complete small pool with a *listener thread* (the one waiting for network events) , work queue (is it a thread or a shared memory?) and worker threads (see the "s"? threads, more than one worker thread).
A group has the responsibility of keeping one thread running, if there is a work to be done. More than one thread in a group can be running, depending on circumstances (more about this later).
Clients are *assigned to the groups in a round-robin fashion** (here my doubt, what client is running in what thread pool group?)*. This will keep (statistically) about the same ratio of clients per group.
Listener and worker roles are dynamically assigned. Listener can become worker, after waiting for network events; it can pick an event and handle thus it becoming a worker. Vice versa, once worker is finished processing a query, it can become listener.
wow *-*! it's a very very interesting code, schedulers are something that
i really like =)
now... what the pool is doing? i'm thinking something similar to a
information schema table to help here, check:
CREATE TABLE information_schema.THREAD_POOL(
  thread_pool_group BIGINT NOT NULL DEFAULT 0,
  thread_id BIGINT NOT NULL DEFAULT 0, (that's not the THD->thread_id
variable, it's the real system thread, maybe a listerner, a worker or a
work queue (is work queue a thread?) )
  work ENUM('listener','worker','queue') NOT NULL DEFAULT 'queue', (this
show what this thread do in this thread pool group)
  connection_id BIGINT DEFAULT 0, (that's the THD->thread_id variable)
others columns? maybe timeouts and others informations from internall
scheduller? maybe timers for windows or others vars... must check the
scheduller code
)
reading mikael post:
http://mikaelronstrom.blogspot.com.br/2011/10/mysql-thread-pool-information-...
i think i'm not wrong creating new tables...
thanks guys
--
Roberto Spadim
SPAEmpresarial
-- 
Roberto Spadim
SPAEmpresarial