[Maria-discuss] Doubts about Thread Pool
Hi guys, I'm testing threadpool on windows with mariadb 10.0.4 I'm running server with this command line: C:\Program Files\mariadb-10.0.4-win32\bin>mysqld --datadir=..\data\ --log_error=..\log --port=3306 --thread_pool_max_threads=3 Now I'm connected with 10 clients: 2rootlocalhost:41859mysqlSleep50.0003rootlocalhost:41860Sleep180.0004root localhost:41861Sleep130.0005rootlocalhost:41862Sleep110.0006root localhost:41867Sleep170.0007rootlocalhost:41955Sleep90.0008root localhost:41966Sleep40.0009rootlocalhost:41969Query0initshow processlist 0.00010rootlocalhost:42029Sleep90.00011rootlocalhost:42032Sleep40.000 *generated 2013-09-15 22:02:41 by HeidiSQL 8.0.0.4464<http://www.heidisql.com/> * My question is... how i know what thread pool is running each connection? -- Roberto Spadim SPAEmpresarial
Hi Roberto, I do believe the idea of the thread pool was to get rid of the one thread/connection paradigm, so all connections will be served by potentially all threads. Michael On 16 September 2013 02:03, Roberto Spadim <roberto@spadim.com.br> wrote:
Hi guys, I'm testing threadpool on windows with mariadb 10.0.4
I'm running server with this command line: C:\Program Files\mariadb-10.0.4-win32\bin>mysqld --datadir=..\data\ --log_error=..\log --port=3306 --thread_pool_max_threads=3
Now I'm connected with 10 clients: 2 rootlocalhost:41859 mysqlSleep 5 0.000 3root localhost:41860 Sleep18 0.0004 rootlocalhost:41861 Sleep 13 0.000 5root localhost:41862 Sleep11 0.0006 rootlocalhost:41867 Sleep 17 0.000 7root localhost:41955 Sleep9 0.0008 rootlocalhost:41966 Sleep 4 0.000 9root localhost:41969 Query0 initshow processlist 0.00010 rootlocalhost:42029 Sleep 9 0.000 11rootlocalhost:42032Sleep 4 0.000
*generated 2013-09-15 22:02:41 by HeidiSQL 8.0.0.4464<http://www.heidisql.com/> *
My question is... how i know what thread pool is running each connection?
-- Roberto Spadim SPAEmpresarial
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Hi Michael! 2013/9/16 Michael Paulini <michael.paulini@wormbase.org>
Hi Roberto,
I do believe the idea of the thread pool was to get rid of the one thread/connection paradigm, so all connections will be served by potentially all threads.
Yes the idea of thread pool from https://mariadb.com/kb/en/thread-pool-in-mariadb-51-53/ and https://mariadb.com/kb/en/threadpool-in-55/ is ok, handle many connections in less threads, and use better hardware resources (with some problems with meta data locks, deadlocks, and others locks) but let me explain my doubts... *1) *what's the name of this "task selector", internally (and at worklog 246) i see a scheduler.cc file, should i call this as threads scheduler? from that file there're three kinds o schedulers "one-process-per-thread" and "pool-of-threads" and "no-thread" right? i think that's the right name to this part of code that select what query will "work" is "scheduler", isn't? i will use it in this email *2) *what's the internal or maybe the complete name of ID column of processlist? internally i know it's the "->thread_id" of THD class variable, from sql_show.cc: /* ID */ table->field[0]->store((longlong) tmp->thread_id, TRUE); i know that processlist is a very old code, maybe from mysql 3.23 and at that time we had only threads, and no thread pool now we have three schedulers and maybe the ->thread_id is a old var name, that should be called "connection_id" but changing this name is a big patch without rewards, and a bad reward of incompatibility with plugins and others external tools that use THD class, i'm right? if yes, maybe we could add more information at information_schema.processlist, with a comment about the real name, just to remove the wrong idea about "ID", something like: CREATE TABLE `PROCESSLIST` (`ID` BIGINT(4) NOT NULL DEFAULT '0'* COMMENT "INTERNAL CONNECTION ID or something better?"*,`QUERY_ID` BIGINT(4) NOT NULL DEFAULT '0',`USER` VARCHAR(128) NOT NULL DEFAULT '',`HOST` VARCHAR(64) NOT NULL DEFAULT '',`DB` VARCHAR(64) NULL DEFAULT NULL,`COMMAND` VARCHAR(16) NOT NULL DEFAULT '',`TIME` INT(7) NOT NULL DEFAULT '0',`STATE` VARCHAR(64) NULL DEFAULT NULL,`INFO` LONGTEXT NULL,`TIME_MS` DECIMAL(22,3) NOT NULL DEFAULT '0.000',`STAGE` TINYINT(2) NOT NULL DEFAULT '0',`MAX_STAGE` TINYINT(2) NOT NULL DEFAULT '0',`PROGRESS` DECIMAL(7,3) NOT NULL DEFAULT '0.000',`MEMORY_USED` INT(7) NOT NULL DEFAULT '0',`EXAMINED_ROWS` INT(7) NOT NULL DEFAULT '0')COLLATE='utf8_general_ci'ENGINE=Aria; *3)* now that ID is better explained in processlist, could we show more information? i don't know if it's the right place (processlist table), maybe another table (thread_pool table) is better, check what i'm talking about... i need information about thread pool group, from "Threadpool implementation on Unix." worklog (maybe the windows implementatition too, but let's check unix for now), and see what each thread pool is doing from worklog, we have this information (i will mark what i think important): Each *group** *in itself is a complete small pool with a *listener thread* (the one waiting for network events) , work queue (is it a thread or a shared memory?) and worker threads (see the "s"? threads, more than one worker thread). A group has the responsibility of keeping one thread running, if there is a work to be done. More than one thread in a group can be running, depending on circumstances (more about this later). Clients are *assigned to the groups in a round-robin fashion** (here my doubt, what client is running in what thread pool group?)*. This will keep (statistically) about the same ratio of clients per group. Listener and worker roles are dynamically assigned. Listener can become worker, after waiting for network events; it can pick an event and handle thus it becoming a worker. Vice versa, once worker is finished processing a query, it can become listener. wow *-*! it's a very very interesting code, schedulers are something that i really like =) now... what the pool is doing? i'm thinking something similar to a information schema table to help here, check: CREATE TABLE information_schema.THREAD_POOL( thread_pool_group BIGINT NOT NULL DEFAULT 0, thread_id BIGINT NOT NULL DEFAULT 0, (that's not the THD->thread_id variable, it's the real system thread, maybe a listerner, a worker or a work queue (is work queue a thread?) ) work ENUM('listener','worker','queue') NOT NULL DEFAULT 'queue', (this show what this thread do in this thread pool group) connection_id BIGINT DEFAULT 0, (that's the THD->thread_id variable) others columns? maybe timeouts and others informations from internall scheduller? maybe timers for windows or others vars... must check the scheduller code ) reading mikael post: http://mikaelronstrom.blogspot.com.br/2011/10/mysql-thread-pool-information-... i think i'm not wrong creating new tables... thanks guys -- Roberto Spadim SPAEmpresarial
MDEV created for this topic: https://mariadb.atlassian.net/browse/MDEV-5019 i will need some help with internals, could anyone give me some points where i could get variables about thread pool groups? 2013/9/16 Roberto Spadim <roberto@spadim.com.br>
Hi Michael!
2013/9/16 Michael Paulini <michael.paulini@wormbase.org>
Hi Roberto,
I do believe the idea of the thread pool was to get rid of the one thread/connection paradigm, so all connections will be served by potentially all threads.
Yes the idea of thread pool from https://mariadb.com/kb/en/thread-pool-in-mariadb-51-53/ and https://mariadb.com/kb/en/threadpool-in-55/ is ok, handle many connections in less threads, and use better hardware resources (with some problems with meta data locks, deadlocks, and others locks) but let me explain my doubts... *1) *what's the name of this "task selector", internally (and at worklog 246) i see a scheduler.cc file, should i call this as threads scheduler? from that file there're three kinds o schedulers "one-process-per-thread" and "pool-of-threads" and "no-thread" right? i think that's the right name to this part of code that select what query will "work" is "scheduler", isn't? i will use it in this email
*2) *what's the internal or maybe the complete name of ID column of processlist? internally i know it's the "->thread_id" of THD class variable, from sql_show.cc:
/* ID */ table->field[0]->store((longlong) tmp->thread_id, TRUE);
i know that processlist is a very old code, maybe from mysql 3.23 and at that time we had only threads, and no thread pool now we have three schedulers and maybe the ->thread_id is a old var name, that should be called "connection_id" but changing this name is a big patch without rewards, and a bad reward of incompatibility with plugins and others external tools that use THD class, i'm right?
if yes, maybe we could add more information at information_schema.processlist, with a comment about the real name, just to remove the wrong idea about "ID", something like:
CREATE TABLE `PROCESSLIST` (`ID` BIGINT(4) NOT NULL DEFAULT '0'* COMMENT "INTERNAL CONNECTION ID or something better?"*,`QUERY_ID` BIGINT(4) NOT NULL DEFAULT '0',`USER` VARCHAR(128) NOT NULL DEFAULT '',`HOST` VARCHAR(64) NOT NULL DEFAULT '',`DB` VARCHAR(64) NULL DEFAULT NULL,`COMMAND` VARCHAR(16) NOT NULL DEFAULT '',`TIME` INT(7) NOT NULL DEFAULT '0',`STATE` VARCHAR(64) NULL DEFAULT NULL,`INFO` LONGTEXT NULL,`TIME_MS` DECIMAL(22,3) NOT NULL DEFAULT '0.000',`STAGE` TINYINT(2) NOT NULL DEFAULT '0',`MAX_STAGE` TINYINT(2) NOT NULL DEFAULT '0',`PROGRESS` DECIMAL(7,3) NOT NULL DEFAULT '0.000',`MEMORY_USED` INT(7) NOT NULL DEFAULT '0',`EXAMINED_ROWS` INT(7) NOT NULL DEFAULT '0')COLLATE='utf8_general_ci'ENGINE=Aria;
*3)* now that ID is better explained in processlist, could we show more information? i don't know if it's the right place (processlist table), maybe another table (thread_pool table) is better, check what i'm talking about...
i need information about thread pool group, from "Threadpool implementation on Unix." worklog (maybe the windows implementatition too, but let's check unix for now), and see what each thread pool is doing
from worklog, we have this information (i will mark what i think important):
Each *group** *in itself is a complete small pool with a *listener thread* (the one waiting for network events) , work queue (is it a thread or a shared memory?) and worker threads (see the "s"? threads, more than one worker thread).
A group has the responsibility of keeping one thread running, if there is a work to be done. More than one thread in a group can be running, depending on circumstances (more about this later). Clients are *assigned to the groups in a round-robin fashion** (here my doubt, what client is running in what thread pool group?)*. This will keep (statistically) about the same ratio of clients per group. Listener and worker roles are dynamically assigned. Listener can become worker, after waiting for network events; it can pick an event and handle thus it becoming a worker. Vice versa, once worker is finished processing a query, it can become listener.
wow *-*! it's a very very interesting code, schedulers are something that i really like =) now... what the pool is doing? i'm thinking something similar to a information schema table to help here, check:
CREATE TABLE information_schema.THREAD_POOL( thread_pool_group BIGINT NOT NULL DEFAULT 0, thread_id BIGINT NOT NULL DEFAULT 0, (that's not the THD->thread_id variable, it's the real system thread, maybe a listerner, a worker or a work queue (is work queue a thread?) ) work ENUM('listener','worker','queue') NOT NULL DEFAULT 'queue', (this show what this thread do in this thread pool group) connection_id BIGINT DEFAULT 0, (that's the THD->thread_id variable)
others columns? maybe timeouts and others informations from internall scheduller? maybe timers for windows or others vars... must check the scheduller code )
reading mikael post:
http://mikaelronstrom.blogspot.com.br/2011/10/mysql-thread-pool-information-... i think i'm not wrong creating new tables...
thanks guys
-- Roberto Spadim SPAEmpresarial
-- Roberto Spadim SPAEmpresarial
participants (2)
-
Michael Paulini
-
Roberto Spadim