March 2010 - developers - lists.mariadb.org

[Maria-developers] Progress (by Hakan): Benchmark suite for sysbench (100)
by worklog-noreply＠askmonty.org 09 Mar '10

09 Mar '10

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Benchmark suite for sysbench CREATION DATE..: Thu, 04 Mar 2010, 17:46 SUPERVISOR.....: Igor IMPLEMENTOR....: Hakan COPIES TO......: CATEGORY.......: Other TASK ID........: 100 (http://askmonty.org/worklog/?tid=100) VERSION........: Benchmarks-3.0 STATUS.........: Assigned PRIORITY.......: 60 WORKED HOURS...: 12 ESTIMATE.......: 28 (hours remain) ORIG. ESTIMATE.: 40 PROGRESS NOTES: -=-=(Hakan - Tue, 09 Mar 2010, 14:10)=-=- * Added run-sysbench-myisam.sh for running MyISAM related benchmarks with sysbench. Worked 4 hours and estimate 28 hours remain (original estimate unchanged). -=-=(Hakan - Tue, 09 Mar 2010, 14:09)=-=- Low Level Design modified. --- /tmp/wklog.100.old.18803 2010-03-09 14:09:37.000000000 +0000 +++ /tmp/wklog.100.new.18803 2010-03-09 14:09:37.000000000 +0000 @@ -9,6 +9,12 @@ ** Number of concurrent clients is hardcoded ** Machine specific configuration like location of binaries and directories needed are in separate config files located at conf/<hostname>.inc + ** Set random seed of sysbench to have better comparison + ** Restart mysqld from scratch for each run and copy away + DATA_DIR of the database for faster starts. + ** Between each run, run sync and clear file system caches with + echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) + ** Write out mysqld and sysbench options for reference. The main loop of run-sysbench.sh is: @@ -31,12 +37,6 @@ ** sar -u (CPU utilization) hook ** Crash detection ** Error detection - ** Set random seed of sysbench to have better comparision - ** Restart mysqld from scratch for each run and copy away - DATA_DIR of the database for faster starts. - ** Between each run, run sync and clear file system caches with - echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) - ** Write out mysqld and sysbench options for reference. * Analyze numbers This is implemented in analyze-sysbench.php -=-=(Hakan - Tue, 09 Mar 2010, 14:06)=-=- Added: ** Set random seed of sysbench to have better comparison ** Restart mysqld from scratch for each run and copy away DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) ** Write out mysqld and sysbench options for reference. Worked 8 hours and estimate 32 hours remain (original estimate unchanged). -=-=(Hakan - Mon, 08 Mar 2010, 12:27)=-=- Low Level Design modified. --- /tmp/wklog.100.old.21404 2010-03-08 12:27:08.000000000 +0000 +++ /tmp/wklog.100.new.21404 2010-03-08 12:27:08.000000000 +0000 @@ -36,6 +36,7 @@ DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) + ** Write out mysqld and sysbench options for reference. * Analyze numbers This is implemented in analyze-sysbench.php -=-=(Guest - Thu, 04 Mar 2010, 18:15)=-=- Low Level Design modified. --- /tmp/wklog.100.old.2695 2010-03-04 18:15:46.000000000 +0000 +++ /tmp/wklog.100.new.2695 2010-03-04 18:15:46.000000000 +0000 @@ -1 +1,112 @@ +All scripts can be found at lp:mariadb-tools/sysbench +* Run sysbench tests on a machine and collect numbers +This is implemented in run-sysbench.sh. Currently it supports: + ** Optionally pull of latest source from Launchpad and compile + ** Starting the server + ** Running each sysbench test for $LOOP_COUNT times and + $RUN_TIME time. + ** Number of concurrent clients is hardcoded + ** Machine specific configuration like location of binaries and + directories needed are in separate config files located at conf/<hostname>.inc + +The main loop of run-sysbench.sh is: + +start_mysqld +for SYSBENCH_TEST in $SYSBENCH_TESTS + for THREADS in $NUM_THREADS + while [ $k -lt $LOOP_COUNT ] + drop schema sbtest + create schema sbtest + + $SYSBENCH $SYSBENCH_OPTIONS prepare + $SYSBENCH $SYSBENCH_OPTIONS run + done + done +done + +Open items: + ** OProfile hook + ** iostat hook + ** sar -u (CPU utilization) hook + ** Crash detection + ** Error detection + ** Set random seed of sysbench to have better comparision + ** Restart mysqld from scratch for each run and copy away + DATA_DIR of the database for faster starts. + ** Between each run, run sync and clear file system caches with + echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) + +* Analyze numbers +This is implemented in analyze-sysbench.php + +Open items: + ** Read result files from + ${RESULT_DIR}/${TODAY}/${PRODUCT}/${SYSBENCH_TEST}/${THREADS}/results.txt + + ** Collect OProfile, iostat, cpu utilization, and machine info + ** Detect errors and crashes. + ** Generate SQL INSERT strings for presentation usage + +The layout for storing the numbers is: + CREATE TABLE sysbench_run ( + id int unsigned NOT NULL auto_increment, + host varchar(80), -- Hostname we ran the test on. + run_date date, -- The day we ran the test. + sysbench_version varchar(32), -- Version of sysbench we used. + test_name varchar(32), -- Name of the sysbench test. + run_time int unsigned, -- Run time in seconds. + runs int unsigned, -- Number of iterations of the test. + PRIMARY KEY (id), + KEY (host), + KEY (run_date + ); + + CREATE TABLE sysbench_comment ( + id int unsigned NOT NULL auto_increment, + sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. + compile_info text, -- Compile options we used. + machine_info text, -- Details about the hardware. + sysbench_options text, -- The sysbench options we used. + PRIMARY KEY (id), + KEY (sysbench_run_id) + ); + + CREATE TABLE sysbench_result ( + id int unsigned NOT NULL auto_increment, + sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. + concurrency int unsigned, -- Concurrency level we used. + result decimal(7,2), -- The actual result. + io varchar(80), -- The IO from iostat. + cpu varchar(80), -- CPU utilization. + profile text, -- Profiling information. + error text, -- Error messages and stack traces. + PRIMARY KEY (id), + KEY (sysbench_run_id + ); + +* Generate a report out of the numbers +This script will generate a HTML version for putting up on the web and a +txt version for email usage. + +Open items: + ** Generate an overview table in the form of + Number of threads + 1 4 8 16 32 64 128 +sysbench test + delete 121.52 144.77 117.70 115.15 100.48 75.39 66.56 + mean value of runs + 1 first run + 2 second run + 3 third run + STDEV + STDEV in % of mean + CPU utilization (usr/sys/wait/idle) + IO (read/write) + +For HTML version additionally generate a graph with JPGraph. + +* Get machine(s) and run the test on a weekly basis and for each release +comparing with the prior release. + +* Email weekly reports and blog about it. DESCRIPTION: Create a benchmark suite for running sysbench * Run sysbench tests on a machine and collect numbers * Analyze numbers * Generate a report out of the numbers LOW-LEVEL DESIGN: All scripts can be found at lp:mariadb-tools/sysbench * Run sysbench tests on a machine and collect numbers This is implemented in run-sysbench.sh. Currently it supports: ** Optionally pull of latest source from Launchpad and compile ** Starting the server ** Running each sysbench test for $LOOP_COUNT times and $RUN_TIME time. ** Number of concurrent clients is hardcoded ** Machine specific configuration like location of binaries and directories needed are in separate config files located at conf/<hostname>.inc ** Set random seed of sysbench to have better comparison ** Restart mysqld from scratch for each run and copy away DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) ** Write out mysqld and sysbench options for reference. The main loop of run-sysbench.sh is: start_mysqld for SYSBENCH_TEST in $SYSBENCH_TESTS for THREADS in $NUM_THREADS while [ $k -lt $LOOP_COUNT ] drop schema sbtest create schema sbtest $SYSBENCH $SYSBENCH_OPTIONS prepare $SYSBENCH $SYSBENCH_OPTIONS run done done done Open items: ** OProfile hook ** iostat hook ** sar -u (CPU utilization) hook ** Crash detection ** Error detection * Analyze numbers This is implemented in analyze-sysbench.php Open items: ** Read result files from ${RESULT_DIR}/${TODAY}/${PRODUCT}/${SYSBENCH_TEST}/${THREADS}/results.txt ** Collect OProfile, iostat, cpu utilization, and machine info ** Detect errors and crashes. ** Generate SQL INSERT strings for presentation usage The layout for storing the numbers is: CREATE TABLE sysbench_run ( id int unsigned NOT NULL auto_increment, host varchar(80), -- Hostname we ran the test on. run_date date, -- The day we ran the test. sysbench_version varchar(32), -- Version of sysbench we used. test_name varchar(32), -- Name of the sysbench test. run_time int unsigned, -- Run time in seconds. runs int unsigned, -- Number of iterations of the test. PRIMARY KEY (id), KEY (host), KEY (run_date ); CREATE TABLE sysbench_comment ( id int unsigned NOT NULL auto_increment, sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. compile_info text, -- Compile options we used. machine_info text, -- Details about the hardware. sysbench_options text, -- The sysbench options we used. PRIMARY KEY (id), KEY (sysbench_run_id) ); CREATE TABLE sysbench_result ( id int unsigned NOT NULL auto_increment, sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. concurrency int unsigned, -- Concurrency level we used. result decimal(7,2), -- The actual result. io varchar(80), -- The IO from iostat. cpu varchar(80), -- CPU utilization. profile text, -- Profiling information. error text, -- Error messages and stack traces. PRIMARY KEY (id), KEY (sysbench_run_id ); * Generate a report out of the numbers This script will generate a HTML version for putting up on the web and a txt version for email usage. Open items: ** Generate an overview table in the form of Number of threads 1 4 8 16 32 64 128 sysbench test delete 121.52 144.77 117.70 115.15 100.48 75.39 66.56 mean value of runs 1 first run 2 second run 3 third run STDEV STDEV in % of mean CPU utilization (usr/sys/wait/idle) IO (read/write) For HTML version additionally generate a graph with JPGraph. * Get machine(s) and run the test on a weekly basis and for each release comparing with the prior release. * Email weekly reports and blog about it. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Updated (by Hakan): Benchmark suite for sysbench (100)
by worklog-noreply＠askmonty.org 09 Mar '10

09 Mar '10

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Benchmark suite for sysbench CREATION DATE..: Thu, 04 Mar 2010, 17:46 SUPERVISOR.....: Igor IMPLEMENTOR....: Hakan COPIES TO......: CATEGORY.......: Other TASK ID........: 100 (http://askmonty.org/worklog/?tid=100) VERSION........: Benchmarks-3.0 STATUS.........: Assigned PRIORITY.......: 60 WORKED HOURS...: 8 ESTIMATE.......: 32 (hours remain) ORIG. ESTIMATE.: 40 PROGRESS NOTES: -=-=(Hakan - Tue, 09 Mar 2010, 14:09)=-=- Low Level Design modified. --- /tmp/wklog.100.old.18803 2010-03-09 14:09:37.000000000 +0000 +++ /tmp/wklog.100.new.18803 2010-03-09 14:09:37.000000000 +0000 @@ -9,6 +9,12 @@ ** Number of concurrent clients is hardcoded ** Machine specific configuration like location of binaries and directories needed are in separate config files located at conf/<hostname>.inc + ** Set random seed of sysbench to have better comparison + ** Restart mysqld from scratch for each run and copy away + DATA_DIR of the database for faster starts. + ** Between each run, run sync and clear file system caches with + echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) + ** Write out mysqld and sysbench options for reference. The main loop of run-sysbench.sh is: @@ -31,12 +37,6 @@ ** sar -u (CPU utilization) hook ** Crash detection ** Error detection - ** Set random seed of sysbench to have better comparision - ** Restart mysqld from scratch for each run and copy away - DATA_DIR of the database for faster starts. - ** Between each run, run sync and clear file system caches with - echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) - ** Write out mysqld and sysbench options for reference. * Analyze numbers This is implemented in analyze-sysbench.php -=-=(Hakan - Tue, 09 Mar 2010, 14:06)=-=- Added: ** Set random seed of sysbench to have better comparison ** Restart mysqld from scratch for each run and copy away DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) ** Write out mysqld and sysbench options for reference. Worked 8 hours and estimate 32 hours remain (original estimate unchanged). -=-=(Hakan - Mon, 08 Mar 2010, 12:27)=-=- Low Level Design modified. --- /tmp/wklog.100.old.21404 2010-03-08 12:27:08.000000000 +0000 +++ /tmp/wklog.100.new.21404 2010-03-08 12:27:08.000000000 +0000 @@ -36,6 +36,7 @@ DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) + ** Write out mysqld and sysbench options for reference. * Analyze numbers This is implemented in analyze-sysbench.php -=-=(Guest - Thu, 04 Mar 2010, 18:15)=-=- Low Level Design modified. --- /tmp/wklog.100.old.2695 2010-03-04 18:15:46.000000000 +0000 +++ /tmp/wklog.100.new.2695 2010-03-04 18:15:46.000000000 +0000 @@ -1 +1,112 @@ +All scripts can be found at lp:mariadb-tools/sysbench +* Run sysbench tests on a machine and collect numbers +This is implemented in run-sysbench.sh. Currently it supports: + ** Optionally pull of latest source from Launchpad and compile + ** Starting the server + ** Running each sysbench test for $LOOP_COUNT times and + $RUN_TIME time. + ** Number of concurrent clients is hardcoded + ** Machine specific configuration like location of binaries and + directories needed are in separate config files located at conf/<hostname>.inc + +The main loop of run-sysbench.sh is: + +start_mysqld +for SYSBENCH_TEST in $SYSBENCH_TESTS + for THREADS in $NUM_THREADS + while [ $k -lt $LOOP_COUNT ] + drop schema sbtest + create schema sbtest + + $SYSBENCH $SYSBENCH_OPTIONS prepare + $SYSBENCH $SYSBENCH_OPTIONS run + done + done +done + +Open items: + ** OProfile hook + ** iostat hook + ** sar -u (CPU utilization) hook + ** Crash detection + ** Error detection + ** Set random seed of sysbench to have better comparision + ** Restart mysqld from scratch for each run and copy away + DATA_DIR of the database for faster starts. + ** Between each run, run sync and clear file system caches with + echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) + +* Analyze numbers +This is implemented in analyze-sysbench.php + +Open items: + ** Read result files from + ${RESULT_DIR}/${TODAY}/${PRODUCT}/${SYSBENCH_TEST}/${THREADS}/results.txt + + ** Collect OProfile, iostat, cpu utilization, and machine info + ** Detect errors and crashes. + ** Generate SQL INSERT strings for presentation usage + +The layout for storing the numbers is: + CREATE TABLE sysbench_run ( + id int unsigned NOT NULL auto_increment, + host varchar(80), -- Hostname we ran the test on. + run_date date, -- The day we ran the test. + sysbench_version varchar(32), -- Version of sysbench we used. + test_name varchar(32), -- Name of the sysbench test. + run_time int unsigned, -- Run time in seconds. + runs int unsigned, -- Number of iterations of the test. + PRIMARY KEY (id), + KEY (host), + KEY (run_date + ); + + CREATE TABLE sysbench_comment ( + id int unsigned NOT NULL auto_increment, + sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. + compile_info text, -- Compile options we used. + machine_info text, -- Details about the hardware. + sysbench_options text, -- The sysbench options we used. + PRIMARY KEY (id), + KEY (sysbench_run_id) + ); + + CREATE TABLE sysbench_result ( + id int unsigned NOT NULL auto_increment, + sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. + concurrency int unsigned, -- Concurrency level we used. + result decimal(7,2), -- The actual result. + io varchar(80), -- The IO from iostat. + cpu varchar(80), -- CPU utilization. + profile text, -- Profiling information. + error text, -- Error messages and stack traces. + PRIMARY KEY (id), + KEY (sysbench_run_id + ); + +* Generate a report out of the numbers +This script will generate a HTML version for putting up on the web and a +txt version for email usage. + +Open items: + ** Generate an overview table in the form of + Number of threads + 1 4 8 16 32 64 128 +sysbench test + delete 121.52 144.77 117.70 115.15 100.48 75.39 66.56 + mean value of runs + 1 first run + 2 second run + 3 third run + STDEV + STDEV in % of mean + CPU utilization (usr/sys/wait/idle) + IO (read/write) + +For HTML version additionally generate a graph with JPGraph. + +* Get machine(s) and run the test on a weekly basis and for each release +comparing with the prior release. + +* Email weekly reports and blog about it. DESCRIPTION: Create a benchmark suite for running sysbench * Run sysbench tests on a machine and collect numbers * Analyze numbers * Generate a report out of the numbers LOW-LEVEL DESIGN: All scripts can be found at lp:mariadb-tools/sysbench * Run sysbench tests on a machine and collect numbers This is implemented in run-sysbench.sh. Currently it supports: ** Optionally pull of latest source from Launchpad and compile ** Starting the server ** Running each sysbench test for $LOOP_COUNT times and $RUN_TIME time. ** Number of concurrent clients is hardcoded ** Machine specific configuration like location of binaries and directories needed are in separate config files located at conf/<hostname>.inc ** Set random seed of sysbench to have better comparison ** Restart mysqld from scratch for each run and copy away DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) ** Write out mysqld and sysbench options for reference. The main loop of run-sysbench.sh is: start_mysqld for SYSBENCH_TEST in $SYSBENCH_TESTS for THREADS in $NUM_THREADS while [ $k -lt $LOOP_COUNT ] drop schema sbtest create schema sbtest $SYSBENCH $SYSBENCH_OPTIONS prepare $SYSBENCH $SYSBENCH_OPTIONS run done done done Open items: ** OProfile hook ** iostat hook ** sar -u (CPU utilization) hook ** Crash detection ** Error detection * Analyze numbers This is implemented in analyze-sysbench.php Open items: ** Read result files from ${RESULT_DIR}/${TODAY}/${PRODUCT}/${SYSBENCH_TEST}/${THREADS}/results.txt ** Collect OProfile, iostat, cpu utilization, and machine info ** Detect errors and crashes. ** Generate SQL INSERT strings for presentation usage The layout for storing the numbers is: CREATE TABLE sysbench_run ( id int unsigned NOT NULL auto_increment, host varchar(80), -- Hostname we ran the test on. run_date date, -- The day we ran the test. sysbench_version varchar(32), -- Version of sysbench we used. test_name varchar(32), -- Name of the sysbench test. run_time int unsigned, -- Run time in seconds. runs int unsigned, -- Number of iterations of the test. PRIMARY KEY (id), KEY (host), KEY (run_date ); CREATE TABLE sysbench_comment ( id int unsigned NOT NULL auto_increment, sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. compile_info text, -- Compile options we used. machine_info text, -- Details about the hardware. sysbench_options text, -- The sysbench options we used. PRIMARY KEY (id), KEY (sysbench_run_id) ); CREATE TABLE sysbench_result ( id int unsigned NOT NULL auto_increment, sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. concurrency int unsigned, -- Concurrency level we used. result decimal(7,2), -- The actual result. io varchar(80), -- The IO from iostat. cpu varchar(80), -- CPU utilization. profile text, -- Profiling information. error text, -- Error messages and stack traces. PRIMARY KEY (id), KEY (sysbench_run_id ); * Generate a report out of the numbers This script will generate a HTML version for putting up on the web and a txt version for email usage. Open items: ** Generate an overview table in the form of Number of threads 1 4 8 16 32 64 128 sysbench test delete 121.52 144.77 117.70 115.15 100.48 75.39 66.56 mean value of runs 1 first run 2 second run 3 third run STDEV STDEV in % of mean CPU utilization (usr/sys/wait/idle) IO (read/write) For HTML version additionally generate a graph with JPGraph. * Get machine(s) and run the test on a weekly basis and for each release comparing with the prior release. * Email weekly reports and blog about it. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Updated (by Hakan): Benchmark suite for sysbench (100)
by worklog-noreply＠askmonty.org 09 Mar '10

09 Mar '10

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Benchmark suite for sysbench CREATION DATE..: Thu, 04 Mar 2010, 17:46 SUPERVISOR.....: Igor IMPLEMENTOR....: Hakan COPIES TO......: CATEGORY.......: Other TASK ID........: 100 (http://askmonty.org/worklog/?tid=100) VERSION........: Benchmarks-3.0 STATUS.........: Assigned PRIORITY.......: 60 WORKED HOURS...: 8 ESTIMATE.......: 32 (hours remain) ORIG. ESTIMATE.: 40 PROGRESS NOTES: -=-=(Hakan - Tue, 09 Mar 2010, 14:09)=-=- Low Level Design modified. --- /tmp/wklog.100.old.18803 2010-03-09 14:09:37.000000000 +0000 +++ /tmp/wklog.100.new.18803 2010-03-09 14:09:37.000000000 +0000 @@ -9,6 +9,12 @@ ** Number of concurrent clients is hardcoded ** Machine specific configuration like location of binaries and directories needed are in separate config files located at conf/<hostname>.inc + ** Set random seed of sysbench to have better comparison + ** Restart mysqld from scratch for each run and copy away + DATA_DIR of the database for faster starts. + ** Between each run, run sync and clear file system caches with + echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) + ** Write out mysqld and sysbench options for reference. The main loop of run-sysbench.sh is: @@ -31,12 +37,6 @@ ** sar -u (CPU utilization) hook ** Crash detection ** Error detection - ** Set random seed of sysbench to have better comparision - ** Restart mysqld from scratch for each run and copy away - DATA_DIR of the database for faster starts. - ** Between each run, run sync and clear file system caches with - echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) - ** Write out mysqld and sysbench options for reference. * Analyze numbers This is implemented in analyze-sysbench.php -=-=(Hakan - Tue, 09 Mar 2010, 14:06)=-=- Added: ** Set random seed of sysbench to have better comparison ** Restart mysqld from scratch for each run and copy away DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) ** Write out mysqld and sysbench options for reference. Worked 8 hours and estimate 32 hours remain (original estimate unchanged). -=-=(Hakan - Mon, 08 Mar 2010, 12:27)=-=- Low Level Design modified. --- /tmp/wklog.100.old.21404 2010-03-08 12:27:08.000000000 +0000 +++ /tmp/wklog.100.new.21404 2010-03-08 12:27:08.000000000 +0000 @@ -36,6 +36,7 @@ DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) + ** Write out mysqld and sysbench options for reference. * Analyze numbers This is implemented in analyze-sysbench.php -=-=(Guest - Thu, 04 Mar 2010, 18:15)=-=- Low Level Design modified. --- /tmp/wklog.100.old.2695 2010-03-04 18:15:46.000000000 +0000 +++ /tmp/wklog.100.new.2695 2010-03-04 18:15:46.000000000 +0000 @@ -1 +1,112 @@ +All scripts can be found at lp:mariadb-tools/sysbench +* Run sysbench tests on a machine and collect numbers +This is implemented in run-sysbench.sh. Currently it supports: + ** Optionally pull of latest source from Launchpad and compile + ** Starting the server + ** Running each sysbench test for $LOOP_COUNT times and + $RUN_TIME time. + ** Number of concurrent clients is hardcoded + ** Machine specific configuration like location of binaries and + directories needed are in separate config files located at conf/<hostname>.inc + +The main loop of run-sysbench.sh is: + +start_mysqld +for SYSBENCH_TEST in $SYSBENCH_TESTS + for THREADS in $NUM_THREADS + while [ $k -lt $LOOP_COUNT ] + drop schema sbtest + create schema sbtest + + $SYSBENCH $SYSBENCH_OPTIONS prepare + $SYSBENCH $SYSBENCH_OPTIONS run + done + done +done + +Open items: + ** OProfile hook + ** iostat hook + ** sar -u (CPU utilization) hook + ** Crash detection + ** Error detection + ** Set random seed of sysbench to have better comparision + ** Restart mysqld from scratch for each run and copy away + DATA_DIR of the database for faster starts. + ** Between each run, run sync and clear file system caches with + echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) + +* Analyze numbers +This is implemented in analyze-sysbench.php + +Open items: + ** Read result files from + ${RESULT_DIR}/${TODAY}/${PRODUCT}/${SYSBENCH_TEST}/${THREADS}/results.txt + + ** Collect OProfile, iostat, cpu utilization, and machine info + ** Detect errors and crashes. + ** Generate SQL INSERT strings for presentation usage + +The layout for storing the numbers is: + CREATE TABLE sysbench_run ( + id int unsigned NOT NULL auto_increment, + host varchar(80), -- Hostname we ran the test on. + run_date date, -- The day we ran the test. + sysbench_version varchar(32), -- Version of sysbench we used. + test_name varchar(32), -- Name of the sysbench test. + run_time int unsigned, -- Run time in seconds. + runs int unsigned, -- Number of iterations of the test. + PRIMARY KEY (id), + KEY (host), + KEY (run_date + ); + + CREATE TABLE sysbench_comment ( + id int unsigned NOT NULL auto_increment, + sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. + compile_info text, -- Compile options we used. + machine_info text, -- Details about the hardware. + sysbench_options text, -- The sysbench options we used. + PRIMARY KEY (id), + KEY (sysbench_run_id) + ); + + CREATE TABLE sysbench_result ( + id int unsigned NOT NULL auto_increment, + sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. + concurrency int unsigned, -- Concurrency level we used. + result decimal(7,2), -- The actual result. + io varchar(80), -- The IO from iostat. + cpu varchar(80), -- CPU utilization. + profile text, -- Profiling information. + error text, -- Error messages and stack traces. + PRIMARY KEY (id), + KEY (sysbench_run_id + ); + +* Generate a report out of the numbers +This script will generate a HTML version for putting up on the web and a +txt version for email usage. + +Open items: + ** Generate an overview table in the form of + Number of threads + 1 4 8 16 32 64 128 +sysbench test + delete 121.52 144.77 117.70 115.15 100.48 75.39 66.56 + mean value of runs + 1 first run + 2 second run + 3 third run + STDEV + STDEV in % of mean + CPU utilization (usr/sys/wait/idle) + IO (read/write) + +For HTML version additionally generate a graph with JPGraph. + +* Get machine(s) and run the test on a weekly basis and for each release +comparing with the prior release. + +* Email weekly reports and blog about it. DESCRIPTION: Create a benchmark suite for running sysbench * Run sysbench tests on a machine and collect numbers * Analyze numbers * Generate a report out of the numbers LOW-LEVEL DESIGN: All scripts can be found at lp:mariadb-tools/sysbench * Run sysbench tests on a machine and collect numbers This is implemented in run-sysbench.sh. Currently it supports: ** Optionally pull of latest source from Launchpad and compile ** Starting the server ** Running each sysbench test for $LOOP_COUNT times and $RUN_TIME time. ** Number of concurrent clients is hardcoded ** Machine specific configuration like location of binaries and directories needed are in separate config files located at conf/<hostname>.inc ** Set random seed of sysbench to have better comparison ** Restart mysqld from scratch for each run and copy away DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) ** Write out mysqld and sysbench options for reference. The main loop of run-sysbench.sh is: start_mysqld for SYSBENCH_TEST in $SYSBENCH_TESTS for THREADS in $NUM_THREADS while [ $k -lt $LOOP_COUNT ] drop schema sbtest create schema sbtest $SYSBENCH $SYSBENCH_OPTIONS prepare $SYSBENCH $SYSBENCH_OPTIONS run done done done Open items: ** OProfile hook ** iostat hook ** sar -u (CPU utilization) hook ** Crash detection ** Error detection * Analyze numbers This is implemented in analyze-sysbench.php Open items: ** Read result files from ${RESULT_DIR}/${TODAY}/${PRODUCT}/${SYSBENCH_TEST}/${THREADS}/results.txt ** Collect OProfile, iostat, cpu utilization, and machine info ** Detect errors and crashes. ** Generate SQL INSERT strings for presentation usage The layout for storing the numbers is: CREATE TABLE sysbench_run ( id int unsigned NOT NULL auto_increment, host varchar(80), -- Hostname we ran the test on. run_date date, -- The day we ran the test. sysbench_version varchar(32), -- Version of sysbench we used. test_name varchar(32), -- Name of the sysbench test. run_time int unsigned, -- Run time in seconds. runs int unsigned, -- Number of iterations of the test. PRIMARY KEY (id), KEY (host), KEY (run_date ); CREATE TABLE sysbench_comment ( id int unsigned NOT NULL auto_increment, sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. compile_info text, -- Compile options we used. machine_info text, -- Details about the hardware. sysbench_options text, -- The sysbench options we used. PRIMARY KEY (id), KEY (sysbench_run_id) ); CREATE TABLE sysbench_result ( id int unsigned NOT NULL auto_increment, sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. concurrency int unsigned, -- Concurrency level we used. result decimal(7,2), -- The actual result. io varchar(80), -- The IO from iostat. cpu varchar(80), -- CPU utilization. profile text, -- Profiling information. error text, -- Error messages and stack traces. PRIMARY KEY (id), KEY (sysbench_run_id ); * Generate a report out of the numbers This script will generate a HTML version for putting up on the web and a txt version for email usage. Open items: ** Generate an overview table in the form of Number of threads 1 4 8 16 32 64 128 sysbench test delete 121.52 144.77 117.70 115.15 100.48 75.39 66.56 mean value of runs 1 first run 2 second run 3 third run STDEV STDEV in % of mean CPU utilization (usr/sys/wait/idle) IO (read/write) For HTML version additionally generate a graph with JPGraph. * Get machine(s) and run the test on a weekly basis and for each release comparing with the prior release. * Email weekly reports and blog about it. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Progress (by Hakan): Benchmark suite for sysbench (100)
by worklog-noreply＠askmonty.org 09 Mar '10

09 Mar '10

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Benchmark suite for sysbench CREATION DATE..: Thu, 04 Mar 2010, 17:46 SUPERVISOR.....: Igor IMPLEMENTOR....: Hakan COPIES TO......: CATEGORY.......: Other TASK ID........: 100 (http://askmonty.org/worklog/?tid=100) VERSION........: Benchmarks-3.0 STATUS.........: Assigned PRIORITY.......: 60 WORKED HOURS...: 8 ESTIMATE.......: 32 (hours remain) ORIG. ESTIMATE.: 40 PROGRESS NOTES: -=-=(Hakan - Tue, 09 Mar 2010, 14:06)=-=- Added: ** Set random seed of sysbench to have better comparison ** Restart mysqld from scratch for each run and copy away DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) ** Write out mysqld and sysbench options for reference. Worked 8 hours and estimate 32 hours remain (original estimate unchanged). -=-=(Hakan - Mon, 08 Mar 2010, 12:27)=-=- Low Level Design modified. --- /tmp/wklog.100.old.21404 2010-03-08 12:27:08.000000000 +0000 +++ /tmp/wklog.100.new.21404 2010-03-08 12:27:08.000000000 +0000 @@ -36,6 +36,7 @@ DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) + ** Write out mysqld and sysbench options for reference. * Analyze numbers This is implemented in analyze-sysbench.php -=-=(Guest - Thu, 04 Mar 2010, 18:15)=-=- Low Level Design modified. --- /tmp/wklog.100.old.2695 2010-03-04 18:15:46.000000000 +0000 +++ /tmp/wklog.100.new.2695 2010-03-04 18:15:46.000000000 +0000 @@ -1 +1,112 @@ +All scripts can be found at lp:mariadb-tools/sysbench +* Run sysbench tests on a machine and collect numbers +This is implemented in run-sysbench.sh. Currently it supports: + ** Optionally pull of latest source from Launchpad and compile + ** Starting the server + ** Running each sysbench test for $LOOP_COUNT times and + $RUN_TIME time. + ** Number of concurrent clients is hardcoded + ** Machine specific configuration like location of binaries and + directories needed are in separate config files located at conf/<hostname>.inc + +The main loop of run-sysbench.sh is: + +start_mysqld +for SYSBENCH_TEST in $SYSBENCH_TESTS + for THREADS in $NUM_THREADS + while [ $k -lt $LOOP_COUNT ] + drop schema sbtest + create schema sbtest + + $SYSBENCH $SYSBENCH_OPTIONS prepare + $SYSBENCH $SYSBENCH_OPTIONS run + done + done +done + +Open items: + ** OProfile hook + ** iostat hook + ** sar -u (CPU utilization) hook + ** Crash detection + ** Error detection + ** Set random seed of sysbench to have better comparision + ** Restart mysqld from scratch for each run and copy away + DATA_DIR of the database for faster starts. + ** Between each run, run sync and clear file system caches with + echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) + +* Analyze numbers +This is implemented in analyze-sysbench.php + +Open items: + ** Read result files from + ${RESULT_DIR}/${TODAY}/${PRODUCT}/${SYSBENCH_TEST}/${THREADS}/results.txt + + ** Collect OProfile, iostat, cpu utilization, and machine info + ** Detect errors and crashes. + ** Generate SQL INSERT strings for presentation usage + +The layout for storing the numbers is: + CREATE TABLE sysbench_run ( + id int unsigned NOT NULL auto_increment, + host varchar(80), -- Hostname we ran the test on. + run_date date, -- The day we ran the test. + sysbench_version varchar(32), -- Version of sysbench we used. + test_name varchar(32), -- Name of the sysbench test. + run_time int unsigned, -- Run time in seconds. + runs int unsigned, -- Number of iterations of the test. + PRIMARY KEY (id), + KEY (host), + KEY (run_date + ); + + CREATE TABLE sysbench_comment ( + id int unsigned NOT NULL auto_increment, + sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. + compile_info text, -- Compile options we used. + machine_info text, -- Details about the hardware. + sysbench_options text, -- The sysbench options we used. + PRIMARY KEY (id), + KEY (sysbench_run_id) + ); + + CREATE TABLE sysbench_result ( + id int unsigned NOT NULL auto_increment, + sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. + concurrency int unsigned, -- Concurrency level we used. + result decimal(7,2), -- The actual result. + io varchar(80), -- The IO from iostat. + cpu varchar(80), -- CPU utilization. + profile text, -- Profiling information. + error text, -- Error messages and stack traces. + PRIMARY KEY (id), + KEY (sysbench_run_id + ); + +* Generate a report out of the numbers +This script will generate a HTML version for putting up on the web and a +txt version for email usage. + +Open items: + ** Generate an overview table in the form of + Number of threads + 1 4 8 16 32 64 128 +sysbench test + delete 121.52 144.77 117.70 115.15 100.48 75.39 66.56 + mean value of runs + 1 first run + 2 second run + 3 third run + STDEV + STDEV in % of mean + CPU utilization (usr/sys/wait/idle) + IO (read/write) + +For HTML version additionally generate a graph with JPGraph. + +* Get machine(s) and run the test on a weekly basis and for each release +comparing with the prior release. + +* Email weekly reports and blog about it. DESCRIPTION: Create a benchmark suite for running sysbench * Run sysbench tests on a machine and collect numbers * Analyze numbers * Generate a report out of the numbers LOW-LEVEL DESIGN: All scripts can be found at lp:mariadb-tools/sysbench * Run sysbench tests on a machine and collect numbers This is implemented in run-sysbench.sh. Currently it supports: ** Optionally pull of latest source from Launchpad and compile ** Starting the server ** Running each sysbench test for $LOOP_COUNT times and $RUN_TIME time. ** Number of concurrent clients is hardcoded ** Machine specific configuration like location of binaries and directories needed are in separate config files located at conf/<hostname>.inc The main loop of run-sysbench.sh is: start_mysqld for SYSBENCH_TEST in $SYSBENCH_TESTS for THREADS in $NUM_THREADS while [ $k -lt $LOOP_COUNT ] drop schema sbtest create schema sbtest $SYSBENCH $SYSBENCH_OPTIONS prepare $SYSBENCH $SYSBENCH_OPTIONS run done done done Open items: ** OProfile hook ** iostat hook ** sar -u (CPU utilization) hook ** Crash detection ** Error detection ** Set random seed of sysbench to have better comparision ** Restart mysqld from scratch for each run and copy away DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) ** Write out mysqld and sysbench options for reference. * Analyze numbers This is implemented in analyze-sysbench.php Open items: ** Read result files from ${RESULT_DIR}/${TODAY}/${PRODUCT}/${SYSBENCH_TEST}/${THREADS}/results.txt ** Collect OProfile, iostat, cpu utilization, and machine info ** Detect errors and crashes. ** Generate SQL INSERT strings for presentation usage The layout for storing the numbers is: CREATE TABLE sysbench_run ( id int unsigned NOT NULL auto_increment, host varchar(80), -- Hostname we ran the test on. run_date date, -- The day we ran the test. sysbench_version varchar(32), -- Version of sysbench we used. test_name varchar(32), -- Name of the sysbench test. run_time int unsigned, -- Run time in seconds. runs int unsigned, -- Number of iterations of the test. PRIMARY KEY (id), KEY (host), KEY (run_date ); CREATE TABLE sysbench_comment ( id int unsigned NOT NULL auto_increment, sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. compile_info text, -- Compile options we used. machine_info text, -- Details about the hardware. sysbench_options text, -- The sysbench options we used. PRIMARY KEY (id), KEY (sysbench_run_id) ); CREATE TABLE sysbench_result ( id int unsigned NOT NULL auto_increment, sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. concurrency int unsigned, -- Concurrency level we used. result decimal(7,2), -- The actual result. io varchar(80), -- The IO from iostat. cpu varchar(80), -- CPU utilization. profile text, -- Profiling information. error text, -- Error messages and stack traces. PRIMARY KEY (id), KEY (sysbench_run_id ); * Generate a report out of the numbers This script will generate a HTML version for putting up on the web and a txt version for email usage. Open items: ** Generate an overview table in the form of Number of threads 1 4 8 16 32 64 128 sysbench test delete 121.52 144.77 117.70 115.15 100.48 75.39 66.56 mean value of runs 1 first run 2 second run 3 third run STDEV STDEV in % of mean CPU utilization (usr/sys/wait/idle) IO (read/write) For HTML version additionally generate a graph with JPGraph. * Get machine(s) and run the test on a weekly basis and for each release comparing with the prior release. * Email weekly reports and blog about it. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Progress (by Hakan): Benchmark suite for sysbench (100)
by worklog-noreply＠askmonty.org 09 Mar '10

09 Mar '10

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Benchmark suite for sysbench CREATION DATE..: Thu, 04 Mar 2010, 17:46 SUPERVISOR.....: Igor IMPLEMENTOR....: Hakan COPIES TO......: CATEGORY.......: Other TASK ID........: 100 (http://askmonty.org/worklog/?tid=100) VERSION........: Benchmarks-3.0 STATUS.........: Assigned PRIORITY.......: 60 WORKED HOURS...: 8 ESTIMATE.......: 32 (hours remain) ORIG. ESTIMATE.: 40 PROGRESS NOTES: -=-=(Hakan - Tue, 09 Mar 2010, 14:06)=-=- Added: ** Set random seed of sysbench to have better comparison ** Restart mysqld from scratch for each run and copy away DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) ** Write out mysqld and sysbench options for reference. Worked 8 hours and estimate 32 hours remain (original estimate unchanged). -=-=(Hakan - Mon, 08 Mar 2010, 12:27)=-=- Low Level Design modified. --- /tmp/wklog.100.old.21404 2010-03-08 12:27:08.000000000 +0000 +++ /tmp/wklog.100.new.21404 2010-03-08 12:27:08.000000000 +0000 @@ -36,6 +36,7 @@ DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) + ** Write out mysqld and sysbench options for reference. * Analyze numbers This is implemented in analyze-sysbench.php -=-=(Guest - Thu, 04 Mar 2010, 18:15)=-=- Low Level Design modified. --- /tmp/wklog.100.old.2695 2010-03-04 18:15:46.000000000 +0000 +++ /tmp/wklog.100.new.2695 2010-03-04 18:15:46.000000000 +0000 @@ -1 +1,112 @@ +All scripts can be found at lp:mariadb-tools/sysbench +* Run sysbench tests on a machine and collect numbers +This is implemented in run-sysbench.sh. Currently it supports: + ** Optionally pull of latest source from Launchpad and compile + ** Starting the server + ** Running each sysbench test for $LOOP_COUNT times and + $RUN_TIME time. + ** Number of concurrent clients is hardcoded + ** Machine specific configuration like location of binaries and + directories needed are in separate config files located at conf/<hostname>.inc + +The main loop of run-sysbench.sh is: + +start_mysqld +for SYSBENCH_TEST in $SYSBENCH_TESTS + for THREADS in $NUM_THREADS + while [ $k -lt $LOOP_COUNT ] + drop schema sbtest + create schema sbtest + + $SYSBENCH $SYSBENCH_OPTIONS prepare + $SYSBENCH $SYSBENCH_OPTIONS run + done + done +done + +Open items: + ** OProfile hook + ** iostat hook + ** sar -u (CPU utilization) hook + ** Crash detection + ** Error detection + ** Set random seed of sysbench to have better comparision + ** Restart mysqld from scratch for each run and copy away + DATA_DIR of the database for faster starts. + ** Between each run, run sync and clear file system caches with + echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) + +* Analyze numbers +This is implemented in analyze-sysbench.php + +Open items: + ** Read result files from + ${RESULT_DIR}/${TODAY}/${PRODUCT}/${SYSBENCH_TEST}/${THREADS}/results.txt + + ** Collect OProfile, iostat, cpu utilization, and machine info + ** Detect errors and crashes. + ** Generate SQL INSERT strings for presentation usage + +The layout for storing the numbers is: + CREATE TABLE sysbench_run ( + id int unsigned NOT NULL auto_increment, + host varchar(80), -- Hostname we ran the test on. + run_date date, -- The day we ran the test. + sysbench_version varchar(32), -- Version of sysbench we used. + test_name varchar(32), -- Name of the sysbench test. + run_time int unsigned, -- Run time in seconds. + runs int unsigned, -- Number of iterations of the test. + PRIMARY KEY (id), + KEY (host), + KEY (run_date + ); + + CREATE TABLE sysbench_comment ( + id int unsigned NOT NULL auto_increment, + sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. + compile_info text, -- Compile options we used. + machine_info text, -- Details about the hardware. + sysbench_options text, -- The sysbench options we used. + PRIMARY KEY (id), + KEY (sysbench_run_id) + ); + + CREATE TABLE sysbench_result ( + id int unsigned NOT NULL auto_increment, + sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. + concurrency int unsigned, -- Concurrency level we used. + result decimal(7,2), -- The actual result. + io varchar(80), -- The IO from iostat. + cpu varchar(80), -- CPU utilization. + profile text, -- Profiling information. + error text, -- Error messages and stack traces. + PRIMARY KEY (id), + KEY (sysbench_run_id + ); + +* Generate a report out of the numbers +This script will generate a HTML version for putting up on the web and a +txt version for email usage. + +Open items: + ** Generate an overview table in the form of + Number of threads + 1 4 8 16 32 64 128 +sysbench test + delete 121.52 144.77 117.70 115.15 100.48 75.39 66.56 + mean value of runs + 1 first run + 2 second run + 3 third run + STDEV + STDEV in % of mean + CPU utilization (usr/sys/wait/idle) + IO (read/write) + +For HTML version additionally generate a graph with JPGraph. + +* Get machine(s) and run the test on a weekly basis and for each release +comparing with the prior release. + +* Email weekly reports and blog about it. DESCRIPTION: Create a benchmark suite for running sysbench * Run sysbench tests on a machine and collect numbers * Analyze numbers * Generate a report out of the numbers LOW-LEVEL DESIGN: All scripts can be found at lp:mariadb-tools/sysbench * Run sysbench tests on a machine and collect numbers This is implemented in run-sysbench.sh. Currently it supports: ** Optionally pull of latest source from Launchpad and compile ** Starting the server ** Running each sysbench test for $LOOP_COUNT times and $RUN_TIME time. ** Number of concurrent clients is hardcoded ** Machine specific configuration like location of binaries and directories needed are in separate config files located at conf/<hostname>.inc The main loop of run-sysbench.sh is: start_mysqld for SYSBENCH_TEST in $SYSBENCH_TESTS for THREADS in $NUM_THREADS while [ $k -lt $LOOP_COUNT ] drop schema sbtest create schema sbtest $SYSBENCH $SYSBENCH_OPTIONS prepare $SYSBENCH $SYSBENCH_OPTIONS run done done done Open items: ** OProfile hook ** iostat hook ** sar -u (CPU utilization) hook ** Crash detection ** Error detection ** Set random seed of sysbench to have better comparision ** Restart mysqld from scratch for each run and copy away DATA_DIR of the database for faster starts. ** Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) ** Write out mysqld and sysbench options for reference. * Analyze numbers This is implemented in analyze-sysbench.php Open items: ** Read result files from ${RESULT_DIR}/${TODAY}/${PRODUCT}/${SYSBENCH_TEST}/${THREADS}/results.txt ** Collect OProfile, iostat, cpu utilization, and machine info ** Detect errors and crashes. ** Generate SQL INSERT strings for presentation usage The layout for storing the numbers is: CREATE TABLE sysbench_run ( id int unsigned NOT NULL auto_increment, host varchar(80), -- Hostname we ran the test on. run_date date, -- The day we ran the test. sysbench_version varchar(32), -- Version of sysbench we used. test_name varchar(32), -- Name of the sysbench test. run_time int unsigned, -- Run time in seconds. runs int unsigned, -- Number of iterations of the test. PRIMARY KEY (id), KEY (host), KEY (run_date ); CREATE TABLE sysbench_comment ( id int unsigned NOT NULL auto_increment, sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. compile_info text, -- Compile options we used. machine_info text, -- Details about the hardware. sysbench_options text, -- The sysbench options we used. PRIMARY KEY (id), KEY (sysbench_run_id) ); CREATE TABLE sysbench_result ( id int unsigned NOT NULL auto_increment, sysbench_run_id int unsigned NOT NULL, -- FK pointing to sysbench_run. concurrency int unsigned, -- Concurrency level we used. result decimal(7,2), -- The actual result. io varchar(80), -- The IO from iostat. cpu varchar(80), -- CPU utilization. profile text, -- Profiling information. error text, -- Error messages and stack traces. PRIMARY KEY (id), KEY (sysbench_run_id ); * Generate a report out of the numbers This script will generate a HTML version for putting up on the web and a txt version for email usage. Open items: ** Generate an overview table in the form of Number of threads 1 4 8 16 32 64 128 sysbench test delete 121.52 144.77 117.70 115.15 100.48 75.39 66.56 mean value of runs 1 first run 2 second run 3 third run STDEV STDEV in % of mean CPU utilization (usr/sys/wait/idle) IO (read/write) For HTML version additionally generate a graph with JPGraph. * Get machine(s) and run the test on a weekly basis and for each release comparing with the prior release. * Email weekly reports and blog about it. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Rev 20: * Fixed variable name. in file:///Users/hakan/work/monty_program/mariadb-tools/
by Hakan Kuecuekyilmaz 09 Mar '10

09 Mar '10

At file:///Users/hakan/work/monty_program/mariadb-tools/ ------------------------------------------------------------ revno: 20 revision-id: hakan(a)askmonty.org-20100309140451-ehuc08gs4x851gtn parent: hakan(a)askmonty.org-20100309140429-75fbhpdrejalq31d committer: Hakan Kuecuekyilmaz <hakan(a)askmonty.org> branch nick: mariadb-tools timestamp: Tue 2010-03-09 15:04:51 +0100 message: * Fixed variable name. === modified file 'sql-bench/run-sql-bench.sh' --- a/sql-bench/run-sql-bench.sh 2010-02-04 01:19:56 +0000 +++ b/sql-bench/run-sql-bench.sh 2010-03-09 14:04:51 +0000 @@ -61,7 +61,7 @@ AVAILABLE=$(df $WORK_DIR | grep -v Filesystem | awk '{ print $4 }') if [ $AVAILABLE -lt $SPACE_LIMIT ]; then - echo "[ERROR]: We need at least $ONE_GB space in $WORK_DIR." + echo "[ERROR]: We need at least $SPACE_LIMIT space in $WORK_DIR." echo 'Exiting.' exit 1

1 0

[Maria-developers] Rev 19: * Added sudo for perro and work.inc in file:///Users/hakan/work/monty_program/mariadb-tools/
by Hakan Kuecuekyilmaz 09 Mar '10

09 Mar '10

At file:///Users/hakan/work/monty_program/mariadb-tools/ ------------------------------------------------------------ revno: 19 revision-id: hakan(a)askmonty.org-20100309140429-75fbhpdrejalq31d parent: hakan(a)askmonty.org-20100304020337-9xmeeklcn4uccvdt committer: Hakan Kuecuekyilmaz <hakan(a)askmonty.org> branch nick: mariadb-tools timestamp: Tue 2010-03-09 15:04:29 +0100 message: * Added sudo for perro and work.inc * Set random seed of sysbench to have better comparision * Restart mysqld from scratch for each run and copy away DATA_DIR of the database for faster starts. * Between each run, run sync and clear file system caches with echo 3 > /proc/sys/vm/drop_caches (http://linux.die.net/man/5/proc) * Write out mysqld and sysbench options for reference. === modified file 'sysbench/analyze-sysbench.php' --- a/sysbench/analyze-sysbench.php 2010-03-04 02:03:37 +0000 +++ b/sysbench/analyze-sysbench.php 2010-03-09 14:04:29 +0000 @@ -2,10 +2,10 @@ /** * Analyze sysbench v0.5 results * - * We take one directories as an argument and produce + * We take one directory as an argument and produce * SQL INSERT statements for further usage. * - * The directory structure is: + * The current directory structure is: * ${RESULT_DIR}/${TODAY}/${PRODUCT}/${SYSBENCH_TEST}/${THREADS} * * For instance: @@ -19,7 +19,7 @@ * 21749.94 * * The current layout of the tables for storing the - * benchmark results of sysbench is: + * benchmark results of a sysbench run is: * CREATE TABLE sysbench_run ( * id int unsigned NOT NULL auto_increment, * host varchar(80), -- Hostname we ran the test on. @@ -62,7 +62,7 @@ */ /** - * Base path to our result files + * Base path to our result files. */ define('BASE_PATH', $_SERVER['HOME'] . '/work/sysbench-results/' . RUN_DATE . '/' . PRODUCT); === modified file 'sysbench/conf/perro.inc' --- a/sysbench/conf/perro.inc 2010-03-04 02:03:03 +0000 +++ b/sysbench/conf/perro.inc 2010-03-09 14:04:29 +0000 @@ -24,6 +24,9 @@ IOSTAT_DEVICE='/dev/sda' SAR='/usr/bin/sar' +# Other binaries. +SUDO=/my/local/bin/sur + # Directories. TEMP_DIR='/mnt/data/sysbench' DATA_DIR="${TEMP_DIR}/data" === modified file 'sysbench/conf/work.inc' --- a/sysbench/conf/work.inc 2010-03-04 02:03:03 +0000 +++ b/sysbench/conf/work.inc 2010-03-09 14:04:29 +0000 @@ -24,6 +24,9 @@ IOSTAT_DEVICE='/dev/sda' SAR='/usr/bin/sar' +# Other binaries. +SUDO=/my/local/bin/sur + # Directories. TEMP_DIR="${HOME}/tmp" DATA_DIR="${TEMP_DIR}/data" === modified file 'sysbench/run-sysbench-myisam.sh' --- a/sysbench/run-sysbench-myisam.sh 2010-03-04 02:03:03 +0000 +++ b/sysbench/run-sysbench-myisam.sh 2010-03-09 14:04:29 +0000 @@ -91,6 +91,10 @@ update_index.lua \ update_non_index.lua" +# +# Note: myisam-max-rows has to match or exceed oltp-table-size +# otherwise we get a table full error while preparing the run. +# SYSBENCH_OPTIONS="--oltp-table-size=$TABLE_SIZE \ --max-time=$RUN_TIME \ --max-requests=0 \ === modified file 'sysbench/run-sysbench.sh' --- a/sysbench/run-sysbench.sh 2010-03-04 02:03:03 +0000 +++ b/sysbench/run-sysbench.sh 2010-03-09 14:04:29 +0000 @@ -93,6 +93,9 @@ # How many times we run each test. LOOP_COUNT=3 +# We need at least 1 GB disk space in our $WORK_DIR. +SPACE_LIMIT=1000000 + SYSBENCH_TESTS="delete.lua \ insert.lua \ oltp_complex_ro.lua \ @@ -107,7 +110,9 @@ --max-requests=0 \ --mysql-table-engine=InnoDB \ --mysql-user=root \ - --mysql-engine-trx=yes" + --mysql-engine-trx=yes \ + --rand-init=on \ + --rand-seed=303" # Timeout in seconds for waiting for mysqld to start. TIMEOUT=100 @@ -118,12 +123,25 @@ BASE="${HOME}/work" TEST_DIR="${BASE}/monty_program/sysbench/sysbench/tests/db" RESULT_DIR="${BASE}/sysbench-results" +SYSBENCH_DB_BACKUP="${TEMP_DIR}/sysbench_db" # # Files # BUILD_LOG="${WORK_DIR}/${PRODUCT}_build.log" +# +# Check system. +# +# We should at least have $SPACE_LIMIT in $WORKDIR. +AVAILABLE=$(df $WORK_DIR | grep -v Filesystem | awk '{ print $4 }') + +if [ $AVAILABLE -lt $SPACE_LIMIT ]; then + echo "[ERROR]: We need at least $SPACE_LIMIT space in $WORK_DIR." + echo 'Exiting.' + + exit 1 +fi if [ ! -d $LOCAL_MASTER ]; then echo "[ERROR]: Supplied local master $LOCAL_MASTER does not exists." @@ -209,40 +227,72 @@ mkdir ${RESULT_DIR}/${TODAY} mkdir ${RESULT_DIR}/${TODAY}/${PRODUCT} -killall -9 mysqld -rm -rf $DATA_DIR -rm -f $MY_SOCKET -mkdir $DATA_DIR - -sql/mysqld $MYSQL_OPTIONS & - -j=0 -STARTED=-1 -while [ $j -le $TIMEOUT ] - do - $MYSQLADMIN $MYSQLADMIN_OPTIONS ping > /dev/null 2>&1 - if [ $? = 0 ]; then - STARTED=0 +function kill_mysqld { + killall -9 mysqld + rm -rf $DATA_DIR + rm -f $MY_SOCKET + mkdir $DATA_DIR +} + +function start_mysqld { + sql/mysqld $MYSQL_OPTIONS & + + j=0 + STARTED=-1 + while [ $j -le $TIMEOUT ] + do + $MYSQLADMIN $MYSQLADMIN_OPTIONS ping > /dev/null 2>&1 + if [ $? = 0 ]; then + STARTED=0 + + break + fi - break + sleep 1 + j=$(($j + 1)) + done + + if [ $STARTED != 0 ]; then + echo '[ERROR]: Start of mysqld failed.' + echo ' Please check your error log.' + echo ' Exiting.' + + exit 1 fi - - sleep 1 - j=$(($j + 1)) -done - -if [ $STARTED != 0 ]; then - echo '[ERROR]: Start of mysqld failed.' - echo ' Please check your error log.' - echo ' Exiting.' - - exit 1 -fi +} + +# +# Write out configurations used for future refernce. +# +echo $MYSQL_OPTIONS > ${RESULT_DIR}/${TODAY}/${PRODUCT}/mysqld_options.txt +echo $SYSBENCH_OPTIONS > ${RESULT_DIR}/${TODAY}/${PRODUCT}/sysbench_options.txt for SYSBENCH_TEST in $SYSBENCH_TESTS do mkdir ${RESULT_DIR}/${TODAY}/${PRODUCT}/${SYSBENCH_TEST} + kill_mysqld + start_mysqld + $MYSQLADMIN $MYSQLADMIN_OPTIONS create sbtest + if [ $? != 0 ]; then + echo "[ERROR]: Create of sbtest database failed" + echo " Please check your setup." + echo " Exiting" + exit 1 + fi + + echo "[$(date "+%Y-%m-%d %H:%M:%S")] Preparing and loading data for $SYSBENCH_TEST." + SYSBENCH_OPTIONS="${SYSBENCH_OPTIONS} --test=${TEST_DIR}/${SYSBENCH_TEST}" + $SYSBENCH $SYSBENCH_OPTIONS prepare + + $MYSQLADMIN $MYSQLADMIN_OPTIONS shutdown + sync + rm -rf ${SYSBENCH_DB_BACKUP} + mkdir ${SYSBENCH_DB_BACKUP} + + echo "[$(date "+%Y-%m-%d %H:%M:%S")] Copying $DATA_DIR of $SYSBENCH_TEST for later usage." + cp -a ${DATA_DIR}/* ${SYSBENCH_DB_BACKUP}/ + for THREADS in $NUM_THREADS do THIS_RESULT_DIR="${RESULT_DIR}/${TODAY}/${PRODUCT}/${SYSBENCH_TEST}/${THREADS}" @@ -250,23 +300,24 @@ echo "[$(date "+%Y-%m-%d %H:%M:%S")] Running $SYSBENCH_TEST with $THREADS threads and $LOOP_COUNT iterations for $PRODUCT" | tee ${THIS_RESULT_DIR}/results.txt echo '' >> ${THIS_RESULT_DIR}/results.txt + SYSBENCH_OPTIONS="$SYSBENCH_OPTIONS --num-threads=$THREADS" + k=0 while [ $k -lt $LOOP_COUNT ] do - $MYSQLADMIN $MYSQLADMIN_OPTIONS -f drop sbtest - $MYSQLADMIN $MYSQLADMIN_OPTIONS create sbtest - if [ $? != 0 ]; then - echo "[ERROR]: Create of sbtest database failed" - echo " Please check your setup." - echo " Exiting" - exit 1 - fi - - SYSBENCH_OPTIONS="$SYSBENCH_OPTIONS --num-threads=$THREADS --test=${TEST_DIR}/${SYSBENCH_TEST}" - $SYSBENCH $SYSBENCH_OPTIONS prepare - - sync - sleep 3 + echo '' + echo "[$(date "+%Y-%m-%d %H:%M:%S")] Killing mysqld and copying back $DATA_DIR for $SYSBENCH_TEST." + kill_mysqld + cp -a ${SYSBENCH_DB_BACKUP}/* ${DATA_DIR} + + # Clear file system cache. This works only with Linux >= 2.6.16. + # On Mac OS X we can use sync; purge. + sync + echo 3 | $SUDO tee /proc/sys/vm/drop_caches + + echo "[$(date "+%Y-%m-%d %H:%M:%S")] Starting mysqld for running $SYSBENCH_TEST with $THREADS threads and $LOOP_COUNT iterations for $PRODUCT" + start_mysqld + sync $SYSBENCH $SYSBENCH_OPTIONS run > ${THIS_RESULT_DIR}/result${k}.txt 2>&1 @@ -274,6 +325,9 @@ k=$(($k + 1)) done + + echo '' >> ${THIS_RESULT_DIR}/results.txt + echo "[$(date "+%Y-%m-%d %H:%M:%S")] Finnished $SYSBENCH_TEST with $THREADS threads and $LOOP_COUNT iterations for $PRODUCT" | tee -a ${THIS_RESULT_DIR}/results.txt done done

1 0

[Maria-developers] Rev 2766: MWL#68 Subquery optimization: Efficient NOT IN execution with NULLs in file:///home/tsk/mprog/src/5.3-mwl68-merge.base-mwl68/
by timour＠askmonty.org 09 Mar '10

09 Mar '10

At file:///home/tsk/mprog/src/5.3-mwl68-merge.base-mwl68/ ------------------------------------------------------------ revno: 2766 [merge] revision-id: timour(a)askmonty.org-20100309103615-dzmm6xt7ye5xfs25 parent: timour(a)askmonty.org-20100309101406-xygkt2sgftvjvevg parent: psergey(a)askmonty.org-20100307154145-ksby2b1l0sqm1xne committer: timour(a)askmonty.org branch nick: 5.3-mwl68-merge.base-mwl68 timestamp: Tue 2010-03-09 12:36:15 +0200 message: MWL#68 Subquery optimization: Efficient NOT IN execution with NULLs Automerge with 5.3-subqueries modified: mysql-test/r/join_cache.result join_cache.result-20091221012827-jfu65h0x5bmixhh3-1 mysql-test/r/subselect_sj.result subselect_sj.result-20100117143926-nrop4ku355g3kv8b-1 mysql-test/r/subselect_sj2.result subselect_sj2.result-20100117143927-4k8x8d6czjviugog-1 mysql-test/r/subselect_sj2_jcl6.result subselect_sj2_jcl6.r-20100117143927-r3uxj2zuyjtrnokh-1 mysql-test/r/subselect_sj_jcl6.result subselect_sj_jcl6.re-20100117143928-7vzk51yaf29cdavp-1 mysql-test/suite/pbxt/r/group_min_max.result group_min_max.result-20090402100035-4ilk9i91sh65vjcb-69 mysql-test/suite/pbxt/r/subselect.result subselect.result-20090402100035-4ilk9i91sh65vjcb-146 mysql-test/t/join_cache.test join_cache.test-20091221012705-n3szmbc9blgmmu84-1 mysql-test/t/subselect_sj.test subselect_sj.test-20100117143931-qp396ufpe3k0scre-1 mysql-test/t/subselect_sj2.test subselect_sj2.test-20100117143932-vxp9ugyy3s0mdo5j-1 mysql-test/t/subselect_sj_jcl6.test subselect_sj_jcl6.te-20100117144012-tmbazng78xjyw6m1-1 sql/item.cc sp1f-item.cc-19700101030959-u7hxqopwpfly4kf5ctlyk2dvrq4l3dhn sql/opt_subselect.cc opt_subselect.cc-20100215190428-nekkl8wisp0k6nlk-1 sql/sql_join_cache.cc sql_join_cache.cc-20091221012625-ipp8zu28iijhjmq2-1 sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb sql/sql_select.h sp1f-sql_select.h-19700101030959-oqegfxr76xlgmrzd6qlevonoibfnwzoz === modified file 'mysql-test/r/join_cache.result' --- a/mysql-test/r/join_cache.result 2010-02-11 21:59:32 +0000 +++ b/mysql-test/r/join_cache.result 2010-03-06 19:14:55 +0000 @@ -4142,3 +4142,46 @@ 2 2 tt uu 2 2 set join_cache_level=default; DROP TABLE t1,t2; +# +# Bug #51092: linked join buffer is used for a 3-way cross join query +# that selects only records of the first table +# +create table t1 (a int, b int); +insert into t1 values (1,1),(2,2); +create table t2 (a int, b int); +insert into t2 values (1,1),(2,2); +create table t3 (a int, b int); +insert into t3 values (1,1),(2,2); +explain select t1.* from t1,t2,t3; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 ALL NULL NULL NULL NULL 2 +1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using join buffer +1 SIMPLE t3 ALL NULL NULL NULL NULL 2 Using join buffer +select t1.* from t1,t2,t3; +a b +1 1 +2 2 +1 1 +2 2 +1 1 +2 2 +1 1 +2 2 +set join_cache_level=2; +explain select t1.* from t1,t2,t3; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 ALL NULL NULL NULL NULL 2 +1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using join buffer +1 SIMPLE t3 ALL NULL NULL NULL NULL 2 Using join buffer +select t1.* from t1,t2,t3; +a b +1 1 +2 2 +1 1 +2 2 +1 1 +2 2 +1 1 +2 2 +set join_cache_level=default; +drop table t1,t2,t3; === modified file 'mysql-test/r/subselect_sj.result' --- a/mysql-test/r/subselect_sj.result 2010-02-21 07:53:12 +0000 +++ b/mysql-test/r/subselect_sj.result 2010-02-24 11:33:42 +0000 @@ -824,3 +824,50 @@ 3 2 drop table t1, t2, t3; +# +# Bug#49198 Wrong result for second call of procedure +# with view in subselect. +# +CREATE TABLE t1 (t1field integer, primary key (t1field)); +CREATE TABLE t2 (t2field integer, primary key (t2field)); +CREATE TABLE t3 (t3field integer, primary key (t3field)); +CREATE VIEW v2 AS SELECT * FROM t2; +CREATE VIEW v3 AS SELECT * FROM t3; +INSERT INTO t1 VALUES(1),(2); +INSERT INTO t2 VALUES(1),(2); +INSERT INTO t3 VALUES(1),(2); +PREPARE stmt FROM +" +SELECT t1field +FROM t1 +WHERE t1field IN (SELECT * FROM v2); +"; +EXECUTE stmt; +t1field +1 +2 +EXECUTE stmt; +t1field +1 +2 +PREPARE stmt FROM +" +EXPLAIN +SELECT t1field +FROM t1 +WHERE t1field IN (SELECT * FROM v2) + AND t1field IN (SELECT * FROM v3) +"; +EXECUTE stmt; +id select_type table type possible_keys key key_len ref rows Extra +1 PRIMARY t1 index PRIMARY PRIMARY 4 NULL 2 Using index +1 PRIMARY t2 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +1 PRIMARY t3 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +EXECUTE stmt; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 index PRIMARY PRIMARY 4 NULL 2 Using index +1 SIMPLE t2 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +1 SIMPLE t3 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +DROP TABLE t1, t2, t3; +DROP VIEW v2, v3; +# End of Bug#49198 === modified file 'mysql-test/r/subselect_sj2.result' --- a/mysql-test/r/subselect_sj2.result 2010-02-17 10:47:55 +0000 +++ b/mysql-test/r/subselect_sj2.result 2010-03-07 15:41:45 +0000 @@ -264,8 +264,8 @@ from t0 where a in (select t2.a+t3.a from t1 left join (t2 join t3) on t2.a=t1.a and t3.a=t1.a); id select_type table type possible_keys key key_len ref rows Extra -1 PRIMARY t0 ALL NULL NULL NULL NULL 10 -1 PRIMARY t1 index NULL a 5 NULL 10 Using index; Start temporary; Using join buffer +1 PRIMARY t0 ALL NULL NULL NULL NULL 10 Start temporary +1 PRIMARY t1 index NULL a 5 NULL 10 Using index; Using join buffer 1 PRIMARY t2 ref a a 5 test.t1.a 1 Using index 1 PRIMARY t3 ref a a 5 test.t1.a 1 Using where; Using index; End temporary drop table t0, t1,t2,t3; === modified file 'mysql-test/r/subselect_sj2_jcl6.result' --- a/mysql-test/r/subselect_sj2_jcl6.result 2010-02-17 10:47:55 +0000 +++ b/mysql-test/r/subselect_sj2_jcl6.result 2010-03-07 15:41:45 +0000 @@ -268,8 +268,8 @@ from t0 where a in (select t2.a+t3.a from t1 left join (t2 join t3) on t2.a=t1.a and t3.a=t1.a); id select_type table type possible_keys key key_len ref rows Extra -1 PRIMARY t0 ALL NULL NULL NULL NULL 10 -1 PRIMARY t1 index NULL a 5 NULL 10 Using index; Start temporary; Using join buffer +1 PRIMARY t0 ALL NULL NULL NULL NULL 10 Start temporary +1 PRIMARY t1 index NULL a 5 NULL 10 Using index; Using join buffer 1 PRIMARY t2 ref a a 5 test.t1.a 1 Using index 1 PRIMARY t3 ref a a 5 test.t1.a 1 Using where; Using index; End temporary drop table t0, t1,t2,t3; @@ -421,20 +421,23 @@ where t0.a in ( select t1.a from t1,t2 where t2.a=t0.a and t1.b=t2.b); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t0 ALL NULL NULL NULL NULL 5 100.00 -1 PRIMARY t1 ref a a 5 test.t0.a 1 100.00 Start temporary; Using join buffer +1 PRIMARY t0 ALL NULL NULL NULL NULL 5 100.00 Start temporary +1 PRIMARY t1 ref a a 5 test.t0.a 1 100.00 Using join buffer 1 PRIMARY t2 eq_ref PRIMARY PRIMARY 4 test.t0.a 1 100.00 Using where; End temporary; Using join buffer Warnings: Note 1276 Field or reference 'test.t0.a' of SELECT #2 was resolved in SELECT #1 Note 1003 select `test`.`t0`.`a` AS `a` from `test`.`t2` semi join (`test`.`t1`) join `test`.`t0` where ((`test`.`t2`.`b` = `test`.`t1`.`b`) and (`test`.`t1`.`a` = `test`.`t0`.`a`) and (`test`.`t2`.`a` = `test`.`t0`.`a`)) update t1 set a=3, b=11 where a=4; update t2 set b=11 where a=3; - +# Not anymore: # The following query gives wrong result due to Bug#49129 select * from t0 where t0.a in (select t1.a from t1, t2 where t2.a=t0.a and t1.b=t2.b); a 0 +1 +2 +3 drop table t0, t1, t2; CREATE TABLE t1 ( id int(11) NOT NULL, @@ -713,9 +716,9 @@ c1 in (select convert(c6,char(1)) from t2); id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY t2 ALL NULL NULL NULL NULL 1 Using where -1 PRIMARY t2 ALL NULL NULL NULL NULL 1 -1 PRIMARY t2 ALL NULL NULL NULL NULL 1 Using where -1 PRIMARY t3 ALL NULL NULL NULL NULL 2 FirstMatch(t2) +1 PRIMARY t2 ALL NULL NULL NULL NULL 1 Using join buffer +1 PRIMARY t2 ALL NULL NULL NULL NULL 1 Using where; Using join buffer +1 PRIMARY t3 ALL NULL NULL NULL NULL 2 FirstMatch(t2); Using join buffer drop table t2, t3; set join_cache_level=default; show variables like 'join_cache_level'; === modified file 'mysql-test/r/subselect_sj_jcl6.result' --- a/mysql-test/r/subselect_sj_jcl6.result 2010-02-21 07:53:12 +0000 +++ b/mysql-test/r/subselect_sj_jcl6.result 2010-03-07 15:41:45 +0000 @@ -374,8 +374,8 @@ (SELECT PNUM FROM PROJ)); id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY STAFF ALL NULL NULL NULL NULL 5 -1 PRIMARY PROJ ALL NULL NULL NULL NULL 6 -1 PRIMARY WORKS ALL NULL NULL NULL NULL 12 Using where; FirstMatch(STAFF) +1 PRIMARY PROJ ALL NULL NULL NULL NULL 6 Using join buffer +1 PRIMARY WORKS ALL NULL NULL NULL NULL 12 Using where; FirstMatch(STAFF); Using join buffer SELECT EMPNUM, EMPNAME FROM STAFF WHERE EMPNUM IN @@ -828,6 +828,84 @@ 3 2 drop table t1, t2, t3; +# +# Bug#49198 Wrong result for second call of procedure +# with view in subselect. +# +CREATE TABLE t1 (t1field integer, primary key (t1field)); +CREATE TABLE t2 (t2field integer, primary key (t2field)); +CREATE TABLE t3 (t3field integer, primary key (t3field)); +CREATE VIEW v2 AS SELECT * FROM t2; +CREATE VIEW v3 AS SELECT * FROM t3; +INSERT INTO t1 VALUES(1),(2); +INSERT INTO t2 VALUES(1),(2); +INSERT INTO t3 VALUES(1),(2); +PREPARE stmt FROM +" +SELECT t1field +FROM t1 +WHERE t1field IN (SELECT * FROM v2); +"; +EXECUTE stmt; +t1field +1 +2 +EXECUTE stmt; +t1field +1 +2 +PREPARE stmt FROM +" +EXPLAIN +SELECT t1field +FROM t1 +WHERE t1field IN (SELECT * FROM v2) + AND t1field IN (SELECT * FROM v3) +"; +EXECUTE stmt; +id select_type table type possible_keys key key_len ref rows Extra +1 PRIMARY t1 index PRIMARY PRIMARY 4 NULL 2 Using index +1 PRIMARY t2 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +1 PRIMARY t3 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +EXECUTE stmt; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 index PRIMARY PRIMARY 4 NULL 2 Using index +1 SIMPLE t2 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +1 SIMPLE t3 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +DROP TABLE t1, t2, t3; +DROP VIEW v2, v3; +# End of Bug#49198 +# +# BUG#49129: Wrong result with IN-subquery with join_cache_level=6 and firstmatch=off +# +CREATE TABLE t0 (a INT); +INSERT INTO t0 VALUES (0),(1),(2),(3),(4); +CREATE TABLE t1 (a INT, b INT, KEY(a)); +INSERT INTO t1 SELECT a, a from t0; +CREATE TABLE t2 (a INT, b INT, PRIMARY KEY(a)); +INSERT INTO t2 SELECT * FROM t1; +UPDATE t1 SET a=3, b=11 WHERE a=4; +UPDATE t2 SET b=11 WHERE a=3; +set @save_optimizer_switch=@@optimizer_switch; +set optimizer_switch='firstmatch=off'; +The following should use a join order of t0,t1,t2, with DuplicateElimination: +explain +SELECT * FROM t0 WHERE t0.a IN +(SELECT t1.a FROM t1, t2 WHERE t2.a=t0.a AND t1.b=t2.b); +id select_type table type possible_keys key key_len ref rows Extra +1 PRIMARY t0 ALL NULL NULL NULL NULL 5 Start temporary +1 PRIMARY t1 ref a a 5 test.t0.a 1 Using join buffer +1 PRIMARY t2 eq_ref PRIMARY PRIMARY 4 test.t0.a 1 Using where; End temporary; Using join buffer +SELECT * FROM t0 WHERE t0.a IN +(SELECT t1.a FROM t1, t2 WHERE t2.a=t0.a AND t1.b=t2.b); +a +0 +1 +2 +3 +set optimizer_switch=@save_optimizer_switch; +drop table t0, t1, t2; +# End set join_cache_level=default; show variables like 'join_cache_level'; Variable_name Value === modified file 'mysql-test/suite/pbxt/r/group_min_max.result' --- a/mysql-test/suite/pbxt/r/group_min_max.result 2009-08-17 15:57:58 +0000 +++ b/mysql-test/suite/pbxt/r/group_min_max.result 2010-02-23 09:22:02 +0000 @@ -2257,7 +2257,7 @@ a IN (SELECT max(b) FROM t1 GROUP BY a HAVING a < 2); id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY t1_outer index NULL a 10 NULL 15 Using where; Using index -2 DEPENDENT SUBQUERY t1 index NULL a 10 NULL 1 Using index +2 SUBQUERY t1 index NULL a 10 NULL 15 Using index EXPLAIN SELECT 1 FROM t1 AS t1_outer GROUP BY a HAVING a > (SELECT max(b) FROM t1 GROUP BY a HAVING a < 2); id select_type table type possible_keys key key_len ref rows Extra === modified file 'mysql-test/suite/pbxt/r/subselect.result' --- a/mysql-test/suite/pbxt/r/subselect.result 2009-12-16 09:28:51 +0000 +++ b/mysql-test/suite/pbxt/r/subselect.result 2010-02-23 09:22:02 +0000 @@ -1293,31 +1293,31 @@ 4 explain extended select * from t2 where t2.a in (select a from t1); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL PRIMARY 4 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 unique_subquery PRIMARY PRIMARY 4 func 1 100.00 Using index +1 PRIMARY t2 index PRIMARY PRIMARY 4 NULL 4 100.00 Using index +1 PRIMARY t1 index PRIMARY PRIMARY 4 NULL 4 75.00 Using where; Using index; Using join buffer Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(<primary_index_lookup>(<cache>(`test`.`t2`.`a`) in t1 on PRIMARY))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t1` join `test`.`t2` where (`test`.`t1`.`a` = `test`.`t2`.`a`) select * from t2 where t2.a in (select a from t1 where t1.b <> 30); a 2 4 explain extended select * from t2 where t2.a in (select a from t1 where t1.b <> 30); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL PRIMARY 4 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 unique_subquery PRIMARY PRIMARY 4 func 1 100.00 Using where +1 PRIMARY t2 index PRIMARY PRIMARY 4 NULL 4 100.00 Using index +1 PRIMARY t1 ALL PRIMARY NULL NULL NULL 4 75.00 Using where; Using join buffer Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(<primary_index_lookup>(<cache>(`test`.`t2`.`a`) in t1 on PRIMARY where ((`test`.`t1`.`b` <> 30) and (<cache>(`test`.`t2`.`a`) = `test`.`t1`.`a`))))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t1` join `test`.`t2` where ((`test`.`t1`.`a` = `test`.`t2`.`a`) and (`test`.`t1`.`b` <> 30)) select * from t2 where t2.a in (select t1.a from t1,t3 where t1.b=t3.a); a 2 3 explain extended select * from t2 where t2.a in (select t1.a from t1,t3 where t1.b=t3.a); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL PRIMARY 4 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 eq_ref PRIMARY PRIMARY 4 func 1 100.00 -2 DEPENDENT SUBQUERY t3 eq_ref PRIMARY PRIMARY 4 test.t1.b 1 100.00 Using index +1 PRIMARY t2 index PRIMARY PRIMARY 4 NULL 4 100.00 Using index +1 PRIMARY t1 ALL PRIMARY NULL NULL NULL 4 75.00 Using where; Using join buffer +1 PRIMARY t3 eq_ref PRIMARY PRIMARY 4 test.t1.b 1 100.00 Using index Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(select 1 AS `Not_used` from `test`.`t1` join `test`.`t3` where ((`test`.`t3`.`a` = `test`.`t1`.`b`) and (<cache>(`test`.`t2`.`a`) = `test`.`t1`.`a`)))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t1` join `test`.`t3` join `test`.`t2` where ((`test`.`t1`.`a` = `test`.`t2`.`a`) and (`test`.`t3`.`a` = `test`.`t1`.`b`)) drop table t1, t2, t3; create table t1 (a int, b int, index a (a,b)); create table t2 (a int, index a (a)); @@ -1332,31 +1332,31 @@ 4 explain extended select * from t2 where t2.a in (select a from t1); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL a 5 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 index_subquery a a 5 func 1 100.00 Using index +1 PRIMARY t2 index a a 5 NULL 4 100.00 Using index +1 PRIMARY t1 ref a a 5 test.t2.a 1 100.00 Using index; FirstMatch(t2) Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(<index_lookup>(<cache>(`test`.`t2`.`a`) in t1 on a))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` semi join (`test`.`t1`) where (`test`.`t1`.`a` = `test`.`t2`.`a`) select * from t2 where t2.a in (select a from t1 where t1.b <> 30); a 2 4 explain extended select * from t2 where t2.a in (select a from t1 where t1.b <> 30); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL a 5 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 index_subquery a a 5 func 1 100.00 Using index; Using where +1 PRIMARY t2 index a a 5 NULL 4 100.00 Using index +1 PRIMARY t1 ref a a 5 test.t2.a 1 100.00 Using where; Using index; FirstMatch(t2) Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(<index_lookup>(<cache>(`test`.`t2`.`a`) in t1 on a where ((`test`.`t1`.`b` <> 30) and (<cache>(`test`.`t2`.`a`) = `test`.`t1`.`a`))))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` semi join (`test`.`t1`) where ((`test`.`t1`.`a` = `test`.`t2`.`a`) and (`test`.`t1`.`b` <> 30)) select * from t2 where t2.a in (select t1.a from t1,t3 where t1.b=t3.a); a 2 3 explain extended select * from t2 where t2.a in (select t1.a from t1,t3 where t1.b=t3.a); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL a 5 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 ref a a 5 func 1 100.00 Using index -2 DEPENDENT SUBQUERY t3 ref a a 5 test.t1.b 1 100.00 Using index +1 PRIMARY t2 index a a 5 NULL 4 100.00 Using index +1 PRIMARY t1 ref a a 5 test.t2.a 1 100.00 Using index +1 PRIMARY t3 ref a a 5 test.t1.b 1 100.00 Using index; FirstMatch(t2) Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(select 1 AS `Not_used` from `test`.`t1` join `test`.`t3` where ((`test`.`t3`.`a` = `test`.`t1`.`b`) and (<cache>(`test`.`t2`.`a`) = `test`.`t1`.`a`)))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` semi join (`test`.`t1` join `test`.`t3`) where ((`test`.`t1`.`a` = `test`.`t2`.`a`) and (`test`.`t3`.`a` = `test`.`t1`.`b`)) insert into t1 values (3,31); select * from t2 where t2.a in (select a from t1 where t1.b <> 30); a @@ -1369,10 +1369,10 @@ 4 explain extended select * from t2 where t2.a in (select a from t1 where t1.b <> 30); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL a 5 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 index_subquery a a 5 func 1 100.00 Using index; Using where +1 PRIMARY t2 index a a 5 NULL 4 100.00 Using index +1 PRIMARY t1 ref a a 5 test.t2.a 1 100.00 Using where; Using index; FirstMatch(t2) Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(<index_lookup>(<cache>(`test`.`t2`.`a`) in t1 on a where ((`test`.`t1`.`b` <> 30) and (<cache>(`test`.`t2`.`a`) = `test`.`t1`.`a`))))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` semi join (`test`.`t1`) where ((`test`.`t1`.`a` = `test`.`t2`.`a`) and (`test`.`t1`.`b` <> 30)) drop table t1, t2, t3; create table t1 (a int, b int); create table t2 (a int, b int); @@ -2823,10 +2823,10 @@ Note 1003 select `test`.`t1`.`one` AS `one`,`test`.`t1`.`two` AS `two`,<in_optimizer>((`test`.`t1`.`one`,`test`.`t1`.`two`),<exists>(select `test`.`t2`.`one` AS `one`,`test`.`t2`.`two` AS `two` from `test`.`t2` where ((`test`.`t2`.`flag` = '0') and trigcond(((<cache>(`test`.`t1`.`one`) = `test`.`t2`.`one`) or isnull(`test`.`t2`.`one`))) and trigcond(((<cache>(`test`.`t1`.`two`) = `test`.`t2`.`two`) or isnull(`test`.`t2`.`two`)))) having (trigcond(<is_not_null_test>(`test`.`t2`.`one`)) and trigcond(<is_not_null_test>(`test`.`t2`.`two`))))) AS `test` from `test`.`t1` explain extended SELECT one,two from t1 where ROW(one,two) IN (SELECT one,two FROM t2 WHERE flag = 'N'); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t1 ALL NULL NULL NULL NULL 8 100.00 Using where -2 DEPENDENT SUBQUERY t2 ALL NULL NULL NULL NULL 9 100.00 Using where +1 PRIMARY t1 ALL NULL NULL NULL NULL 8 100.00 +1 PRIMARY t2 ALL NULL NULL NULL NULL 9 100.00 Using where; FirstMatch(t1) Warnings: -Note 1003 select `test`.`t1`.`one` AS `one`,`test`.`t1`.`two` AS `two` from `test`.`t1` where <in_optimizer>((`test`.`t1`.`one`,`test`.`t1`.`two`),<exists>(select `test`.`t2`.`one` AS `one`,`test`.`t2`.`two` AS `two` from `test`.`t2` where ((`test`.`t2`.`flag` = 'N') and (<cache>(`test`.`t1`.`one`) = `test`.`t2`.`one`) and (<cache>(`test`.`t1`.`two`) = `test`.`t2`.`two`)))) +Note 1003 select `test`.`t1`.`one` AS `one`,`test`.`t1`.`two` AS `two` from `test`.`t1` semi join (`test`.`t2`) where ((`test`.`t2`.`two` = `test`.`t1`.`two`) and (`test`.`t2`.`one` = `test`.`t1`.`one`) and (`test`.`t2`.`flag` = 'N')) explain extended SELECT one,two,ROW(one,two) IN (SELECT one,two FROM t2 WHERE flag = '0' group by one,two) as 'test' from t1; id select_type table type possible_keys key key_len ref rows filtered Extra 1 PRIMARY t1 ALL NULL NULL NULL NULL 8 100.00 @@ -3412,7 +3412,7 @@ SELECT * FROM t1 WHERE (a,b) = ANY (SELECT a, max(b) FROM t1 GROUP BY a); id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY t1 ALL NULL NULL NULL NULL 9 Using where -2 DEPENDENT SUBQUERY t1 ALL NULL NULL NULL NULL 9 Using temporary; Using filesort +2 SUBQUERY t1 ALL NULL NULL NULL NULL 9 Using temporary; Using filesort ALTER TABLE t1 ADD INDEX(a); SELECT * FROM t1 WHERE (a,b) = ANY (SELECT a, max(b) FROM t1 GROUP BY a); a b @@ -3423,7 +3423,7 @@ SELECT * FROM t1 WHERE (a,b) = ANY (SELECT a, max(b) FROM t1 GROUP BY a); id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY t1 ALL NULL NULL NULL NULL 9 Using where -2 DEPENDENT SUBQUERY t1 index NULL a 8 NULL 1 Using filesort +2 SUBQUERY t1 ALL NULL NULL NULL NULL 9 Using temporary; Using filesort DROP TABLE t1; create table t1( f1 int,f2 int); insert into t1 values (1,1),(2,2); @@ -4213,8 +4213,8 @@ CREATE INDEX I2 ON t1 (b); EXPLAIN SELECT a,b FROM t1 WHERE b IN (SELECT a FROM t1); id select_type table type possible_keys key key_len ref rows Extra -1 PRIMARY t1 ALL NULL NULL NULL NULL 2 Using where -2 DEPENDENT SUBQUERY t1 index_subquery I1 I1 2 func 1 Using index; Using where +1 PRIMARY t1 index I1 I1 2 NULL 2 Using index; LooseScan +1 PRIMARY t1 ref I2 I2 13 test.t1.a 1 Using where SELECT a,b FROM t1 WHERE b IN (SELECT a FROM t1); a b CREATE TABLE t2 (a VARCHAR(1), b VARCHAR(10)); @@ -4223,15 +4223,15 @@ CREATE INDEX I2 ON t2 (b); EXPLAIN SELECT a,b FROM t2 WHERE b IN (SELECT a FROM t2); id select_type table type possible_keys key key_len ref rows Extra -1 PRIMARY t2 ALL NULL NULL NULL NULL 2 Using where -2 DEPENDENT SUBQUERY t2 index_subquery I1 I1 4 func 1 Using index; Using where +1 PRIMARY t2 index I1 I1 4 NULL 2 Using index; LooseScan +1 PRIMARY t2 ref I2 I2 13 test.t2.a 1 Using where SELECT a,b FROM t2 WHERE b IN (SELECT a FROM t2); a b EXPLAIN SELECT a,b FROM t1 WHERE b IN (SELECT a FROM t1 WHERE LENGTH(a)<500); id select_type table type possible_keys key key_len ref rows Extra -1 PRIMARY t1 ALL NULL NULL NULL NULL 2 Using where -2 DEPENDENT SUBQUERY t1 index_subquery I1 I1 2 func 1 Using index; Using where +1 PRIMARY t1 index I1 I1 2 NULL 2 Using where; Using index; LooseScan +1 PRIMARY t1 ref I2 I2 13 test.t1.a 1 Using where SELECT a,b FROM t1 WHERE b IN (SELECT a FROM t1 WHERE LENGTH(a)<500); a b DROP TABLE t1,t2; === modified file 'mysql-test/t/join_cache.test' --- a/mysql-test/t/join_cache.test 2009-12-21 02:26:15 +0000 +++ b/mysql-test/t/join_cache.test 2010-03-06 19:14:55 +0000 @@ -1823,3 +1823,27 @@ set join_cache_level=default; DROP TABLE t1,t2; + +--echo # +--echo # Bug #51092: linked join buffer is used for a 3-way cross join query +--echo # that selects only records of the first table +--echo # + +create table t1 (a int, b int); +insert into t1 values (1,1),(2,2); +create table t2 (a int, b int); +insert into t2 values (1,1),(2,2); +create table t3 (a int, b int); +insert into t3 values (1,1),(2,2); + +explain select t1.* from t1,t2,t3; +select t1.* from t1,t2,t3; + +set join_cache_level=2; + +explain select t1.* from t1,t2,t3; +select t1.* from t1,t2,t3; + +set join_cache_level=default; + +drop table t1,t2,t3; === modified file 'mysql-test/t/subselect_sj.test' --- a/mysql-test/t/subselect_sj.test 2010-02-21 07:53:12 +0000 +++ b/mysql-test/t/subselect_sj.test 2010-02-24 11:33:42 +0000 @@ -728,3 +728,45 @@ drop table t1, t2, t3; +--echo # +--echo # Bug#49198 Wrong result for second call of procedure +--echo # with view in subselect. +--echo # + +CREATE TABLE t1 (t1field integer, primary key (t1field)); +CREATE TABLE t2 (t2field integer, primary key (t2field)); +CREATE TABLE t3 (t3field integer, primary key (t3field)); + +CREATE VIEW v2 AS SELECT * FROM t2; +CREATE VIEW v3 AS SELECT * FROM t3; + +INSERT INTO t1 VALUES(1),(2); +INSERT INTO t2 VALUES(1),(2); +INSERT INTO t3 VALUES(1),(2); + +PREPARE stmt FROM +" +SELECT t1field +FROM t1 +WHERE t1field IN (SELECT * FROM v2); +"; + +EXECUTE stmt; +EXECUTE stmt; + +PREPARE stmt FROM +" +EXPLAIN +SELECT t1field +FROM t1 +WHERE t1field IN (SELECT * FROM v2) + AND t1field IN (SELECT * FROM v3) +"; + +EXECUTE stmt; +EXECUTE stmt; + +DROP TABLE t1, t2, t3; +DROP VIEW v2, v3; + +--echo # End of Bug#49198 === modified file 'mysql-test/t/subselect_sj2.test' --- a/mysql-test/t/subselect_sj2.test 2010-01-17 14:51:10 +0000 +++ b/mysql-test/t/subselect_sj2.test 2010-03-07 15:41:45 +0000 @@ -583,7 +583,7 @@ if (`select @@join_cache_level=6`) { - --echo + --echo # Not anymore: --echo # The following query gives wrong result due to Bug#49129 } select * from t0 where t0.a in === modified file 'mysql-test/t/subselect_sj_jcl6.test' --- a/mysql-test/t/subselect_sj_jcl6.test 2010-01-17 14:51:10 +0000 +++ b/mysql-test/t/subselect_sj_jcl6.test 2010-03-07 15:41:45 +0000 @@ -7,5 +7,33 @@ --source t/subselect_sj.test +--echo # +--echo # BUG#49129: Wrong result with IN-subquery with join_cache_level=6 and firstmatch=off +--echo # +CREATE TABLE t0 (a INT); +INSERT INTO t0 VALUES (0),(1),(2),(3),(4); +CREATE TABLE t1 (a INT, b INT, KEY(a)); +INSERT INTO t1 SELECT a, a from t0; +CREATE TABLE t2 (a INT, b INT, PRIMARY KEY(a)); +INSERT INTO t2 SELECT * FROM t1; +UPDATE t1 SET a=3, b=11 WHERE a=4; +UPDATE t2 SET b=11 WHERE a=3; + +set @save_optimizer_switch=@@optimizer_switch; +set optimizer_switch='firstmatch=off'; + +--echo The following should use a join order of t0,t1,t2, with DuplicateElimination: +explain +SELECT * FROM t0 WHERE t0.a IN + (SELECT t1.a FROM t1, t2 WHERE t2.a=t0.a AND t1.b=t2.b); + +SELECT * FROM t0 WHERE t0.a IN + (SELECT t1.a FROM t1, t2 WHERE t2.a=t0.a AND t1.b=t2.b); + +set optimizer_switch=@save_optimizer_switch; +drop table t0, t1, t2; + +--echo # End + set join_cache_level=default; show variables like 'join_cache_level'; === modified file 'sql/item.cc' --- a/sql/item.cc 2010-02-21 06:32:23 +0000 +++ b/sql/item.cc 2010-02-24 11:33:42 +0000 @@ -6491,11 +6491,9 @@ void Item_ref::fix_after_pullout(st_select_lex *new_parent, Item **refptr) { + (*ref)->fix_after_pullout(new_parent, ref); if (depended_from == new_parent) - { - (*ref)->fix_after_pullout(new_parent, ref); depended_from= NULL; - } } === modified file 'sql/opt_subselect.cc' --- a/sql/opt_subselect.cc 2010-02-19 21:55:57 +0000 +++ b/sql/opt_subselect.cc 2010-03-09 10:36:15 +0000 @@ -531,7 +531,6 @@ *expr= new_cond; if (do_fix_fields) new_cond->fix_fields(join->thd, expr); - join->select_lex->where= *expr; return FALSE; } @@ -3031,10 +3030,24 @@ forwards, but do not destroy other duplicate elimination methods. */ uint first_table= i; + uint join_cache_level= join->thd->variables.join_cache_level; for (uint j= i; j < i + pos->n_sj_tables; j++) { - if (join->best_positions[j].use_join_buffer && j <= no_jbuf_after) + /* + When we'll properly take join buffering into account during + join optimization, the below check should be changed to + "if (join->best_positions[j].use_join_buffer && + j <= no_jbuf_after)". + For now, use a rough criteria: + */ + JOIN_TAB *js_tab=join->join_tab + j; + if (j != join->const_tables && js_tab->use_quick != 2 && + j <= no_jbuf_after && + ((js_tab->type == JT_ALL && join_cache_level != 0) || + (join_cache_level > 4 && (tab->type == JT_REF || + tab->type == JT_EQ_REF)))) { + /* Looks like we'll be using join buffer */ first_table= join->const_tables; break; } @@ -3112,7 +3125,12 @@ JOIN_TAB *j, *jump_to= tab-1; for (j= tab; j != tab + pos->n_sj_tables; j++) { - if (!tab->emb_sj_nest) + /* + NOTE: this loop probably doesn't do the right thing for the case + where FirstMatch's duplicate-generating range is interleaved with + "unrelated" tables (as specified in WL#3750, section 2.2). + */ + if (!j->emb_sj_nest) jump_to= tab; else { === modified file 'sql/sql_join_cache.cc' --- a/sql/sql_join_cache.cc 2010-02-15 21:53:06 +0000 +++ b/sql/sql_join_cache.cc 2010-03-07 15:41:45 +0000 @@ -31,6 +31,8 @@ #include "sql_select.h" #include "opt_subselect.h" +#define NO_MORE_RECORDS_IN_BUFFER (uint)(-1) + /***************************************************************************** * Join cache module @@ -407,8 +409,10 @@ However at this moment we don't know whether we have referenced fields for the cache or not. Later when a referenced field is registered for the cache we adjust the value of the flag 'with_length'. - */ - with_length= is_key_access() || with_match_flag; + */ + with_length= is_key_access() || + join_tab->is_inner_table_of_semi_join_with_first_match() || + join_tab->is_inner_table_of_outer_join(); /* At this moment we don't know yet the value of 'referenced_fields', but in any case it can't be greater than the value of 'fields'. @@ -604,7 +608,12 @@ copy_end= cache->field_descr+cache->fields; for (copy= cache->field_descr+cache->flag_fields; copy < copy_end; copy++) { - if (copy->field->table == tab->table && + /* + (1) - when we store rowids for DuplicateWeedout, they have + copy->field==NULL + */ + if (copy->field && // (1) + copy->field->table == tab->table && bitmap_is_set(key_read_set, copy->field->field_index)) { *copy_ptr++= copy; @@ -1235,7 +1244,7 @@ prev_rec_ptr= prev_cache->get_rec_ref(pos); } curr_rec_pos= pos; - if (!(res= read_all_record_fields() == 0)) + if (!(res= read_all_record_fields() == NO_MORE_RECORDS_IN_BUFFER)) { pos+= referenced_fields*size_of_fld_ofs; if (prev_cache) @@ -1304,7 +1313,7 @@ uchar *prev_rec_ptr= prev_cache->get_rec_ref(rec_ptr); return prev_cache->get_match_flag_by_pos(prev_rec_ptr); } - DBUG_ASSERT(1); + DBUG_ASSERT(0); return FALSE; } @@ -1324,7 +1333,8 @@ read data. RETURN - length of the data read from the join buffer + (-1) - if there is no more records in the join buffer + length of the data read from the join buffer - otherwise */ uint JOIN_CACHE::read_all_record_fields() @@ -1332,7 +1342,7 @@ uchar *init_pos= pos; if (pos > last_rec_pos || !records) - return 0; + return NO_MORE_RECORDS_IN_BUFFER; /* First match flag, read null bitmaps and null_row flag for each table */ read_flag_fields(); @@ -1538,12 +1548,12 @@ bool JOIN_CACHE::skip_record_if_match() { - DBUG_ASSERT(with_match_flag && with_length); + DBUG_ASSERT(with_length); uint offset= size_of_rec_len; if (prev_cache) offset+= prev_cache->get_size_of_rec_offset(); /* Check whether the match flag is on */ - if (test(*(pos+offset))) + if (get_match_flag_by_pos(pos+offset)) { pos+= size_of_rec_len + get_rec_length(pos); return TRUE; === modified file 'sql/sql_select.cc' --- a/sql/sql_select.cc 2010-02-19 21:55:57 +0000 +++ b/sql/sql_select.cc 2010-03-09 10:36:15 +0000 @@ -5635,7 +5635,11 @@ uint blob_length=(uint) (join_tab->table->file->stats.mean_rec_length- (join_tab->table->s->reclength-rec_length)); rec_length+=(uint) max(4,blob_length); - } + } + /* + psergey-todo: why we don't count here rowid that we might need to store + when using DuplicateElimination? + */ join_tab->used_fields=fields; join_tab->used_fieldlength=rec_length; join_tab->used_blobs=blobs; @@ -6355,10 +6359,17 @@ } if (!tab->first_inner) tab->first_inner= nested_join->first_nested; + if (tab->table->reginfo.not_exists_optimize) + tab->first_inner->table->reginfo.not_exists_optimize= 1; if (++nested_join->counter < nested_join->n_tables) break; /* Table tab is the last inner table for nested join. */ nested_join->first_nested->last_inner= tab; + if (tab->first_inner->table->reginfo.not_exists_optimize) + { + for (JOIN_TAB *join_tab= tab->first_inner; join_tab <= tab; join_tab++) + join_tab->table->reginfo.not_exists_optimize= 1; + } } } DBUG_VOID_RETURN; @@ -7112,18 +7123,14 @@ if (tab->use_quick == 2) goto no_join_cache; /* - Use join cache with FirstMatch semi-join strategy only when semi-join - contains only one table. - */ - if (tab->is_inner_table_of_semi_join_with_first_match() && - !tab->is_single_inner_of_semi_join_with_first_match()) - goto no_join_cache; - /* Non-linked join buffers can't guarantee one match */ - if (force_unlinked_cache && - (tab->is_inner_table_of_outer_join() && - !tab->is_single_inner_of_outer_join())) + if (force_unlinked_cache && + (!tab->type == JT_ALL || cache_level <= 4) && + ((tab->is_inner_table_of_semi_join_with_first_match() && + !tab->is_single_inner_of_semi_join_with_first_match()) || + (tab->is_inner_table_of_outer_join() && + !tab->is_single_inner_of_outer_join()))) goto no_join_cache; /* === modified file 'sql/sql_select.h' --- a/sql/sql_select.h 2010-02-15 21:53:06 +0000 +++ b/sql/sql_select.h 2010-03-05 18:54:48 +0000 @@ -321,8 +321,8 @@ } bool check_only_first_match() { - return last_sj_inner_tab == this || - (first_inner && first_inner->last_inner == this && + return is_inner_table_of_semi_join_with_first_match() || + (is_inner_table_of_outer_join() && table->reginfo.not_exists_optimize); } bool is_last_inner_table()

1 0

[Maria-developers] bzr commit into file:///home/tsk/mprog/src/5.3-mwl68-merge.base-mwl68/ branch (timour:2766)
by timour＠askmonty.org 09 Mar '10

09 Mar '10

#At file:///home/tsk/mprog/src/5.3-mwl68-merge.base-mwl68/ based on revid:timour@askmonty.org-20100309101406-xygkt2sgftvjvevg 2766 timour(a)askmonty.org 2010-03-09 [merge] MWL#68 Subquery optimization: Efficient NOT IN execution with NULLs Automerge with 5.3-subqueries modified: mysql-test/r/join_cache.result mysql-test/r/subselect_sj.result mysql-test/r/subselect_sj2.result mysql-test/r/subselect_sj2_jcl6.result mysql-test/r/subselect_sj_jcl6.result mysql-test/suite/pbxt/r/group_min_max.result mysql-test/suite/pbxt/r/subselect.result mysql-test/t/join_cache.test mysql-test/t/subselect_sj.test mysql-test/t/subselect_sj2.test mysql-test/t/subselect_sj_jcl6.test sql/item.cc sql/opt_subselect.cc sql/sql_join_cache.cc sql/sql_select.cc sql/sql_select.h === modified file 'mysql-test/r/join_cache.result' --- a/mysql-test/r/join_cache.result 2010-02-11 21:59:32 +0000 +++ b/mysql-test/r/join_cache.result 2010-03-06 19:14:55 +0000 @@ -4142,3 +4142,46 @@ c1 c2 c1 c2 LENGTH(t2.c1) LENGTH(t2.c2) 2 2 tt uu 2 2 set join_cache_level=default; DROP TABLE t1,t2; +# +# Bug #51092: linked join buffer is used for a 3-way cross join query +# that selects only records of the first table +# +create table t1 (a int, b int); +insert into t1 values (1,1),(2,2); +create table t2 (a int, b int); +insert into t2 values (1,1),(2,2); +create table t3 (a int, b int); +insert into t3 values (1,1),(2,2); +explain select t1.* from t1,t2,t3; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 ALL NULL NULL NULL NULL 2 +1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using join buffer +1 SIMPLE t3 ALL NULL NULL NULL NULL 2 Using join buffer +select t1.* from t1,t2,t3; +a b +1 1 +2 2 +1 1 +2 2 +1 1 +2 2 +1 1 +2 2 +set join_cache_level=2; +explain select t1.* from t1,t2,t3; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 ALL NULL NULL NULL NULL 2 +1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using join buffer +1 SIMPLE t3 ALL NULL NULL NULL NULL 2 Using join buffer +select t1.* from t1,t2,t3; +a b +1 1 +2 2 +1 1 +2 2 +1 1 +2 2 +1 1 +2 2 +set join_cache_level=default; +drop table t1,t2,t3; === modified file 'mysql-test/r/subselect_sj.result' --- a/mysql-test/r/subselect_sj.result 2010-02-21 07:53:12 +0000 +++ b/mysql-test/r/subselect_sj.result 2010-02-24 11:33:42 +0000 @@ -824,3 +824,50 @@ a 3 2 drop table t1, t2, t3; +# +# Bug#49198 Wrong result for second call of procedure +# with view in subselect. +# +CREATE TABLE t1 (t1field integer, primary key (t1field)); +CREATE TABLE t2 (t2field integer, primary key (t2field)); +CREATE TABLE t3 (t3field integer, primary key (t3field)); +CREATE VIEW v2 AS SELECT * FROM t2; +CREATE VIEW v3 AS SELECT * FROM t3; +INSERT INTO t1 VALUES(1),(2); +INSERT INTO t2 VALUES(1),(2); +INSERT INTO t3 VALUES(1),(2); +PREPARE stmt FROM +" +SELECT t1field +FROM t1 +WHERE t1field IN (SELECT * FROM v2); +"; +EXECUTE stmt; +t1field +1 +2 +EXECUTE stmt; +t1field +1 +2 +PREPARE stmt FROM +" +EXPLAIN +SELECT t1field +FROM t1 +WHERE t1field IN (SELECT * FROM v2) + AND t1field IN (SELECT * FROM v3) +"; +EXECUTE stmt; +id select_type table type possible_keys key key_len ref rows Extra +1 PRIMARY t1 index PRIMARY PRIMARY 4 NULL 2 Using index +1 PRIMARY t2 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +1 PRIMARY t3 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +EXECUTE stmt; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 index PRIMARY PRIMARY 4 NULL 2 Using index +1 SIMPLE t2 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +1 SIMPLE t3 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +DROP TABLE t1, t2, t3; +DROP VIEW v2, v3; +# End of Bug#49198 === modified file 'mysql-test/r/subselect_sj2.result' --- a/mysql-test/r/subselect_sj2.result 2010-02-17 10:47:55 +0000 +++ b/mysql-test/r/subselect_sj2.result 2010-03-07 15:41:45 +0000 @@ -264,8 +264,8 @@ explain select * from t0 where a in (select t2.a+t3.a from t1 left join (t2 join t3) on t2.a=t1.a and t3.a=t1.a); id select_type table type possible_keys key key_len ref rows Extra -1 PRIMARY t0 ALL NULL NULL NULL NULL 10 -1 PRIMARY t1 index NULL a 5 NULL 10 Using index; Start temporary; Using join buffer +1 PRIMARY t0 ALL NULL NULL NULL NULL 10 Start temporary +1 PRIMARY t1 index NULL a 5 NULL 10 Using index; Using join buffer 1 PRIMARY t2 ref a a 5 test.t1.a 1 Using index 1 PRIMARY t3 ref a a 5 test.t1.a 1 Using where; Using index; End temporary drop table t0, t1,t2,t3; === modified file 'mysql-test/r/subselect_sj2_jcl6.result' --- a/mysql-test/r/subselect_sj2_jcl6.result 2010-02-17 10:47:55 +0000 +++ b/mysql-test/r/subselect_sj2_jcl6.result 2010-03-07 15:41:45 +0000 @@ -268,8 +268,8 @@ explain select * from t0 where a in (select t2.a+t3.a from t1 left join (t2 join t3) on t2.a=t1.a and t3.a=t1.a); id select_type table type possible_keys key key_len ref rows Extra -1 PRIMARY t0 ALL NULL NULL NULL NULL 10 -1 PRIMARY t1 index NULL a 5 NULL 10 Using index; Start temporary; Using join buffer +1 PRIMARY t0 ALL NULL NULL NULL NULL 10 Start temporary +1 PRIMARY t1 index NULL a 5 NULL 10 Using index; Using join buffer 1 PRIMARY t2 ref a a 5 test.t1.a 1 Using index 1 PRIMARY t3 ref a a 5 test.t1.a 1 Using where; Using index; End temporary drop table t0, t1,t2,t3; @@ -421,20 +421,23 @@ explain extended select * from t0 where t0.a in ( select t1.a from t1,t2 where t2.a=t0.a and t1.b=t2.b); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t0 ALL NULL NULL NULL NULL 5 100.00 -1 PRIMARY t1 ref a a 5 test.t0.a 1 100.00 Start temporary; Using join buffer +1 PRIMARY t0 ALL NULL NULL NULL NULL 5 100.00 Start temporary +1 PRIMARY t1 ref a a 5 test.t0.a 1 100.00 Using join buffer 1 PRIMARY t2 eq_ref PRIMARY PRIMARY 4 test.t0.a 1 100.00 Using where; End temporary; Using join buffer Warnings: Note 1276 Field or reference 'test.t0.a' of SELECT #2 was resolved in SELECT #1 Note 1003 select `test`.`t0`.`a` AS `a` from `test`.`t2` semi join (`test`.`t1`) join `test`.`t0` where ((`test`.`t2`.`b` = `test`.`t1`.`b`) and (`test`.`t1`.`a` = `test`.`t0`.`a`) and (`test`.`t2`.`a` = `test`.`t0`.`a`)) update t1 set a=3, b=11 where a=4; update t2 set b=11 where a=3; - +# Not anymore: # The following query gives wrong result due to Bug#49129 select * from t0 where t0.a in (select t1.a from t1, t2 where t2.a=t0.a and t1.b=t2.b); a 0 +1 +2 +3 drop table t0, t1, t2; CREATE TABLE t1 ( id int(11) NOT NULL, @@ -713,9 +716,9 @@ c2 in (select 1 from t3, t2) and c1 in (select convert(c6,char(1)) from t2); id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY t2 ALL NULL NULL NULL NULL 1 Using where -1 PRIMARY t2 ALL NULL NULL NULL NULL 1 -1 PRIMARY t2 ALL NULL NULL NULL NULL 1 Using where -1 PRIMARY t3 ALL NULL NULL NULL NULL 2 FirstMatch(t2) +1 PRIMARY t2 ALL NULL NULL NULL NULL 1 Using join buffer +1 PRIMARY t2 ALL NULL NULL NULL NULL 1 Using where; Using join buffer +1 PRIMARY t3 ALL NULL NULL NULL NULL 2 FirstMatch(t2); Using join buffer drop table t2, t3; set join_cache_level=default; show variables like 'join_cache_level'; === modified file 'mysql-test/r/subselect_sj_jcl6.result' --- a/mysql-test/r/subselect_sj_jcl6.result 2010-02-21 07:53:12 +0000 +++ b/mysql-test/r/subselect_sj_jcl6.result 2010-03-07 15:41:45 +0000 @@ -374,8 +374,8 @@ WHERE PNUM IN (SELECT PNUM FROM PROJ)); id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY STAFF ALL NULL NULL NULL NULL 5 -1 PRIMARY PROJ ALL NULL NULL NULL NULL 6 -1 PRIMARY WORKS ALL NULL NULL NULL NULL 12 Using where; FirstMatch(STAFF) +1 PRIMARY PROJ ALL NULL NULL NULL NULL 6 Using join buffer +1 PRIMARY WORKS ALL NULL NULL NULL NULL 12 Using where; FirstMatch(STAFF); Using join buffer SELECT EMPNUM, EMPNAME FROM STAFF WHERE EMPNUM IN @@ -828,6 +828,84 @@ a 3 2 drop table t1, t2, t3; +# +# Bug#49198 Wrong result for second call of procedure +# with view in subselect. +# +CREATE TABLE t1 (t1field integer, primary key (t1field)); +CREATE TABLE t2 (t2field integer, primary key (t2field)); +CREATE TABLE t3 (t3field integer, primary key (t3field)); +CREATE VIEW v2 AS SELECT * FROM t2; +CREATE VIEW v3 AS SELECT * FROM t3; +INSERT INTO t1 VALUES(1),(2); +INSERT INTO t2 VALUES(1),(2); +INSERT INTO t3 VALUES(1),(2); +PREPARE stmt FROM +" +SELECT t1field +FROM t1 +WHERE t1field IN (SELECT * FROM v2); +"; +EXECUTE stmt; +t1field +1 +2 +EXECUTE stmt; +t1field +1 +2 +PREPARE stmt FROM +" +EXPLAIN +SELECT t1field +FROM t1 +WHERE t1field IN (SELECT * FROM v2) + AND t1field IN (SELECT * FROM v3) +"; +EXECUTE stmt; +id select_type table type possible_keys key key_len ref rows Extra +1 PRIMARY t1 index PRIMARY PRIMARY 4 NULL 2 Using index +1 PRIMARY t2 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +1 PRIMARY t3 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +EXECUTE stmt; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 index PRIMARY PRIMARY 4 NULL 2 Using index +1 SIMPLE t2 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +1 SIMPLE t3 eq_ref PRIMARY PRIMARY 4 test.t1.t1field 1 Using index +DROP TABLE t1, t2, t3; +DROP VIEW v2, v3; +# End of Bug#49198 +# +# BUG#49129: Wrong result with IN-subquery with join_cache_level=6 and firstmatch=off +# +CREATE TABLE t0 (a INT); +INSERT INTO t0 VALUES (0),(1),(2),(3),(4); +CREATE TABLE t1 (a INT, b INT, KEY(a)); +INSERT INTO t1 SELECT a, a from t0; +CREATE TABLE t2 (a INT, b INT, PRIMARY KEY(a)); +INSERT INTO t2 SELECT * FROM t1; +UPDATE t1 SET a=3, b=11 WHERE a=4; +UPDATE t2 SET b=11 WHERE a=3; +set @save_optimizer_switch=@@optimizer_switch; +set optimizer_switch='firstmatch=off'; +The following should use a join order of t0,t1,t2, with DuplicateElimination: +explain +SELECT * FROM t0 WHERE t0.a IN +(SELECT t1.a FROM t1, t2 WHERE t2.a=t0.a AND t1.b=t2.b); +id select_type table type possible_keys key key_len ref rows Extra +1 PRIMARY t0 ALL NULL NULL NULL NULL 5 Start temporary +1 PRIMARY t1 ref a a 5 test.t0.a 1 Using join buffer +1 PRIMARY t2 eq_ref PRIMARY PRIMARY 4 test.t0.a 1 Using where; End temporary; Using join buffer +SELECT * FROM t0 WHERE t0.a IN +(SELECT t1.a FROM t1, t2 WHERE t2.a=t0.a AND t1.b=t2.b); +a +0 +1 +2 +3 +set optimizer_switch=@save_optimizer_switch; +drop table t0, t1, t2; +# End set join_cache_level=default; show variables like 'join_cache_level'; Variable_name Value === modified file 'mysql-test/suite/pbxt/r/group_min_max.result' --- a/mysql-test/suite/pbxt/r/group_min_max.result 2009-08-17 15:57:58 +0000 +++ b/mysql-test/suite/pbxt/r/group_min_max.result 2010-02-23 09:22:02 +0000 @@ -2257,7 +2257,7 @@ EXPLAIN SELECT 1 FROM t1 AS t1_outer WHE a IN (SELECT max(b) FROM t1 GROUP BY a HAVING a < 2); id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY t1_outer index NULL a 10 NULL 15 Using where; Using index -2 DEPENDENT SUBQUERY t1 index NULL a 10 NULL 1 Using index +2 SUBQUERY t1 index NULL a 10 NULL 15 Using index EXPLAIN SELECT 1 FROM t1 AS t1_outer GROUP BY a HAVING a > (SELECT max(b) FROM t1 GROUP BY a HAVING a < 2); id select_type table type possible_keys key key_len ref rows Extra === modified file 'mysql-test/suite/pbxt/r/subselect.result' --- a/mysql-test/suite/pbxt/r/subselect.result 2009-12-16 09:28:51 +0000 +++ b/mysql-test/suite/pbxt/r/subselect.result 2010-02-23 09:22:02 +0000 @@ -1293,31 +1293,31 @@ a 4 explain extended select * from t2 where t2.a in (select a from t1); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL PRIMARY 4 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 unique_subquery PRIMARY PRIMARY 4 func 1 100.00 Using index +1 PRIMARY t2 index PRIMARY PRIMARY 4 NULL 4 100.00 Using index +1 PRIMARY t1 index PRIMARY PRIMARY 4 NULL 4 75.00 Using where; Using index; Using join buffer Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(<primary_index_lookup>(<cache>(`test`.`t2`.`a`) in t1 on PRIMARY))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t1` join `test`.`t2` where (`test`.`t1`.`a` = `test`.`t2`.`a`) select * from t2 where t2.a in (select a from t1 where t1.b <> 30); a 2 4 explain extended select * from t2 where t2.a in (select a from t1 where t1.b <> 30); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL PRIMARY 4 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 unique_subquery PRIMARY PRIMARY 4 func 1 100.00 Using where +1 PRIMARY t2 index PRIMARY PRIMARY 4 NULL 4 100.00 Using index +1 PRIMARY t1 ALL PRIMARY NULL NULL NULL 4 75.00 Using where; Using join buffer Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(<primary_index_lookup>(<cache>(`test`.`t2`.`a`) in t1 on PRIMARY where ((`test`.`t1`.`b` <> 30) and (<cache>(`test`.`t2`.`a`) = `test`.`t1`.`a`))))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t1` join `test`.`t2` where ((`test`.`t1`.`a` = `test`.`t2`.`a`) and (`test`.`t1`.`b` <> 30)) select * from t2 where t2.a in (select t1.a from t1,t3 where t1.b=t3.a); a 2 3 explain extended select * from t2 where t2.a in (select t1.a from t1,t3 where t1.b=t3.a); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL PRIMARY 4 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 eq_ref PRIMARY PRIMARY 4 func 1 100.00 -2 DEPENDENT SUBQUERY t3 eq_ref PRIMARY PRIMARY 4 test.t1.b 1 100.00 Using index +1 PRIMARY t2 index PRIMARY PRIMARY 4 NULL 4 100.00 Using index +1 PRIMARY t1 ALL PRIMARY NULL NULL NULL 4 75.00 Using where; Using join buffer +1 PRIMARY t3 eq_ref PRIMARY PRIMARY 4 test.t1.b 1 100.00 Using index Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(select 1 AS `Not_used` from `test`.`t1` join `test`.`t3` where ((`test`.`t3`.`a` = `test`.`t1`.`b`) and (<cache>(`test`.`t2`.`a`) = `test`.`t1`.`a`)))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t1` join `test`.`t3` join `test`.`t2` where ((`test`.`t1`.`a` = `test`.`t2`.`a`) and (`test`.`t3`.`a` = `test`.`t1`.`b`)) drop table t1, t2, t3; create table t1 (a int, b int, index a (a,b)); create table t2 (a int, index a (a)); @@ -1332,31 +1332,31 @@ a 4 explain extended select * from t2 where t2.a in (select a from t1); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL a 5 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 index_subquery a a 5 func 1 100.00 Using index +1 PRIMARY t2 index a a 5 NULL 4 100.00 Using index +1 PRIMARY t1 ref a a 5 test.t2.a 1 100.00 Using index; FirstMatch(t2) Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(<index_lookup>(<cache>(`test`.`t2`.`a`) in t1 on a))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` semi join (`test`.`t1`) where (`test`.`t1`.`a` = `test`.`t2`.`a`) select * from t2 where t2.a in (select a from t1 where t1.b <> 30); a 2 4 explain extended select * from t2 where t2.a in (select a from t1 where t1.b <> 30); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL a 5 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 index_subquery a a 5 func 1 100.00 Using index; Using where +1 PRIMARY t2 index a a 5 NULL 4 100.00 Using index +1 PRIMARY t1 ref a a 5 test.t2.a 1 100.00 Using where; Using index; FirstMatch(t2) Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(<index_lookup>(<cache>(`test`.`t2`.`a`) in t1 on a where ((`test`.`t1`.`b` <> 30) and (<cache>(`test`.`t2`.`a`) = `test`.`t1`.`a`))))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` semi join (`test`.`t1`) where ((`test`.`t1`.`a` = `test`.`t2`.`a`) and (`test`.`t1`.`b` <> 30)) select * from t2 where t2.a in (select t1.a from t1,t3 where t1.b=t3.a); a 2 3 explain extended select * from t2 where t2.a in (select t1.a from t1,t3 where t1.b=t3.a); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL a 5 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 ref a a 5 func 1 100.00 Using index -2 DEPENDENT SUBQUERY t3 ref a a 5 test.t1.b 1 100.00 Using index +1 PRIMARY t2 index a a 5 NULL 4 100.00 Using index +1 PRIMARY t1 ref a a 5 test.t2.a 1 100.00 Using index +1 PRIMARY t3 ref a a 5 test.t1.b 1 100.00 Using index; FirstMatch(t2) Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(select 1 AS `Not_used` from `test`.`t1` join `test`.`t3` where ((`test`.`t3`.`a` = `test`.`t1`.`b`) and (<cache>(`test`.`t2`.`a`) = `test`.`t1`.`a`)))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` semi join (`test`.`t1` join `test`.`t3`) where ((`test`.`t1`.`a` = `test`.`t2`.`a`) and (`test`.`t3`.`a` = `test`.`t1`.`b`)) insert into t1 values (3,31); select * from t2 where t2.a in (select a from t1 where t1.b <> 30); a @@ -1369,10 +1369,10 @@ a 4 explain extended select * from t2 where t2.a in (select a from t1 where t1.b <> 30); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t2 index NULL a 5 NULL 4 100.00 Using where; Using index -2 DEPENDENT SUBQUERY t1 index_subquery a a 5 func 1 100.00 Using index; Using where +1 PRIMARY t2 index a a 5 NULL 4 100.00 Using index +1 PRIMARY t1 ref a a 5 test.t2.a 1 100.00 Using where; Using index; FirstMatch(t2) Warnings: -Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` where <in_optimizer>(`test`.`t2`.`a`,<exists>(<index_lookup>(<cache>(`test`.`t2`.`a`) in t1 on a where ((`test`.`t1`.`b` <> 30) and (<cache>(`test`.`t2`.`a`) = `test`.`t1`.`a`))))) +Note 1003 select `test`.`t2`.`a` AS `a` from `test`.`t2` semi join (`test`.`t1`) where ((`test`.`t1`.`a` = `test`.`t2`.`a`) and (`test`.`t1`.`b` <> 30)) drop table t1, t2, t3; create table t1 (a int, b int); create table t2 (a int, b int); @@ -2823,10 +2823,10 @@ Warnings: Note 1003 select `test`.`t1`.`one` AS `one`,`test`.`t1`.`two` AS `two`,<in_optimizer>((`test`.`t1`.`one`,`test`.`t1`.`two`),<exists>(select `test`.`t2`.`one` AS `one`,`test`.`t2`.`two` AS `two` from `test`.`t2` where ((`test`.`t2`.`flag` = '0') and trigcond(((<cache>(`test`.`t1`.`one`) = `test`.`t2`.`one`) or isnull(`test`.`t2`.`one`))) and trigcond(((<cache>(`test`.`t1`.`two`) = `test`.`t2`.`two`) or isnull(`test`.`t2`.`two`)))) having (trigcond(<is_not_null_test>(`test`.`t2`.`one`)) and trigcond(<is_not_null_test>(`test`.`t2`.`two`))))) AS `test` from `test`.`t1` explain extended SELECT one,two from t1 where ROW(one,two) IN (SELECT one,two FROM t2 WHERE flag = 'N'); id select_type table type possible_keys key key_len ref rows filtered Extra -1 PRIMARY t1 ALL NULL NULL NULL NULL 8 100.00 Using where -2 DEPENDENT SUBQUERY t2 ALL NULL NULL NULL NULL 9 100.00 Using where +1 PRIMARY t1 ALL NULL NULL NULL NULL 8 100.00 +1 PRIMARY t2 ALL NULL NULL NULL NULL 9 100.00 Using where; FirstMatch(t1) Warnings: -Note 1003 select `test`.`t1`.`one` AS `one`,`test`.`t1`.`two` AS `two` from `test`.`t1` where <in_optimizer>((`test`.`t1`.`one`,`test`.`t1`.`two`),<exists>(select `test`.`t2`.`one` AS `one`,`test`.`t2`.`two` AS `two` from `test`.`t2` where ((`test`.`t2`.`flag` = 'N') and (<cache>(`test`.`t1`.`one`) = `test`.`t2`.`one`) and (<cache>(`test`.`t1`.`two`) = `test`.`t2`.`two`)))) +Note 1003 select `test`.`t1`.`one` AS `one`,`test`.`t1`.`two` AS `two` from `test`.`t1` semi join (`test`.`t2`) where ((`test`.`t2`.`two` = `test`.`t1`.`two`) and (`test`.`t2`.`one` = `test`.`t1`.`one`) and (`test`.`t2`.`flag` = 'N')) explain extended SELECT one,two,ROW(one,two) IN (SELECT one,two FROM t2 WHERE flag = '0' group by one,two) as 'test' from t1; id select_type table type possible_keys key key_len ref rows filtered Extra 1 PRIMARY t1 ALL NULL NULL NULL NULL 8 100.00 @@ -3412,7 +3412,7 @@ EXPLAIN SELECT * FROM t1 WHERE (a,b) = ANY (SELECT a, max(b) FROM t1 GROUP BY a); id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY t1 ALL NULL NULL NULL NULL 9 Using where -2 DEPENDENT SUBQUERY t1 ALL NULL NULL NULL NULL 9 Using temporary; Using filesort +2 SUBQUERY t1 ALL NULL NULL NULL NULL 9 Using temporary; Using filesort ALTER TABLE t1 ADD INDEX(a); SELECT * FROM t1 WHERE (a,b) = ANY (SELECT a, max(b) FROM t1 GROUP BY a); a b @@ -3423,7 +3423,7 @@ EXPLAIN SELECT * FROM t1 WHERE (a,b) = ANY (SELECT a, max(b) FROM t1 GROUP BY a); id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY t1 ALL NULL NULL NULL NULL 9 Using where -2 DEPENDENT SUBQUERY t1 index NULL a 8 NULL 1 Using filesort +2 SUBQUERY t1 ALL NULL NULL NULL NULL 9 Using temporary; Using filesort DROP TABLE t1; create table t1( f1 int,f2 int); insert into t1 values (1,1),(2,2); @@ -4213,8 +4213,8 @@ CREATE INDEX I1 ON t1 (a); CREATE INDEX I2 ON t1 (b); EXPLAIN SELECT a,b FROM t1 WHERE b IN (SELECT a FROM t1); id select_type table type possible_keys key key_len ref rows Extra -1 PRIMARY t1 ALL NULL NULL NULL NULL 2 Using where -2 DEPENDENT SUBQUERY t1 index_subquery I1 I1 2 func 1 Using index; Using where +1 PRIMARY t1 index I1 I1 2 NULL 2 Using index; LooseScan +1 PRIMARY t1 ref I2 I2 13 test.t1.a 1 Using where SELECT a,b FROM t1 WHERE b IN (SELECT a FROM t1); a b CREATE TABLE t2 (a VARCHAR(1), b VARCHAR(10)); @@ -4223,15 +4223,15 @@ CREATE INDEX I1 ON t2 (a); CREATE INDEX I2 ON t2 (b); EXPLAIN SELECT a,b FROM t2 WHERE b IN (SELECT a FROM t2); id select_type table type possible_keys key key_len ref rows Extra -1 PRIMARY t2 ALL NULL NULL NULL NULL 2 Using where -2 DEPENDENT SUBQUERY t2 index_subquery I1 I1 4 func 1 Using index; Using where +1 PRIMARY t2 index I1 I1 4 NULL 2 Using index; LooseScan +1 PRIMARY t2 ref I2 I2 13 test.t2.a 1 Using where SELECT a,b FROM t2 WHERE b IN (SELECT a FROM t2); a b EXPLAIN SELECT a,b FROM t1 WHERE b IN (SELECT a FROM t1 WHERE LENGTH(a)<500); id select_type table type possible_keys key key_len ref rows Extra -1 PRIMARY t1 ALL NULL NULL NULL NULL 2 Using where -2 DEPENDENT SUBQUERY t1 index_subquery I1 I1 2 func 1 Using index; Using where +1 PRIMARY t1 index I1 I1 2 NULL 2 Using where; Using index; LooseScan +1 PRIMARY t1 ref I2 I2 13 test.t1.a 1 Using where SELECT a,b FROM t1 WHERE b IN (SELECT a FROM t1 WHERE LENGTH(a)<500); a b DROP TABLE t1,t2; === modified file 'mysql-test/t/join_cache.test' --- a/mysql-test/t/join_cache.test 2009-12-21 02:26:15 +0000 +++ b/mysql-test/t/join_cache.test 2010-03-06 19:14:55 +0000 @@ -1823,3 +1823,27 @@ SELECT t1.*, t2.*, LENGTH(t2.c1), LENGTH set join_cache_level=default; DROP TABLE t1,t2; + +--echo # +--echo # Bug #51092: linked join buffer is used for a 3-way cross join query +--echo # that selects only records of the first table +--echo # + +create table t1 (a int, b int); +insert into t1 values (1,1),(2,2); +create table t2 (a int, b int); +insert into t2 values (1,1),(2,2); +create table t3 (a int, b int); +insert into t3 values (1,1),(2,2); + +explain select t1.* from t1,t2,t3; +select t1.* from t1,t2,t3; + +set join_cache_level=2; + +explain select t1.* from t1,t2,t3; +select t1.* from t1,t2,t3; + +set join_cache_level=default; + +drop table t1,t2,t3; === modified file 'mysql-test/t/subselect_sj.test' --- a/mysql-test/t/subselect_sj.test 2010-02-21 07:53:12 +0000 +++ b/mysql-test/t/subselect_sj.test 2010-02-24 11:33:42 +0000 @@ -728,3 +728,45 @@ where a in (select c from t2 where d >= drop table t1, t2, t3; +--echo # +--echo # Bug#49198 Wrong result for second call of procedure +--echo # with view in subselect. +--echo # + +CREATE TABLE t1 (t1field integer, primary key (t1field)); +CREATE TABLE t2 (t2field integer, primary key (t2field)); +CREATE TABLE t3 (t3field integer, primary key (t3field)); + +CREATE VIEW v2 AS SELECT * FROM t2; +CREATE VIEW v3 AS SELECT * FROM t3; + +INSERT INTO t1 VALUES(1),(2); +INSERT INTO t2 VALUES(1),(2); +INSERT INTO t3 VALUES(1),(2); + +PREPARE stmt FROM +" +SELECT t1field +FROM t1 +WHERE t1field IN (SELECT * FROM v2); +"; + +EXECUTE stmt; +EXECUTE stmt; + +PREPARE stmt FROM +" +EXPLAIN +SELECT t1field +FROM t1 +WHERE t1field IN (SELECT * FROM v2) + AND t1field IN (SELECT * FROM v3) +"; + +EXECUTE stmt; +EXECUTE stmt; + +DROP TABLE t1, t2, t3; +DROP VIEW v2, v3; + +--echo # End of Bug#49198 === modified file 'mysql-test/t/subselect_sj2.test' --- a/mysql-test/t/subselect_sj2.test 2010-01-17 14:51:10 +0000 +++ b/mysql-test/t/subselect_sj2.test 2010-03-07 15:41:45 +0000 @@ -583,7 +583,7 @@ update t2 set b=11 where a=3; if (`select @@join_cache_level=6`) { - --echo + --echo # Not anymore: --echo # The following query gives wrong result due to Bug#49129 } select * from t0 where t0.a in === modified file 'mysql-test/t/subselect_sj_jcl6.test' --- a/mysql-test/t/subselect_sj_jcl6.test 2010-01-17 14:51:10 +0000 +++ b/mysql-test/t/subselect_sj_jcl6.test 2010-03-07 15:41:45 +0000 @@ -7,5 +7,33 @@ show variables like 'join_cache_level'; --source t/subselect_sj.test +--echo # +--echo # BUG#49129: Wrong result with IN-subquery with join_cache_level=6 and firstmatch=off +--echo # +CREATE TABLE t0 (a INT); +INSERT INTO t0 VALUES (0),(1),(2),(3),(4); +CREATE TABLE t1 (a INT, b INT, KEY(a)); +INSERT INTO t1 SELECT a, a from t0; +CREATE TABLE t2 (a INT, b INT, PRIMARY KEY(a)); +INSERT INTO t2 SELECT * FROM t1; +UPDATE t1 SET a=3, b=11 WHERE a=4; +UPDATE t2 SET b=11 WHERE a=3; + +set @save_optimizer_switch=@@optimizer_switch; +set optimizer_switch='firstmatch=off'; + +--echo The following should use a join order of t0,t1,t2, with DuplicateElimination: +explain +SELECT * FROM t0 WHERE t0.a IN + (SELECT t1.a FROM t1, t2 WHERE t2.a=t0.a AND t1.b=t2.b); + +SELECT * FROM t0 WHERE t0.a IN + (SELECT t1.a FROM t1, t2 WHERE t2.a=t0.a AND t1.b=t2.b); + +set optimizer_switch=@save_optimizer_switch; +drop table t0, t1, t2; + +--echo # End + set join_cache_level=default; show variables like 'join_cache_level'; === modified file 'sql/item.cc' --- a/sql/item.cc 2010-02-21 06:32:23 +0000 +++ b/sql/item.cc 2010-02-24 11:33:42 +0000 @@ -6491,11 +6491,9 @@ void Item_outer_ref::fix_after_pullout(s void Item_ref::fix_after_pullout(st_select_lex *new_parent, Item **refptr) { + (*ref)->fix_after_pullout(new_parent, ref); if (depended_from == new_parent) - { - (*ref)->fix_after_pullout(new_parent, ref); depended_from= NULL; - } } === modified file 'sql/opt_subselect.cc' --- a/sql/opt_subselect.cc 2010-02-19 21:55:57 +0000 +++ b/sql/opt_subselect.cc 2010-03-09 10:36:15 +0000 @@ -531,7 +531,6 @@ static bool replace_where_subcondition(J *expr= new_cond; if (do_fix_fields) new_cond->fix_fields(join->thd, expr); - join->select_lex->where= *expr; return FALSE; } @@ -3031,10 +3030,24 @@ int setup_semijoin_dups_elimination(JOIN forwards, but do not destroy other duplicate elimination methods. */ uint first_table= i; + uint join_cache_level= join->thd->variables.join_cache_level; for (uint j= i; j < i + pos->n_sj_tables; j++) { - if (join->best_positions[j].use_join_buffer && j <= no_jbuf_after) + /* + When we'll properly take join buffering into account during + join optimization, the below check should be changed to + "if (join->best_positions[j].use_join_buffer && + j <= no_jbuf_after)". + For now, use a rough criteria: + */ + JOIN_TAB *js_tab=join->join_tab + j; + if (j != join->const_tables && js_tab->use_quick != 2 && + j <= no_jbuf_after && + ((js_tab->type == JT_ALL && join_cache_level != 0) || + (join_cache_level > 4 && (tab->type == JT_REF || + tab->type == JT_EQ_REF)))) { + /* Looks like we'll be using join buffer */ first_table= join->const_tables; break; } @@ -3112,7 +3125,12 @@ int setup_semijoin_dups_elimination(JOIN JOIN_TAB *j, *jump_to= tab-1; for (j= tab; j != tab + pos->n_sj_tables; j++) { - if (!tab->emb_sj_nest) + /* + NOTE: this loop probably doesn't do the right thing for the case + where FirstMatch's duplicate-generating range is interleaved with + "unrelated" tables (as specified in WL#3750, section 2.2). + */ + if (!j->emb_sj_nest) jump_to= tab; else { === modified file 'sql/sql_join_cache.cc' --- a/sql/sql_join_cache.cc 2010-02-15 21:53:06 +0000 +++ b/sql/sql_join_cache.cc 2010-03-07 15:41:45 +0000 @@ -31,6 +31,8 @@ #include "sql_select.h" #include "opt_subselect.h" +#define NO_MORE_RECORDS_IN_BUFFER (uint)(-1) + /***************************************************************************** * Join cache module @@ -407,8 +409,10 @@ void JOIN_CACHE::set_constants() However at this moment we don't know whether we have referenced fields for the cache or not. Later when a referenced field is registered for the cache we adjust the value of the flag 'with_length'. - */ - with_length= is_key_access() || with_match_flag; + */ + with_length= is_key_access() || + join_tab->is_inner_table_of_semi_join_with_first_match() || + join_tab->is_inner_table_of_outer_join(); /* At this moment we don't know yet the value of 'referenced_fields', but in any case it can't be greater than the value of 'fields'. @@ -604,7 +608,12 @@ int JOIN_CACHE_BKA::init() copy_end= cache->field_descr+cache->fields; for (copy= cache->field_descr+cache->flag_fields; copy < copy_end; copy++) { - if (copy->field->table == tab->table && + /* + (1) - when we store rowids for DuplicateWeedout, they have + copy->field==NULL + */ + if (copy->field && // (1) + copy->field->table == tab->table && bitmap_is_set(key_read_set, copy->field->field_index)) { *copy_ptr++= copy; @@ -1235,7 +1244,7 @@ bool JOIN_CACHE::get_record() prev_rec_ptr= prev_cache->get_rec_ref(pos); } curr_rec_pos= pos; - if (!(res= read_all_record_fields() == 0)) + if (!(res= read_all_record_fields() == NO_MORE_RECORDS_IN_BUFFER)) { pos+= referenced_fields*size_of_fld_ofs; if (prev_cache) @@ -1304,7 +1313,7 @@ bool JOIN_CACHE::get_match_flag_by_pos(u uchar *prev_rec_ptr= prev_cache->get_rec_ref(rec_ptr); return prev_cache->get_match_flag_by_pos(prev_rec_ptr); } - DBUG_ASSERT(1); + DBUG_ASSERT(0); return FALSE; } @@ -1324,7 +1333,8 @@ bool JOIN_CACHE::get_match_flag_by_pos(u read data. RETURN - length of the data read from the join buffer + (-1) - if there is no more records in the join buffer + length of the data read from the join buffer - otherwise */ uint JOIN_CACHE::read_all_record_fields() @@ -1332,7 +1342,7 @@ uint JOIN_CACHE::read_all_record_fields( uchar *init_pos= pos; if (pos > last_rec_pos || !records) - return 0; + return NO_MORE_RECORDS_IN_BUFFER; /* First match flag, read null bitmaps and null_row flag for each table */ read_flag_fields(); @@ -1538,12 +1548,12 @@ bool JOIN_CACHE::read_referenced_field(C bool JOIN_CACHE::skip_record_if_match() { - DBUG_ASSERT(with_match_flag && with_length); + DBUG_ASSERT(with_length); uint offset= size_of_rec_len; if (prev_cache) offset+= prev_cache->get_size_of_rec_offset(); /* Check whether the match flag is on */ - if (test(*(pos+offset))) + if (get_match_flag_by_pos(pos+offset)) { pos+= size_of_rec_len + get_rec_length(pos); return TRUE; === modified file 'sql/sql_select.cc' --- a/sql/sql_select.cc 2010-02-19 21:55:57 +0000 +++ b/sql/sql_select.cc 2010-03-09 10:36:15 +0000 @@ -5635,7 +5635,11 @@ void calc_used_field_length(THD *thd, JO uint blob_length=(uint) (join_tab->table->file->stats.mean_rec_length- (join_tab->table->s->reclength-rec_length)); rec_length+=(uint) max(4,blob_length); - } + } + /* + psergey-todo: why we don't count here rowid that we might need to store + when using DuplicateElimination? + */ join_tab->used_fields=fields; join_tab->used_fieldlength=rec_length; join_tab->used_blobs=blobs; @@ -6355,10 +6359,17 @@ make_outerjoin_info(JOIN *join) } if (!tab->first_inner) tab->first_inner= nested_join->first_nested; + if (tab->table->reginfo.not_exists_optimize) + tab->first_inner->table->reginfo.not_exists_optimize= 1; if (++nested_join->counter < nested_join->n_tables) break; /* Table tab is the last inner table for nested join. */ nested_join->first_nested->last_inner= tab; + if (tab->first_inner->table->reginfo.not_exists_optimize) + { + for (JOIN_TAB *join_tab= tab->first_inner; join_tab <= tab; join_tab++) + join_tab->table->reginfo.not_exists_optimize= 1; + } } } DBUG_VOID_RETURN; @@ -7112,18 +7123,14 @@ uint check_join_cache_usage(JOIN_TAB *ta if (tab->use_quick == 2) goto no_join_cache; /* - Use join cache with FirstMatch semi-join strategy only when semi-join - contains only one table. - */ - if (tab->is_inner_table_of_semi_join_with_first_match() && - !tab->is_single_inner_of_semi_join_with_first_match()) - goto no_join_cache; - /* Non-linked join buffers can't guarantee one match */ - if (force_unlinked_cache && - (tab->is_inner_table_of_outer_join() && - !tab->is_single_inner_of_outer_join())) + if (force_unlinked_cache && + (!tab->type == JT_ALL || cache_level <= 4) && + ((tab->is_inner_table_of_semi_join_with_first_match() && + !tab->is_single_inner_of_semi_join_with_first_match()) || + (tab->is_inner_table_of_outer_join() && + !tab->is_single_inner_of_outer_join()))) goto no_join_cache; /* === modified file 'sql/sql_select.h' --- a/sql/sql_select.h 2010-02-15 21:53:06 +0000 +++ b/sql/sql_select.h 2010-03-05 18:54:48 +0000 @@ -321,8 +321,8 @@ typedef struct st_join_table { } bool check_only_first_match() { - return last_sj_inner_tab == this || - (first_inner && first_inner->last_inner == this && + return is_inner_table_of_semi_join_with_first_match() || + (is_inner_table_of_outer_join() && table->reginfo.not_exists_optimize); } bool is_last_inner_table()

1 0

[Maria-developers] Rev 2765: MWL#68 Subquery optimization: Efficient NOT IN execution with NULLs in file:///home/tsk/mprog/src/5.3-mwl68/
by timour＠askmonty.org 09 Mar '10

09 Mar '10

At file:///home/tsk/mprog/src/5.3-mwl68/ ------------------------------------------------------------ revno: 2765 revision-id: timour(a)askmonty.org-20100309101406-xygkt2sgftvjvevg parent: timour(a)askmonty.org-20100222151655-ltjv0rlv6z2sdiiu committer: timour(a)askmonty.org branch nick: 5.3-mwl68 timestamp: Tue 2010-03-09 12:14:06 +0200 message: MWL#68 Subquery optimization: Efficient NOT IN execution with NULLs * Implemented a second partial matching strategy via table scan. This strategy is a fallback when there is no memory for rowid merging. * Refactored the selection and creation of partial matching strategies, so that the choice of strategy is encapsulated in a separate method choose_partial_match_strategy(). * Refactored the representation of partial match strategies so that: - each strategy is represented by a polymorphic class, and - the base class for all partial match strategies contains common execution code. * Added an estimate of the memory needed for the rowid merge strategy, and the system variable "rowid_merge_buff_size" to control the maximum memory to be used by the rowid merge algorithm. * Added two optimizer_switch system variables to control the choice of partial match strategy: "partial_match_rowid_merge", "partial_match_table_scan". * Fixed multiple problems with deallocation of resources by the partial match strategies. === modified file 'sql/item_subselect.cc' --- a/sql/item_subselect.cc 2010-02-22 15:16:55 +0000 +++ b/sql/item_subselect.cc 2010-03-09 10:14:06 +0000 @@ -2910,13 +2910,7 @@ /* - TIMOUR: this needs more thinking, as exec() is a wrong IMO because: - - we don't need empty_result_set, as it is == 1 <=> when - item->value == 0 - - scan_table() returns >0 even when there was no actuall error, - but we only found EOF while scanning. - - scan_table should not check table->status, but it should check - HA_ERR_END_OF_FILE + TIMOUR: write comment */ int subselect_uniquesubquery_engine::index_lookup() @@ -2924,8 +2918,6 @@ DBUG_ENTER("subselect_uniquesubquery_engine::index_lookup"); int error; TABLE *table= tab->table; - empty_result_set= TRUE; - table->status= 0; if (!table->file->inited) table->file->ha_index_init(tab->ref.key, 0); @@ -2934,25 +2926,25 @@ make_prev_keypart_map(tab-> ref.key_parts), HA_READ_KEY_EXACT); - DBUG_PRINT("info", ("lookup result: %i", error)); - if (error && - error != HA_ERR_KEY_NOT_FOUND && error != HA_ERR_END_OF_FILE) + + if (error && error != HA_ERR_KEY_NOT_FOUND && error != HA_ERR_END_OF_FILE) + { + /* + TIMOUR: I don't understand at all when do we need to call report_error. + In most places where we access an index, we don't do this. Why here? + */ error= report_error(table, error); + DBUG_RETURN(error); + } + + table->null_row= 0; + if (!error && (!cond || cond->val_int())) + ((Item_in_subselect *) item)->value= 1; else - { - error= 0; - table->null_row= 0; - if (!table->status && (!cond || cond->val_int())) - { - ((Item_in_subselect *) item)->value= 1; - empty_result_set= FALSE; - } - else - ((Item_in_subselect *) item)->value= 0; - } + ((Item_in_subselect *) item)->value= 0; - DBUG_RETURN(error); + DBUG_RETURN(0); } @@ -3415,19 +3407,24 @@ If max_keys > 1, then we need partial matching because there are more indexes than just the one we use during materialization to remove duplicates. + + @note + TIMOUR: The schema-based analysis for partial matching can be done once for + prepared statement and remembered. It is done here to remove the need to + save/restore all related variables between each re-execution, thus making + the code simpler. + + @retval PARTIAL_MATCH if a partial match should be used + @retval COMPLETE_MATCH if a complete match (index lookup) should be used */ -void subselect_hash_sj_engine::set_strategy_using_schema() +subselect_hash_sj_engine::exec_strategy +subselect_hash_sj_engine::get_strategy_using_schema() { Item_in_subselect *item_in= (Item_in_subselect *) item; - DBUG_ENTER("subselect_hash_sj_engine::set_strategy_using_schema"); - if (item_in->is_top_level_item()) - { - strategy= COMPLETE_MATCH; - DBUG_VOID_RETURN; - } + return COMPLETE_MATCH; else { List_iterator<Item> inner_col_it(*item_in->unit->get_unit_column_types()); @@ -3450,10 +3447,8 @@ /* If no column contains NULLs use regular hash index lookups. */ if (count_partial_match_columns) - strategy= PARTIAL_MATCH; - else - strategy= COMPLETE_MATCH; - DBUG_VOID_RETURN; + return PARTIAL_MATCH; + return COMPLETE_MATCH; } @@ -3465,19 +3460,25 @@ matching type of columns that cannot be NULL or that contain only NULLs. Based on this, the procedure determines the final execution strategy for the [NOT] IN predicate. + + @retval PARTIAL_MATCH if a partial match should be used + @retval COMPLETE_MATCH if a complete match (index lookup) should be used */ -void subselect_hash_sj_engine::set_strategy_using_data() +subselect_hash_sj_engine::exec_strategy +subselect_hash_sj_engine::get_strategy_using_data() { Item_in_subselect *item_in= (Item_in_subselect *) item; select_materialize_with_stats *result_sink= (select_materialize_with_stats *) result; Item *outer_col; - DBUG_ENTER("subselect_hash_sj_engine::set_strategy_using_data"); - - /* Call this procedure only if already selected partial matching. */ - DBUG_ASSERT(strategy == PARTIAL_MATCH); + /* + If we already determined that a complete match is enough based on schema + information, nothing can be better. + */ + if (strategy == COMPLETE_MATCH) + return COMPLETE_MATCH; for (uint i= 0; i < item_in->left_expr->cols(); i++) { @@ -3501,9 +3502,117 @@ /* If no column contains NULLs use regular hash index lookups. */ if (!count_partial_match_columns) - strategy= COMPLETE_MATCH; - - DBUG_VOID_RETURN; + return COMPLETE_MATCH; + return PARTIAL_MATCH; +} + + +void +subselect_hash_sj_engine::choose_partial_match_strategy( + bool has_non_null_key, bool has_covering_null_row, + MY_BITMAP *partial_match_key_parts) +{ + size_t pm_buff_size; + + DBUG_ASSERT(strategy == PARTIAL_MATCH); + /* + Choose according to global optimizer switch. If only one of the switches is + 'ON', then the remaining strategy is the only possible one. The only cases + when this will be overriden is when the total size of all buffers for the + merge strategy is bigger than the 'rowid_merge_buff_size' system variable, + or if there isn't enough physical memory to allocate the buffers. + */ + if (!optimizer_flag(thd, OPTIMIZER_SWITCH_PARTIAL_MATCH_ROWID_MERGE) && + optimizer_flag(thd, OPTIMIZER_SWITCH_PARTIAL_MATCH_TABLE_SCAN)) + strategy= PARTIAL_MATCH_SCAN; + else if + ( optimizer_flag(thd, OPTIMIZER_SWITCH_PARTIAL_MATCH_ROWID_MERGE) && + !optimizer_flag(thd, OPTIMIZER_SWITCH_PARTIAL_MATCH_TABLE_SCAN)) + strategy= PARTIAL_MATCH_MERGE; + + /* + If both switches are ON, or both are OFF, we interpret that as "let the + optimizer decide". Perform a cost based choice between the two partial + matching strategies. + */ + /* + TIMOUR: the above interpretation of the switch values could be changed to: + - if both are ON - let the optimizer decide, + - if both are OFF - do not use partial matching, therefore do not use + materialization in non-top-level predicates. + The problem with this is that we know for sure if we need partial matching + only after the subquery is materialized, and this is too late to revert to + the IN=>EXISTS strategy. + */ + if (strategy == PARTIAL_MATCH) + { + /* + TIMOUR: Currently we use a super simplistic measure. This will be + addressed in a separate task. + */ + if (tmp_table->file->stats.records < 100) + strategy= PARTIAL_MATCH_SCAN; + else + strategy= PARTIAL_MATCH_MERGE; + } + + /* Check if there is enough memory for the rowid merge strategy. */ + if (strategy == PARTIAL_MATCH_MERGE) + { + pm_buff_size= rowid_merge_buff_size(has_non_null_key, + has_covering_null_row, + partial_match_key_parts); + if (pm_buff_size > thd->variables.rowid_merge_buff_size) + strategy= PARTIAL_MATCH_SCAN; + } +} + + +/* + Compute the memory size of all buffers proportional to the number of rows + in tmp_table. + + @details + If the result is bigger than thd->variables.rowid_merge_buff_size, partial + matching via merging is not applicable. +*/ + +size_t subselect_hash_sj_engine::rowid_merge_buff_size( + bool has_non_null_key, bool has_covering_null_row, + MY_BITMAP *partial_match_key_parts) +{ + size_t buff_size; /* Total size of all buffers used by partial matching. */ + ha_rows row_count= tmp_table->file->stats.records; + uint rowid_length= tmp_table->file->ref_length; + select_materialize_with_stats *result_sink= + (select_materialize_with_stats *) result; + + /* Size of the subselect_rowid_merge_engine::row_num_to_rowid buffer. */ + buff_size= row_count * rowid_length * sizeof(uchar); + + if (has_non_null_key) + { + /* Add the size of Ordered_key::key_buff of the only non-NULL key. */ + buff_size+= row_count * sizeof(rownum_t); + } + + if (!has_covering_null_row) + { + for (uint i= 0; i < partial_match_key_parts->n_bits; i++) + { + if (!bitmap_is_set(partial_match_key_parts, i) || + result_sink->get_null_count_of_col(i) == row_count) + continue; /* In these cases we wouldn't construct Ordered keys. */ + + /* Add the size of Ordered_key::key_buff */ + buff_size+= (row_count - result_sink->get_null_count_of_col(i)) * + sizeof(rownum_t); + /* Add the size of Ordered_key::null_key */ + buff_size+= bitmap_buffer_size(result_sink->get_max_null_of_col(i)); + } + } + + return buff_size; } @@ -3561,7 +3670,6 @@ thd->mem_root)) DBUG_RETURN(TRUE); - set_strategy_using_schema(); /* Create and initialize a select result interceptor that stores the result stream in a temporary table. The temporary table itself is @@ -3623,7 +3731,9 @@ ((Item_in_subselect *) item)->left_expr->cols() == tmp_table->key_info->key_parts); - if (make_semi_join_conds()) + if (make_semi_join_conds() || + /* A unique_engine is used both for complete and partial matching. */ + !(lookup_engine= make_unique_engine())) DBUG_RETURN(TRUE); DBUG_RETURN(FALSE); @@ -3691,7 +3801,7 @@ DBUG_RETURN(TRUE); } } - if (semi_join_conds->fix_fields(thd, &semi_join_conds)) + if (semi_join_conds->fix_fields(thd, (Item**)&semi_join_conds)) DBUG_RETURN(TRUE); DBUG_RETURN(FALSE); @@ -3791,7 +3901,7 @@ clause of the query, and it is not 'fixed' during JOIN::prepare. */ if (semi_join_conds && !semi_join_conds->fixed && - semi_join_conds->fix_fields(thd, &semi_join_conds)) + semi_join_conds->fix_fields(thd, (Item**)&semi_join_conds)) return TRUE; /* Let our engine reuse this query plan for materialization. */ materialize_join= materialize_engine->join; @@ -3802,6 +3912,7 @@ subselect_hash_sj_engine::~subselect_hash_sj_engine() { + delete lookup_engine; delete result; if (tmp_table) free_tmp_table(thd, tmp_table); @@ -3817,9 +3928,30 @@ void subselect_hash_sj_engine::cleanup() { + enum_engine_type lookup_engine_type= lookup_engine->engine_type(); is_materialized= FALSE; + bitmap_clear_all(&non_null_key_parts); + bitmap_clear_all(&partial_match_key_parts); + count_partial_match_columns= 0; + count_null_only_columns= 0; + strategy= UNDEFINED; + materialize_engine->cleanup(); + if (lookup_engine_type == TABLE_SCAN_ENGINE || + lookup_engine_type == ROWID_MERGE_ENGINE) + { + subselect_engine *inner_lookup_engine; + inner_lookup_engine= + ((subselect_partial_match_engine*) lookup_engine)->lookup_engine; + /* + Partial match engines are recreated for each PS execution inside + subselect_hash_sj_engine::exec(). + */ + delete lookup_engine; + lookup_engine= inner_lookup_engine; + } + DBUG_ASSERT(lookup_engine->engine_type() == UNIQUESUBQUERY_ENGINE); + lookup_engine->cleanup(); result->cleanup(); /* Resets the temp table as well. */ - materialize_engine->cleanup(); } @@ -3838,6 +3970,7 @@ { Item_in_subselect *item_in= (Item_in_subselect *) item; SELECT_LEX *save_select= thd->lex->current_select; + subselect_partial_match_engine *pm_engine= NULL; int res= 0; DBUG_ENTER("subselect_hash_sj_engine::exec"); @@ -3881,59 +4014,86 @@ DBUG_RETURN(FALSE); } - if (strategy == PARTIAL_MATCH) - set_strategy_using_data(); - - /* A unique_engine is used both for complete and partial matching. */ - if (!(lookup_engine= make_unique_engine())) - { - res= 1; - goto err; - } - - if (strategy == PARTIAL_MATCH) - { - subselect_rowid_merge_engine *rowid_merge_engine; - uint count_pm_keys; - MY_BITMAP *nn_key_parts; - bool has_covering_null_row; + /* + TIMOUR: The schema-based analysis for partial matching can be done once for + prepared statement and remembered. It is done here to remove the need to + save/restore all related variables between each re-execution, thus making + the code simpler. + */ + strategy= get_strategy_using_schema(); + /* This call may discover that we don't need partial matching at all. */ + strategy= get_strategy_using_data(); + if (strategy == PARTIAL_MATCH) + { + uint count_pm_keys; /* Total number of keys needed for partial matching. */ + MY_BITMAP *nn_key_parts; /* The key parts of the only non-NULL index. */ + uint covering_null_row_width; select_materialize_with_stats *result_sink= (select_materialize_with_stats *) result; - /* Total number of keys needed for partial matching. */ nn_key_parts= (count_partial_match_columns < tmp_table->s->fields) ? &non_null_key_parts : NULL; - has_covering_null_row= (result_sink->get_max_nulls_in_row() == - tmp_table->s->fields - - (nn_key_parts ? bitmap_bits_set(nn_key_parts) : 0)); + if (result_sink->get_max_nulls_in_row() == + tmp_table->s->fields - + (nn_key_parts ? bitmap_bits_set(nn_key_parts) : 0)) + covering_null_row_width= result_sink->get_max_nulls_in_row(); + else + covering_null_row_width= 0; - if (has_covering_null_row) + if (covering_null_row_width) count_pm_keys= nn_key_parts ? 1 : 0; else count_pm_keys= count_partial_match_columns - count_null_only_columns + (nn_key_parts ? 1 : 0); - if (!(rowid_merge_engine= - new subselect_rowid_merge_engine((subselect_uniquesubquery_engine*) - lookup_engine, - tmp_table, - count_pm_keys, - has_covering_null_row, - item, result)) || - rowid_merge_engine->init(nn_key_parts, &partial_match_key_parts)) + choose_partial_match_strategy(test(nn_key_parts), + test(covering_null_row_width), + &partial_match_key_parts); + DBUG_ASSERT(strategy == PARTIAL_MATCH_MERGE || + strategy == PARTIAL_MATCH_SCAN); + if (strategy == PARTIAL_MATCH_MERGE) { - strategy= PARTIAL_MATCH_SCAN; - delete rowid_merge_engine; - /* TIMOUR: setup execution structures for partial match via scanning. */ + pm_engine= + new subselect_rowid_merge_engine((subselect_uniquesubquery_engine*) + lookup_engine, tmp_table, + count_pm_keys, + covering_null_row_width, + item, result, + semi_join_conds->argument_list()); + if (!pm_engine || + ((subselect_rowid_merge_engine*) pm_engine)-> + init(nn_key_parts, &partial_match_key_parts)) + { + /* + The call to init() would fail if there was not enough memory to allocate + all buffers for the rowid merge strategy. In this case revert to table + scanning which doesn't need any big buffers. + */ + delete pm_engine; + pm_engine= NULL; + strategy= PARTIAL_MATCH_SCAN; + } } - else + + if (strategy == PARTIAL_MATCH_SCAN) { - strategy= PARTIAL_MATCH_INDEX; - lookup_engine= rowid_merge_engine; + if (!(pm_engine= + new subselect_table_scan_engine((subselect_uniquesubquery_engine*) + lookup_engine, tmp_table, + item, result, + semi_join_conds->argument_list(), + covering_null_row_width))) + { + /* This is an irrecoverable error. */ + res= 1; + goto err; + } } } + if (pm_engine) + lookup_engine= pm_engine; item_in->change_engine(lookup_engine); err: @@ -4009,10 +4169,8 @@ Ordered_key::~Ordered_key() { - /* - All data structures are allocated on thd->mem_root, thus we don't - free them here. - */ + my_free((char*) key_buff, MYF(0)); + bitmap_free(&null_key); } @@ -4030,6 +4188,7 @@ */ } + /* Initialize a multi-column index. */ @@ -4103,14 +4262,16 @@ } +/* + Allocate the buffers for both the row number, and the NULL-bitmap indexes. +*/ + bool Ordered_key::alloc_keys_buffers() { - THD *thd= tbl->in_use; - DBUG_ASSERT(key_buff_elements > 0); - if (!(key_buff= (rownum_t*) thd->alloc(key_buff_elements * - sizeof(rownum_t)))) + if (!(key_buff= (rownum_t*) my_malloc(key_buff_elements * sizeof(rownum_t), + MYF(MY_WME)))) return TRUE; /* @@ -4118,10 +4279,8 @@ (max_null_row - min_null_row), and then use min_null_row as lookup offset. */ - if (bitmap_init_memroot(&null_key, - /* this is max array index, we need count, so +1. */ - max_null_row + 1, - thd->mem_root)) + /* Notice that max_null_row is max array index, we need count, so +1. */ + if (bitmap_init(&null_key, NULL, max_null_row + 1, FALSE)) return TRUE; cur_key_idx= HA_POS_ERROR; @@ -4193,8 +4352,9 @@ /* - The probability that a certain row does not contain a NULL in some row in - a NULL-indexed column. + The fraction of rows that do not contain NULL in the columns indexed by + this key. + @retval 1 if there are no NULLs @retval 0 if only NULLs */ @@ -4353,10 +4513,122 @@ } +subselect_partial_match_engine::subselect_partial_match_engine( + subselect_uniquesubquery_engine *engine_arg, + TABLE *tmp_table_arg, Item_subselect *item_arg, + select_result_interceptor *result_arg, + List<Item> *equi_join_conds_arg, + uint covering_null_row_width_arg) + :subselect_engine(item_arg, result_arg), + tmp_table(tmp_table_arg), lookup_engine(engine_arg), + equi_join_conds(equi_join_conds_arg), + covering_null_row_width(covering_null_row_width_arg) +{} + + +int subselect_partial_match_engine::exec() +{ + Item_in_subselect *item_in= (Item_in_subselect *) item; + int res; + + /* Try to find a matching row by index lookup. */ + res= lookup_engine->copy_ref_key_simple(); + if (res == -1) + { + /* The result is FALSE based on the outer reference. */ + item_in->value= 0; + item_in->null_value= 0; + return 0; + } + else if (res == 0) + { + /* Search for a complete match. */ + if ((res= lookup_engine->index_lookup())) + { + /* An error occured during lookup(). */ + item_in->value= 0; + item_in->null_value= 0; + return res; + } + else if (item_in->value) + { + /* + A complete match was found, the result of IN is TRUE. + Notice: (this->item == lookup_engine->item) + */ + return 0; + } + } + + if (covering_null_row_width == tmp_table->s->fields) + { + /* + If there is a NULL-only row that coveres all columns the result of IN + is UNKNOWN. + */ + item_in->value= 0; + /* + TIMOUR: which one is the right way to propagate an UNKNOWN result? + Should we also set empty_result_set= FALSE; ??? + */ + //item_in->was_null= 1; + item_in->null_value= 1; + return 0; + } + + /* + There is no complete match. Look for a partial match (UNKNOWN result), or + no match (FALSE). + */ + if (tmp_table->file->inited) + tmp_table->file->ha_index_end(); + + if (partial_match()) + { + /* The result of IN is UNKNOWN. */ + item_in->value= 0; + /* + TIMOUR: which one is the right way to propagate an UNKNOWN result? + Should we also set empty_result_set= FALSE; ??? + */ + //item_in->was_null= 1; + item_in->null_value= 1; + } + else + { + /* The result of IN is FALSE. */ + item_in->value= 0; + /* + TIMOUR: which one is the right way to propagate an UNKNOWN result? + Should we also set empty_result_set= FALSE; ??? + */ + //item_in->was_null= 0; + item_in->null_value= 0; + } + + return 0; +} + + +void subselect_partial_match_engine::print(String *str, + enum_query_type query_type) +{ + /* + Should never be called as the actual engine cannot be known at query + optimization time. + */ + DBUG_ASSERT(FALSE); +} + + /* @param non_null_key_parts @param partial_match_key_parts A union of all single-column NULL key parts. @param count_partial_match_columns Number of NULL keyparts (set bits above). + + @retval FALSE the engine was initialized successfully + @retval TRUE there was some (memory allocation) error during initialization, + such errors should be interpreted as revert to other strategy */ bool @@ -4379,14 +4651,17 @@ return FALSE; } - DBUG_ASSERT(!has_covering_null_row || (has_covering_null_row && - keys_count == 1 && - non_null_key_parts)); - + DBUG_ASSERT(!covering_null_row_width || (covering_null_row_width && + keys_count == 1 && + non_null_key_parts)); + /* + Allocate buffers to hold the merged keys and the mapping between rowids and + row numbers. + */ if (!(merge_keys= (Ordered_key**) thd->alloc(keys_count * sizeof(Ordered_key*))) || - !(row_num_to_rowid= (uchar*) thd->alloc(row_count * rowid_length * - sizeof(uchar)))) + !(row_num_to_rowid= (uchar*) my_malloc(row_count * rowid_length * + sizeof(uchar), MYF(MY_WME)))) return TRUE; /* Create the only non-NULL key if there is any. */ @@ -4395,10 +4670,7 @@ non_null_key= new Ordered_key(cur_keyid, tmp_table, item_in->left_expr, 0, 0, 0, row_num_to_rowid); if (non_null_key->init(non_null_key_parts)) - { - // TIMOUR: revert to partial matching via scanning return TRUE; - } merge_keys[cur_keyid]= non_null_key; merge_keys[cur_keyid]->first(); ++cur_keyid; @@ -4406,9 +4678,10 @@ /* If there is a covering NULL row, the only key that is needed is the - only non-NULL key that is already created above. + only non-NULL key that is already created above. We create keys on + NULL-able columns only if there is no covering NULL row. */ - if (!has_covering_null_row) + if (!covering_null_row_width) { if (bitmap_init_memroot(&matching_keys, keys_count, thd->mem_root) || bitmap_init_memroot(&matching_outer_cols, keys_count, thd->mem_root) || @@ -4436,10 +4709,7 @@ result_sink->get_max_null_of_col(i), row_num_to_rowid); if (merge_keys[cur_keyid]->init(i)) - { - // TIMOUR: revert to partial matching via scanning return TRUE; - } merge_keys[cur_keyid]->first(); } ++cur_keyid; @@ -4510,10 +4780,7 @@ if (init_queue(&pq, keys_count, 0, FALSE, subselect_rowid_merge_engine::cmp_keys_by_cur_rownum, NULL)) - { - // TIMOUR: revert to partial matching via scanning return TRUE; - } return FALSE; } @@ -4521,26 +4788,21 @@ subselect_rowid_merge_engine::~subselect_rowid_merge_engine() { - delete_queue(&pq); + /* None of the resources below is allocated if there are no ordered keys. */ + if (keys_count) + { + my_free((char*) row_num_to_rowid, MYF(0)); + for (uint i= 0; i < keys_count; i++) + delete merge_keys[i]; + delete_queue(&pq); + if (tmp_table->file->inited == handler::RND) + tmp_table->file->ha_rnd_end(); + } } void subselect_rowid_merge_engine::cleanup() { - lookup_engine->cleanup(); - /* Tell handler we don't need the index anymore */ - if (tmp_table->file->inited) - tmp_table->file->ha_rnd_end(); - queue_remove_all(&pq); -} - - -void subselect_rowid_merge_engine::print(String *str, enum_query_type query_type) -{ - str->append(STRING_WITH_LEN("<rowid_merge>(")); - for (uint i= 0; i < keys_count; i++) - merge_keys[i]->print(str); - str->append(')'); } @@ -4627,20 +4889,31 @@ Ordered_key *cur_key; rownum_t cur_row_num; uint count_nulls_in_search_key= 0; + bool res= FALSE; /* If there is a non-NULL key, it must be the first key in the keys array. */ DBUG_ASSERT(!non_null_key || (non_null_key && merge_keys[0] == non_null_key)); + + /* All data accesses during execution are via handler::ha_rnd_pos() */ + tmp_table->file->ha_rnd_init(0); + /* Check if there is a match for the columns of the only non-NULL key. */ if (non_null_key && !non_null_key->lookup()) - return FALSE; + { + res= FALSE; + goto end; + } /* If there is a NULL (sub)row that covers all NULL-able columns, then there is a guranteed partial match, and we don't need to search for the matching row. */ - if (has_covering_null_row) - return TRUE; + if (covering_null_row_width) + { + res= TRUE; + goto end; + } if (non_null_key) queue_insert(&pq, (uchar *) non_null_key); @@ -4667,14 +4940,20 @@ if (count_nulls_in_search_key == ((Item_in_subselect *) item)->left_expr->cols() - (non_null_key ? non_null_key->get_column_count() : 0)) - return TRUE; + { + res= TRUE; + goto end; + } /* If there is no NULL (sub)row that covers all NULL columns, and there is no single match for any of the NULL columns, the result is FALSE. */ if (pq.elements - test(non_null_key) == 0) - return FALSE; + { + res= FALSE; + goto end; + } DBUG_ASSERT(pq.elements); @@ -4692,10 +4971,8 @@ Check the only matching row of the only key min_key for NULL matches in the other columns. */ - if (test_null_row(min_row_num)) - return TRUE; - else - return FALSE; + res= test_null_row(min_row_num); + goto end; } while (TRUE) @@ -4710,7 +4987,10 @@ /* Follows from the correct use of priority queue. */ DBUG_ASSERT(cur_row_num > min_row_num); if (test_null_row(min_row_num)) - return TRUE; + { + res= TRUE; + goto end; + } else { min_key= cur_key; @@ -4727,99 +5007,112 @@ if (pq.elements == 0) { /* Check the last row of the last column in PQ for NULL matches. */ - if (test_null_row(min_row_num)) - return TRUE; - else - return FALSE; + res= test_null_row(min_row_num); + goto end; } } - /* We should never get here. */ + /* We should never get here - all branches must be handled explicitly above. */ DBUG_ASSERT(FALSE); - return FALSE; + +end: + tmp_table->file->ha_rnd_end(); + return res; } -int subselect_rowid_merge_engine::exec() +subselect_table_scan_engine::subselect_table_scan_engine( + subselect_uniquesubquery_engine *engine_arg, + TABLE *tmp_table_arg, + Item_subselect *item_arg, + select_result_interceptor *result_arg, + List<Item> *equi_join_conds_arg, + uint covering_null_row_width_arg) + :subselect_partial_match_engine(engine_arg, tmp_table_arg, item_arg, + result_arg, equi_join_conds_arg, + covering_null_row_width_arg) +{} + + +/* + TIMOUR: + This method is based on subselect_uniquesubquery_engine::scan_table(). + Consider refactoring somehow, 80% of the code is the same. + + for each row_i in tmp_table + { + count_matches= 0; + for each row element row_i[j] + { + if (outer_ref[j] is NULL || row_i[j] is NULL || outer_ref[j] == row_i[j]) + ++count_matches; + } + if (count_matches == outer_ref.elements) + return TRUE + } + return FALSE +*/ + +bool subselect_table_scan_engine::partial_match() { - Item_in_subselect *item_in= (Item_in_subselect *) item; - int res; - - /* Try to find a matching row by index lookup. */ - res= lookup_engine->copy_ref_key_simple(); - if (res == -1) - { - /* The result is FALSE based on the outer reference. */ - item_in->value= 0; - item_in->null_value= 0; - return 0; - } - else if (res == 0) - { - if ((res= lookup_engine->index_lookup())) - { - /* An error occured during lookup(). */ - item_in->value= 0; - item_in->null_value= 0; - return res; - } - else if (item_in->value) - { - /* - A complete match was found, the result of IN is TRUE. - Notice: (this->item == lookup_engine->item) - */ - return 0; - } - } - - if (has_covering_null_row && !keys_count) - { - /* - If there is a NULL-only row that coveres all columns the result of IN - is UNKNOWN. - */ - item_in->value= 0; - /* - TIMOUR: which one is the right way to propagate an UNKNOWN result? - Should we also set empty_result_set= FALSE; ??? - */ - //item_in->was_null= 1; - item_in->null_value= 1; - return 0; - } - - /* All data accesses during execution are via handler::ha_rnd_pos() */ - if (tmp_table->file->inited) - tmp_table->file->ha_index_end(); - tmp_table->file->ha_rnd_init(0); + List_iterator_fast<Item> equality_it(*equi_join_conds); + Item *cur_eq; + uint count_matches; + int error; + bool res; + + tmp_table->file->ha_rnd_init(1); + tmp_table->file->extra_opt(HA_EXTRA_CACHE, + current_thd->variables.read_buff_size); /* - There is no complete match. Look for a partial match (UNKNOWN result), or - no match (FALSE). + TIMOUR: + scan_table() also calls "table->null_row= 0;", why, do we need it? */ - if (partial_match()) - { - /* The result of IN is UNKNOWN. */ - item_in->value= 0; - /* - TIMOUR: which one is the right way to propagate an UNKNOWN result? - Should we also set empty_result_set= FALSE; ??? - */ - //item_in->was_null= 1; - item_in->null_value= 1; - } - else - { - /* The result of IN is FALSE. */ - item_in->value= 0; - /* - TIMOUR: which one is the right way to propagate an UNKNOWN result? - Should we also set empty_result_set= FALSE; ??? - */ - //item_in->was_null= 0; - item_in->null_value= 0; - } + for (;;) + { + error= tmp_table->file->ha_rnd_next(tmp_table->record[0]); + if (error) { + if (error == HA_ERR_RECORD_DELETED) + { + error= 0; + continue; + } + if (error == HA_ERR_END_OF_FILE) + { + error= 0; + break; + } + else + { + error= report_error(tmp_table, error); + break; + } + } + + equality_it.rewind(); + count_matches= 0; + while ((cur_eq= equality_it++)) + { + DBUG_ASSERT(cur_eq->type() == Item::FUNC_ITEM && + ((Item_func*)cur_eq)->functype() == Item_func::EQ_FUNC); + if (!cur_eq->val_int() && !cur_eq->null_value) + break; + ++count_matches; + } + if (count_matches == tmp_table->s->fields) + { + res= TRUE; /* Found a matching row. */ + goto end; + } + } + + res= FALSE; +end: tmp_table->file->ha_rnd_end(); - - return 0; + return res; +} + + +void subselect_table_scan_engine::cleanup() +{ } === modified file 'sql/item_subselect.h' --- a/sql/item_subselect.h 2010-02-22 15:16:55 +0000 +++ b/sql/item_subselect.h 2010-03-09 10:14:06 +0000 @@ -436,7 +436,7 @@ friend class Item_in_optimizer; friend class subselect_indexsubquery_engine; friend class subselect_hash_sj_engine; - friend class subselect_rowid_merge_engine; + friend class subselect_partial_match_engine; }; @@ -472,7 +472,7 @@ enum enum_engine_type {ABSTRACT_ENGINE, SINGLE_SELECT_ENGINE, UNION_ENGINE, UNIQUESUBQUERY_ENGINE, INDEXSUBQUERY_ENGINE, HASH_SJ_ENGINE, - ROR_INTERSECT_ENGINE}; + ROWID_MERGE_ENGINE, TABLE_SCAN_ENGINE}; subselect_engine(Item_subselect *si, select_result_interceptor *res) :thd(0) @@ -716,6 +716,109 @@ } +/** + Compute an IN predicate via a hash semi-join. This class is responsible for + the materialization of the subquery, and the selection of the correct and + optimal execution method (e.g. direct index lookup, or partial matching) for + the IN predicate. +*/ + +class subselect_hash_sj_engine : public subselect_engine +{ +protected: + /* The table into which the subquery is materialized. */ + TABLE *tmp_table; + /* TRUE if the subquery was materialized into a temp table. */ + bool is_materialized; + /* + The old engine already chosen at parse time and stored in permanent memory. + Through this member we can re-create and re-prepare materialize_join for + each execution of a prepared statement. We also reuse the functionality + of subselect_single_select_engine::[prepare | cols]. + */ + subselect_single_select_engine *materialize_engine; + /* The engine used to compute the IN predicate. */ + subselect_engine *lookup_engine; + /* + QEP to execute the subquery and materialize its result into a + temporary table. Created during the first call to exec(). + */ + JOIN *materialize_join; + + /* Keyparts of the only non-NULL composite index in a rowid merge. */ + MY_BITMAP non_null_key_parts; + /* Keyparts of the single column indexes with NULL, one keypart per index. */ + MY_BITMAP partial_match_key_parts; + uint count_partial_match_columns; + uint count_null_only_columns; + /* + A conjunction of all the equality condtions between all pairs of expressions + that are arguments of an IN predicate. We need these to post-filter some + IN results because index lookups sometimes match values that are actually + not equal to the search key in SQL terms. + */ + Item_cond_and *semi_join_conds; + /* Possible execution strategies that can be used to compute hash semi-join.*/ + enum exec_strategy { + UNDEFINED, + COMPLETE_MATCH, /* Use regular index lookups. */ + PARTIAL_MATCH, /* Use some partial matching strategy. */ + PARTIAL_MATCH_MERGE, /* Use partial matching through index merging. */ + PARTIAL_MATCH_SCAN, /* Use partial matching through table scan. */ + IMPOSSIBLE /* Subquery materialization is not applicable. */ + }; + /* The chosen execution strategy. Computed after materialization. */ + exec_strategy strategy; +protected: + exec_strategy get_strategy_using_schema(); + exec_strategy get_strategy_using_data(); + size_t rowid_merge_buff_size(bool has_non_null_key, + bool has_covering_null_row, + MY_BITMAP *partial_match_key_parts); + void choose_partial_match_strategy(bool has_non_null_key, + bool has_covering_null_row, + MY_BITMAP *partial_match_key_parts); + bool make_semi_join_conds(); + subselect_uniquesubquery_engine* make_unique_engine(); + +public: + subselect_hash_sj_engine(THD *thd, Item_subselect *in_predicate, + subselect_single_select_engine *old_engine) + :subselect_engine(in_predicate, NULL), tmp_table(NULL), + is_materialized(FALSE), materialize_engine(old_engine), lookup_engine(NULL), + materialize_join(NULL), count_partial_match_columns(0), + count_null_only_columns(0), semi_join_conds(NULL), strategy(UNDEFINED) + { + set_thd(thd); + } + ~subselect_hash_sj_engine(); + + bool init_permanent(List<Item> *tmp_columns); + bool init_runtime(); + void cleanup(); + int prepare() { return 0; } /* Override virtual function in base class. */ + int exec(); + virtual void print(String *str, enum_query_type query_type); + uint cols() + { + return materialize_engine->cols(); + } + uint8 uncacheable() { return UNCACHEABLE_DEPENDENT; } + table_map upper_select_const_tables() { return 0; } + bool no_rows() { return !tmp_table->file->stats.records; } + virtual enum_engine_type engine_type() { return HASH_SJ_ENGINE; } + /* + TODO: factor out all these methods in a base subselect_index_engine class + because all of them have dummy implementations and should never be called. + */ + void fix_length_and_dec(Item_cache** row);//=>base class + void exclude(); //=>base class + //=>base class + bool change_result(Item_subselect *si, select_result_interceptor *result); + bool no_tables();//=>base class +}; + + /* Distinguish the type od (0-based) row numbers from the type of the index into an array of row numbers. @@ -745,7 +848,7 @@ PS (re)execution, however most of the comprising objects can be reused. */ -class Ordered_key +class Ordered_key : public Sql_alloc { protected: /* @@ -761,6 +864,8 @@ uint key_column_count; /* An expression, or sequence of expressions that forms the search key. + The search key is a sequence when it is Item_row. Each element of the + sequence is accessible via Item::element_index(int i). */ Item *search_key; @@ -808,8 +913,6 @@ int cmp_key_with_search_key(rownum_t row_num); public: - static void *operator new(size_t size) throw () - { return sql_alloc(size); } Ordered_key(uint keyid_arg, TABLE *tbl_arg, Item *search_key_arg, ha_rows null_count_arg, ha_rows min_null_row_arg, ha_rows max_null_row_arg, @@ -828,6 +931,10 @@ DBUG_ASSERT(i < key_column_count); return key_columns[i]->field->field_index; } + /* + Get the search key element that corresponds to the i-th key part of this + index. + */ Item *get_search_key(uint i) { return search_key->element_index(key_columns[i]->field->field_index); @@ -899,7 +1006,7 @@ }; -class subselect_rowid_merge_engine: public subselect_engine +class subselect_partial_match_engine : public subselect_engine { protected: /* The temporary table that contains a materialized subquery. */ @@ -910,6 +1017,51 @@ FALSE and UNKNOWN. */ subselect_uniquesubquery_engine *lookup_engine; + /* A list of equalities between each pair of IN operands. */ + List<Item> *equi_join_conds; + /* + If there is a row, such that all its NULL-able components are NULL, this + member is set to the number of covered columns. If there is no covering + row, then this is 0. + */ + uint covering_null_row_width; +protected: + virtual bool partial_match()= 0; +public: + subselect_partial_match_engine(subselect_uniquesubquery_engine *engine_arg, + TABLE *tmp_table_arg, Item_subselect *item_arg, + select_result_interceptor *result_arg, + List<Item> *equi_join_conds_arg, + uint covering_null_row_width_arg); + int prepare() { return 0; } + int exec(); + void fix_length_and_dec(Item_cache**) {} + uint cols() { /* TODO: what is the correct value? */ return 1; } + uint8 uncacheable() { return UNCACHEABLE_DEPENDENT; } + void exclude() {} + table_map upper_select_const_tables() { return 0; } + bool change_result(Item_subselect*, select_result_interceptor*) + { DBUG_ASSERT(FALSE); return false; } + bool no_tables() { return false; } + bool no_rows() + { + /* + TODO: It is completely unclear what is the semantics of this + method. The current result is computed so that the call to no_rows() + from Item_in_optimizer::val_int() sets Item_in_optimizer::null_value + correctly. + */ + return !(((Item_in_subselect *) item)->null_value); + } + void print(String*, enum_query_type); + + friend void subselect_hash_sj_engine::cleanup(); +}; + + +class subselect_rowid_merge_engine: public subselect_partial_match_engine +{ +protected: /* Mapping from row numbers to row ids. The rowids are stored sequentially in the array - rowid[i] is located in row_num_to_rowid + i * rowid_length. @@ -953,8 +1105,6 @@ This queue is used by the partial match algorithm in method exec(). */ QUEUE pq; - /* True if there is a NULL (sub)row that covers all NULLable columns. */ - bool has_covering_null_row; protected: /* Comparison function to compare keys in order of decreasing bitmap @@ -972,143 +1122,34 @@ public: subselect_rowid_merge_engine(subselect_uniquesubquery_engine *engine_arg, TABLE *tmp_table_arg, uint keys_count_arg, - uint has_covering_null_row_arg, + uint covering_null_row_width_arg, Item_subselect *item_arg, - select_result_interceptor *result_arg) - :subselect_engine(item_arg, result_arg), - tmp_table(tmp_table_arg), lookup_engine(engine_arg), - keys_count(keys_count_arg), non_null_key(NULL), - has_covering_null_row(has_covering_null_row_arg) + select_result_interceptor *result_arg, + List<Item> *equi_join_conds_arg) + :subselect_partial_match_engine(engine_arg, tmp_table_arg, item_arg, + result_arg, equi_join_conds_arg, + covering_null_row_width_arg), + keys_count(keys_count_arg), non_null_key(NULL) { thd= lookup_engine->get_thd(); } ~subselect_rowid_merge_engine(); bool init(MY_BITMAP *non_null_key_parts, MY_BITMAP *partial_match_key_parts); void cleanup(); - int prepare() { return 0; } - void fix_length_and_dec(Item_cache**) {} - int exec(); - uint cols() { /* TODO: what is the correct value? */ return 1; } - uint8 uncacheable() { return UNCACHEABLE_DEPENDENT; } - void exclude() {} - table_map upper_select_const_tables() { return 0; } - void print(String*, enum_query_type); - bool change_result(Item_subselect*, select_result_interceptor*) - { DBUG_ASSERT(FALSE); return false; } - bool no_tables() { return false; } - bool no_rows() - { - /* - TODO: It is completely unclear what is the semantics of this - method. The current result is computed so that the call to no_rows() - from Item_in_optimizer::val_int() sets Item_in_optimizer::null_value - correctly. - */ - return !(((Item_in_subselect *) item)->null_value); - } + virtual enum_engine_type engine_type() { return ROWID_MERGE_ENGINE; } }; -/** - Compute an IN predicate via a hash semi-join. This class is responsible for - the materialization of the subquery, and the selection of the correct and - optimal execution method (e.g. direct index lookup, or partial matching) for - the IN predicate. -*/ - -class subselect_hash_sj_engine : public subselect_engine +class subselect_table_scan_engine: public subselect_partial_match_engine { protected: - /* The table into which the subquery is materialized. */ - TABLE *tmp_table; - /* TRUE if the subquery was materialized into a temp table. */ - bool is_materialized; - /* - The old engine already chosen at parse time and stored in permanent memory. - Through this member we can re-create and re-prepare materialize_join for - each execution of a prepared statement. We also reuse the functionality - of subselect_single_select_engine::[prepare | cols]. - */ - subselect_single_select_engine *materialize_engine; - /* The engine used to compute the IN predicate. */ - subselect_engine *lookup_engine; - /* - QEP to execute the subquery and materialize its result into a - temporary table. Created during the first call to exec(). - */ - JOIN *materialize_join; - /* - TRUE if the subquery result has an all-NULL column, which means that - there at best can be a partial match for any IN execution. - */ - bool inner_partial_match; - /* - TRUE if the materialized subquery contains a whole row only of NULLs. - */ - bool has_null_row; - - /* Keyparts of the only non-NULL composite index in a rowid merge. */ - MY_BITMAP non_null_key_parts; - /* Keyparts of the single column indexes with NULL, one keypart per index. */ - MY_BITMAP partial_match_key_parts; - uint count_partial_match_columns; - uint count_null_only_columns; - /* - A conjunction of all the equality condtions between all pairs of expressions - that are arguments of an IN predicate. We need these to post-filter some - IN results because index lookups sometimes match values that are actually - not equal to the search key in SQL terms. - */ - Item *semi_join_conds; - /* Possible execution strategies that can be used to compute hash semi-join.*/ - enum exec_strategy { - COMPLETE_MATCH, /* Use regular index lookups. */ - PARTIAL_MATCH, /* Use some partial matching strategy. */ - PARTIAL_MATCH_INDEX, /* Use partial matching through index merging. */ - PARTIAL_MATCH_SCAN, /* Use partial matching through table scan. */ - IMPOSSIBLE /* Subquery materialization is not applicable. */ - }; - /* The chosen execution strategy. Computed after materialization. */ - exec_strategy strategy; -protected: - void set_strategy_using_schema(); - void set_strategy_using_data(); - bool make_semi_join_conds(); - subselect_uniquesubquery_engine* make_unique_engine(); - + bool partial_match(); public: - subselect_hash_sj_engine(THD *thd, Item_subselect *in_predicate, - subselect_single_select_engine *old_engine) - :subselect_engine(in_predicate, NULL), tmp_table(NULL), - is_materialized(FALSE), materialize_engine(old_engine), lookup_engine(NULL), - materialize_join(NULL), count_partial_match_columns(0), - count_null_only_columns(0), semi_join_conds(NULL) - { - set_thd(thd); - } - ~subselect_hash_sj_engine(); - - bool init_permanent(List<Item> *tmp_columns); - bool init_runtime(); + subselect_table_scan_engine(subselect_uniquesubquery_engine *engine_arg, + TABLE *tmp_table_arg, Item_subselect *item_arg, + select_result_interceptor *result_arg, + List<Item> *equi_join_conds_arg, + uint covering_null_row_width_arg); void cleanup(); - int prepare() { return 0; } /* Override virtual function in base class. */ - int exec(); - virtual void print (String *str, enum_query_type query_type); - uint cols() - { - return materialize_engine->cols(); - } - uint8 uncacheable() { return UNCACHEABLE_DEPENDENT; } - table_map upper_select_const_tables() { return 0; } - bool no_rows() { return !tmp_table->file->stats.records; } - virtual enum_engine_type engine_type() { return HASH_SJ_ENGINE; } - /* - TODO: factor out all these methods in a base subselect_index_engine class - because all of them have dummy implementations and should never be called. - */ - void fix_length_and_dec(Item_cache** row);//=>base class - void exclude(); //=>base class - //=>base class - bool change_result(Item_subselect *si, select_result_interceptor *result); - bool no_tables();//=>base class + virtual enum_engine_type engine_type() { return TABLE_SCAN_ENGINE; } }; === modified file 'sql/mysql_priv.h' --- a/sql/mysql_priv.h 2010-01-17 14:55:08 +0000 +++ b/sql/mysql_priv.h 2010-03-09 10:14:06 +0000 @@ -552,12 +552,14 @@ #define OPTIMIZER_SWITCH_LOOSE_SCAN 64 #define OPTIMIZER_SWITCH_MATERIALIZATION 128 #define OPTIMIZER_SWITCH_SEMIJOIN 256 +#define OPTIMIZER_SWITCH_PARTIAL_MATCH_ROWID_MERGE 512 +#define OPTIMIZER_SWITCH_PARTIAL_MATCH_TABLE_SCAN 1024 #ifdef DBUG_OFF -# define OPTIMIZER_SWITCH_LAST 512 +# define OPTIMIZER_SWITCH_LAST 2048 #else -# define OPTIMIZER_SWITCH_TABLE_ELIMINATION 512 -# define OPTIMIZER_SWITCH_LAST 1024 +# define OPTIMIZER_SWITCH_TABLE_ELIMINATION 2048 +# define OPTIMIZER_SWITCH_LAST 4096 #endif #ifdef DBUG_OFF @@ -570,8 +572,10 @@ OPTIMIZER_SWITCH_FIRSTMATCH | \ OPTIMIZER_SWITCH_LOOSE_SCAN | \ OPTIMIZER_SWITCH_MATERIALIZATION | \ - OPTIMIZER_SWITCH_SEMIJOIN) -#else + OPTIMIZER_SWITCH_SEMIJOIN | \ + OPTIMIZER_SWITCH_PARTIAL_MATCH_ROWID_MERGE|\ + OPTIMIZER_SWITCH_PARTIAL_MATCH_TABLE_SCAN) +#else # define OPTIMIZER_SWITCH_DEFAULT (OPTIMIZER_SWITCH_INDEX_MERGE | \ OPTIMIZER_SWITCH_INDEX_MERGE_UNION | \ OPTIMIZER_SWITCH_INDEX_MERGE_SORT_UNION | \ @@ -581,7 +585,9 @@ OPTIMIZER_SWITCH_FIRSTMATCH | \ OPTIMIZER_SWITCH_LOOSE_SCAN | \ OPTIMIZER_SWITCH_MATERIALIZATION | \ - OPTIMIZER_SWITCH_SEMIJOIN) + OPTIMIZER_SWITCH_SEMIJOIN | \ + OPTIMIZER_SWITCH_PARTIAL_MATCH_ROWID_MERGE|\ + OPTIMIZER_SWITCH_PARTIAL_MATCH_TABLE_SCAN) #endif /* === modified file 'sql/mysqld.cc' --- a/sql/mysqld.cc 2010-01-17 14:55:08 +0000 +++ b/sql/mysqld.cc 2010-03-09 10:14:06 +0000 @@ -301,7 +301,9 @@ "index_merge","index_merge_union","index_merge_sort_union", "index_merge_intersection", "index_condition_pushdown", - "firstmatch","loosescan","materialization", "semijoin", + "firstmatch","loosescan","materialization", "semijoin", + "partial_match_rowid_merge", + "partial_match_table_scan", #ifndef DBUG_OFF "table_elimination", #endif @@ -320,6 +322,8 @@ sizeof("loosescan") - 1, sizeof("materialization") - 1, sizeof("semijoin") - 1, + sizeof("partial_match_rowid_merge") - 1, + sizeof("partial_match_table_scan") - 1, #ifndef DBUG_OFF sizeof("table_elimination") - 1, #endif @@ -5794,7 +5798,8 @@ OPT_RECORD_RND_BUFFER, OPT_DIV_PRECINCREMENT, OPT_RELAY_LOG_SPACE_LIMIT, OPT_RELAY_LOG_PURGE, OPT_SLAVE_NET_TIMEOUT, OPT_SLAVE_COMPRESSED_PROTOCOL, OPT_SLOW_LAUNCH_TIME, - OPT_SLAVE_TRANS_RETRIES, OPT_READONLY, OPT_DEBUGGING, OPT_DEBUG_FLUSH, + OPT_SLAVE_TRANS_RETRIES, OPT_READONLY, OPT_ROWID_MERGE_BUFF_SIZE, + OPT_DEBUGGING, OPT_DEBUG_FLUSH, OPT_SORT_BUFFER, OPT_TABLE_OPEN_CACHE, OPT_TABLE_DEF_CACHE, OPT_THREAD_CONCURRENCY, OPT_THREAD_CACHE_SIZE, OPT_TMP_TABLE_SIZE, OPT_THREAD_STACK, @@ -7130,6 +7135,11 @@ (uchar**) &max_system_variables.range_alloc_block_size, 0, GET_ULONG, REQUIRED_ARG, RANGE_ALLOC_BLOCK_SIZE, RANGE_ALLOC_BLOCK_SIZE, (longlong) ULONG_MAX, 0, 1024, 0}, + {"rowid_merge_buff_size", OPT_ROWID_MERGE_BUFF_SIZE, + "The size of the buffers used [NOT] IN evaluation via partial matching.", + (uchar**) &global_system_variables.rowid_merge_buff_size, + (uchar**) &max_system_variables.rowid_merge_buff_size, 0, GET_ULONG, + REQUIRED_ARG, 8*1024*1024L, 0, MAX_MEM_TABLE_SIZE/2, 0, 1, 0}, {"read_buffer_size", OPT_RECORD_BUFFER, "Each thread that does a sequential scan allocates a buffer of this size for each table it scans. If you do many sequential scans, you may want to increase this value.", (uchar**) &global_system_variables.read_buff_size, === modified file 'sql/set_var.cc' --- a/sql/set_var.cc 2009-12-22 12:49:15 +0000 +++ b/sql/set_var.cc 2010-03-09 10:14:06 +0000 @@ -540,6 +540,9 @@ static sys_var_thd_ulong sys_range_alloc_block_size(&vars, "range_alloc_block_size", &SV::range_alloc_block_size); +static sys_var_thd_ulong sys_rowid_merge_buff_size(&vars, "rowid_merge_buff_size", + &SV::rowid_merge_buff_size); + static sys_var_thd_ulong sys_query_alloc_block_size(&vars, "query_alloc_block_size", &SV::query_alloc_block_size, 0, fix_thd_mem_root); === modified file 'sql/sql_class.h' --- a/sql/sql_class.h 2010-02-19 21:55:57 +0000 +++ b/sql/sql_class.h 2010-03-09 10:14:06 +0000 @@ -343,6 +343,8 @@ ulong mrr_buff_size; ulong div_precincrement; ulong sortbuff_size; + /* Total size of all buffers used by the subselect_rowid_merge_engine. */ + ulong rowid_merge_buff_size; ulong thread_handling; ulong tx_isolation; ulong completion_type; === modified file 'support-files/build-tags' --- a/support-files/build-tags 2009-12-15 07:16:46 +0000 +++ b/support-files/build-tags 2010-03-09 10:14:06 +0000 @@ -4,7 +4,7 @@ filter='\.cc$\|\.c$\|\.h$\|\.yy$' list="find . -type f" -bzr root >/dev/null 2>/dev/null && list="bzr ls --from-root --kind=file --versioned" +bzr root >/dev/null 2>/dev/null && list="bzr ls --from-root -R --kind=file --versioned" $list |grep $filter |while read f; do

1 0