developers
Threads by month
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
December 2009
- 20 participants
- 156 discussions
Re: [Maria-developers] [Merge] lp:~paul-mccullagh/maria/maria-pbxt-rc3 into lp:maria
by Kristian Nielsen 21 Dec '09
by Kristian Nielsen 21 Dec '09
21 Dec '09
Hi Daniel,
I have merged this from Paul into MariaDB.
This means that the next MariaDB release (5.1.41) will include PBXT version
1.0.09f RC3 as described by Paul below.
I hope you can use this to prepare MariaDB 5.1.41 release notes.
- Kristian.
Paul McCullagh <paul.mccullagh(a)primebase.org> writes:
> Paul McCullagh has proposed merging lp:~paul-mccullagh/maria/maria-pbxt-rc3 into lp:maria.
>
> Requested reviews:
> Maria-captains (maria-captains)
> Related bugs:
> #345524 pbxt does not compile on 64 bit windows
> https://bugs.launchpad.net/bugs/345524
>
>
> This is the 3rd RC release of PBXT. It includes a number of bug fixes (some specifically for MariaDB), and 2 features:
>
> - XA/2-Phase commit support
> - Online backup native driver for PBXT
>
> The backup uses the MySQL backup API (MySQL 6.0/5.4), and is not yet available in MariaDB.
>
> Release notes since the RC2 version already in MariaDB are as follows:
>
> ------- 1.0.09f RC3 - 2009-11-30
>
> RN291: Fixed bug #489088: On shutdown MySQL reports: [Warning] Plugin 'PBXT' will be forced to shutdown.
>
> RN290: Fixed bug #345524: pbxt does not compile on 64 bit windows. Currently atomic operations are not supported on this platform.
>
> RN286: Fixed a bug introduced in RN281, which could cause an index scan to hang. The original change was to prevent a warning in Valgrind.
>
> RN285: Merged changes required to compile with Drizzle.
>
> RN284: Fixed bug that cause the error "[ERROR] Invalid (old?) table or database name 'mysqld.1'", when running temp_table.test under MariaDB (thanks to Monty for his initial bug fix). Added a fix for partition table names as well.
>
> RN283: Added win_inttypes.h to the distribution. This file is only required for the Windows build.
>
> RN282: Fixed bug #451101: jump or move depends on uninitialised value in myxt_get_key_length
>
> RN281: Fixed bug #451080: Uninitialised memory write in XTDatabaseLog::xlog_append
>
> RN280: Fixed bug #451085: jump or move depends on uninitialised value in my_type_to_string
>
> RN279: Fixed bug #441000: xtstat crashes with segmentation fault on startup if max_pbxt_threads exceeded.
>
> ------- 1.0.09e RC3 - 2009-11-20
>
> RN278: Fixed compile error with MySQL 5.1.41.
>
> ------- 1.0.09d RC3 - 2009-09-30
>
> RN277: Added r/o flag to pbxt_max_threads server variable (this fix is related to bug #430637)
>
> RN276: Added test case for replication on tables w/o PKs (see bug #430716)
>
> RN275: Fixed bug #430600: 'Failed to read auto-increment value from storage engine' error.
>
> RN274: Fixed bug #431240: This report is public edit xtstat fails if no PBXT table has been created. xtstat now accepts --database=information_schema or --database=pbxt. Depending on this setting PBXT will either use the information_schema.pbxt_statistics or the pbxt.statistics table. If information_schema is used, then the statistics are displayed even when no PBXT table exists. Recovery activity is also displayed, unless pbxt_support_xa=1, in which case MySQL will wait for PBXT recovery to complete before allowing connections.
>
> RN273: Fixed bug #430633: XA_RBDEADLOCK is not returned on XA END after the transacting ended with a deadlock.
>
> RN272: Fixed bug #430596: Backup/restore does not work well even on a basic PBXT table with auto-increment.
>
> ------- 1.0.09c RC3 - 2009-09-16
>
> RN271: Windows build update: now you can simply put the pbxt directory under <mysql-root>/storage and build the PBXT engine as a part of the source tree. The engine will be linked statically. Be sure to specify the WITH_PBXT_STORAGE_ENGINE option when running win\configure.js
>
> RN270: Correctly disabled PBMS so that this version now compiles under Windows. If PBMS_ENABLED is defined, PBXT will not compile under Windows becaause of a getpid() call in pbms.h.
>
> ------- 1.0.09 RC3 - 2009-09-09
>
> RN269: Implemented online backup. A native online backup driver now performs BACKUP and RESTORE DATABASE operations for PBXT. NOTE: This feature is only supported by MySQL 6.0.9 or later.
>
> RN268: Implemented XA support. PBXT now supports all XA related MySQL statements. The variable pbxt_support_xa determines if XA support is enabled. Note: due to MySQL bug #47134, enabling XA support could lead to a crash.
1
0
[Maria-developers] [Branch ~maria-captains/maria/5.1] Rev 2784: Automatic merge of PBXT 1.0.09f RC3 into MariaDB trunk.
by noreply@launchpad.net 21 Dec '09
by noreply@launchpad.net 21 Dec '09
21 Dec '09
Merge authors:
Kristian Nielsen (knielsen)
Paul McCullagh (paul-mccullagh)
Related merge proposals:
https://code.launchpad.net/~paul-mccullagh/maria/maria-pbxt-rc3/+merge/15934
proposed by: Paul McCullagh (paul-mccullagh)
------------------------------------------------------------
revno: 2784 [merge]
committer: knielsen(a)knielsen-hq.org
branch nick: mariadb-5.1
timestamp: Mon 2009-12-21 10:42:28 +0100
message:
Automatic merge of PBXT 1.0.09f RC3 into MariaDB trunk.
added:
storage/pbxt/src/backup_xt.cc
storage/pbxt/src/backup_xt.h
modified:
sql/handler.cc
storage/pbxt/ChangeLog
storage/pbxt/src/Makefile.am
storage/pbxt/src/cache_xt.cc
storage/pbxt/src/cache_xt.h
storage/pbxt/src/database_xt.cc
storage/pbxt/src/database_xt.h
storage/pbxt/src/datadic_xt.cc
storage/pbxt/src/datalog_xt.cc
storage/pbxt/src/discover_xt.cc
storage/pbxt/src/filesys_xt.cc
storage/pbxt/src/filesys_xt.h
storage/pbxt/src/ha_pbxt.cc
storage/pbxt/src/ha_pbxt.h
storage/pbxt/src/ha_xtsys.h
storage/pbxt/src/heap_xt.cc
storage/pbxt/src/index_xt.cc
storage/pbxt/src/lock_xt.cc
storage/pbxt/src/locklist_xt.h
storage/pbxt/src/memory_xt.cc
storage/pbxt/src/myxt_xt.cc
storage/pbxt/src/myxt_xt.h
storage/pbxt/src/pbms.h
storage/pbxt/src/pbms_enabled.cc
storage/pbxt/src/pbms_enabled.h
storage/pbxt/src/pthread_xt.cc
storage/pbxt/src/restart_xt.cc
storage/pbxt/src/restart_xt.h
storage/pbxt/src/strutil_xt.cc
storage/pbxt/src/systab_xt.cc
storage/pbxt/src/tabcache_xt.cc
storage/pbxt/src/tabcache_xt.h
storage/pbxt/src/table_xt.cc
storage/pbxt/src/table_xt.h
storage/pbxt/src/thread_xt.cc
storage/pbxt/src/thread_xt.h
storage/pbxt/src/util_xt.cc
storage/pbxt/src/util_xt.h
storage/pbxt/src/xaction_xt.cc
storage/pbxt/src/xaction_xt.h
storage/pbxt/src/xactlog_xt.cc
storage/pbxt/src/xactlog_xt.h
storage/pbxt/src/xt_config.h
storage/pbxt/src/xt_defs.h
storage/pbxt/src/xt_errno.h
The size of the diff (5758 lines) is larger than your specified limit of 5000 lines
--
lp:maria
https://code.launchpad.net/~maria-captains/maria/5.1
Your team Maria developers is subscribed to branch lp:maria.
To unsubscribe from this branch go to https://code.launchpad.net/~maria-captains/maria/5.1/+edit-subscription.
1
0
[Maria-developers] bzr commit into MariaDB 5.1, with Maria 1.5:maria branch (knielsen:2784)
by knielsen@knielsen-hq.org 21 Dec '09
by knielsen@knielsen-hq.org 21 Dec '09
21 Dec '09
#At lp:maria
2784 knielsen(a)knielsen-hq.org 2009-12-21 [merge]
Automatic merge of PBXT 1.0.09f RC3 into MariaDB trunk.
added:
storage/pbxt/src/backup_xt.cc
storage/pbxt/src/backup_xt.h
modified:
sql/handler.cc
storage/pbxt/ChangeLog
storage/pbxt/src/Makefile.am
storage/pbxt/src/cache_xt.cc
storage/pbxt/src/cache_xt.h
storage/pbxt/src/database_xt.cc
storage/pbxt/src/database_xt.h
storage/pbxt/src/datadic_xt.cc
storage/pbxt/src/datalog_xt.cc
storage/pbxt/src/discover_xt.cc
storage/pbxt/src/filesys_xt.cc
storage/pbxt/src/filesys_xt.h
storage/pbxt/src/ha_pbxt.cc
storage/pbxt/src/ha_pbxt.h
storage/pbxt/src/ha_xtsys.h
storage/pbxt/src/heap_xt.cc
storage/pbxt/src/index_xt.cc
storage/pbxt/src/lock_xt.cc
storage/pbxt/src/locklist_xt.h
storage/pbxt/src/memory_xt.cc
storage/pbxt/src/myxt_xt.cc
storage/pbxt/src/myxt_xt.h
storage/pbxt/src/pbms.h
storage/pbxt/src/pbms_enabled.cc
storage/pbxt/src/pbms_enabled.h
storage/pbxt/src/pthread_xt.cc
storage/pbxt/src/restart_xt.cc
storage/pbxt/src/restart_xt.h
storage/pbxt/src/strutil_xt.cc
storage/pbxt/src/systab_xt.cc
storage/pbxt/src/tabcache_xt.cc
storage/pbxt/src/tabcache_xt.h
storage/pbxt/src/table_xt.cc
storage/pbxt/src/table_xt.h
storage/pbxt/src/thread_xt.cc
storage/pbxt/src/thread_xt.h
storage/pbxt/src/util_xt.cc
storage/pbxt/src/util_xt.h
storage/pbxt/src/xaction_xt.cc
storage/pbxt/src/xaction_xt.h
storage/pbxt/src/xactlog_xt.cc
storage/pbxt/src/xactlog_xt.h
storage/pbxt/src/xt_config.h
storage/pbxt/src/xt_defs.h
storage/pbxt/src/xt_errno.h
=== modified file 'sql/handler.cc'
--- a/sql/handler.cc 2009-12-03 11:19:05 +0000
+++ b/sql/handler.cc 2009-12-16 08:13:18 +0000
@@ -1584,16 +1584,6 @@ int ha_recover(HASH *commit_list)
if (info.commit_list)
sql_print_information("Starting crash recovery...");
-#ifndef WILL_BE_DELETED_LATER
- /*
- for now, only InnoDB supports 2pc. It means we can always safely
- rollback all pending transactions, without risking inconsistent data
- */
- DBUG_ASSERT(total_ha_2pc == (ulong) opt_bin_log+1); // only InnoDB and binlog
- tc_heuristic_recover= TC_HEURISTIC_RECOVER_ROLLBACK; // forcing ROLLBACK
- info.dry_run=FALSE;
-#endif
-
for (info.len= MAX_XID_LIST_SIZE ;
info.list==0 && info.len > MIN_XID_LIST_SIZE; info.len/=2)
{
=== modified file 'storage/pbxt/ChangeLog'
--- a/storage/pbxt/ChangeLog 2009-09-03 06:15:03 +0000
+++ b/storage/pbxt/ChangeLog 2009-12-01 09:50:46 +0000
@@ -1,6 +1,58 @@
PBXT Release Notes
==================
+------- 1.0.09f RC3 - 2009-11-30
+
+RN291: Fixed bug #489088: On shutdown MySQL reports: [Warning] Plugin 'PBXT' will be forced to shutdown.
+
+RN290: Fixed bug #345524: pbxt does not compile on 64 bit windows. Currently atomic operations are not supported on this platform.
+
+RN286: Fixed a bug introduced in RN281, which could cause an index scan to hang. The original change was to prevent a warning in Valgrind.
+
+RN285: Merged changes required to compile with Drizzle.
+
+RN284: Fixed bug that cause the error "[ERROR] Invalid (old?) table or database name 'mysqld.1'", when running temp_table.test under MariaDB (thanks to Monty for his initial bug fix). Added a fix for partition table names as well.
+
+RN283: Added win_inttypes.h to the distribution. This file is only required for the Windows build.
+
+RN282: Fixed bug #451101: jump or move depends on uninitialised value in myxt_get_key_length
+
+RN281: Fixed bug #451080: Uninitialised memory write in XTDatabaseLog::xlog_append
+
+RN280: Fixed bug #451085: jump or move depends on uninitialised value in my_type_to_string
+
+RN279: Fixed bug #441000: xtstat crashes with segmentation fault on startup if max_pbxt_threads exceeded.
+
+------- 1.0.09e RC3 - 2009-11-20
+
+RN278: Fixed compile error with MySQL 5.1.41.
+
+------- 1.0.09d RC3 - 2009-09-30
+
+RN277: Added r/o flag to pbxt_max_threads server variable (this fix is related to bug #430637)
+
+RN276: Added test case for replication on tables w/o PKs (see bug #430716)
+
+RN275: Fixed bug #430600: 'Failed to read auto-increment value from storage engine' error.
+
+RN274: Fixed bug #431240: This report is public edit xtstat fails if no PBXT table has been created. xtstat now accepts --database=information_schema or --database=pbxt. Depending on this setting PBXT will either use the information_schema.pbxt_statistics or the pbxt.statistics table. If information_schema is used, then the statistics are displayed even when no PBXT table exists. Recovery activity is also displayed, unless pbxt_support_xa=1, in which case MySQL will wait for PBXT recovery to complete before allowing connections.
+
+RN273: Fixed bug #430633: XA_RBDEADLOCK is not returned on XA END after the transacting ended with a deadlock.
+
+RN272: Fixed bug #430596: Backup/restore does not work well even on a basic PBXT table with auto-increment.
+
+------- 1.0.09c RC3 - 2009-09-16
+
+RN271: Windows build update: now you can simply put the pbxt directory under <mysql-root>/storage and build the PBXT engine as a part of the source tree. The engine will be linked statically. Be sure to specify the WITH_PBXT_STORAGE_ENGINE option when running win\configure.js
+
+RN270: Correctly disabled PBMS so that this version now compiles under Windows. If PBMS_ENABLED is defined, PBXT will not compile under Windows becaause of a getpid() call in pbms.h.
+
+------- 1.0.09 RC3 - 2009-09-09
+
+RN269: Implemented online backup. A native online backup driver now performs BACKUP and RESTORE DATABASE operations for PBXT. NOTE: This feature is only supported by MySQL 6.0.9 or later.
+
+RN268: Implemented XA support. PBXT now supports all XA related MySQL statements. The variable pbxt_support_xa determines if XA support is enabled. Note: due to MySQL bug #47134, enabling XA support could lead to a crash.
+
------- 1.0.08d RC2 - 2009-09-02
RN267: Fixed a bug that caused MySQL to crash on shutdown, after an incorrect command line parameter was given. The crash occurred because the background recovery task was not cleaned up before the PBXT engine was de-initialized.
=== modified file 'storage/pbxt/src/Makefile.am'
--- a/storage/pbxt/src/Makefile.am 2009-10-07 07:40:56 +0000
+++ b/storage/pbxt/src/Makefile.am 2009-11-24 10:55:06 +0000
@@ -22,7 +22,7 @@ noinst_HEADERS = bsearch_xt.h cache_xt.
pbms_enabled.h sortedlist_xt.h strutil_xt.h \
tabcache_xt.h table_xt.h trace_xt.h thread_xt.h \
util_xt.h xaction_xt.h xactlog_xt.h lock_xt.h \
- systab_xt.h ha_xtsys.h discover_xt.h \
+ systab_xt.h ha_xtsys.h discover_xt.h backup_xt.h \
pbms.h xt_config.h xt_defs.h xt_errno.h locklist_xt.h
EXTRA_LTLIBRARIES = libpbxt.la
@@ -32,7 +32,7 @@ libpbxt_la_SOURCES = bsearch_xt.cc cache
memory_xt.cc myxt_xt.cc pthread_xt.cc restart_xt.cc \
sortedlist_xt.cc strutil_xt.cc \
tabcache_xt.cc table_xt.cc trace_xt.cc thread_xt.cc \
- systab_xt.cc ha_xtsys.cc discover_xt.cc \
+ systab_xt.cc ha_xtsys.cc discover_xt.cc backup_xt.cc \
util_xt.cc xaction_xt.cc xactlog_xt.cc lock_xt.cc locklist_xt.cc
libpbxt_la_LDFLAGS = -module
=== added file 'storage/pbxt/src/backup_xt.cc'
--- a/storage/pbxt/src/backup_xt.cc 1970-01-01 00:00:00 +0000
+++ b/storage/pbxt/src/backup_xt.cc 2009-11-24 10:55:06 +0000
@@ -0,0 +1,802 @@
+/* Copyright (c) 2009 PrimeBase Technologies GmbH
+ *
+ * PrimeBase XT
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * 2009-09-07 Paul McCullagh
+ *
+ * H&G2JCtL
+ */
+
+#include "xt_config.h"
+
+#ifdef MYSQL_SUPPORTS_BACKUP
+
+#include <string.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <time.h>
+#include <ctype.h>
+
+#include "mysql_priv.h"
+#include <backup/api_types.h>
+#include <backup/backup_engine.h>
+#include <backup/backup_aux.h> // for build_table_list()
+#include <hash.h>
+
+#include "ha_pbxt.h"
+
+#include "backup_xt.h"
+#include "pthread_xt.h"
+#include "filesys_xt.h"
+#include "database_xt.h"
+#include "strutil_xt.h"
+#include "memory_xt.h"
+#include "trace_xt.h"
+#include "myxt_xt.h"
+
+#ifdef OK
+#undef OK
+#endif
+
+#ifdef byte
+#undef byte
+#endif
+
+#ifdef DEBUG
+//#define TRACE_BACKUP_CALLS
+//#define TEST_SMALL_BLOCK 100000
+#endif
+
+using backup::byte;
+using backup::result_t;
+using backup::version_t;
+using backup::Table_list;
+using backup::Table_ref;
+using backup::Buffer;
+
+#ifdef TRACE_BACKUP_CALLS
+#define XT_TRACE_CALL() ha_trace_function(__FUNC__, NULL)
+#else
+#define XT_TRACE_CALL()
+#endif
+
+#define XT_RESTORE_BATCH_SIZE 10000
+
+#define BUP_STATE_BEFORE_LOCK 0
+#define BUP_STATE_AFTER_LOCK 1
+
+#define BUP_STANDARD_VAR_RECORD 1
+#define BUP_RECORD_BLOCK_4_START 2 // Part of a record, with a 4 byte total length, and 4 byte data length
+#define BUP_RECORD_BLOCK_4 3 // Part of a record, with a 4 byte length
+#define BUP_RECORD_BLOCK_4_END 4 // Last part of a record with a 4 byte length
+
+/*
+ * -----------------------------------------------------------------------
+ * UTILITIES
+ */
+
+#ifdef TRACE_BACKUP_CALLS
+static void ha_trace_function(const char *function, char *table)
+{
+ char func_buf[50], *ptr;
+ XTThreadPtr thread = xt_get_self();
+
+ if ((ptr = strchr(function, '('))) {
+ ptr--;
+ while (ptr > function) {
+ if (!(isalnum(*ptr) || *ptr == '_'))
+ break;
+ ptr--;
+ }
+ ptr++;
+ xt_strcpy(50, func_buf, ptr);
+ if ((ptr = strchr(func_buf, '(')))
+ *ptr = 0;
+ }
+ else
+ xt_strcpy(50, func_buf, function);
+ if (table)
+ printf("%s %s (%s)\n", thread ? thread->t_name : "-unknown-", func_buf, table);
+ else
+ printf("%s %s\n", thread ? thread->t_name : "-unknown-", func_buf);
+}
+#endif
+
+/*
+ * -----------------------------------------------------------------------
+ * BACKUP DRIVER
+ */
+
+class PBXTBackupDriver: public Backup_driver
+{
+ public:
+ PBXTBackupDriver(const Table_list &);
+ virtual ~PBXTBackupDriver();
+
+ virtual size_t size();
+ virtual size_t init_size();
+ virtual result_t begin(const size_t);
+ virtual result_t end();
+ virtual result_t get_data(Buffer &);
+ virtual result_t prelock();
+ virtual result_t lock();
+ virtual result_t unlock();
+ virtual result_t cancel();
+ virtual void free();
+ void lock_tables_TL_READ_NO_INSERT();
+
+ private:
+ XTThreadPtr bd_thread;
+ int bd_state;
+ u_int bd_table_no;
+ XTOpenTablePtr bd_ot;
+ xtWord1 *bd_row_buf;
+
+ /* Non-zero if we last returned only part of
+ * a row.
+ */
+ xtWord1 *db_write_block(xtWord1 *buffer, xtWord1 bup_type, size_t *size, xtWord4 row_len);
+ xtWord1 *db_write_block(xtWord1 *buffer, xtWord1 bup_type, size_t *size, xtWord4 total_len, xtWord4 row_len);
+
+ xtWord4 bd_row_offset;
+ xtWord4 bd_row_size;
+};
+
+
+PBXTBackupDriver::PBXTBackupDriver(const Table_list &tables):
+Backup_driver(tables),
+bd_state(BUP_STATE_BEFORE_LOCK),
+bd_table_no(0),
+bd_ot(NULL),
+bd_row_buf(NULL),
+bd_row_offset(0),
+bd_row_size(0)
+{
+}
+
+PBXTBackupDriver::~PBXTBackupDriver()
+{
+}
+
+/** Estimates total size of backup. @todo improve it */
+size_t PBXTBackupDriver::size()
+{
+ XT_TRACE_CALL();
+ return UNKNOWN_SIZE;
+}
+
+/** Estimates size of backup before lock. @todo improve it */
+size_t PBXTBackupDriver::init_size()
+{
+ XT_TRACE_CALL();
+ return 0;
+}
+
+result_t PBXTBackupDriver::begin(const size_t)
+{
+ THD *thd = current_thd;
+ XTExceptionRec e;
+
+ XT_TRACE_CALL();
+
+ if (!(bd_thread = xt_ha_set_current_thread(thd, &e))) {
+ xt_log_exception(NULL, &e, XT_LOG_DEFAULT);
+ return backup::ERROR;
+ }
+
+ return backup::OK;
+}
+
+result_t PBXTBackupDriver::end()
+{
+ XT_TRACE_CALL();
+ if (bd_ot) {
+ xt_tab_seq_exit(bd_ot);
+ xt_db_return_table_to_pool_ns(bd_ot);
+ bd_ot = NULL;
+ }
+ if (bd_thread->st_xact_data) {
+ if (!xt_xn_commit(bd_thread))
+ return backup::ERROR;
+ }
+ return backup::OK;
+}
+
+xtWord1 *PBXTBackupDriver::db_write_block(xtWord1 *buffer, xtWord1 bup_type, size_t *ret_size, xtWord4 row_len)
+{
+ register size_t size = *ret_size;
+
+ *buffer = bup_type; // Record type identifier.
+ buffer++;
+ size--;
+ memcpy(buffer, bd_ot->ot_row_wbuffer, row_len);
+ buffer += row_len;
+ size -= row_len;
+ *ret_size = size;
+ return buffer;
+}
+
+xtWord1 *PBXTBackupDriver::db_write_block(xtWord1 *buffer, xtWord1 bup_type, size_t *ret_size, xtWord4 total_len, xtWord4 row_len)
+{
+ register size_t size = *ret_size;
+
+ *buffer = bup_type; // Record type identifier.
+ buffer++;
+ size--;
+ if (bup_type == BUP_RECORD_BLOCK_4_START) {
+ XT_SET_DISK_4(buffer, total_len);
+ buffer += 4;
+ size -= 4;
+ }
+ XT_SET_DISK_4(buffer, row_len);
+ buffer += 4;
+ size -= 4;
+ memcpy(buffer, bd_ot->ot_row_wbuffer+bd_row_offset, row_len);
+ buffer += row_len;
+ size -= row_len;
+ bd_row_size -= row_len;
+ bd_row_offset += row_len;
+ *ret_size = size;
+ return buffer;
+}
+
+result_t PBXTBackupDriver::get_data(Buffer &buf)
+{
+ xtBool eof = FALSE;
+ size_t size;
+ xtWord4 row_len;
+ xtWord1 *buffer;
+
+ XT_TRACE_CALL();
+
+ if (bd_state == BUP_STATE_BEFORE_LOCK) {
+ buf.table_num = 0;
+ buf.size = 0;
+ buf.last = FALSE;
+ return backup::READY;
+ }
+
+ /* Open the backup table: */
+ if (!bd_ot) {
+ XTThreadPtr self = bd_thread;
+ XTTableHPtr tab;
+ char path[PATH_MAX];
+
+ if (bd_table_no == m_tables.count()) {
+ buf.size = 0;
+ buf.table_num = 0;
+ buf.last = TRUE;
+ return backup::DONE;
+ }
+
+ m_tables[bd_table_no].internal_name(path, sizeof(path));
+ bd_table_no++;
+ try_(a) {
+ xt_ha_open_database_of_table(self, (XTPathStrPtr) path);
+ tab = xt_use_table(self, (XTPathStrPtr) path, FALSE, FALSE, NULL);
+ pushr_(xt_heap_release, tab);
+ if (!(bd_ot = xt_db_open_table_using_tab(tab, bd_thread)))
+ xt_throw(self);
+ freer_(); // xt_heap_release(tab)
+
+ /* Prepare the seqential scan: */
+ xt_tab_seq_exit(bd_ot);
+ if (!xt_tab_seq_init(bd_ot))
+ xt_throw(self);
+
+ if (bd_row_buf) {
+ xt_free(self, bd_row_buf);
+ bd_row_buf = NULL;
+ }
+ bd_row_buf = (xtWord1 *) xt_malloc(self, bd_ot->ot_table->tab_dic.dic_mysql_buf_size);
+ bd_ot->ot_cols_req = bd_ot->ot_table->tab_dic.dic_no_of_cols;
+ }
+ catch_(a) {
+ ;
+ }
+ cont_(a);
+
+ if (!bd_ot)
+ goto failed;
+ }
+
+ buf.table_num = bd_table_no;
+#ifdef TEST_SMALL_BLOCK
+ buf.size = TEST_SMALL_BLOCK;
+#endif
+ size = buf.size;
+ buffer = (xtWord1 *) buf.data;
+ ASSERT_NS(size > 9);
+
+ /* First check of a record was partically written
+ * last time.
+ */
+ write_row:
+ if (bd_row_size > 0) {
+ row_len = bd_row_size;
+ if (bd_row_offset == 0) {
+ if (row_len+1 > size) {
+ ASSERT_NS(size > 9);
+ row_len = size - 9;
+ buffer = db_write_block(buffer, BUP_RECORD_BLOCK_4_START, &size, bd_row_size, row_len);
+ goto done;
+ }
+ buffer = db_write_block(buffer, BUP_STANDARD_VAR_RECORD, &size, row_len);
+ bd_row_size = 0;
+ }
+ else {
+ if (row_len+5 > size) {
+ row_len = size - 5;
+ buffer = db_write_block(buffer, BUP_RECORD_BLOCK_4, &size, 0, row_len);
+ goto done;
+ }
+ buffer = db_write_block(buffer, BUP_RECORD_BLOCK_4_END, &size, 0, row_len);
+ }
+ }
+
+ /* Now continue with the sequential scan. */
+ while (size > 1) {
+ if (!xt_tab_seq_next(bd_ot, bd_row_buf, &eof))
+ goto failed;
+ if (eof) {
+ /* We will go the next table, on the next call. */
+ xt_tab_seq_exit(bd_ot);
+ xt_db_return_table_to_pool_ns(bd_ot);
+ bd_ot = NULL;
+ break;
+ }
+ if (!(row_len = myxt_store_row_data(bd_ot, 0, (char *) bd_row_buf)))
+ goto failed;
+ if (row_len+1 > size) {
+ /* Does not fit: */
+ bd_row_offset = 0;
+ bd_row_size = row_len;
+ /* Only add part of the row, if there is still
+ * quite a bit of space left:
+ */
+ if (size >= (32 * 1024))
+ goto write_row;
+ break;
+ }
+ buffer = db_write_block(buffer, BUP_STANDARD_VAR_RECORD, &size, row_len);
+ }
+
+ done:
+ buf.size = buf.size - size;
+ /* This indicates wnd of data for a table! */
+ buf.last = eof;
+
+ return backup::OK;
+
+ failed:
+ xt_log_and_clear_exception(bd_thread);
+ return backup::ERROR;
+}
+
+result_t PBXTBackupDriver::prelock()
+{
+ XT_TRACE_CALL();
+ return backup::READY;
+}
+
+result_t PBXTBackupDriver::lock()
+{
+ XT_TRACE_CALL();
+ bd_thread->st_xact_mode = XT_XACT_COMMITTED_READ;
+ bd_thread->st_ignore_fkeys = FALSE;
+ bd_thread->st_auto_commit = FALSE;
+ bd_thread->st_table_trans = FALSE;
+ bd_thread->st_abort_trans = FALSE;
+ bd_thread->st_stat_ended = FALSE;
+ bd_thread->st_stat_trans = FALSE;
+ bd_thread->st_is_update = FALSE;
+ if (!xt_xn_begin(bd_thread))
+ return backup::ERROR;
+ bd_state = BUP_STATE_AFTER_LOCK;
+ return backup::OK;
+}
+
+result_t PBXTBackupDriver::unlock()
+{
+ XT_TRACE_CALL();
+ return backup::OK;
+}
+
+result_t PBXTBackupDriver::cancel()
+{
+ XT_TRACE_CALL();
+ return backup::OK; // free() will be called and suffice
+}
+
+void PBXTBackupDriver::free()
+{
+ XT_TRACE_CALL();
+ if (bd_ot) {
+ xt_tab_seq_exit(bd_ot);
+ xt_db_return_table_to_pool_ns(bd_ot);
+ bd_ot = NULL;
+ }
+ if (bd_row_buf) {
+ xt_free_ns(bd_row_buf);
+ bd_row_buf = NULL;
+ }
+ if (bd_thread->st_xact_data)
+ xt_xn_rollback(bd_thread);
+ delete this;
+}
+
+void PBXTBackupDriver::lock_tables_TL_READ_NO_INSERT()
+{
+ XT_TRACE_CALL();
+}
+
+/*
+ * -----------------------------------------------------------------------
+ * BACKUP DRIVER
+ */
+
+class PBXTRestoreDriver: public Restore_driver
+{
+ public:
+ PBXTRestoreDriver(const Table_list &tables);
+ virtual ~PBXTRestoreDriver();
+
+ virtual result_t begin(const size_t);
+ virtual result_t end();
+ virtual result_t send_data(Buffer &buf);
+ virtual result_t cancel();
+ virtual void free();
+
+ private:
+ XTThreadPtr rd_thread;
+ u_int rd_table_no;
+ XTOpenTablePtr rd_ot;
+ STRUCT_TABLE *rd_my_table;
+ xtWord1 *rb_row_buf;
+ u_int rb_col_cnt;
+ u_int rb_insert_count;
+
+ /* Long rows are accumulated here: */
+ xtWord4 rb_row_len;
+ xtWord4 rb_data_size;
+ xtWord1 *rb_row_data;
+};
+
+PBXTRestoreDriver::PBXTRestoreDriver(const Table_list &tables):
+Restore_driver(tables),
+rd_thread(NULL),
+rd_table_no(0),
+rd_ot(NULL),
+rb_row_buf(NULL),
+rb_row_len(0),
+rb_data_size(0),
+rb_row_data(NULL)
+{
+}
+
+PBXTRestoreDriver::~PBXTRestoreDriver()
+{
+}
+
+result_t PBXTRestoreDriver::begin(const size_t)
+{
+ THD *thd = current_thd;
+ XTExceptionRec e;
+
+ XT_TRACE_CALL();
+
+ if (!(rd_thread = xt_ha_set_current_thread(thd, &e))) {
+ xt_log_exception(NULL, &e, XT_LOG_DEFAULT);
+ return backup::ERROR;
+ }
+
+ return backup::OK;
+}
+
+result_t PBXTRestoreDriver::end()
+{
+ XT_TRACE_CALL();
+ if (rd_ot) {
+ xt_db_return_table_to_pool_ns(rd_ot);
+ rd_ot = NULL;
+ }
+ //if (rb_row_buf) {
+ // xt_free_ns(rb_row_buf);
+ // rb_row_buf = NULL;
+ //}
+ if (rb_row_data) {
+ xt_free_ns(rb_row_data);
+ rb_row_data = NULL;
+ }
+ if (rd_thread->st_xact_data) {
+ if (!xt_xn_commit(rd_thread))
+ return backup::ERROR;
+ }
+ return backup::OK;
+}
+
+
+result_t PBXTRestoreDriver::send_data(Buffer &buf)
+{
+ size_t size;
+ xtWord1 type;
+ xtWord1 *buffer;
+ xtWord4 row_len;
+ xtWord1 *rec_data;
+
+ XT_TRACE_CALL();
+
+ if (buf.table_num != rd_table_no) {
+ XTThreadPtr self = rd_thread;
+ XTTableHPtr tab;
+ char path[PATH_MAX];
+
+ if (rd_ot) {
+ xt_db_return_table_to_pool_ns(rd_ot);
+ rd_ot = NULL;
+ }
+
+ if (rd_thread->st_xact_data) {
+ if (!xt_xn_commit(rd_thread))
+ goto failed;
+ }
+ if (!xt_xn_begin(rd_thread))
+ goto failed;
+ rb_insert_count = 0;
+
+ rd_table_no = buf.table_num;
+ m_tables[rd_table_no-1].internal_name(path, sizeof(path));
+ try_(a) {
+ xt_ha_open_database_of_table(self, (XTPathStrPtr) path);
+ tab = xt_use_table(self, (XTPathStrPtr) path, FALSE, FALSE, NULL);
+ pushr_(xt_heap_release, tab);
+ if (!(rd_ot = xt_db_open_table_using_tab(tab, rd_thread)))
+ xt_throw(self);
+ freer_(); // xt_heap_release(tab)
+
+ rd_my_table = rd_ot->ot_table->tab_dic.dic_my_table;
+ if (rd_my_table->found_next_number_field) {
+ rd_my_table->in_use = current_thd;
+ rd_my_table->next_number_field = rd_my_table->found_next_number_field;
+ rd_my_table->mark_columns_used_by_index_no_reset(rd_my_table->s->next_number_index, rd_my_table->read_set);
+ }
+
+ /* This is safe because only one thread can restore a table at
+ * a time!
+ */
+ rb_row_buf = (xtWord1 *) rd_my_table->record[0];
+ //if (rb_row_buf) {
+ // xt_free(self, rb_row_buf);
+ // rb_row_buf = NULL;
+ //}
+ //rb_row_buf = (xtWord1 *) xt_malloc(self, rd_ot->ot_table->tab_dic.dic_mysql_buf_size);
+
+ rb_col_cnt = rd_ot->ot_table->tab_dic.dic_no_of_cols;
+
+ }
+ catch_(a) {
+ ;
+ }
+ cont_(a);
+
+ if (!rd_ot)
+ goto failed;
+ }
+
+ buffer = (xtWord1 *) buf.data;
+ size = buf.size;
+
+ while (size > 0) {
+ type = *buffer;
+ switch (type) {
+ case BUP_STANDARD_VAR_RECORD:
+ rec_data = buffer + 1;
+ break;
+ case BUP_RECORD_BLOCK_4_START:
+ buffer++;
+ row_len = XT_GET_DISK_4(buffer);
+ buffer += 4;
+ if (rb_data_size < row_len) {
+ if (!xt_realloc_ns((void **) &rb_row_data, row_len))
+ goto failed;
+ rb_data_size = row_len;
+ }
+ row_len = XT_GET_DISK_4(buffer);
+ buffer += 4;
+ ASSERT_NS(row_len <= rb_data_size);
+ if (row_len > rb_data_size) {
+ xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
+ goto failed;
+ }
+ memcpy(rb_row_data, buffer, row_len);
+ rb_row_len = row_len;
+ buffer += row_len;
+ if (row_len + 9 > size) {
+ xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
+ goto failed;
+ }
+ size -= row_len + 9;
+ continue;
+ case BUP_RECORD_BLOCK_4:
+ buffer++;
+ row_len = XT_GET_DISK_4(buffer);
+ buffer += 4;
+ ASSERT_NS(rb_row_len + row_len <= rb_data_size);
+ if (rb_row_len + row_len > rb_data_size) {
+ xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
+ goto failed;
+ }
+ memcpy(rb_row_data + rb_row_len, buffer, row_len);
+ rb_row_len += row_len;
+ buffer += row_len;
+ if (row_len + 5 > size) {
+ xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
+ goto failed;
+ }
+ size -= row_len + 5;
+ continue;
+ case BUP_RECORD_BLOCK_4_END:
+ buffer++;
+ row_len = XT_GET_DISK_4(buffer);
+ buffer += 4;
+ ASSERT_NS(rb_row_len + row_len <= rb_data_size);
+ if (rb_row_len + row_len > rb_data_size) {
+ xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
+ goto failed;
+ }
+ memcpy(rb_row_data + rb_row_len, buffer, row_len);
+ buffer += row_len;
+ if (row_len + 5 > size) {
+ xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
+ goto failed;
+ }
+ size -= row_len + 5;
+ rec_data = rb_row_data;
+ break;
+ default:
+ xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
+ goto failed;
+ }
+
+ if (!(row_len = myxt_load_row_data(rd_ot, rec_data, rb_row_buf, rb_col_cnt)))
+ goto failed;
+
+ if (rd_ot->ot_table->tab_dic.dic_my_table->found_next_number_field)
+ ha_set_auto_increment(rd_ot, rd_ot->ot_table->tab_dic.dic_my_table->found_next_number_field);
+
+ if (!xt_tab_new_record(rd_ot, rb_row_buf))
+ goto failed;
+
+ if (type == BUP_STANDARD_VAR_RECORD) {
+ buffer += row_len+1;
+ if (row_len + 1 > size) {
+ xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
+ goto failed;
+ }
+ size -= row_len + 1;
+ }
+
+ rb_insert_count++;
+ if (rb_insert_count == XT_RESTORE_BATCH_SIZE) {
+ if (!xt_xn_commit(rd_thread))
+ goto failed;
+ if (!xt_xn_begin(rd_thread))
+ goto failed;
+ rb_insert_count = 0;
+ }
+ }
+
+ return backup::OK;
+
+ failed:
+ xt_log_and_clear_exception(rd_thread);
+ return backup::ERROR;
+}
+
+
+result_t PBXTRestoreDriver::cancel()
+{
+ XT_TRACE_CALL();
+ /* Nothing to do in cancel(); free() will suffice */
+ return backup::OK;
+}
+
+void PBXTRestoreDriver::free()
+{
+ XT_TRACE_CALL();
+ if (rd_ot) {
+ xt_db_return_table_to_pool_ns(rd_ot);
+ rd_ot = NULL;
+ }
+ //if (rb_row_buf) {
+ // xt_free_ns(rb_row_buf);
+ // rb_row_buf = NULL;
+ //}
+ if (rb_row_data) {
+ xt_free_ns(rb_row_data);
+ rb_row_data = NULL;
+ }
+ if (rd_thread->st_xact_data)
+ xt_xn_rollback(rd_thread);
+ delete this;
+}
+
+/*
+ * -----------------------------------------------------------------------
+ * BACKUP ENGINE FACTORY
+ */
+
+#define PBXT_BACKUP_VERSION 1
+
+
+class PBXTBackupEngine: public Backup_engine
+{
+ public:
+ PBXTBackupEngine() { };
+
+ virtual version_t version() const {
+ return PBXT_BACKUP_VERSION;
+ };
+
+ virtual result_t get_backup(const uint32, const Table_list &, Backup_driver* &);
+
+ virtual result_t get_restore(const version_t, const uint32, const Table_list &,Restore_driver* &);
+
+ virtual void free()
+ {
+ delete this;
+ }
+};
+
+result_t PBXTBackupEngine::get_backup(const u_int count, const Table_list &tables, Backup_driver* &drv)
+{
+ PBXTBackupDriver *ptr = new PBXTBackupDriver(tables);
+
+ if (!ptr)
+ return backup::ERROR;
+ drv = ptr;
+ return backup::OK;
+}
+
+result_t PBXTBackupEngine::get_restore(const version_t ver, const uint32,
+ const Table_list &tables, Restore_driver* &drv)
+{
+ if (ver > PBXT_BACKUP_VERSION)
+ {
+ return backup::ERROR;
+ }
+
+ PBXTRestoreDriver *ptr = new PBXTRestoreDriver(tables);
+
+ if (!ptr)
+ return backup::ERROR;
+ drv = (Restore_driver *) ptr;
+ return backup::OK;
+}
+
+
+Backup_result_t pbxt_backup_engine(handlerton *self, Backup_engine* &be)
+{
+ be = new PBXTBackupEngine();
+
+ if (!be)
+ return backup::ERROR;
+
+ return backup::OK;
+}
+
+#endif
=== added file 'storage/pbxt/src/backup_xt.h'
--- a/storage/pbxt/src/backup_xt.h 1970-01-01 00:00:00 +0000
+++ b/storage/pbxt/src/backup_xt.h 2009-11-24 10:55:06 +0000
@@ -0,0 +1,34 @@
+/* Copyright (c) 2009 PrimeBase Technologies GmbH
+ *
+ * PrimeBase XT
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * 2009-09-07 Paul McCullagh
+ *
+ * H&G2JCtL
+ */
+
+#ifndef __backup_xt_h__
+#define __backup_xt_h__
+
+#include "xt_defs.h"
+
+#ifdef MYSQL_SUPPORTS_BACKUP
+
+Backup_result_t pbxt_backup_engine(handlerton *self, Backup_engine* &be);
+
+#endif
+#endif
=== modified file 'storage/pbxt/src/cache_xt.cc'
--- a/storage/pbxt/src/cache_xt.cc 2009-10-30 18:50:56 +0000
+++ b/storage/pbxt/src/cache_xt.cc 2009-11-24 10:55:06 +0000
@@ -73,7 +73,7 @@
#define IDX_CAC_UNLOCK(i, o) xt_xsmutex_unlock(&(i)->cs_lock, (o)->t_id)
#elif defined(IDX_CAC_USE_PTHREAD_RW)
#define IDX_CAC_LOCK_TYPE xt_rwlock_type
-#define IDX_CAC_INIT_LOCK(s, i) xt_init_rwlock(s, &(i)->cs_lock)
+#define IDX_CAC_INIT_LOCK(s, i) xt_init_rwlock_with_autoname(s, &(i)->cs_lock)
#define IDX_CAC_FREE_LOCK(s, i) xt_free_rwlock(&(i)->cs_lock)
#define IDX_CAC_READ_LOCK(i, o) xt_slock_rwlock_ns(&(i)->cs_lock)
#define IDX_CAC_WRITE_LOCK(i, o) xt_xlock_rwlock_ns(&(i)->cs_lock)
@@ -94,8 +94,12 @@
#define IDX_CAC_UNLOCK(i, s) xt_spinxslock_unlock(&(i)->cs_lock, (s)->t_id)
#endif
+#ifdef XT_NO_ATOMICS
+#define ID_HANDLE_USE_PTHREAD_RW
+#else
#define ID_HANDLE_USE_SPINLOCK
//#define ID_HANDLE_USE_PTHREAD_RW
+#endif
#if defined(ID_HANDLE_USE_PTHREAD_RW)
#define ID_HANDLE_LOCK_TYPE xt_mutex_type
@@ -374,7 +378,7 @@ xtPublic void xt_ind_release_handle(XTIn
{
DcHandleSlotPtr hs;
XTIndBlockPtr block = NULL;
- u_int hash_idx = 0;
+ u_int hash_idx = 0;
DcSegmentPtr seg = NULL;
XTIndBlockPtr xblock;
@@ -1379,7 +1383,7 @@ xtPublic xtBool xt_ind_fetch(XTOpenTable
ASSERT_NS(iref->ir_xlock == 2);
#endif
if (!(block = ind_cac_fetch(ot, ind, address, &seg, TRUE)))
- return 0;
+ return FAILED;
branch_size = XT_GET_DISK_2(((XTIdxBranchDPtr) block->cb_data)->tb_size_2);
if (XT_GET_INDEX_BLOCK_LEN(branch_size) < 2 || XT_GET_INDEX_BLOCK_LEN(branch_size) > XT_INDEX_PAGE_SIZE) {
=== modified file 'storage/pbxt/src/cache_xt.h'
--- a/storage/pbxt/src/cache_xt.h 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/cache_xt.h 2009-11-24 10:55:06 +0000
@@ -62,7 +62,7 @@ struct XTIdxReadBuffer;
#define XT_IPAGE_UNLOCK(i, x) xt_atomicrwlock_unlock(i, x)
#elif defined(XT_IPAGE_USE_PTHREAD_RW)
#define XT_IPAGE_LOCK_TYPE xt_rwlock_type
-#define XT_IPAGE_INIT_LOCK(s, i) xt_init_rwlock(s, i)
+#define XT_IPAGE_INIT_LOCK(s, i) xt_init_rwlock_with_autoname(s, i)
#define XT_IPAGE_FREE_LOCK(s, i) xt_free_rwlock(i)
#define XT_IPAGE_READ_LOCK(i) xt_slock_rwlock_ns(i)
#define XT_IPAGE_WRITE_LOCK(i, s) xt_xlock_rwlock_ns(i)
=== modified file 'storage/pbxt/src/database_xt.cc'
--- a/storage/pbxt/src/database_xt.cc 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/database_xt.cc 2009-11-24 10:55:06 +0000
@@ -54,6 +54,8 @@
* GLOBALS
*/
+xtPublic XTDatabaseHPtr pbxt_database = NULL; // The global open database
+
xtPublic xtLogOffset xt_db_log_file_threshold;
xtPublic size_t xt_db_log_buffer_size;
xtPublic size_t xt_db_transaction_buffer_size;
@@ -505,6 +507,15 @@ xtPublic XTDatabaseHPtr xt_get_database(
* all index entries that are not visible have
* been removed.
*
+ * REASON WHY WE SET ROWID ON RECOVERY:
+ * The row ID is set on recovery because the
+ * change to the index may be lost after a crash.
+ * The change to the index is done by the sweeper, and
+ * there is no record of this change in the log.
+ * The sweeper will not "re-sweep" all transations
+ * that are recovered. As a result, this upadte
+ * of the index by the sweeper may be lost.
+ *
* {OPEN-DB-SWEEPER-WAIT}
* This has been moved to after the release of the open
* database lock because:
@@ -518,9 +529,12 @@ xtPublic XTDatabaseHPtr xt_get_database(
* - To open the database it needs the open database
* lock.
*/
+ /*
+ * This has been moved, see: {WAIT-FOR-SW-AFTER-RECOV}
pushr_(xt_heap_release, db);
xt_wait_for_sweeper(self, db, 0);
popr_();
+ */
return db;
}
=== modified file 'storage/pbxt/src/database_xt.h'
--- a/storage/pbxt/src/database_xt.h 2009-04-02 10:03:14 +0000
+++ b/storage/pbxt/src/database_xt.h 2009-11-24 10:55:06 +0000
@@ -105,6 +105,8 @@ typedef struct XTTablePath {
#define XT_THREAD_IDLE 1
#define XT_THREAD_INERR 2
+#define XT_XA_HASH_TAB_SIZE 223
+
typedef struct XTDatabase : public XTHeap {
char *db_name; /* The name of the database, last component of the path! */
char *db_main_path;
@@ -131,6 +133,9 @@ typedef struct XTDatabase : public XTHea
u_int db_stat_sweep_waits; /* STATISTICS: count the sweeper waits. */
XTDatabaseLogRec db_xlog; /* The transaction log for this database. */
XTXactRestartRec db_restart; /* Database recovery stuff. */
+ xt_mutex_type db_xn_xa_lock;
+ XTXactPreparePtr db_xn_xa_table[XT_XA_HASH_TAB_SIZE];
+ XTSortedListPtr db_xn_xa_list; /* The "wait-for" list, of transactions waiting for other transactions. */
XTSortedListPtr db_xn_wait_for; /* The "wait-for" list, of transactions waiting for other transactions. */
u_int db_xn_call_start; /* Start of the post wait calls. */
@@ -198,6 +203,7 @@ void xt_check_database(XTThreadPtr se
void xt_add_pbxt_file(size_t size, char *path, const char *file);
void xt_add_location_file(size_t size, char *path);
+void xt_add_pbxt_dir(size_t size, char *path);
void xt_add_system_dir(size_t size, char *path);
void xt_add_data_dir(size_t size, char *path);
@@ -244,4 +250,6 @@ inline void xt_xlog_check_long_writer(XT
}
}
+extern XTDatabaseHPtr pbxt_database; // The global open database
+
#endif
=== modified file 'storage/pbxt/src/datadic_xt.cc'
--- a/storage/pbxt/src/datadic_xt.cc 2009-08-18 07:46:53 +0000
+++ b/storage/pbxt/src/datadic_xt.cc 2009-11-24 10:55:06 +0000
@@ -35,7 +35,7 @@
#ifdef DEBUG
#ifdef DRIZZLED
-#include <drizzled/common_includes.h>
+//#include <drizzled/common_includes.h>
#else
#include "mysql_priv.h"
#endif
@@ -437,11 +437,6 @@ class XTTokenizer {
XTToken *nextToken(XTThreadPtr self, c_char *keyword, XTToken *tk);
};
-void ri_free_token(XTThreadPtr XT_UNUSED(self), XTToken *tk)
-{
- delete tk;
-}
-
XTToken *XTTokenizer::newToken(XTThreadPtr self, u_int type, char *start, char *end)
{
if (!tkn_current) {
=== modified file 'storage/pbxt/src/datalog_xt.cc'
--- a/storage/pbxt/src/datalog_xt.cc 2009-09-04 07:29:34 +0000
+++ b/storage/pbxt/src/datalog_xt.cc 2009-11-24 10:55:06 +0000
@@ -410,7 +410,7 @@ static void dl_recover_log(XTThreadPtr s
ASSERT_NS(seq_read.sl_log_eof == seq_read.sl_rec_log_offset);
data_log->dlf_log_eof = seq_read.sl_rec_log_offset;
- if ((size_t) data_log->dlf_log_eof < sizeof(XTXactLogHeaderDRec)) {
+ if (data_log->dlf_log_eof < (off_t) sizeof(XTXactLogHeaderDRec)) {
data_log->dlf_log_eof = sizeof(XTXactLogHeaderDRec);
if (!dl_create_log_header(data_log, seq_read.sl_log_file, self))
xt_throw(self);
@@ -1162,7 +1162,7 @@ xtBool XTDataLogBuffer::dlb_close_log(XT
/* When I use 'thread' instead of 'self', this means
* that I will not throw an error.
*/
-xtBool XTDataLogBuffer::dlb_get_log_offset(xtLogID *log_id, xtLogOffset *out_offset, size_t req_size, struct XTThread *thread)
+xtBool XTDataLogBuffer::dlb_get_log_offset(xtLogID *log_id, xtLogOffset *out_offset, size_t XT_UNUSED(req_size), struct XTThread *thread)
{
/* Note, I am allowing a log to grow beyond the threshold.
* The amount depends on the maximum extended record size.
@@ -1757,7 +1757,7 @@ static xtBool dl_collect_garbage(XTThrea
freer_(); // xt_unlock_mutex(&db->db_co_dlog_lock)
/* Flush the transaction log. */
- if (!xt_xlog_flush_log(self))
+ if (!xt_xlog_flush_log(db, self))
xt_throw(self);
xt_lock_mutex_ns(&db->db_datalogs.dlc_head_lock);
@@ -1891,7 +1891,7 @@ static xtBool dl_collect_garbage(XTThrea
freer_(); // xt_unlock_mutex(&db->db_co_dlog_lock)
/* Flush the transaction log. */
- if (!xt_xlog_flush_log(self))
+ if (!xt_xlog_flush_log(db, self))
xt_throw(self);
/* Save state in source log header. */
=== modified file 'storage/pbxt/src/discover_xt.cc'
--- a/storage/pbxt/src/discover_xt.cc 2009-12-03 11:19:05 +0000
+++ b/storage/pbxt/src/discover_xt.cc 2009-12-16 08:13:18 +0000
@@ -31,6 +31,9 @@
#include <drizzled/session.h>
#include <drizzled/server_includes.h>
#include <drizzled/sql_base.h>
+#include <drizzled/statement/alter_table.h>
+#include <algorithm>
+#include <sstream>
#endif
#include "strutil_xt.h"
@@ -39,18 +42,273 @@
#include "ha_xtsys.h"
#ifndef DRIZZLED
-#if MYSQL_VERSION_ID > 60005
+#if MYSQL_VERSION_ID >= 50404
#define DOT_STR(x) x.str
#else
#define DOT_STR(x) x
#endif
#endif
-#ifndef DRIZZLED
+//#ifndef DRIZZLED
#define LOCK_OPEN_HACK_REQUIRED
-#endif // DRIZZLED
+//#endif // DRIZZLED
#ifdef LOCK_OPEN_HACK_REQUIRED
+#ifdef DRIZZLED
+
+using namespace drizzled;
+using namespace std;
+
+#define mysql_create_table_no_lock hacked_mysql_create_table_no_lock
+
+namespace drizzled {
+
+int rea_create_table(Session *session, const char *path,
+ const char *db, const char *table_name,
+ message::Table *table_proto,
+ HA_CREATE_INFO *create_info,
+ List<CreateField> &create_field,
+ uint32_t key_count,KEY *key_info);
+}
+
+static uint32_t build_tmptable_filename(Session* session,
+ char *buff, size_t bufflen)
+{
+ uint32_t length;
+ ostringstream path_str, post_tmpdir_str;
+ string tmp;
+
+ path_str << drizzle_tmpdir;
+ post_tmpdir_str << "/" << TMP_FILE_PREFIX << current_pid;
+ post_tmpdir_str << session->thread_id << session->tmp_table++;
+ tmp= post_tmpdir_str.str();
+
+ transform(tmp.begin(), tmp.end(), tmp.begin(), ::tolower);
+
+ path_str << tmp;
+
+ if (bufflen < path_str.str().length())
+ length= 0;
+ else
+ length= unpack_filename(buff, path_str.str().c_str());
+
+ return length;
+}
+
+static bool mysql_create_table_no_lock(Session *session,
+ const char *db, const char *table_name,
+ HA_CREATE_INFO *create_info,
+ message::Table *table_proto,
+ AlterInfo *alter_info,
+ bool internal_tmp_table,
+ uint32_t select_field_count)
+{
+ char path[FN_REFLEN];
+ uint32_t path_length;
+ uint db_options, key_count;
+ KEY *key_info_buffer;
+ Cursor *file;
+ bool error= true;
+ /* Check for duplicate fields and check type of table to create */
+ if (!alter_info->create_list.elements)
+ {
+ my_message(ER_TABLE_MUST_HAVE_COLUMNS, ER(ER_TABLE_MUST_HAVE_COLUMNS),
+ MYF(0));
+ return true;
+ }
+ assert(strcmp(table_name,table_proto->name().c_str())==0);
+ if (check_engine(session, table_name, create_info))
+ return true;
+ db_options= create_info->table_options;
+ if (create_info->row_type == ROW_TYPE_DYNAMIC)
+ db_options|=HA_OPTION_PACK_RECORD;
+
+ /*if (!(file= create_info->db_type->getCursor((TableShare*) 0, session->mem_root)))
+ {
+ my_error(ER_OUTOFMEMORY, MYF(0), sizeof(Cursor));
+ return true;
+ }*/
+
+ /* PMC - Done to avoid getting the partition handler by mistake! */
+ if (!(file= new (session->mem_root) ha_xtsys(pbxt_hton, NULL)))
+ {
+ my_error(ER_OUTOFMEMORY, MYF(0), sizeof(Cursor));
+ return true;
+ }
+
+ set_table_default_charset(create_info, (char*) db);
+
+ if (mysql_prepare_create_table(session,
+ create_info,
+ table_proto,
+ alter_info,
+ internal_tmp_table,
+ &db_options, file,
+ &key_info_buffer, &key_count,
+ select_field_count))
+ goto err;
+
+ /* Check if table exists */
+ if (create_info->options & HA_LEX_CREATE_TMP_TABLE)
+ {
+ path_length= build_tmptable_filename(session, path, sizeof(path));
+ }
+ else
+ {
+ #ifdef FN_DEVCHAR
+ /* check if the table name contains FN_DEVCHAR when defined */
+ if (strchr(table_name, FN_DEVCHAR))
+ {
+ my_error(ER_WRONG_TABLE_NAME, MYF(0), table_name);
+ return true;
+ }
+#endif
+ path_length= build_table_filename(path, sizeof(path), db, table_name, internal_tmp_table);
+ }
+
+ /* Check if table already exists */
+ if ((create_info->options & HA_LEX_CREATE_TMP_TABLE) &&
+ session->find_temporary_table(db, table_name))
+ {
+ if (create_info->options & HA_LEX_CREATE_IF_NOT_EXISTS)
+ {
+ create_info->table_existed= 1; // Mark that table existed
+ push_warning_printf(session, DRIZZLE_ERROR::WARN_LEVEL_NOTE,
+ ER_TABLE_EXISTS_ERROR, ER(ER_TABLE_EXISTS_ERROR),
+ table_name);
+ error= 0;
+ goto err;
+ }
+ my_error(ER_TABLE_EXISTS_ERROR, MYF(0), table_name);
+ goto err;
+ }
+
+ //pthread_mutex_lock(&LOCK_open); /* CREATE TABLE (some confussion on naming, double check) */
+ if (!internal_tmp_table && !(create_info->options & HA_LEX_CREATE_TMP_TABLE))
+ {
+ if (plugin::StorageEngine::getTableDefinition(*session,
+ path,
+ db,
+ table_name,
+ internal_tmp_table) == EEXIST)
+ {
+ if (create_info->options & HA_LEX_CREATE_IF_NOT_EXISTS)
+ {
+ error= false;
+ push_warning_printf(session, DRIZZLE_ERROR::WARN_LEVEL_NOTE,
+ ER_TABLE_EXISTS_ERROR, ER(ER_TABLE_EXISTS_ERROR),
+ table_name);
+ create_info->table_existed= 1; // Mark that table existed
+ }
+ else
+ my_error(ER_TABLE_EXISTS_ERROR,MYF(0),table_name);
+
+ goto unlock_and_end;
+ }
+ /*
+ * We don't assert here, but check the result, because the table could be
+ * in the table definition cache and in the same time the .frm could be
+ * missing from the disk, in case of manual intervention which deletes
+ * the .frm file. The user has to use FLUSH TABLES; to clear the cache.
+ * Then she could create the table. This case is pretty obscure and
+ * therefore we don't introduce a new error message only for it.
+ * */
+ if (TableShare::getShare(db, table_name))
+ {
+ my_error(ER_TABLE_EXISTS_ERROR, MYF(0), table_name);
+ goto unlock_and_end;
+ }
+ }
+ /*
+ * Check that table with given name does not already
+ * exist in any storage engine. In such a case it should
+ * be discovered and the error ER_TABLE_EXISTS_ERROR be returned
+ * unless user specified CREATE TABLE IF EXISTS
+ * The LOCK_open mutex has been locked to make sure no
+ * one else is attempting to discover the table. Since
+ * it's not on disk as a frm file, no one could be using it!
+ * */
+ if (!(create_info->options & HA_LEX_CREATE_TMP_TABLE))
+ {
+ bool create_if_not_exists =
+ create_info->options & HA_LEX_CREATE_IF_NOT_EXISTS;
+
+ char table_path[FN_REFLEN];
+ uint32_t table_path_length;
+
+ table_path_length= build_table_filename(table_path, sizeof(table_path),
+ db, table_name, false);
+
+ int retcode= plugin::StorageEngine::getTableDefinition(*session,
+ table_path,
+ db,
+ table_name,
+ false);
+ switch (retcode)
+ {
+ case ENOENT:
+ /* Normal case, no table exists. we can go and create it */
+ break;
+ case EEXIST:
+ if (create_if_not_exists)
+ {
+ error= false;
+ push_warning_printf(session, DRIZZLE_ERROR::WARN_LEVEL_NOTE,
+ ER_TABLE_EXISTS_ERROR, ER(ER_TABLE_EXISTS_ERROR),
+ table_name);
+ create_info->table_existed= 1; // Mark that table existed
+ goto unlock_and_end;
+ }
+ my_error(ER_TABLE_EXISTS_ERROR,MYF(0),table_name);
+ goto unlock_and_end;
+ default:
+ my_error(retcode, MYF(0),table_name);
+ goto unlock_and_end;
+ }
+ }
+
+ session->set_proc_info("creating table");
+ create_info->table_existed= 0; // Mark that table is created
+
+ create_info->table_options=db_options;
+
+ if (rea_create_table(session, path, db, table_name,
+ table_proto,
+ create_info, alter_info->create_list,
+ key_count, key_info_buffer))
+ goto unlock_and_end;
+
+ if (create_info->options & HA_LEX_CREATE_TMP_TABLE)
+ {
+ /* Open table and put in temporary table list */
+ if (!(session->open_temporary_table(path, db, table_name, 1, OTM_OPEN)))
+ {
+ (void) session->rm_temporary_table(create_info->db_type, path);
+ goto unlock_and_end;
+ }
+ }
+
+ /*
+ * Don't write statement if:
+ * - It is an internal temporary table,
+ * - Row-based logging is used and it we are creating a temporary table, or
+ * - The binary log is not open.
+ * Otherwise, the statement shall be binlogged.
+ * */
+ if (!internal_tmp_table &&
+ ((!(create_info->options & HA_LEX_CREATE_TMP_TABLE))))
+ write_bin_log(session, session->query, session->query_length);
+ error= false;
+unlock_and_end:
+ //pthread_mutex_unlock(&LOCK_open);
+
+err:
+ session->set_proc_info("After create");
+ delete file;
+ return(error);
+}
+
+#else // MySQL case
///////////////////////////////
/*
* Unfortunately I cannot use the standard mysql_create_table_no_lock() because it will lock "LOCK_open"
@@ -1229,13 +1487,13 @@ static bool mysql_create_table_no_lock(T
if (create_info->options & HA_LEX_CREATE_TMP_TABLE)
{
/* Open table and put in temporary table list */
-#if MYSQL_VERSION_ID > 60005
+#if MYSQL_VERSION_ID >= 50404
if (!(open_temporary_table(thd, path, db, table_name, 1, OTM_OPEN)))
#else
if (!(open_temporary_table(thd, path, db, table_name, 1)))
#endif
{
-#if MYSQL_VERSION_ID > 60005
+#if MYSQL_VERSION_ID >= 50404
(void) rm_temporary_table(create_info->db_type, path, false);
#else
(void) rm_temporary_table(create_info->db_type, path);
@@ -1252,11 +1510,21 @@ static bool mysql_create_table_no_lock(T
- The binary log is not open.
Otherwise, the statement shall be binlogged.
*/
+ /* PBXT 1.0.09e
+ * Firstly we had a compile problem with MySQL 5.1.42 and
+ * the write_bin_log() call below:
+ * discover_xt.cc:1259: error: argument of type 'char* (Statement::)()' does not match 'const char*'
+ *
+ * And secondly, we should no write the BINLOG anyway because this is
+ * an internal PBXT system table.
+ *
+ * So I am just commenting out the code altogether.
if (!internal_tmp_table &&
(!thd->current_stmt_binlog_row_based ||
(thd->current_stmt_binlog_row_based &&
!(create_info->options & HA_LEX_CREATE_TMP_TABLE))))
- write_bin_log(thd, TRUE, thd->query(), thd->query_length());
+ write_bin_log(thd, TRUE, thd->query, thd->query_length);
+ */
error= FALSE;
unlock_and_end:
pthread_mutex_unlock(&LOCK_open);
@@ -1279,37 +1547,51 @@ warn:
////// END OF CUT AND PASTES FROM sql_table.cc ////////
////////////////////////////////////////////////////////
+#endif // DRIZZLED
#endif // LOCK_OPEN_HACK_REQUIRED
//------------------------------
int xt_create_table_frm(handlerton *hton, THD* thd, const char *db, const char *name, DT_FIELD_INFO *info, DT_KEY_INFO *XT_UNUSED(keys), xtBool skip_existing)
{
#ifdef DRIZZLED
- drizzled::message::Table table_proto;
+#define MYLEX_CREATE_INFO create_info
+#else
+#define MYLEX_CREATE_INFO mylex.create_info
+#endif
+
+#ifdef DRIZZLED
+ drizzled::statement::AlterTable *stmt = new drizzled::statement::AlterTable(thd);
+ HA_CREATE_INFO create_info;
+ //AlterInfo alter_info;
+ drizzled::message::Table table_proto;
static const char *ext = ".dfe";
static const int ext_len = 4;
+
+ table_proto.mutable_engine()->mutable_name()->assign("PBXT");
#else
static const char *ext = ".frm";
static const int ext_len = 4;
#endif
int err = 1;
- //HA_CREATE_INFO create_info = {0};
- //Alter_info alter_info;
char field_length_buffer[12], *field_length_ptr;
LEX *save_lex= thd->lex, mylex;
-
- memset(&mylex.create_info, 0, sizeof(HA_CREATE_INFO));
+
+ memset(&MYLEX_CREATE_INFO, 0, sizeof(HA_CREATE_INFO));
thd->lex = &mylex;
- lex_start(thd);
+ lex_start(thd);
+#ifdef DRIZZLED
+ mylex.statement = stmt;
+#endif
/* setup the create info */
- mylex.create_info.db_type = hton;
+ MYLEX_CREATE_INFO.db_type = hton;
+
#ifndef DRIZZLED
mylex.create_info.frm_only = 1;
#endif
- mylex.create_info.default_table_charset = system_charset_info;
+ MYLEX_CREATE_INFO.default_table_charset = system_charset_info;
/* setup the column info. */
while (info->field_name) {
@@ -1335,7 +1617,7 @@ int xt_create_table_frm(handlerton *hton
#else
if (add_field_to_list(thd, &field_name, info->field_type, field_length_ptr, info->field_decimal_length,
info->field_flags,
-#if MYSQL_VERSION_ID > 60005
+#if MYSQL_VERSION_ID >= 50404
HA_SM_DISK,
COLUMN_FORMAT_TYPE_FIXED,
#endif
@@ -1369,7 +1651,7 @@ int xt_create_table_frm(handlerton *hton
table_proto.set_name(name);
table_proto.set_type(drizzled::message::Table::STANDARD);
- if (mysql_create_table_no_lock(thd, db, name, &mylex.create_info, &table_proto, &mylex.alter_info, 1, 0, false))
+ if (mysql_create_table_no_lock(thd, db, name, &create_info, &table_proto, &stmt->alter_info, 1, 0))
goto error;
#else
if (mysql_create_table_no_lock(thd, db, name, &mylex.create_info, &mylex.alter_info, 1, 0))
=== modified file 'storage/pbxt/src/filesys_xt.cc'
--- a/storage/pbxt/src/filesys_xt.cc 2009-09-03 06:15:03 +0000
+++ b/storage/pbxt/src/filesys_xt.cc 2009-11-24 10:55:06 +0000
@@ -56,6 +56,8 @@
//#define DEBUG_TRACE_FILES
//#define INJECT_WRITE_REMAP_ERROR
/* This is required to make testing on the Mac faster: */
+/* It turns of full file sync. */
+#define DEBUG_FAST_MAC
#endif
#ifdef DEBUG_TRACE_FILES
@@ -63,10 +65,6 @@
#define PRINTF xt_trace
#endif
-#if defined(XT_MAC) && defined(F_FULLFSYNC)
-#undef F_FULLFSYNC
-#endif
-
#ifdef INJECT_WRITE_REMAP_ERROR
#define INJECT_REMAP_FILE_SIZE 1000000
#define INJECT_REMAP_FILE_TYPE "xtd"
@@ -883,7 +881,7 @@ xtPublic xtBool xt_flush_file(XTOpenFile
* fsync didn't really flush index pages to disk. fcntl(F_FULLFSYNC) is considered more effective
* in such case.
*/
-#ifdef F_FULLFSYNC
+#if defined(F_FULLFSYNC) && !defined(DEBUG_FAST_MAC)
if (fcntl(of->of_filedes, F_FULLFSYNC, 0) == -1) {
xt_register_ferrno(XT_REG_CONTEXT, errno, xt_file_path(of));
goto failed;
=== modified file 'storage/pbxt/src/filesys_xt.h'
--- a/storage/pbxt/src/filesys_xt.h 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/filesys_xt.h 2009-11-24 10:55:06 +0000
@@ -102,7 +102,7 @@ xtBool xt_fs_rename(struct XTThread *s
#define FILE_MAP_UNLOCK(i, o) xt_xsmutex_unlock(i, o)
#elif defined(FILE_MAP_USE_PTHREAD_RW)
#define FILE_MAP_LOCK_TYPE xt_rwlock_type
-#define FILE_MAP_INIT_LOCK(s, i) xt_init_rwlock(s, i)
+#define FILE_MAP_INIT_LOCK(s, i) xt_init_rwlock_with_autoname(s, i)
#define FILE_MAP_FREE_LOCK(s, i) xt_free_rwlock(i)
#define FILE_MAP_READ_LOCK(i, o) xt_slock_rwlock_ns(i)
#define FILE_MAP_WRITE_LOCK(i, o) xt_xlock_rwlock_ns(i)
=== modified file 'storage/pbxt/src/ha_pbxt.cc'
--- a/storage/pbxt/src/ha_pbxt.cc 2009-09-03 06:15:03 +0000
+++ b/storage/pbxt/src/ha_pbxt.cc 2009-11-27 15:37:02 +0000
@@ -35,6 +35,7 @@
#include <stdlib.h>
#include <time.h>
+#include <ctype.h>
#ifdef DRIZZLED
#include <drizzled/common.h>
@@ -42,14 +43,21 @@
#include <mysys/my_alloc.h>
#include <mysys/hash.h>
#include <drizzled/field.h>
-#include <drizzled/current_session.h>
+#include <drizzled/session.h>
#include <drizzled/data_home.h>
#include <drizzled/error.h>
#include <drizzled/table.h>
#include <drizzled/field/timestamp.h>
#include <drizzled/server_includes.h>
+#include <drizzled/plugin/info_schema_table.h>
extern "C" char **session_query(Session *session);
#define my_strdup(a,b) strdup(a)
+
+using drizzled::plugin::Registry;
+using drizzled::plugin::ColumnInfo;
+using drizzled::plugin::InfoSchemaTable;
+using drizzled::plugin::InfoSchemaMethods;
+
#else
#include "mysql_priv.h"
#include <mysql/plugin.h>
@@ -71,7 +79,7 @@ extern "C" char **session_query(Session
#include "tabcache_xt.h"
#include "systab_xt.h"
#include "xaction_xt.h"
-#include "restart_xt.h"
+#include "backup_xt.h"
#ifdef DEBUG
//#define XT_USE_SYS_PAR_DEBUG_SIZES
@@ -101,6 +109,10 @@ static void pbxt_drop_database(handlert
static int pbxt_close_connection(handlerton *hton, THD* thd);
static int pbxt_commit(handlerton *hton, THD *thd, bool all);
static int pbxt_rollback(handlerton *hton, THD *thd, bool all);
+static int pbxt_prepare(handlerton *hton, THD *thd, bool all);
+static int pbxt_recover(handlerton *hton, XID *xid_list, uint len);
+static int pbxt_commit_by_xid(handlerton *hton, XID *xid);
+static int pbxt_rollback_by_xid(handlerton *hton, XID *xid);
#endif
static void ha_aquire_exclusive_use(XTThreadPtr self, XTSharePtr share, ha_pbxt *mine);
static void ha_release_exclusive_use(XTThreadPtr self, XTSharePtr share);
@@ -123,7 +135,9 @@ static void ha_close_open_tables(XTThre
#ifdef PBXT_HANDLER_TRACE
#define PBXT_ALLOW_PRINTING
-#define XT_TRACE_CALL() do { XTThreadPtr s = xt_get_self(); printf("%s %s\n", s ? s->t_name : "-unknown-", __FUNC__); } while (0)
+#define XT_TRACE_CALL() ha_trace_function(__FUNC__, NULL)
+#define XT_TRACE_METHOD() ha_trace_function(__FUNC__, pb_share->sh_table_path->ps_path)
+
#ifdef PBXT_TRACE_RETURN
#define XT_RETURN(x) do { printf("%d\n", (int) (x)); return (x); } while (0)
#define XT_RETURN_VOID do { printf("out\n"); return; } while (0)
@@ -135,6 +149,7 @@ static void ha_close_open_tables(XTThre
#else
#define XT_TRACE_CALL()
+#define XT_TRACE_METHOD()
#define XT_RETURN(x) return (x)
#define XT_RETURN_VOID return
@@ -165,10 +180,10 @@ xtBool pbxt_crash_debug = TRUE;
xtBool pbxt_crash_debug = FALSE;
#endif
+
/* Variables for pbxt share methods */
static xt_mutex_type pbxt_database_mutex; // Prevent a database from being opened while it is being dropped
static XTHashTabPtr pbxt_share_tables; // Hash used to track open tables
-XTDatabaseHPtr pbxt_database = NULL; // The global open database
static char *pbxt_index_cache_size;
static char *pbxt_record_cache_size;
static char *pbxt_log_cache_size;
@@ -180,6 +195,12 @@ static char *pbxt_data_log_threshold;
static char *pbxt_data_file_grow_size;
static char *pbxt_row_file_grow_size;
static int pbxt_max_threads;
+static my_bool pbxt_support_xa;
+
+#ifndef DRIZZLED
+// drizzle complains it's not used
+static XTXactEnumXARec pbxt_xa_enum;
+#endif
#ifdef DEBUG
#define XT_SHARE_LOCK_WAIT 5000
@@ -259,6 +280,33 @@ static HAVarParamsRec vp_row_file_grow_s
//#define XT_AUTO_INCREMENT_DEF 1
#endif
+#ifdef PBXT_HANDLER_TRACE
+static void ha_trace_function(const char *function, char *table)
+{
+ char func_buf[50], *ptr;
+ XTThreadPtr thread = xt_get_self();
+
+ if ((ptr = strchr(function, '('))) {
+ ptr--;
+ while (ptr > function) {
+ if (!(isalnum(*ptr) || *ptr == '_'))
+ break;
+ ptr--;
+ }
+ ptr++;
+ xt_strcpy(50, func_buf, ptr);
+ if ((ptr = strchr(func_buf, '(')))
+ *ptr = 0;
+ }
+ else
+ xt_strcpy(50, func_buf, function);
+ if (table)
+ printf("%s %s (%s)\n", thread ? thread->t_name : "-unknown-", func_buf, table);
+ else
+ printf("%s %s\n", thread ? thread->t_name : "-unknown-", func_buf);
+}
+#endif
+
/*
* -----------------------------------------------------------------------
* SHARED TABLE DATA
@@ -584,6 +632,9 @@ xtPublic XTThreadPtr xt_ha_thd_to_self(T
/* The first bit is 1. */
static u_int ha_get_max_bit(MX_BITMAP *map)
{
+#ifdef DRIZZLED
+ return map->getFirstSet();
+#else
my_bitmap_map *data_ptr = map->bitmap;
my_bitmap_map *end_ptr = map->last_word_ptr;
my_bitmap_map b;
@@ -612,6 +663,7 @@ static u_int ha_get_max_bit(MX_BITMAP *m
cnt -= 32;
}
return 0;
+#endif
}
/*
@@ -684,9 +736,10 @@ xtPublic int xt_ha_pbxt_to_mysql_error(i
return(-1); // Unknown error
}
-xtPublic int xt_ha_pbxt_thread_error_for_mysql(THD *XT_UNUSED(thd), const XTThreadPtr self, int ignore_dup_key)
+xtPublic int xt_ha_pbxt_thread_error_for_mysql(THD *thd, const XTThreadPtr self, int ignore_dup_key)
{
- int xt_err = self->t_exception.e_xt_err;
+ int xt_err = self->t_exception.e_xt_err;
+ xtBool dup_key = FALSE;
XT_PRINT2(self, "xt_ha_pbxt_thread_error_for_mysql xt_err=%d auto commit=%d\n", (int) xt_err, (int) self->st_auto_commit);
switch (xt_err) {
@@ -725,6 +778,7 @@ xtPublic int xt_ha_pbxt_thread_error_for
/* If we are in auto-commit mode (and we are not ignoring
* duplicate keys) then rollback the transaction automatically.
*/
+ dup_key = TRUE;
if (!ignore_dup_key && self->st_auto_commit)
goto abort_transaction;
break;
@@ -790,26 +844,20 @@ xtPublic int xt_ha_pbxt_thread_error_for
/* Locks are held on tables.
* Only rollback after locks are released.
*/
- self->st_auto_commit = TRUE;
+ /* I do not think this is required, because
+ * I tell mysql to rollback below,
+ * besides it is a hack!
+ self->st_auto_commit = TRUE;
+ */
self->st_abort_trans = TRUE;
}
-#ifdef xxxx
-/* DBUG_ASSERT(thd->transaction.stmt.ha_list == NULL ||
- trans == &thd->transaction.stmt); in handler.cc now
- * fails, and I don't know if this function can be called anymore! */
- /* Cause any other DBs to do a rollback as well... */
- if (thd) {
- /*
- * GOTCHA:
- * This is a BUG in MySQL. I cannot rollback a transaction if
- * pb_mysql_thd->in_sub_stmt! But I must....?!
- */
-#ifdef MYSQL_SERVER
- if (!thd->in_sub_stmt)
- ha_rollback(thd);
-#endif
+ /* Only tell MySQL to rollback if we automatically rollback.
+ * Note: calling this with (thd, FALSE), cause sp.test to fail.
+ */
+ if (!dup_key) {
+ if (thd)
+ thd_mark_transaction_to_rollback(thd, TRUE);
}
-#endif
}
break;
}
@@ -908,7 +956,11 @@ static void pbxt_call_init(XTThreadPtr s
xt_db_data_file_grow_size = (size_t) data_file_grow_size;
xt_db_row_file_grow_size = (size_t) row_file_grow_size;
+#ifdef DRIZZLED
+ pbxt_ignore_case = TRUE;
+#else
pbxt_ignore_case = lower_case_table_names != 0;
+#endif
if (pbxt_ignore_case)
pbxt_share_tables = xt_new_hashtable(self, ha_hash_comp_ci, ha_hash_ci, ha_hash_free, TRUE, FALSE);
else
@@ -968,7 +1020,7 @@ static void pbxt_call_exit(XTThreadPtr s
*/
static void ha_exit(XTThreadPtr self)
{
- xt_xres_wait_for_recovery(self);
+ xt_xres_terminate_recovery(self);
/* Wrap things up... */
xt_unuse_database(self, self); /* Just in case the main thread has a database in use (for testing)? */
@@ -1024,7 +1076,7 @@ static bool pbxt_show_status(handlerton
cont_(a);
if (!not_ok) {
- if (stat_print(thd, "PBXT", 4, "", 0, strbuf.sb_cstring, strbuf.sb_len))
+ if (stat_print(thd, "PBXT", 4, "", 0, strbuf.sb_cstring, (uint) strbuf.sb_len))
not_ok = TRUE;
}
xt_sb_set_size(self, &strbuf, 0);
@@ -1038,14 +1090,14 @@ static bool pbxt_show_status(handlerton
* return 1 on error, else 0.
*/
#ifdef DRIZZLED
-static int pbxt_init(PluginRegistry ®istry)
+static int pbxt_init(Registry ®istry)
#else
static int pbxt_init(void *p)
#endif
{
int init_err = 0;
- XT_TRACE_CALL();
+ XT_PRINT0(NULL, "pbxt_init\n");
if (sizeof(xtWordPS) != sizeof(void *)) {
printf("PBXT: This won't work, I require that sizeof(xtWordPS) == sizeof(void *)!\n");
@@ -1076,11 +1128,27 @@ static int pbxt_init(void *p)
pbxt_hton->close_connection = pbxt_close_connection; /* close_connection, cleanup thread related data. */
pbxt_hton->commit = pbxt_commit; /* commit */
pbxt_hton->rollback = pbxt_rollback; /* rollback */
+ if (pbxt_support_xa) {
+ pbxt_hton->prepare = pbxt_prepare;
+ pbxt_hton->recover = pbxt_recover;
+ pbxt_hton->commit_by_xid = pbxt_commit_by_xid;
+ pbxt_hton->rollback_by_xid = pbxt_rollback_by_xid;
+ }
+ else {
+ pbxt_hton->prepare = NULL;
+ pbxt_hton->recover = NULL;
+ pbxt_hton->commit_by_xid = NULL;
+ pbxt_hton->rollback_by_xid = NULL;
+ }
pbxt_hton->create = pbxt_create_handler; /* Create a new handler */
pbxt_hton->drop_database = pbxt_drop_database; /* Drop a database */
pbxt_hton->panic = pbxt_panic; /* Panic call */
pbxt_hton->show_status = pbxt_show_status;
pbxt_hton->flags = HTON_NO_FLAGS; /* HTON_CAN_RECREATE - Without this flags TRUNCATE uses delete_all_rows() */
+ pbxt_hton->slot = (uint)-1; /* assign invald value, so we know when it's inited later */
+#if defined(MYSQL_SUPPORTS_BACKUP) && defined(XT_ENABLE_ONLINE_BACKUP)
+ pbxt_hton->get_backup_engine = pbxt_backup_engine;
+#endif
#endif
if (!xt_init_logging()) /* Initialize logging */
goto error_1;
@@ -1160,8 +1228,10 @@ static int pbxt_init(void *p)
* Only real problem, 2 threads try to load the same
* plugin at the same time.
*/
+#if MYSQL_VERSION_ID < 60014
myxt_mutex_unlock(&LOCK_plugin);
#endif
+#endif
/* Can't do this here yet, because I need a THD! */
try_(b) {
@@ -1195,8 +1265,10 @@ static int pbxt_init(void *p)
if (thd)
myxt_destroy_thread(thd, FALSE);
#ifndef DRIZZLED
+#if MYSQL_VERSION_ID < 60014
myxt_mutex_lock(&LOCK_plugin);
#endif
+#endif
}
#endif
}
@@ -1262,7 +1334,7 @@ static int pbxt_init(void *p)
}
#ifdef DRIZZLED
-static int pbxt_end(PluginRegistry ®istry)
+static int pbxt_end(Registry ®istry)
#else
static int pbxt_end(void *)
#endif
@@ -1378,7 +1450,7 @@ static int pbxt_commit(handlerton *hton,
* transaction (!all && !self->st_auto_commit).
*/
if (all || self->st_auto_commit) {
- XT_PRINT0(self, "xt_xn_commit\n");
+ XT_PRINT0(self, "xt_xn_commit in pbxt_commit\n");
if (!xt_xn_commit(self))
err = xt_ha_pbxt_thread_error_for_mysql(thd, self, FALSE);
@@ -1402,7 +1474,7 @@ static int pbxt_rollback(handlerton *hto
XTThreadPtr self;
if ((self = (XTThreadPtr) *thd_ha_data(thd, hton))) {
- XT_PRINT1(self, "pbxt_rollback all=%d\n", all);
+ XT_PRINT1(self, "pbxt_rollback all=%d in pbxt_commit\n", all);
if (self->st_xact_data) {
/* There are no table locks, rollback immediately in all cases
@@ -1431,7 +1503,7 @@ static int pbxt_rollback(handlerton *hto
}
#ifdef DRIZZLED
-handler *PBXTStorageEngine::create(TABLE_SHARE *table, MEM_ROOT *mem_root)
+Cursor *PBXTStorageEngine::create(TABLE_SHARE *table, MEM_ROOT *mem_root)
{
PBXTStorageEngine * const hton = this;
#else
@@ -1446,6 +1518,182 @@ static handler *pbxt_create_handler(hand
/*
* -----------------------------------------------------------------------
+ * 2-PHASE COMMIT
+ *
+ */
+
+#ifndef DRIZZLED
+
+static int pbxt_prepare(handlerton *hton, THD *thd, bool all)
+{
+ int err = 0;
+ XTThreadPtr self;
+
+ XT_TRACE_CALL();
+ if ((self = (XTThreadPtr) *thd_ha_data(thd, hton))) {
+ XT_PRINT1(self, "pbxt_commit all=%d\n", all);
+
+ if (self->st_xact_data) {
+ /* There are no table locks, commit immediately in all cases
+ * except when this is a statement commit with an explicit
+ * transaction (!all && !self->st_auto_commit).
+ */
+ if (all) {
+ XID xid;
+
+ XT_PRINT0(self, "xt_xn_prepare in pbxt_prepare\n");
+ thd_get_xid(thd, (MYSQL_XID*) &xid);
+
+ if (!xt_xn_prepare(xid.length(), (xtWord1 *) &xid, self))
+ err = xt_ha_pbxt_thread_error_for_mysql(thd, self, FALSE);
+ }
+ }
+ }
+ return err;
+}
+
+static XTThreadPtr ha_temp_open_global_database(handlerton *hton, THD **ret_thd, int *temp_thread, char *thread_name, int *err)
+{
+ THD *thd;
+ XTThreadPtr self = NULL;
+
+ *temp_thread = 0;
+ if ((thd = current_thd))
+ self = (XTThreadPtr) *thd_ha_data(thd, hton);
+ else {
+ //thd = (THD *) myxt_create_thread();
+ //*temp_thread |= 2;
+ }
+
+ if (!self) {
+ XTExceptionRec e;
+
+ if (!(self = xt_create_thread(thread_name, FALSE, TRUE, &e))) {
+ *err = xt_ha_pbxt_to_mysql_error(e.e_xt_err);
+ xt_log_exception(NULL, &e, XT_LOG_DEFAULT);
+ return NULL;
+ }
+ *temp_thread |= 1;
+ }
+
+ xt_xres_wait_for_recovery(self, XT_RECOVER_DONE);
+
+ try_(a) {
+ xt_open_database(self, mysql_real_data_home, TRUE);
+ }
+ catch_(a) {
+ *err = xt_ha_pbxt_thread_error_for_mysql(thd, self, FALSE);
+ if ((*temp_thread & 1))
+ xt_free_thread(self);
+ if (*temp_thread & 2)
+ myxt_destroy_thread(thd, FALSE);
+ self = NULL;
+ }
+ cont_(a);
+
+ *ret_thd = thd;
+ return self;
+}
+
+static void ha_temp_close_database(XTThreadPtr self, THD *thd, int temp_thread)
+{
+ xt_unuse_database(self, self);
+ if (temp_thread & 1)
+ xt_free_thread(self);
+ if (temp_thread & 2)
+ myxt_destroy_thread(thd, TRUE);
+}
+
+/* Return all prepared transactions, found during recovery.
+ * This function returns a count. If len is returned, the
+ * function will be called again.
+ */
+static int pbxt_recover(handlerton *hton, XID *xid_list, uint len)
+{
+ xtBool temp_thread;
+ XTThreadPtr self;
+ XTDatabaseHPtr db;
+ uint count = 0;
+ XTXactPreparePtr xap;
+ int err;
+ THD *thd;
+
+ if (!(self = ha_temp_open_global_database(hton, &thd, &temp_thread, "TempForRecover", &err)))
+ return 0;
+
+ db = self->st_database;
+
+ for (count=0; count<len; count++) {
+ xap = xt_xn_enum_xa_data(db, &pbxt_xa_enum);
+ if (!xap)
+ break;
+ memcpy(&xid_list[count], xap->xp_xa_data, xap->xp_data_len);
+ }
+
+ ha_temp_close_database(self, thd, temp_thread);
+ return (int) count;
+}
+
+static int pbxt_commit_by_xid(handlerton *hton, XID *xid)
+{
+ xtBool temp_thread;
+ XTThreadPtr self;
+ XTDatabaseHPtr db;
+ int err = 0;
+ XTXactPreparePtr xap;
+ THD *thd;
+
+ XT_TRACE_CALL();
+
+ if (!(self = ha_temp_open_global_database(hton, &thd, &temp_thread, "TempForCommitXA", &err)))
+ return err;
+ db = self->st_database;
+
+ if ((xap = xt_xn_find_xa_data(db, xid->length(), (xtWord1 *) xid, TRUE, self))) {
+ if ((self->st_xact_data = xt_xn_get_xact(db, xap->xp_xact_id, self))) {
+ self->st_xact_data->xd_flags &= ~XT_XN_XAC_PREPARED; // Prepared transactions cannot be swept!
+ if (!xt_xn_commit(self))
+ err = xt_ha_pbxt_thread_error_for_mysql(thd, self, FALSE);
+ }
+ xt_xn_delete_xa_data(db, xap, TRUE, self);
+ }
+
+ ha_temp_close_database(self, thd, temp_thread);
+ return 0;
+}
+
+static int pbxt_rollback_by_xid(handlerton *hton, XID *xid)
+{
+ int temp_thread;
+ XTThreadPtr self;
+ XTDatabaseHPtr db;
+ int err = 0;
+ XTXactPreparePtr xap;
+ THD *thd;
+
+ XT_TRACE_CALL();
+
+ if (!(self = ha_temp_open_global_database(hton, &thd, &temp_thread, "TempForRollbackXA", &err)))
+ return err;
+ db = self->st_database;
+
+ if ((xap = xt_xn_find_xa_data(db, xid->length(), (xtWord1 *) xid, TRUE, self))) {
+ if ((self->st_xact_data = xt_xn_get_xact(db, xap->xp_xact_id, self))) {
+ self->st_xact_data->xd_flags &= ~XT_XN_XAC_PREPARED; // Prepared transactions cannot be swept!
+ if (!xt_xn_rollback(self))
+ err = xt_ha_pbxt_thread_error_for_mysql(thd, self, FALSE);
+ }
+ xt_xn_delete_xa_data(db, xap, TRUE, self);
+ }
+
+ ha_temp_close_database(self, thd, temp_thread);
+ return 0;
+}
+
+#endif
+
+/*
+ * -----------------------------------------------------------------------
* HANDLER LOCKING FUNCTIONS
*
* These functions are used get a lock on all handles of a particular table.
@@ -1497,7 +1745,7 @@ static void ha_aquire_exclusive_use(XTTh
ha_pbxt *handler;
time_t end_time = time(NULL) + XT_SHARE_LOCK_TIMEOUT / 1000;
- XT_PRINT1(self, "ha_aquire_exclusive_use %s PBXT X lock\n", share->sh_table_path->ps_path);
+ XT_PRINT1(self, "ha_aquire_exclusive_use (%s) PBXT X lock\n", share->sh_table_path->ps_path);
/* GOTCHA: It is possible to hang here, if you hold
* onto the sh_ex_mutex lock, before we really
* have the exclusive lock (i.e. before all
@@ -1578,7 +1826,7 @@ static void ha_release_exclusive_use(XTT
static void ha_release_exclusive_use(XTThreadPtr XT_UNUSED(self), XTSharePtr share)
#endif
{
- XT_PRINT1(self, "ha_release_exclusive_use %s PBXT X UNLOCK\n", share->sh_table_path->ps_path);
+ XT_PRINT1(self, "ha_release_exclusive_use (%s) PBXT X UNLOCK\n", share->sh_table_path->ps_path);
xt_lock_mutex_ns((xt_mutex_type *) share->sh_ex_mutex);
share->sh_table_lock = FALSE;
xt_broadcast_cond_ns((xt_cond_type *) share->sh_ex_cond);
@@ -1589,7 +1837,7 @@ static xtBool ha_wait_for_shared_use(ha_
{
time_t end_time = time(NULL) + XT_SHARE_LOCK_TIMEOUT / 1000;
- XT_PRINT1(xt_get_self(), "ha_wait_for_shared_use %s share lock wait...\n", share->sh_table_path->ps_path);
+ XT_PRINT1(xt_get_self(), "ha_wait_for_shared_use (%s) share lock wait...\n", share->sh_table_path->ps_path);
mine->pb_ex_in_use = 0;
xt_lock_mutex_ns((xt_mutex_type *) share->sh_ex_mutex);
while (share->sh_table_lock) {
@@ -1667,14 +1915,38 @@ xtPublic int ha_pbxt::reopen()
*
*/
-int pbxt_statistics_fill_table(THD *thd, TABLE_LIST *tables, COND *cond)
+static int pbxt_statistics_fill_table(THD *thd, TABLE_LIST *tables, COND *cond)
{
- XTThreadPtr self;
+ XTThreadPtr self = NULL;
int err = 0;
+ if (!pbxt_hton) {
+ /* Can't do if PBXT is not loaded! */
+ XTExceptionRec e;
+
+ xt_exception_xterr(&e, XT_CONTEXT, XT_ERR_PBXT_NOT_INSTALLED);
+ xt_log_exception(NULL, &e, XT_LOG_DEFAULT);
+ /* Just return an empty set: */
+ return 0;
+ }
+
if (!(self = ha_set_current_thread(thd, &err)))
return xt_ha_pbxt_to_mysql_error(err);
+
+
try_(a) {
+ /* If the thread has no open database, and the global
+ * database is already open, then open
+ * the database. Otherwise the statement will be
+ * executed without an open database, which means
+ * that the related statistics will be missing.
+ *
+ * This includes all background threads.
+ */
+ if (!self->st_database && pbxt_database) {
+ xt_ha_open_database_of_table(self, (XTPathStrPtr) NULL);
+ }
+
err = myxt_statistics_fill_table(self, thd, tables, cond, system_charset_info);
}
catch_(a) {
@@ -1684,6 +1956,24 @@ int pbxt_statistics_fill_table(THD *thd,
return err;
}
+#ifdef DRIZZLED
+ColumnInfo pbxt_statistics_fields_info[]=
+{
+ ColumnInfo("ID", 4, MYSQL_TYPE_LONG, 0, 0, "The ID of the statistic", SKIP_OPEN_TABLE),
+ ColumnInfo("Name", 40, MYSQL_TYPE_STRING, 0, 0, "The name of the statistic", SKIP_OPEN_TABLE),
+ ColumnInfo("Value", 8, MYSQL_TYPE_LONGLONG, 0, 0, "The accumulated value", SKIP_OPEN_TABLE),
+ ColumnInfo()
+};
+
+class PBXTStatisticsMethods : public InfoSchemaMethods
+{
+public:
+ int fillTable(Session *session, TableList *tables, COND *cond)
+ {
+ return pbxt_statistics_fill_table(session, tables, cond);
+ }
+};
+#else
ST_FIELD_INFO pbxt_statistics_fields_info[]=
{
{ "ID", 4, MYSQL_TYPE_LONG, 0, 0, "The ID of the statistic", SKIP_OPEN_TABLE},
@@ -1691,24 +1981,28 @@ ST_FIELD_INFO pbxt_statistics_fields_inf
{ "Value", 8, MYSQL_TYPE_LONGLONG, 0, 0, "The accumulated value", SKIP_OPEN_TABLE},
{ 0, 0, MYSQL_TYPE_STRING, 0, 0, 0, SKIP_OPEN_TABLE}
};
+#endif
#ifdef DRIZZLED
static InfoSchemaTable *pbxt_statistics_table;
-
-int pbxt_init_statitics(PluginRegistry ®istry)
+static PBXTStatisticsMethods pbxt_statistics_methods;
+static int pbxt_init_statistics(Registry ®istry)
#else
-int pbxt_init_statitics(void *p)
+static int pbxt_init_statistics(void *p)
#endif
{
#ifdef DRIZZLED
- pbxt_statistics_table = (InfoSchemaTable *)xt_calloc_ns(sizeof(InfoSchemaTable));
- pbxt_statistics_table->table_name= "PBXT_STATISTICS";
+ //pbxt_statistics_table = (InfoSchemaTable *)xt_calloc_ns(sizeof(InfoSchemaTable));
+ //pbxt_statistics_table->table_name= "PBXT_STATISTICS";
+ pbxt_statistics_table = new InfoSchemaTable("PBXT_STATISTICS");
+ pbxt_statistics_table->setColumnInfo(pbxt_statistics_fields_info);
+ pbxt_statistics_table->setInfoSchemaMethods(&pbxt_statistics_methods);
registry.add(pbxt_statistics_table);
#else
ST_SCHEMA_TABLE *pbxt_statistics_table = (ST_SCHEMA_TABLE *) p;
-#endif
pbxt_statistics_table->fields_info = pbxt_statistics_fields_info;
pbxt_statistics_table->fill_table = pbxt_statistics_fill_table;
+#endif
#if defined(XT_WIN) && defined(XT_COREDUMP)
void register_crash_filter();
@@ -1721,14 +2015,14 @@ int pbxt_init_statitics(void *p)
}
#ifdef DRIZZLED
-int pbxt_exit_statitics(PluginRegistry ®istry)
+static int pbxt_exit_statistics(Registry ®istry)
#else
-int pbxt_exit_statitics(void *XT_UNUSED(p))
+static int pbxt_exit_statistics(void *XT_UNUSED(p))
#endif
{
#ifdef DRIZZLED
registry.remove(pbxt_statistics_table);
- xt_free_ns(pbxt_statistics_table);
+ delete pbxt_statistics_table;
#endif
return(0);
}
@@ -1758,7 +2052,11 @@ ha_pbxt::ha_pbxt(handlerton *hton, TABLE
* exist for the storage engine. This is also used by the default rename_table and
* delete_table method in handler.cc.
*/
+#ifdef DRIZZLED
+const char **PBXTStorageEngine::bas_ext() const
+#else
const char **ha_pbxt::bas_ext() const
+#endif
{
return pbxt_extensions;
}
@@ -1800,11 +2098,13 @@ MX_TABLE_TYPES_T ha_pbxt::table_flags()
* purposes!
HA_NOT_EXACT_COUNT |
*/
+#ifndef DRIZZLED
/*
* This basically means we have a file with the name of
* database table (which we do).
*/
HA_FILE_BASED |
+#endif
/*
* Not sure what this does (but MyISAM and InnoDB have it)?!
* Could it mean that we support the handler functions.
@@ -1971,7 +2271,7 @@ int ha_pbxt::open(const char *table_path
if (!(self = ha_set_current_thread(thd, &err)))
return xt_ha_pbxt_to_mysql_error(err);
- XT_PRINT1(self, "ha_pbxt::open %s\n", table_path);
+ XT_PRINT1(self, "open (%s)\n", table_path);
pb_ex_in_use = 1;
try_(a) {
@@ -2049,7 +2349,7 @@ int ha_pbxt::close(void)
}
}
- XT_PRINT1(self, "ha_pbxt::close %s\n", pb_share && pb_share->sh_table_path->ps_path ? pb_share->sh_table_path->ps_path : "unknown");
+ XT_PRINT1(self, "close (%s)\n", pb_share && pb_share->sh_table_path->ps_path ? pb_share->sh_table_path->ps_path : "unknown");
if (self) {
try_(a) {
@@ -2125,9 +2425,10 @@ void ha_pbxt::init_auto_increment(xtWord
if (!TS(table)->next_number_key_offset) {
// Autoincrement at key-start
err = index_last(table->record[1]);
- if (!err)
+ if (!err && !table->next_number_field->is_null(TS(table)->rec_buff_length)) {
/* {PRE-INC} */
nr = (xtWord8) table->next_number_field->val_int_offset(TS(table)->rec_buff_length);
+ }
}
else {
/* Do an index scan to find the largest value! */
@@ -2180,8 +2481,10 @@ void ha_pbxt::init_auto_increment(xtWord
table->next_number_field = tmp_fie;
table->in_use = tmp_thd;
- if (xn_started)
+ if (xn_started) {
+ XT_PRINT0(self, "xt_xn_commit in init_auto_increment\n");
xt_xn_commit(self);
+ }
}
xt_spinlock_unlock(&tab->tab_ainc_lock);
}
@@ -2228,13 +2531,13 @@ void ha_pbxt::get_auto_increment(MX_ULON
* insert into t1 values (-1);
* insert into t1 values (NULL);
*/
-void ha_pbxt::set_auto_increment(Field *nr)
+xtPublic void ha_set_auto_increment(XTOpenTablePtr ot, Field *nr)
{
register XTTableHPtr tab;
MX_ULONGLONG_T nr_int_val;
nr_int_val = nr->val_int();
- tab = pb_open_tab->ot_table;
+ tab = ot->ot_table;
if (nr->cmp((const unsigned char *)&tab->tab_auto_inc) > 0) {
xt_spinlock_lock(&tab->tab_ainc_lock);
@@ -2260,9 +2563,9 @@ void ha_pbxt::set_auto_increment(Field *
#else
tab->tab_dic.dic_min_auto_inc = nr_int_val + 100;
#endif
- pb_open_tab->ot_thread = xt_get_self();
- if (!xt_tab_write_min_auto_inc(pb_open_tab))
- xt_log_and_clear_exception(pb_open_tab->ot_thread);
+ ot->ot_thread = xt_get_self();
+ if (!xt_tab_write_min_auto_inc(ot))
+ xt_log_and_clear_exception(ot->ot_thread);
}
}
}
@@ -2305,7 +2608,7 @@ int ha_pbxt::write_row(byte *buf)
ASSERT_NS(pb_ex_in_use);
- XT_PRINT1(pb_open_tab->ot_thread, "ha_pbxt::write_row %s\n", pb_share->sh_table_path->ps_path);
+ XT_PRINT1(pb_open_tab->ot_thread, "write_row (%s)\n", pb_share->sh_table_path->ps_path);
XT_DISABLED_TRACE(("INSERT tx=%d val=%d\n", (int) pb_open_tab->ot_thread->st_xact_data->xd_start_xn_id, (int) XT_GET_DISK_4(&buf[1])));
//statistic_increment(ha_write_count,&LOCK_status);
#ifdef PBMS_ENABLED
@@ -2350,7 +2653,7 @@ int ha_pbxt::write_row(byte *buf)
err = update_err;
goto done;
}
- set_auto_increment(table->next_number_field);
+ ha_set_auto_increment(pb_open_tab, table->next_number_field);
}
if (!xt_tab_new_record(pb_open_tab, (xtWord1 *) buf)) {
@@ -2423,7 +2726,7 @@ int ha_pbxt::update_row(const byte * old
ASSERT_NS(pb_ex_in_use);
- XT_PRINT1(self, "ha_pbxt::update_row %s\n", pb_share->sh_table_path->ps_path);
+ XT_PRINT1(self, "update_row (%s)\n", pb_share->sh_table_path->ps_path);
XT_DISABLED_TRACE(("UPDATE tx=%d val=%d\n", (int) self->st_xact_data->xd_start_xn_id, (int) XT_GET_DISK_4(&new_data[1])));
//statistic_increment(ha_update_count,&LOCK_status);
@@ -2472,7 +2775,7 @@ int ha_pbxt::update_row(const byte * old
old_map = mx_tmp_use_all_columns(table, table->read_set);
nr = table->found_next_number_field->val_int();
- set_auto_increment(table->found_next_number_field);
+ ha_set_auto_increment(pb_open_tab, table->found_next_number_field);
mx_tmp_restore_column_map(table, old_map);
}
@@ -2504,7 +2807,7 @@ int ha_pbxt::delete_row(const byte * buf
ASSERT_NS(pb_ex_in_use);
- XT_PRINT1(pb_open_tab->ot_thread, "ha_pbxt::delete_row %s\n", pb_share->sh_table_path->ps_path);
+ XT_PRINT1(pb_open_tab->ot_thread, "delete_row (%s)\n", pb_share->sh_table_path->ps_path);
XT_DISABLED_TRACE(("DELETE tx=%d val=%d\n", (int) pb_open_tab->ot_thread->st_xact_data->xd_start_xn_id, (int) XT_GET_DISK_4(&buf[1])));
//statistic_increment(ha_delete_count,&LOCK_status);
@@ -2829,7 +3132,8 @@ int ha_pbxt::xt_index_prev_read(XTOpenTa
int ha_pbxt::index_init(uint idx, bool XT_UNUSED(sorted))
{
- XTIndexPtr ind;
+ XTIndexPtr ind;
+ XTThreadPtr thread = pb_open_tab->ot_thread;
/* select count(*) from smalltab_PBXT;
* ignores the error below, and continues to
@@ -2851,6 +3155,15 @@ int ha_pbxt::index_init(uint idx, bool X
printf("index_init %s index %d cols req=%d/%d read_bits=%X write_bits=%X index_bits=%X\n", pb_open_tab->ot_table->tab_name->ps_path, (int) idx, pb_open_tab->ot_cols_req, pb_open_tab->ot_cols_req, (int) *table->read_set->bitmap, (int) *table->write_set->bitmap, (int) *ind->mi_col_map.bitmap);
#endif
+ /* Start a statement based transaction as soon
+ * as a read is done for a modify type statement!
+ * Previously, this was done too late!
+ */
+ if (!thread->st_stat_trans) {
+ trans_register_ha(pb_mysql_thd, FALSE, pbxt_hton);
+ XT_PRINT0(thread, "ha_pbxt::update_row trans_register_ha all=FALSE\n");
+ thread->st_stat_trans = TRUE;
+ }
}
else {
pb_open_tab->ot_cols_req = ha_get_max_bit(table->read_set);
@@ -2901,7 +3214,7 @@ int ha_pbxt::index_init(uint idx, bool X
#endif
}
- xt_xlog_check_long_writer(pb_open_tab->ot_thread);
+ xt_xlog_check_long_writer(thread);
pb_open_tab->ot_thread->st_statistics.st_scan_index++;
return 0;
@@ -2911,7 +3224,7 @@ int ha_pbxt::index_end()
{
int err = 0;
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
XTThreadPtr thread = pb_open_tab->ot_thread;
@@ -2991,7 +3304,7 @@ int ha_pbxt::index_read_xt(byte * buf, u
ASSERT_NS(pb_ex_in_use);
- XT_PRINT1(pb_open_tab->ot_thread, "ha_pbxt::index_read_xt %s\n", pb_share->sh_table_path->ps_path);
+ XT_PRINT1(pb_open_tab->ot_thread, "index_read_xt (%s)\n", pb_share->sh_table_path->ps_path);
XT_DISABLED_TRACE(("search tx=%d val=%d update=%d\n", (int) pb_open_tab->ot_thread->st_xact_data->xd_start_xn_id, (int) XT_GET_DISK_4(key), pb_modified));
ind = (XTIndexPtr) pb_share->sh_dic_keys[idx];
@@ -3072,7 +3385,7 @@ int ha_pbxt::index_next(byte * buf)
int err = 0;
XTIndexPtr ind;
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
//statistic_increment(ha_read_next_count,&LOCK_status);
ASSERT_NS(pb_ex_in_use);
@@ -3114,7 +3427,7 @@ int ha_pbxt::index_next_same(byte * buf,
XTIndexPtr ind;
XTIdxSearchKeyRec search_key;
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
//statistic_increment(ha_read_next_count,&LOCK_status);
ASSERT_NS(pb_ex_in_use);
@@ -3154,7 +3467,7 @@ int ha_pbxt::index_prev(byte * buf)
int err = 0;
XTIndexPtr ind;
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
//statistic_increment(ha_read_prev_count,&LOCK_status);
ASSERT_NS(pb_ex_in_use);
@@ -3188,7 +3501,7 @@ int ha_pbxt::index_first(byte * buf)
XTIndexPtr ind;
XTIdxSearchKeyRec search_key;
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
//statistic_increment(ha_read_first_count,&LOCK_status);
ASSERT_NS(pb_ex_in_use);
@@ -3228,7 +3541,7 @@ int ha_pbxt::index_last(byte * buf)
XTIndexPtr ind;
XTIdxSearchKeyRec search_key;
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
//statistic_increment(ha_read_last_count,&LOCK_status);
ASSERT_NS(pb_ex_in_use);
@@ -3275,10 +3588,11 @@ int ha_pbxt::index_last(byte * buf)
*/
int ha_pbxt::rnd_init(bool scan)
{
- int err = 0;
+ int err = 0;
+ XTThreadPtr thread = pb_open_tab->ot_thread;
- XT_PRINT1(pb_open_tab->ot_thread, "ha_pbxt::rnd_init %s\n", pb_share->sh_table_path->ps_path);
- XT_DISABLED_TRACE(("seq scan tx=%d\n", (int) pb_open_tab->ot_thread->st_xact_data->xd_start_xn_id));
+ XT_PRINT1(thread, "rnd_init (%s)\n", pb_share->sh_table_path->ps_path);
+ XT_DISABLED_TRACE(("seq scan tx=%d\n", (int) thread->st_xact_data->xd_start_xn_id));
/* Call xt_tab_seq_exit() to make sure the resources used by the previous
* scan are freed. In particular make sure cache page ref count is decremented.
@@ -3296,8 +3610,18 @@ int ha_pbxt::rnd_init(bool scan)
xt_tab_seq_exit(pb_open_tab);
/* The number of columns required: */
- if (pb_open_tab->ot_is_modify)
+ if (pb_open_tab->ot_is_modify) {
pb_open_tab->ot_cols_req = table->read_set->MX_BIT_SIZE();
+ /* Start a statement based transaction as soon
+ * as a read is done for a modify type statement!
+ * Previously, this was done too late!
+ */
+ if (!thread->st_stat_trans) {
+ trans_register_ha(pb_mysql_thd, FALSE, pbxt_hton);
+ XT_PRINT0(thread, "ha_pbxt::update_row trans_register_ha all=FALSE\n");
+ thread->st_stat_trans = TRUE;
+ }
+ }
else {
pb_open_tab->ot_cols_req = ha_get_max_bit(table->read_set);
@@ -3322,14 +3646,14 @@ int ha_pbxt::rnd_init(bool scan)
else
xt_tab_seq_reset(pb_open_tab);
- xt_xlog_check_long_writer(pb_open_tab->ot_thread);
+ xt_xlog_check_long_writer(thread);
return err;
}
int ha_pbxt::rnd_end()
{
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
/*
* make permanent the lock for the last scanned row
@@ -3358,7 +3682,7 @@ int ha_pbxt::rnd_next(byte *buf)
int err = 0;
xtBool eof;
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
ASSERT_NS(pb_ex_in_use);
//statistic_increment(ha_read_rnd_next_count, &LOCK_status);
xt_xlog_check_long_writer(pb_open_tab->ot_thread);
@@ -3393,7 +3717,7 @@ int ha_pbxt::rnd_next(byte *buf)
*/
void ha_pbxt::position(const byte *XT_UNUSED(record))
{
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
ASSERT_NS(pb_ex_in_use);
/*
* I changed this from using little endian to big endian.
@@ -3429,10 +3753,10 @@ int ha_pbxt::rnd_pos(byte * buf, byte *p
{
int err = 0;
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
ASSERT_NS(pb_ex_in_use);
//statistic_increment(ha_read_rnd_count, &LOCK_status);
- XT_PRINT1(pb_open_tab->ot_thread, "ha_pbxt::rnd_pos %s\n", pb_share->sh_table_path->ps_path);
+ XT_PRINT1(pb_open_tab->ot_thread, "rnd_pos (%s)\n", pb_share->sh_table_path->ps_path);
pb_open_tab->ot_curr_rec_id = mi_uint4korr((xtWord1 *) pos);
switch (xt_tab_dirty_read_record(pb_open_tab, (xtWord1 *) buf)) {
@@ -3509,7 +3833,7 @@ int ha_pbxt::info(uint flag)
XTOpenTablePtr ot;
int in_use;
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
if (!(in_use = pb_ex_in_use)) {
pb_ex_in_use = 1;
@@ -3535,7 +3859,7 @@ int ha_pbxt::info(uint flag)
stats.index_file_length = xt_ind_node_to_offset(ot->ot_table, ot->ot_table->tab_ind_eof);
stats.delete_length = ot->ot_table->tab_rec_fnum * ot->ot_rec_size;
//check_time = info.check_time;
- stats.mean_rec_length = ot->ot_rec_size;
+ stats.mean_rec_length = (ulong) ot->ot_rec_size;
}
if (flag & HA_STATUS_CONST) {
@@ -3551,7 +3875,9 @@ int ha_pbxt::info(uint flag)
stats.block_size = XT_INDEX_PAGE_SIZE;
if (share->tmp_table == NO_TMP_TABLE)
-#if MYSQL_VERSION_ID > 60005
+#ifdef DRIZZLED
+#define WHICH_MUTEX mutex
+#elif MYSQL_VERSION_ID >= 50404
#define WHICH_MUTEX LOCK_ha_data
#else
#define WHICH_MUTEX mutex
@@ -3559,19 +3885,15 @@ int ha_pbxt::info(uint flag)
#ifdef SAFE_MUTEX
-#if MYSQL_VERSION_ID < 60000
+#if MYSQL_VERSION_ID < 50404
#if MYSQL_VERSION_ID < 50123
safe_mutex_lock(&share->mutex,__FILE__,__LINE__);
#else
safe_mutex_lock(&share->mutex,0,__FILE__,__LINE__);
#endif
#else
-#if MYSQL_VERSION_ID < 60004
- safe_mutex_lock(&share->mutex,__FILE__,__LINE__);
-#else
safe_mutex_lock(&share->WHICH_MUTEX,0,__FILE__,__LINE__);
#endif
-#endif
#else // SAFE_MUTEX
@@ -3675,7 +3997,7 @@ int ha_pbxt::extra(enum ha_extra_functio
{
int err = 0;
- XT_PRINT2(xt_get_self(), "ha_pbxt::extra %s operation=%d\n", pb_share->sh_table_path->ps_path, operation);
+ XT_PRINT2(xt_get_self(), "ha_pbxt::extra (%s) operation=%d\n", pb_share->sh_table_path->ps_path, operation);
switch (operation) {
case HA_EXTRA_RESET_STATE:
@@ -3771,14 +4093,14 @@ int ha_pbxt::extra(enum ha_extra_functio
*/
int ha_pbxt::reset(void)
{
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
extra(HA_EXTRA_RESET_STATE);
XT_RETURN(0);
}
void ha_pbxt::unlock_row()
{
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
if (pb_open_tab)
pb_open_tab->ot_table->tab_locks.xt_remove_temp_lock(pb_open_tab, FALSE);
}
@@ -3802,7 +4124,7 @@ int ha_pbxt::delete_all_rows()
XTDDTable *tab_def = NULL;
char path[PATH_MAX];
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
if (thd_sql_command(thd) != SQLCOM_TRUNCATE) {
/* Just like InnoDB we only handle TRUNCATE TABLE
@@ -3906,7 +4228,7 @@ int ha_pbxt::analyze(THD *thd, HA_CHECK_
xtXactID clean_xn_id = 0;
uint cnt = 10;
- XT_TRACE_CALL();
+ XT_TRACE_METHOD();
if (!pb_open_tab) {
if ((err = reopen()))
@@ -4056,7 +4378,7 @@ xtPublic int ha_pbxt::external_lock(THD
ASSERT_NS(pb_ex_in_use);
*/
- XT_PRINT1(self, "ha_pbxt::EXTERNAL_LOCK %s lock_type=UNLOCK\n", pb_share->sh_table_path->ps_path);
+ XT_PRINT1(self, "EXTERNAL_LOCK (%s) lock_type=UNLOCK\n", pb_share->sh_table_path->ps_path);
/* Make any temporary locks on this table permanent.
*
@@ -4189,10 +4511,9 @@ xtPublic int ha_pbxt::external_lock(THD
xt_broadcast_cond_ns((xt_cond_type *) pb_share->sh_ex_cond);
}
else {
- XT_PRINT2(self, "ha_pbxt::EXTERNAL_LOCK %s lock_type=%d\n", pb_share->sh_table_path->ps_path, lock_type);
+ XT_PRINT2(self, "ha_pbxt::EXTERNAL_LOCK (%s) lock_type=%d\n", pb_share->sh_table_path->ps_path, lock_type);
if (pb_lock_table) {
-
pb_ex_in_use = 1;
try_(a) {
if (!pb_table_locked)
@@ -4243,14 +4564,18 @@ xtPublic int ha_pbxt::external_lock(THD
if ((pb_open_tab->ot_for_update = (lock_type == F_WRLCK))) {
switch ((int) thd_sql_command(thd)) {
case SQLCOM_DELETE:
+#ifndef DRIZZLED
case SQLCOM_DELETE_MULTI:
+#endif
/* turn DELETE IGNORE into normal DELETE. The IGNORE option causes problems because
* when a record is deleted we add an xlog record which we cannot "rollback" later
* when we find that an FK-constraint has failed.
*/
thd->lex->ignore = false;
case SQLCOM_UPDATE:
+#ifndef DRIZZLED
case SQLCOM_UPDATE_MULTI:
+#endif
case SQLCOM_REPLACE:
case SQLCOM_REPLACE_SELECT:
case SQLCOM_INSERT:
@@ -4265,7 +4590,9 @@ xtPublic int ha_pbxt::external_lock(THD
case SQLCOM_DROP_TABLE:
case SQLCOM_DROP_INDEX:
case SQLCOM_LOAD:
+#ifndef DRIZZLED
case SQLCOM_REPAIR:
+#endif
case SQLCOM_OPTIMIZE:
self->st_stat_modify = TRUE;
break;
@@ -4316,11 +4643,16 @@ xtPublic int ha_pbxt::external_lock(THD
self->st_xact_mode = thd_tx_isolation(thd) <= ISO_READ_COMMITTED ? XT_XACT_COMMITTED_READ : XT_XACT_REPEATABLE_READ;
self->st_ignore_fkeys = (thd_test_options(thd,OPTION_NO_FOREIGN_KEY_CHECKS)) != 0;
self->st_auto_commit = (thd_test_options(thd, (OPTION_NOT_AUTOCOMMIT | OPTION_BEGIN))) == 0;
+#ifdef DRIZZLED
+ self->st_table_trans = FALSE;
+#else
self->st_table_trans = thd_sql_command(thd) == SQLCOM_LOCK_TABLES;
+#endif
self->st_abort_trans = FALSE;
self->st_stat_ended = FALSE;
self->st_stat_trans = FALSE;
XT_PRINT0(self, "xt_xn_begin\n");
+ xt_xres_wait_for_recovery(self, XT_RECOVER_SWEPT);
if (!xt_xn_begin(self)) {
err = xt_ha_pbxt_thread_error_for_mysql(thd, self, pb_ignore_dup_key);
pb_ex_in_use = 0;
@@ -4404,7 +4736,7 @@ int ha_pbxt::start_stmt(THD *thd, thr_lo
if (!(self = ha_set_current_thread(thd, &err)))
return xt_ha_pbxt_to_mysql_error(err);
- XT_PRINT2(self, "ha_pbxt::start_stmt %s lock_type=%d\n", pb_share->sh_table_path->ps_path, (int) lock_type);
+ XT_PRINT2(self, "ha_pbxt::start_stmt (%s) lock_type=%d\n", pb_share->sh_table_path->ps_path, (int) lock_type);
if (!pb_open_tab) {
if ((err = reopen()))
@@ -4430,12 +4762,12 @@ int ha_pbxt::start_stmt(THD *thd, thr_lo
/* This section handles "auto-commit"... */
if (self->st_xact_data && self->st_auto_commit && self->st_table_trans) {
if (self->st_abort_trans) {
- XT_PRINT0(self, "xt_xn_rollback\n");
+ XT_PRINT0(self, "xt_xn_rollback in start_stmt\n");
if (!xt_xn_rollback(self))
err = xt_ha_pbxt_thread_error_for_mysql(pb_mysql_thd, self, pb_ignore_dup_key);
}
else {
- XT_PRINT0(self, "xt_xn_commit\n");
+ XT_PRINT0(self, "xt_xn_commit in start_stmt\n");
if (!xt_xn_commit(self))
err = xt_ha_pbxt_thread_error_for_mysql(pb_mysql_thd, self, pb_ignore_dup_key);
}
@@ -4466,9 +4798,11 @@ int ha_pbxt::start_stmt(THD *thd, thr_lo
if (pb_open_tab->ot_for_update) {
switch ((int) thd_sql_command(thd)) {
case SQLCOM_UPDATE:
- case SQLCOM_UPDATE_MULTI:
case SQLCOM_DELETE:
+#ifndef DRIZZLED
+ case SQLCOM_UPDATE_MULTI:
case SQLCOM_DELETE_MULTI:
+#endif
case SQLCOM_REPLACE:
case SQLCOM_REPLACE_SELECT:
case SQLCOM_INSERT:
@@ -4483,14 +4817,15 @@ int ha_pbxt::start_stmt(THD *thd, thr_lo
case SQLCOM_DROP_TABLE:
case SQLCOM_DROP_INDEX:
case SQLCOM_LOAD:
+#ifndef DRIZZLED
case SQLCOM_REPAIR:
+#endif
case SQLCOM_OPTIMIZE:
self->st_stat_modify = TRUE;
break;
}
}
-
/* (***) This is required at this level!
* No matter how often it is called, it is still the start of a
* statement. We need to make sure statements that are NOT mistaken
@@ -4516,6 +4851,7 @@ int ha_pbxt::start_stmt(THD *thd, thr_lo
self->st_stat_ended = FALSE;
self->st_stat_trans = FALSE;
XT_PRINT0(self, "xt_xn_begin\n");
+ xt_xres_wait_for_recovery(self, XT_RECOVER_SWEPT);
if (!xt_xn_begin(self)) {
err = xt_ha_pbxt_thread_error_for_mysql(thd, self, pb_ignore_dup_key);
goto complete;
@@ -4673,7 +5009,9 @@ THR_LOCK_DATA **ha_pbxt::store_lock(THD
*
*/
if ((lock_type >= TL_WRITE_CONCURRENT_INSERT && lock_type <= TL_WRITE) &&
+#ifndef DRIZZLED
!(thd_in_lock_tables(thd) && thd_sql_command(thd) == SQLCOM_LOCK_TABLES) &&
+#endif
!thd_tablespace_op(thd) &&
thd_sql_command(thd) != SQLCOM_TRUNCATE &&
thd_sql_command(thd) != SQLCOM_OPTIMIZE &&
@@ -4691,22 +5029,23 @@ THR_LOCK_DATA **ha_pbxt::store_lock(THD
* Stewart: removed SQLCOM_CALL, not sure of implications.
*/
- if (lock_type == TL_READ_NO_INSERT &&
- (!thd_in_lock_tables(thd)
+ if (lock_type == TL_READ_NO_INSERT
#ifndef DRIZZLED
+ && (!thd_in_lock_tables(thd)
|| thd_sql_command(thd) == SQLCOM_CALL
+ )
#endif
- ))
+ )
{
lock_type = TL_READ;
}
- XT_PRINT3(xt_get_self(), "ha_pbxt::store_lock %s %d->%d\n", pb_share->sh_table_path->ps_path, pb_lock.type, lock_type);
+ XT_PRINT3(xt_get_self(), "store_lock (%s) %d->%d\n", pb_share->sh_table_path->ps_path, pb_lock.type, lock_type);
pb_lock.type = lock_type;
}
#ifdef PBXT_HANDLER_TRACE
else {
- XT_PRINT3(xt_get_self(), "ha_pbxt::store_lock %s %d->%d (ignore/unlock)\n", pb_share->sh_table_path->ps_path, lock_type, lock_type);
+ XT_PRINT3(xt_get_self(), "store_lock (%s) %d->%d (ignore/unlock)\n", pb_share->sh_table_path->ps_path, lock_type, lock_type);
}
#endif
*to++= &pb_lock;
@@ -4723,15 +5062,23 @@ THR_LOCK_DATA **ha_pbxt::store_lock(THD
* during create if the table_flag HA_DROP_BEFORE_CREATE was specified for
* the storage engine.
*/
+#ifdef DRIZZLED
+int PBXTStorageEngine::doDropTable(Session &, std::string table_path_str)
+#else
int ha_pbxt::delete_table(const char *table_path)
+#endif
{
THD *thd = current_thd;
int err = 0;
XTThreadPtr self = NULL;
XTSharePtr share;
+#ifdef DRIZZLED
+ const char *table_path = table_path_str.c_str();
+#endif
+
STAT_TRACE(self, *thd_query(thd));
- XT_PRINT1(self, "ha_pbxt::delete_table %s\n", table_path);
+ XT_PRINT1(self, "delete_table (%s)\n", table_path);
if (XTSystemTableShare::isSystemTable(table_path))
return delete_system_table(table_path);
@@ -4795,7 +5142,7 @@ int ha_pbxt::delete_table(const char *ta
#endif
}
catch_(a) {
- err = xt_ha_pbxt_thread_error_for_mysql(thd, self, pb_ignore_dup_key);
+ err = xt_ha_pbxt_thread_error_for_mysql(thd, self, FALSE);
#ifdef DRIZZLED
if (err == HA_ERR_NO_SUCH_TABLE)
err = ENOENT;
@@ -4819,7 +5166,11 @@ int ha_pbxt::delete_table(const char *ta
return err;
}
+#ifdef DRIZZLED
+int PBXTStorageEngine::delete_system_table(const char *table_path)
+#else
int ha_pbxt::delete_system_table(const char *table_path)
+#endif
{
THD *thd = current_thd;
XTExceptionRec e;
@@ -4857,7 +5208,13 @@ int ha_pbxt::delete_system_table(const c
* This function can be used to move a table from one database to
* another.
*/
+#ifdef DRIZZLED
+int PBXTStorageEngine::doRenameTable(Session *,
+ const char *from,
+ const char *to)
+#else
int ha_pbxt::rename_table(const char *from, const char *to)
+#endif
{
THD *thd = current_thd;
int err = 0;
@@ -4865,15 +5222,13 @@ int ha_pbxt::rename_table(const char *fr
XTSharePtr share;
XTDatabaseHPtr to_db;
- XT_TRACE_CALL();
-
if (XTSystemTableShare::isSystemTable(from))
return rename_system_table(from, to);
if (!(self = ha_set_current_thread(thd, &err)))
return xt_ha_pbxt_to_mysql_error(err);
- XT_PRINT2(self, "ha_pbxt::rename_table %s -> %s\n", from, to);
+ XT_PRINT2(self, "rename_table (%s -> %s)\n", from, to);
#ifdef PBMS_ENABLED
PBMSResultRec result;
@@ -4929,7 +5284,7 @@ int ha_pbxt::rename_table(const char *fr
#endif
}
catch_(a) {
- err = xt_ha_pbxt_thread_error_for_mysql(thd, self, pb_ignore_dup_key);
+ err = xt_ha_pbxt_thread_error_for_mysql(thd, self, FALSE);
}
cont_(a);
@@ -4940,7 +5295,11 @@ int ha_pbxt::rename_table(const char *fr
XT_RETURN(err);
}
+#ifdef DRIZZLED
+int PBXTStorageEngine::rename_system_table(const char *XT_UNUSED(from), const char *XT_UNUSED(to))
+#else
int ha_pbxt::rename_system_table(const char *XT_UNUSED(from), const char *XT_UNUSED(to))
+#endif
{
return ER_NOT_SUPPORTED_YET;
}
@@ -5029,7 +5388,15 @@ ha_rows ha_pbxt::records_in_range(uint i
* Called from handle.cc by ha_create_table().
*/
+#ifdef DRIZZLED
+int PBXTStorageEngine::doCreateTable(Session *,
+ const char *table_path,
+ Table &table_arg,
+ HA_CREATE_INFO &create_info,
+ drizzled::message::Table &XT_UNUSED(proto))
+#else
int ha_pbxt::create(const char *table_path, TABLE *table_arg, HA_CREATE_INFO *create_info)
+#endif
{
THD *thd = current_thd;
int err = 0;
@@ -5037,37 +5404,61 @@ int ha_pbxt::create(const char *table_pa
XTDDTable *tab_def = NULL;
XTDictionaryRec dic;
- memset(&dic, 0, sizeof(dic));
+ if ((strcmp(table_path, "./pbxt/location") == 0) || (strcmp(table_path, "./pbxt/statistics") == 0))
+ return 0;
- XT_TRACE_CALL();
+ memset(&dic, 0, sizeof(dic));
if (!(self = ha_set_current_thread(thd, &err)))
return xt_ha_pbxt_to_mysql_error(err);
+#ifdef DRIZZLED
+ XT_PRINT2(self, "create (%s) %s\n", table_path, (create_info.options & HA_LEX_CREATE_TMP_TABLE) ? "temporary" : "");
+#else
+ XT_PRINT2(self, "create (%s) %s\n", table_path, (create_info->options & HA_LEX_CREATE_TMP_TABLE) ? "temporary" : "");
+#endif
STAT_TRACE(self, *thd_query(thd));
- XT_PRINT1(self, "ha_pbxt::create %s\n", table_path);
try_(a) {
xt_ha_open_database_of_table(self, (XTPathStrPtr) table_path);
+#ifdef DRIZZLED
+ for (uint i=0; i<TS(&table_arg)->keys; i++) {
+ if (table_arg.key_info[i].key_length > XT_INDEX_MAX_KEY_SIZE)
+ xt_throw_sulxterr(XT_CONTEXT, XT_ERR_KEY_TOO_LARGE, table_arg.key_info[i].name, (u_long) XT_INDEX_MAX_KEY_SIZE);
+ }
+#else
for (uint i=0; i<TS(table_arg)->keys; i++) {
if (table_arg->key_info[i].key_length > XT_INDEX_MAX_KEY_SIZE)
xt_throw_sulxterr(XT_CONTEXT, XT_ERR_KEY_TOO_LARGE, table_arg->key_info[i].name, (u_long) XT_INDEX_MAX_KEY_SIZE);
}
+#endif
/* ($) auto_increment_value will be zero if
* AUTO_INCREMENT is not used. Otherwise
* Query was ALTER TABLE ... AUTO_INCREMENT = x; or
* CREATE TABLE ... AUTO_INCREMENT = x;
*/
+#ifdef DRIZZLED
+ tab_def = xt_ri_create_table(self, true, (XTPathStrPtr) table_path, *thd_query(thd), myxt_create_table_from_table(self, &table_arg));
+ tab_def->checkForeignKeys(self, create_info.options & HA_LEX_CREATE_TMP_TABLE);
+#else
tab_def = xt_ri_create_table(self, true, (XTPathStrPtr) table_path, *thd_query(thd), myxt_create_table_from_table(self, table_arg));
tab_def->checkForeignKeys(self, create_info->options & HA_LEX_CREATE_TMP_TABLE);
+#endif
dic.dic_table = tab_def;
+#ifdef DRIZZLED
+ dic.dic_my_table = &table_arg;
+ dic.dic_tab_flags = (create_info.options & HA_LEX_CREATE_TMP_TABLE) ? XT_TAB_FLAGS_TEMP_TAB : 0;
+ dic.dic_min_auto_inc = (xtWord8) create_info.auto_increment_value; /* ($) */
+ dic.dic_def_ave_row_size = table_arg.s->getAvgRowLength();
+#else
dic.dic_my_table = table_arg;
dic.dic_tab_flags = (create_info->options & HA_LEX_CREATE_TMP_TABLE) ? XT_TAB_FLAGS_TEMP_TAB : 0;
dic.dic_min_auto_inc = (xtWord8) create_info->auto_increment_value; /* ($) */
dic.dic_def_ave_row_size = (xtWord8) table_arg->s->avg_row_length;
+#endif
myxt_setup_dictionary(self, &dic);
/*
@@ -5089,7 +5480,7 @@ int ha_pbxt::create(const char *table_pa
if (tab_def)
tab_def->finalize(self);
dic.dic_table = NULL;
- err = xt_ha_pbxt_thread_error_for_mysql(thd, self, pb_ignore_dup_key);
+ err = xt_ha_pbxt_thread_error_for_mysql(thd, self, FALSE);
}
cont_(a);
@@ -5161,7 +5552,7 @@ bool ha_pbxt::get_error_message(int XT_U
if (!self->t_exception.e_xt_err)
return FALSE;
- buf->copy(self->t_exception.e_err_msg, strlen(self->t_exception.e_err_msg), system_charset_info);
+ buf->copy(self->t_exception.e_err_msg, (uint32) strlen(self->t_exception.e_err_msg), system_charset_info);
return TRUE;
}
@@ -5421,16 +5812,31 @@ static MYSQL_SYSVAR_INT(sweeper_priority
#ifdef DRIZZLED
static MYSQL_SYSVAR_INT(max_threads, pbxt_max_threads,
- PLUGIN_VAR_OPCMDARG,
+ PLUGIN_VAR_OPCMDARG | PLUGIN_VAR_READONLY,
"The maximum number of threads used by PBXT",
NULL, NULL, 500, 20, 20000, 1);
#else
static MYSQL_SYSVAR_INT(max_threads, pbxt_max_threads,
- PLUGIN_VAR_OPCMDARG,
+ PLUGIN_VAR_OPCMDARG | PLUGIN_VAR_READONLY,
"The maximum number of threads used by PBXT, 0 = set according to MySQL max_connections.",
NULL, NULL, 0, 0, 20000, 1);
#endif
+#ifndef DEBUG
+static MYSQL_SYSVAR_BOOL(support_xa, pbxt_support_xa,
+ PLUGIN_VAR_OPCMDARG,
+ "Enable PBXT support for the XA two-phase commit, default is enabled",
+ NULL, NULL, TRUE);
+#else
+static MYSQL_SYSVAR_BOOL(support_xa, pbxt_support_xa,
+ PLUGIN_VAR_OPCMDARG,
+ "Enable PBXT support for the XA two-phase commit, default is disabled (due to assertion failure in MySQL)",
+ /* The problem is, in MySQL an assertion fails in debug mode:
+ * Assertion failed: (total_ha_2pc == (ulong) opt_bin_log+1), function ha_recover, file handler.cc, line 1557.
+ */
+ NULL, NULL, FALSE);
+#endif
+
static struct st_mysql_sys_var* pbxt_system_variables[] = {
MYSQL_SYSVAR(index_cache_size),
MYSQL_SYSVAR(record_cache_size),
@@ -5448,6 +5854,7 @@ static struct st_mysql_sys_var* pbxt_sys
MYSQL_SYSVAR(offline_log_function),
MYSQL_SYSVAR(sweeper_priority),
MYSQL_SYSVAR(max_threads),
+ MYSQL_SYSVAR(support_xa),
NULL
};
#endif
@@ -5494,8 +5901,8 @@ mysql_declare_plugin(pbxt)
"Paul McCullagh, PrimeBase Technologies GmbH",
"PBXT internal system statitics",
PLUGIN_LICENSE_GPL,
- pbxt_init_statitics, /* plugin init */
- pbxt_exit_statitics, /* plugin deinit */
+ pbxt_init_statistics, /* plugin init */
+ pbxt_exit_statistics, /* plugin deinit */
#ifndef DRIZZLED
0x0005,
#endif
=== modified file 'storage/pbxt/src/ha_pbxt.h'
--- a/storage/pbxt/src/ha_pbxt.h 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/ha_pbxt.h 2009-11-24 10:55:06 +0000
@@ -27,9 +27,9 @@
#ifdef DRIZZLED
#include <drizzled/common.h>
-#include <drizzled/handler.h>
-#include <drizzled/plugin/storage_engine.h>
#include <mysys/thr_lock.h>
+#include <drizzled/cursor.h>
+
#else
#include "mysql_priv.h"
#endif
@@ -53,17 +53,31 @@ class ha_pbxt;
#ifdef DRIZZLED
-class PBXTStorageEngine : public StorageEngine {
+class PBXTStorageEngine : public drizzled::plugin::StorageEngine
+{
+
+ int delete_system_table(const char *table_path);
+ int rename_system_table(const char * from, const char * to);
+
public:
PBXTStorageEngine(std::string name_arg)
- : StorageEngine(name_arg, HTON_NO_FLAGS) {}
+ : drizzled::plugin::StorageEngine(name_arg, HTON_NO_FLAGS) {}
+
+ void operator delete(void *) {}
+ void operator delete[] (void *) {}
/* override */ int close_connection(Session *);
/* override */ int commit(Session *, bool);
/* override */ int rollback(Session *, bool);
- /* override */ handler *create(TABLE_SHARE *, MEM_ROOT *);
+ /* override */ Cursor *create(TABLE_SHARE *, MEM_ROOT *);
/* override */ void drop_database(char *);
/* override */ bool show_status(Session *, stat_print_fn *, enum ha_stat_type);
+ /* override */ const char **bas_ext() const;
+ /* override */ int doCreateTable(Session *session, const char *table_name,
+ Table &table_arg, HA_CREATE_INFO
+ &create_info, drizzled::message::Table &proto);
+ /* override */ int doRenameTable(Session *, const char *from, const char *to);
+ /* override */ int doDropTable(Session &session, std::string table_path);
};
typedef PBXTStorageEngine handlerton;
@@ -139,9 +153,9 @@ class ha_pbxt: public handler
* don't implement this method unless you really have indexes.
*/
const char *index_type(uint inx) { (void) inx; return "BTREE"; }
-
+#ifndef DRIZZLED
const char **bas_ext() const;
-
+#endif
MX_UINT8_T table_cache_type();
/*
@@ -241,11 +255,13 @@ class ha_pbxt: public handler
int optimize(THD* thd, HA_CHECK_OPT* check_opt);
int check(THD* thd, HA_CHECK_OPT* check_opt);
ha_rows records_in_range(uint inx, key_range *min_key, key_range *max_key);
- int delete_table(const char *from);
+#ifndef DRIZZLED
int delete_system_table(const char *table_path);
- int rename_table(const char * from, const char * to);
+ int delete_table(const char *from);
int rename_system_table(const char * from, const char * to);
+ int rename_table(const char * from, const char * to);
int create(const char *name, TABLE *form, HA_CREATE_INFO *create_info); //required
+#endif
void update_create_info(HA_CREATE_INFO *create_info);
THR_LOCK_DATA **store_lock(THD *thd, THR_LOCK_DATA **to, enum thr_lock_type lock_type); //required
@@ -277,6 +293,7 @@ struct XTThread *xt_ha_thd_to_self(THD*
int xt_ha_pbxt_to_mysql_error(int xt_err);
int xt_ha_pbxt_thread_error_for_mysql(THD *thd, const XTThreadPtr self, int ignore_dup_key);
void xt_ha_all_threads_close_database(XTThreadPtr self, XTDatabase *db);
+void ha_set_auto_increment(XTOpenTablePtr ot, Field *nr);
/*
* These hooks are suppossed to only be used by InnoDB:
=== modified file 'storage/pbxt/src/ha_xtsys.h'
--- a/storage/pbxt/src/ha_xtsys.h 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/ha_xtsys.h 2009-11-24 10:55:06 +0000
@@ -30,8 +30,9 @@
#ifdef DRIZZLED
#include <drizzled/common.h>
-#include <drizzled/handler.h>
+#include <drizzled/handler_structs.h>
#include <drizzled/current_session.h>
+#include <drizzled/cursor.h>
#else
#include "mysql_priv.h"
#endif
=== modified file 'storage/pbxt/src/heap_xt.cc'
--- a/storage/pbxt/src/heap_xt.cc 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/heap_xt.cc 2009-11-24 10:55:06 +0000
@@ -73,7 +73,7 @@ xtPublic void xt_check_heap(XTThreadPtr
}
#ifdef DEBUG_MEMORY
-xtPublic void xt_mm_heap_reference(XTThreadPtr self, XTHeapPtr hp, u_int line, c_char *file)
+xtPublic void xt_mm_heap_reference(XTThreadPtr XT_UNUSED(self), XTHeapPtr hp, u_int line, c_char *file)
#else
xtPublic void xt_heap_reference(XTThreadPtr, XTHeapPtr hp)
#endif
=== modified file 'storage/pbxt/src/index_xt.cc'
--- a/storage/pbxt/src/index_xt.cc 2009-08-18 07:46:53 +0000
+++ b/storage/pbxt/src/index_xt.cc 2009-11-24 10:55:06 +0000
@@ -829,14 +829,25 @@ static void idx_next_branch_item(XTTable
result->sr_item.i_item_offset += result->sr_item.i_item_size + result->sr_item.i_node_ref_size;
bitem = branch->tb_data + result->sr_item.i_item_offset;
- if (ind->mi_fix_key)
- ilen = result->sr_item.i_item_size;
+ if (result->sr_item.i_item_offset < result->sr_item.i_total_size) {
+ if (ind->mi_fix_key)
+ ilen = result->sr_item.i_item_size;
+ else {
+ ilen = myxt_get_key_length(ind, bitem) + XT_RECORD_REF_SIZE;
+ result->sr_item.i_item_size = ilen;
+ }
+ xt_get_res_record_ref(bitem + ilen - XT_RECORD_REF_SIZE, result); /* (Only valid if i_item_offset < i_total_size) */
+ }
else {
- ilen = myxt_get_key_length(ind, bitem) + XT_RECORD_REF_SIZE;
- result->sr_item.i_item_size = ilen;
+ result->sr_item.i_item_size = 0;
+ result->sr_rec_id = 0;
+ result->sr_row_id = 0;
}
- xt_get_res_record_ref(bitem + ilen - XT_RECORD_REF_SIZE, result); /* (Only valid if i_item_offset < i_total_size) */
- result->sr_branch = IDX_GET_NODE_REF(tab, bitem, result->sr_item.i_node_ref_size);
+ if (result->sr_item.i_node_ref_size)
+ /* IDX_GET_NODE_REF() loads the branch reference to the LEFT of the item. */
+ result->sr_branch = IDX_GET_NODE_REF(tab, bitem, result->sr_item.i_node_ref_size);
+ else
+ result->sr_branch = 0;
}
xtPublic void xt_prev_branch_item_fix(XTTableHPtr XT_UNUSED(tab), XTIndexPtr XT_UNUSED(ind), XTIdxBranchDPtr branch, register XTIdxResultRec *result)
@@ -3987,7 +3998,7 @@ xtPublic xtBool xt_flush_indices(XTOpenT
* here.
*/
if (!(tab->tab_dic.dic_tab_flags & XT_TAB_FLAGS_TEMP_TAB)) {
- if (!xt_xlog_flush_log(ot->ot_thread))
+ if (!xt_xlog_flush_log(tab->tab_db, ot->ot_thread))
goto failed_2;
if (!il->il_flush(ot))
goto failed_2;
=== modified file 'storage/pbxt/src/lock_xt.cc'
--- a/storage/pbxt/src/lock_xt.cc 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/lock_xt.cc 2009-11-24 10:55:06 +0000
@@ -1246,7 +1246,7 @@ xtPublic void xt_spinlock_init(XTThreadP
(void) self;
spl->spl_lock = 0;
#ifdef XT_NO_ATOMICS
- xt_init_mutex(self, &spl->spl_mutex);
+ xt_init_mutex_with_autoname(self, &spl->spl_mutex);
#endif
#ifdef DEBUG
spl->spl_locker = 0;
=== modified file 'storage/pbxt/src/locklist_xt.h'
--- a/storage/pbxt/src/locklist_xt.h 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/locklist_xt.h 2009-11-27 15:37:02 +0000
@@ -24,11 +24,16 @@
#ifndef __xt_locklist_h__
#define __xt_locklist_h__
+/*
+ * XT_THREAD_LOCK_INFO and DEBUG_LOCKING code must be updated to avoid calls to xt_get_self() as it can be called before hton->slot is
+ * assigned by MySQL which is used by xt_get_self()
+ */
+
#ifdef DEBUG
-#define XT_THREAD_LOCK_INFO
+//#define XT_THREAD_LOCK_INFO
#ifndef XT_WIN
/* We need DEBUG_LOCKING in order to enable pthread function wrappers */
-#define DEBUG_LOCKING
+//#define DEBUG_LOCKING
#endif
#endif
=== modified file 'storage/pbxt/src/memory_xt.cc'
--- a/storage/pbxt/src/memory_xt.cc 2009-09-03 06:15:03 +0000
+++ b/storage/pbxt/src/memory_xt.cc 2009-11-25 15:06:47 +0000
@@ -34,7 +34,7 @@
#include "trace_xt.h"
#ifdef DEBUG
-//#define RECORD_MM
+#define RECORD_MM
#endif
#ifdef DEBUG
@@ -367,9 +367,8 @@ static long mm_find_pointer(void *ptr)
return(-1);
}
-static long mm_add_pointer(void *ptr, u_int id)
+static long mm_add_pointer(void *ptr, u_int XT_UNUSED(id))
{
-#pragma unused(id)
register int i, n, guess;
if (mm_nr_in_use == mm_total_allocated) {
=== modified file 'storage/pbxt/src/myxt_xt.cc'
--- a/storage/pbxt/src/myxt_xt.cc 2009-09-03 06:15:03 +0000
+++ b/storage/pbxt/src/myxt_xt.cc 2009-12-10 11:36:05 +0000
@@ -36,7 +36,7 @@
#include <drizzled/current_session.h>
#include <drizzled/sql_lex.h>
#include <drizzled/session.h>
-extern "C" struct charset_info_st *session_charset(Session *session);
+//extern "C" struct charset_info_st *session_charset(Session *session);
extern pthread_key_t THR_Session;
#else
#include "mysql_priv.h"
@@ -171,7 +171,9 @@ xtPublic u_int myxt_create_key_from_key(
for (u_int i=0; i<ind->mi_seg_count && (int) k_length > 0; i++, old += keyseg->length, keyseg++)
{
+#ifndef DRIZZLED
enum ha_base_keytype type = (enum ha_base_keytype) keyseg->type;
+#endif
u_int length = keyseg->length < k_length ? keyseg->length : k_length;
u_int char_length;
xtWord1 *pos;
@@ -192,14 +194,18 @@ xtPublic u_int myxt_create_key_from_key(
pos = old;
if (keyseg->flag & HA_SPACE_PACK) {
uchar *end = pos + length;
+#ifndef DRIZZLED
if (type != HA_KEYTYPE_NUM) {
+#endif
while (end > pos && end[-1] == ' ')
end--;
+#ifndef DRIZZLED
}
else {
while (pos < end && pos[0] == ' ')
pos++;
}
+#endif
k_length -= length;
length = (u_int) (end-pos);
FIX_LENGTH(cs, pos, length, char_length);
@@ -276,6 +282,7 @@ xtPublic u_int myxt_create_key_from_row(
char_length= ((cs && cs->mbmaxlen > 1) ? length/cs->mbmaxlen : length);
pos = record + keyseg->start;
+#ifndef DRIZZLED
if (type == HA_KEYTYPE_BIT)
{
if (keyseg->bit_length)
@@ -289,17 +296,22 @@ xtPublic u_int myxt_create_key_from_row(
key+= length;
continue;
}
+#endif
if (keyseg->flag & HA_SPACE_PACK)
{
end = pos + length;
+#ifndef DRIZZLED
if (type != HA_KEYTYPE_NUM) {
+#endif
while (end > pos && end[-1] == ' ')
end--;
+#ifndef DRIZZLED
}
else {
while (pos < end && pos[0] == ' ')
pos++;
}
+#endif
length = (u_int) (end-pos);
FIX_LENGTH(cs, pos, length, char_length);
store_key_length_inc(key,char_length);
@@ -333,6 +345,7 @@ xtPublic u_int myxt_create_key_from_row(
if (keyseg->flag & HA_SWAP_KEY)
{ /* Numerical column */
#ifdef HAVE_ISNAN
+#ifndef DRIZZLED
if (type == HA_KEYTYPE_FLOAT)
{
float nr;
@@ -345,7 +358,9 @@ xtPublic u_int myxt_create_key_from_row(
continue;
}
}
- else if (type == HA_KEYTYPE_DOUBLE) {
+ else
+#endif
+ if (type == HA_KEYTYPE_DOUBLE) {
double nr;
float8get(nr,pos);
@@ -414,6 +429,7 @@ xtPublic u_int myxt_create_foreign_key_f
char_length= ((cs && cs->mbmaxlen > 1) ? length/cs->mbmaxlen : length);
pos = record + keyseg->start;
+#ifndef DRIZZLED
if (type == HA_KEYTYPE_BIT)
{
if (keyseg->bit_length)
@@ -427,17 +443,22 @@ xtPublic u_int myxt_create_foreign_key_f
key+= length;
continue;
}
+#endif
if (keyseg->flag & HA_SPACE_PACK)
{
end = pos + length;
+#ifndef DRIZZLED
if (type != HA_KEYTYPE_NUM) {
+#endif
while (end > pos && end[-1] == ' ')
end--;
+#ifndef DRIZZLED
}
else {
while (pos < end && pos[0] == ' ')
pos++;
}
+#endif
length = (u_int) (end-pos);
FIX_LENGTH(cs, pos, length, char_length);
store_key_length_inc(key,char_length);
@@ -471,6 +492,7 @@ xtPublic u_int myxt_create_foreign_key_f
if (keyseg->flag & HA_SWAP_KEY)
{ /* Numerical column */
#ifdef HAVE_ISNAN
+#ifndef DRIZZLED
if (type == HA_KEYTYPE_FLOAT)
{
float nr;
@@ -483,7 +505,9 @@ xtPublic u_int myxt_create_foreign_key_f
continue;
}
}
- else if (type == HA_KEYTYPE_DOUBLE) {
+ else
+#endif
+ if (type == HA_KEYTYPE_DOUBLE) {
double nr;
float8get(nr,pos);
@@ -622,7 +646,6 @@ static char *mx_get_length_and_data(Fiel
case MYSQL_TYPE_SET:
case MYSQL_TYPE_GEOMETRY:
#else
- case DRIZZLE_TYPE_TINY:
case DRIZZLE_TYPE_LONG:
case DRIZZLE_TYPE_DOUBLE:
case DRIZZLE_TYPE_NULL:
@@ -740,7 +763,6 @@ static void mx_set_length_and_data(Field
case MYSQL_TYPE_SET:
case MYSQL_TYPE_GEOMETRY:
#else
- case DRIZZLE_TYPE_TINY:
case DRIZZLE_TYPE_LONG:
case DRIZZLE_TYPE_DOUBLE:
case DRIZZLE_TYPE_NULL:
@@ -825,6 +847,7 @@ xtPublic xtBool myxt_create_row_from_key
}
record[keyseg->null_pos] &= ~keyseg->null_bit;
}
+#ifndef DRIZZLED
if (keyseg->type == HA_KEYTYPE_BIT)
{
uint length = keyseg->length;
@@ -845,6 +868,7 @@ xtPublic xtBool myxt_create_row_from_key
key+= length;
continue;
}
+#endif
if (keyseg->flag & HA_SPACE_PACK)
{
uint length;
@@ -854,16 +878,20 @@ xtPublic xtBool myxt_create_row_from_key
goto err;
#endif
pos = record+keyseg->start;
+#ifndef DRIZZLED
if (keyseg->type != (int) HA_KEYTYPE_NUM)
{
+#endif
memcpy(pos,key,(size_t) length);
bfill(pos+length,keyseg->length-length,' ');
+#ifndef DRIZZLED
}
else
{
bfill(pos,keyseg->length-length,' ');
memcpy(pos+keyseg->length-length,key,(size_t) length);
}
+#endif
key+=length;
continue;
}
@@ -945,7 +973,7 @@ xtPublic xtBool myxt_create_row_from_key
static int my_compare_bin(uchar *a, uint a_length, uchar *b, uint b_length,
my_bool part_key, my_bool skip_end_space)
{
- uint length= min(a_length,b_length);
+ uint length= a_length < b_length ? a_length : b_length;
uchar *end= a+ length;
int flag;
@@ -1023,6 +1051,7 @@ xtPublic u_int myxt_get_key_length(XTInd
get_key_pack_length(seg_len, pack_len, key_data);
key_data += seg_len;
break;
+#ifndef DRIZZLED
case HA_KEYTYPE_NUM: {
/* Numeric key */
if (keyseg->flag & HA_SPACE_PACK)
@@ -1035,15 +1064,16 @@ xtPublic u_int myxt_get_key_length(XTInd
case HA_KEYTYPE_INT8:
case HA_KEYTYPE_SHORT_INT:
case HA_KEYTYPE_USHORT_INT:
+ case HA_KEYTYPE_INT24:
+ case HA_KEYTYPE_FLOAT:
+ case HA_KEYTYPE_BIT:
+#endif
case HA_KEYTYPE_LONG_INT:
case HA_KEYTYPE_ULONG_INT:
- case HA_KEYTYPE_INT24:
case HA_KEYTYPE_UINT24:
- case HA_KEYTYPE_FLOAT:
case HA_KEYTYPE_DOUBLE:
case HA_KEYTYPE_LONGLONG:
case HA_KEYTYPE_ULONGLONG:
- case HA_KEYTYPE_BIT:
key_data += keyseg->length;
break;
case HA_KEYTYPE_END:
@@ -1190,6 +1220,7 @@ xtPublic int myxt_compare_key(XTIndexPtr
b += b_length;
break;
}
+#ifndef DRIZZLED
case HA_KEYTYPE_INT8:
{
int i_1 = (int) *((signed char *) a);
@@ -1218,6 +1249,7 @@ xtPublic int myxt_compare_key(XTIndexPtr
b += keyseg->length;
break;
}
+#endif
case HA_KEYTYPE_LONG_INT: {
int32 l_1 = sint4korr(a);
int32 l_2 = sint4korr(b);
@@ -1236,6 +1268,7 @@ xtPublic int myxt_compare_key(XTIndexPtr
b += keyseg->length;
break;
}
+#ifndef DRIZZLED
case HA_KEYTYPE_INT24: {
int32 l_1 = sint3korr(a);
int32 l_2 = sint3korr(b);
@@ -1245,6 +1278,7 @@ xtPublic int myxt_compare_key(XTIndexPtr
b += keyseg->length;
break;
}
+#endif
case HA_KEYTYPE_UINT24: {
int32 l_1 = uint3korr(a);
int32 l_2 = uint3korr(b);
@@ -1254,6 +1288,7 @@ xtPublic int myxt_compare_key(XTIndexPtr
b += keyseg->length;
break;
}
+#ifndef DRIZZLED
case HA_KEYTYPE_FLOAT: {
float f_1, f_2;
@@ -1270,6 +1305,7 @@ xtPublic int myxt_compare_key(XTIndexPtr
b += keyseg->length;
break;
}
+#endif
case HA_KEYTYPE_DOUBLE: {
double d_1, d_2;
@@ -1286,6 +1322,7 @@ xtPublic int myxt_compare_key(XTIndexPtr
b += keyseg->length;
break;
}
+#ifndef DRIZZLED
case HA_KEYTYPE_NUM: {
/* Numeric key */
if (keyseg->flag & HA_SPACE_PACK) {
@@ -1339,6 +1376,7 @@ xtPublic int myxt_compare_key(XTIndexPtr
b += b_length;
break;
}
+#endif
#ifdef HAVE_LONG_LONG
case HA_KEYTYPE_LONGLONG: {
longlong ll_a = sint8korr(a);
@@ -1359,9 +1397,11 @@ xtPublic int myxt_compare_key(XTIndexPtr
break;
}
#endif
+#ifndef DRIZZLED
case HA_KEYTYPE_BIT:
/* TODO: What here? */
break;
+#endif
case HA_KEYTYPE_END: /* Ready */
goto end;
}
@@ -1410,16 +1450,19 @@ xtPublic u_int myxt_key_seg_length(XTInd
key_length = has_null + a_length + pack_len;
break;
}
+#ifndef DRIZZLED
case HA_KEYTYPE_INT8:
case HA_KEYTYPE_SHORT_INT:
case HA_KEYTYPE_USHORT_INT:
+ case HA_KEYTYPE_INT24:
+ case HA_KEYTYPE_FLOAT:
+#endif
case HA_KEYTYPE_LONG_INT:
case HA_KEYTYPE_ULONG_INT:
- case HA_KEYTYPE_INT24:
case HA_KEYTYPE_UINT24:
- case HA_KEYTYPE_FLOAT:
case HA_KEYTYPE_DOUBLE:
break;
+#ifndef DRIZZLED
case HA_KEYTYPE_NUM: {
/* Numeric key */
if (keyseg->flag & HA_SPACE_PACK) {
@@ -1428,14 +1471,17 @@ xtPublic u_int myxt_key_seg_length(XTInd
}
break;
}
+#endif
#ifdef HAVE_LONG_LONG
case HA_KEYTYPE_LONGLONG:
case HA_KEYTYPE_ULONGLONG:
break;
#endif
+#ifndef DRIZZLED
case HA_KEYTYPE_BIT:
/* TODO: What here? */
break;
+#endif
case HA_KEYTYPE_END: /* Ready */
break;
}
@@ -1486,7 +1532,7 @@ xtPublic xtWord4 myxt_store_row_length(X
return row_size;
}
-static xtWord4 mx_store_row(XTOpenTablePtr ot, xtWord4 row_size, char *rec_buff)
+xtPublic xtWord4 myxt_store_row_data(XTOpenTablePtr ot, xtWord4 row_size, char *rec_buff)
{
TABLE *table = ot->ot_table->tab_dic.dic_my_table;
char *sdata;
@@ -1614,8 +1660,9 @@ xtPublic size_t myxt_load_row_length(XTO
}
/* Unload from PBXT variable length format to the MySQL row format. */
-xtPublic xtBool myxt_load_row(XTOpenTablePtr ot, xtWord1 *source_buf, xtWord1 *dest_buff, u_int col_cnt)
+xtPublic xtWord4 myxt_load_row_data(XTOpenTablePtr ot, xtWord1 *source_buf, xtWord1 *dest_buff, u_int col_cnt)
{
+ xtWord1 *input_buf = source_buf;
TABLE *table;
xtWord4 len;
Field *curr_field;
@@ -1624,7 +1671,7 @@ xtPublic xtBool myxt_load_row(XTOpenTabl
if (!(table = ot->ot_table->tab_dic.dic_my_table)) {
xt_register_taberr(XT_REG_CONTEXT, XT_ERR_NO_DICTIONARY, ot->ot_table->tab_name);
- return FAILED;
+ return 0;
}
/* According to the InnoDB implementation:
@@ -1657,7 +1704,7 @@ xtPublic xtBool myxt_load_row(XTOpenTabl
default: // Length byte
if (*source_buf > 240) {
xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_RECORD_FORMAT);
- return FAILED;
+ return 0;
}
len = *source_buf;
source_buf++;
@@ -1671,7 +1718,12 @@ xtPublic xtBool myxt_load_row(XTOpenTabl
source_buf += len;
}
- return OK;
+ return (xtWord4) (source_buf - input_buf);
+}
+
+xtPublic xtBool myxt_load_row(XTOpenTablePtr ot, xtWord1 *source_buf, xtWord1 *dest_buff, u_int col_cnt)
+{
+ return myxt_load_row_data(ot, source_buf, dest_buff, col_cnt) != 0;
}
xtPublic xtBool myxt_find_column(XTOpenTablePtr ot, u_int *col_idx, const char *col_name)
@@ -1784,7 +1836,7 @@ xtPublic xtBool myxt_store_row(XTOpenTab
else {
xtWord4 row_size;
- if (!(row_size = mx_store_row(ot, XT_REC_EXT_HEADER_SIZE, rec_buff)))
+ if (!(row_size = myxt_store_row_data(ot, XT_REC_EXT_HEADER_SIZE, rec_buff)))
return FAILED;
if (row_size - XT_REC_FIX_EXT_HEADER_DIFF <= ot->ot_rec_size) {
rec_info->ri_fix_rec_buf = (XTTabRecFixDPtr) &ot->ot_row_wbuffer[XT_REC_FIX_EXT_HEADER_DIFF];
@@ -1951,7 +2003,7 @@ static TABLE *my_open_table(XTThreadPtr
#ifdef DRIZZLED
share->init(db_name, 0, name, path);
- if ((error = open_table_def(thd, share)) ||
+ if ((error = open_table_def(*thd, share)) ||
(error = open_table_from_share(thd, share, "", 0, (uint32_t) READ_ALL, 0, table, OTM_OPEN)))
{
xt_free(self, table);
@@ -1995,7 +2047,7 @@ static TABLE *my_open_table(XTThreadPtr
return NULL;
}
-#if MYSQL_VERSION_ID >= 60003
+#if MYSQL_VERSION_ID >= 50404
if ((error = open_table_from_share(thd, share, "", 0, (uint) READ_ALL, 0, table, OTM_OPEN)))
#else
if ((error = open_table_from_share(thd, share, "", 0, (uint) READ_ALL, 0, table, FALSE)))
@@ -2145,7 +2197,10 @@ static XTIndexPtr my_create_index(XTThre
if (options & HA_OPTION_PACK_KEYS ||
(index->flags & (HA_PACK_KEY | HA_BINARY_PACK_KEY | HA_SPACE_PACK_USED)))
{
- if (key_part->length > 8 && (type == HA_KEYTYPE_TEXT || type == HA_KEYTYPE_NUM ||
+ if (key_part->length > 8 && (type == HA_KEYTYPE_TEXT ||
+#ifndef DRIZZLED
+ type == HA_KEYTYPE_NUM ||
+#endif
(type == HA_KEYTYPE_BINARY && !field->zero_pack())))
{
/* No blobs here */
@@ -2213,8 +2268,12 @@ static XTIndexPtr my_create_index(XTThre
else if (field->type() == MYSQL_TYPE_ENUM) {
switch (seg->length) {
case 2:
+#ifdef DRIZZLED
+ ASSERT_NS(FALSE);
+#else
seg->type = HA_KEYTYPE_USHORT_INT;
break;
+#endif
case 3:
seg->type = HA_KEYTYPE_UINT24;
break;
@@ -2675,7 +2734,11 @@ xtPublic xtBool myxt_load_dictionary(XTT
if (!(my_tab = my_open_table(self, db, tab_path)))
return FAILED;
dic->dic_my_table = my_tab;
+#ifdef DRIZZLED
+ dic->dic_def_ave_row_size = (xtWord8) my_tab->s->getAvgRowLength();
+#else
dic->dic_def_ave_row_size = (xtWord8) my_tab->s->avg_row_length;
+#endif
myxt_setup_dictionary(self, dic);
dic->dic_keys = (XTIndexPtr *) xt_calloc(self, sizeof(XTIndexPtr) * TS(my_tab)->keys);
for (uint i=0; i<TS(my_tab)->keys; i++)
@@ -2805,8 +2868,10 @@ static void ha_create_dd_index(XTThreadP
static char *my_type_to_string(XTThreadPtr self, Field *field, TABLE *XT_UNUSED(my_tab))
{
- char buffer[MAX_FIELD_WIDTH + 400], *ptr;
+ char buffer[MAX_FIELD_WIDTH + 400];
+ const char *ptr;
String type((char *) buffer, sizeof(buffer), system_charset_info);
+ xtWord4 len;
/* GOTCHA:
* - Above sets the string length to the same as the buffer,
@@ -2817,10 +2882,17 @@ static char *my_type_to_string(XTThreadP
*/
type.length(0);
field->sql_type(type);
- ptr = type.c_ptr();
+ ptr = type.ptr();
+ len = type.length();
+
+ if (len >= sizeof(buffer))
+ len = sizeof(buffer)-1;
+
if (ptr != buffer)
- xt_strcpy(sizeof(buffer), buffer, ptr);
+ xt_strcpy(sizeof(buffer), buffer, ptr);
+ buffer[len] = 0;
+
if (field->has_charset()) {
/* Always include the charset so that we can compare types
* for FK/PK releations.
@@ -2877,6 +2949,10 @@ xtPublic XTDDTable *myxt_create_table_fr
xtPublic void myxt_static_convert_identifier(XTThreadPtr XT_UNUSED(self), MX_CHARSET_INFO *cs, char *from, char *to, size_t to_len)
{
+#ifdef DRIZZLED
+ ((void *)cs);
+ xt_strcpy(to_len, to, from);
+#else
uint errors;
/*
@@ -2888,11 +2964,16 @@ xtPublic void myxt_static_convert_identi
xt_strcpy(to_len, to, from);
else
strconvert(cs, from, &my_charset_utf8_general_ci, to, to_len, &errors);
+#endif
}
// cs == current_thd->charset()
xtPublic char *myxt_convert_identifier(XTThreadPtr self, MX_CHARSET_INFO *cs, char *from)
{
+#ifdef DRIZZLED
+ char *to = xt_dup_string(self, from);
+ ((void *)cs);
+#else
uint errors;
u_int len;
char *to;
@@ -2904,6 +2985,7 @@ xtPublic char *myxt_convert_identifier(X
to = (char *) xt_malloc(self, len);
strconvert(cs, from, &my_charset_utf8_general_ci, to, len, &errors);
}
+#endif
return to;
}
@@ -2954,9 +3036,9 @@ xtPublic MX_CHARSET_INFO *myxt_getcharse
THD *thd = current_thd;
if (thd)
- return thd_charset(thd);
+ return (MX_CHARSET_INFO *)thd_charset(thd);
}
- return &my_charset_utf8_general_ci;
+ return (MX_CHARSET_INFO *)&my_charset_utf8_general_ci;
}
xtPublic void *myxt_create_thread()
@@ -3011,12 +3093,26 @@ xtPublic void *myxt_create_thread()
return NULL;
}
- if (!(new_thd = new THD())) {
+ if (!(new_thd = new THD)) {
my_thread_end();
xt_register_error(XT_REG_CONTEXT, XT_ERR_MYSQL_ERROR, 0, "Unable to create MySQL thread (THD)");
return NULL;
}
+ /*
+ * If PBXT is the default storage engine, then creating any THD objects will add extra
+ * references to the PBXT plugin object. because the threads are created but PBXT
+ * this creates a self reference, and the reference count does not go to zero
+ * on shutdown.
+ *
+ * The server then issues a message that it is forcing shutdown of the plugin.
+ *
+ * However, the engine reference is not required by the THDs used by PBXT, so
+ * I just remove them here.
+ */
+ plugin_unlock(NULL, new_thd->variables.table_plugin);
+ new_thd->variables.table_plugin = NULL;
+
new_thd->thread_stack = (char *) &new_thd;
new_thd->store_globals();
lex_start(new_thd);
@@ -3051,7 +3147,7 @@ xtPublic void myxt_destroy_thread(void *
#else
close_thread_tables(thd);
#endif
-
+
delete thd;
/* Remember that we don't have a THD */
@@ -3128,11 +3224,12 @@ xtPublic int myxt_statistics_fill_table(
const char *stat_name;
u_llong stat_value;
XTStatisticsRec statistics;
+ XTDatabaseHPtr db = self->st_database;
xt_gather_statistics(&statistics);
for (u_int rec_id=0; !err && rec_id<XT_STAT_CURRENT_MAX; rec_id++) {
stat_name = xt_get_stat_meta_data(rec_id)->sm_name;
- stat_value = xt_get_statistic(&statistics, self->st_database, rec_id);
+ stat_value = xt_get_statistic(&statistics, db, rec_id);
col=0;
mx_put_u_llong(table, col++, rec_id+1);
@@ -3213,19 +3310,31 @@ static void myxt_bitmap_init(XTThreadPtr
my_bitmap_map *buf;
uint size_in_bytes = (((n_bits) + 31) / 32) * 4;
- buf = (my_bitmap_map *) xt_malloc(self, size_in_bytes);
+ buf = (my_bitmap_map *) xt_malloc(self, size_in_bytes);
+
+#ifdef DRIZZLED
+ map->init(buf, n_bits);
+#else
map->bitmap= buf;
map->n_bits= n_bits;
create_last_word_mask(map);
bitmap_clear_all(map);
+#endif
}
static void myxt_bitmap_free(XTThreadPtr self, MX_BITMAP *map)
{
+#ifdef DRIZZLED
+ my_bitmap_map *buf = map->getBitmap();
+ if (buf)
+ xt_free(self, buf);
+ map->setBitmap(NULL);
+#else
if (map->bitmap) {
xt_free(self, map->bitmap);
map->bitmap = NULL;
}
+#endif
}
/*
@@ -3269,3 +3378,29 @@ XTDDColumn *XTDDColumnFactory::createFro
return col;
}
+/*
+ * -----------------------------------------------------------------------
+ * utilities
+ */
+
+/*
+ * MySQL (not sure about Drizzle) first calls hton->init and then assigns the plugin a thread slot
+ * which is used by xt_get_self(). This is a problem as pbxt_init() starts a number of daemon threads
+ * which could try to use the slot before it is assigned. This code waits till slot is inited.
+ * We cannot directly check hton->slot as in some versions of MySQL it can be 0 before init which is a
+ * valid value.
+ */
+extern ulong total_ha;
+
+xtPublic void myxt_wait_pbxt_plugin_slot_assigned(XTThread *self)
+{
+#ifdef DRIZZLED
+ static LEX_STRING plugin_name = { C_STRING_WITH_LEN("PBXT") };
+
+ while (!self->t_quit && !Registry::singleton().find(&plugin_name))
+ xt_sleep_milli_second(1);
+#else
+ while(!self->t_quit && (pbxt_hton->slot >= total_ha))
+ xt_sleep_milli_second(1);
+#endif
+}
=== modified file 'storage/pbxt/src/myxt_xt.h'
--- a/storage/pbxt/src/myxt_xt.h 2009-09-03 06:15:03 +0000
+++ b/storage/pbxt/src/myxt_xt.h 2009-11-27 15:37:02 +0000
@@ -52,8 +52,10 @@ void myxt_set_default_row_from_key(XTOp
void myxt_print_key(XTIndexPtr ind, xtWord1 *key_value);
xtWord4 myxt_store_row_length(XTOpenTablePtr ot, char *rec_buff);
+xtWord4 myxt_store_row_data(XTOpenTablePtr ot, xtWord4 row_size, char *rec_buff);
xtBool myxt_store_row(XTOpenTablePtr ot, XTTabRecInfoPtr rec_info, char *rec_buff);
size_t myxt_load_row_length(XTOpenTablePtr ot, size_t buffer_size, xtWord1 *source_buf, u_int *ret_col_cnt);
+xtWord4 myxt_load_row_data(XTOpenTablePtr ot, xtWord1 *source_buf, xtWord1 *dest_buff, u_int col_cnt);
xtBool myxt_load_row(XTOpenTablePtr ot, xtWord1 *source_buf, xtWord1 *dest_buff, u_int col_cnt);
xtBool myxt_find_column(XTOpenTablePtr ot, u_int *col_idx, const char *col_name);
void myxt_get_column_name(XTOpenTablePtr ot, u_int col_idx, u_int len, char *col_name);
@@ -93,4 +95,6 @@ public:
static XTDDColumn *createFromMySQLField(XTThread *self, STRUCT_TABLE *, Field *);
};
+void myxt_wait_pbxt_plugin_slot_assigned(XTThread *self);
+
#endif
=== modified file 'storage/pbxt/src/pbms.h'
--- a/storage/pbxt/src/pbms.h 2009-08-18 07:46:53 +0000
+++ b/storage/pbxt/src/pbms.h 2009-11-24 10:55:06 +0000
@@ -344,16 +344,16 @@ public:
int couldBeURL(char *blob_url, int size)
{
if (blob_url && (size < PBMS_BLOB_URL_SIZE)) {
- char buffer[PBMS_BLOB_URL_SIZE+1];
- u_int32_t db_id = 0;
- u_int32_t tab_id = 0;
- u_int64_t blob_id = 0;
- u_int64_t blob_ref_id = 0;
- u_int64_t blob_size = 0;
- u_int32_t auth_code = 0;
- u_int32_t server_id = 0;
- char type, junk[5];
- int scanned;
+ char buffer[PBMS_BLOB_URL_SIZE+1];
+ unsigned long db_id = 0;
+ unsigned long tab_id = 0;
+ unsigned long long blob_id = 0;
+ unsigned long long blob_ref_id = 0;
+ unsigned long long blob_size = 0;
+ unsigned long auth_code = 0;
+ unsigned long server_id = 0;
+ char type, junk[5];
+ int scanned;
junk[0] = 0;
if (blob_url[size]) { // There is no guarantee that the URL will be null terminated.
@@ -364,12 +364,12 @@ public:
scanned = sscanf(blob_url, URL_FMT"%4s", &db_id, &type, &tab_id, &blob_id, &auth_code, &server_id, &blob_ref_id, &blob_size, junk);
if (scanned != 8) {// If junk is found at the end this will also result in an invalid URL.
- printf("Bad URL \"%s\": scanned = %d, junk: %d, %d, %d, %d\n", blob_url, scanned, junk[0], junk[1], junk[2], junk[3]);
+ printf("Bad URL \"%s\": scanned = %d, junk: %d, %d, %d, %d\n", blob_url, scanned, junk[0], junk[1], junk[2], junk[3]);
return 0;
}
if (junk[0] || (type != '~' && type != '_')) {
- printf("Bad URL \"%s\": scanned = %d, junk: %d, %d, %d, %d\n", blob_url, scanned, junk[0], junk[1], junk[2], junk[3]);
+ printf("Bad URL \"%s\": scanned = %d, junk: %d, %d, %d, %d\n", blob_url, scanned, junk[0], junk[1], junk[2], junk[3]);
return 0;
}
=== modified file 'storage/pbxt/src/pbms_enabled.cc'
--- a/storage/pbxt/src/pbms_enabled.cc 2009-10-06 15:16:01 +0000
+++ b/storage/pbxt/src/pbms_enabled.cc 2009-11-24 10:55:06 +0000
@@ -29,15 +29,10 @@
*
*/
-/*
- The following two lines backported by psergey. Remove them when we merge from PBXT again.
-*/
#include "xt_config.h"
-#ifdef PBMS_ENABLED
-#define PBMS_API pbms_enabled_api
+#ifdef PBMS_ENABLED
-#include "pbms_enabled.h"
#ifdef DRIZZLED
#include <sys/stat.h>
#include <drizzled/common_includes.h>
@@ -47,11 +42,15 @@
#include <mysql/plugin.h>
#define session_alloc(sess, size) thd_alloc(sess, size);
#define current_session current_thd
-#endif
+#endif
-#define GET_BLOB_FIELD(t, i) (Field_blob *)(t->field[t->s->blob_field[i]])
-#define DB_NAME(f) (f->table->s->db.str)
-#define TAB_NAME(f) (*(f->table_name))
+#define GET_BLOB_FIELD(t, i) (Field_blob *)(t->field[t->s->blob_field[i]])
+#define DB_NAME(f) (f->table->s->db.str)
+#define TAB_NAME(f) (*(f->table_name))
+
+#define PBMS_API pbms_enabled_api
+
+#include "pbms_enabled.h"
static PBMS_API pbms_api;
@@ -242,4 +241,4 @@ void pbms_completed(TABLE *table, bool o
return ;
}
-#endif
\ No newline at end of file
+#endif // PBMS_ENABLED
=== modified file 'storage/pbxt/src/pbms_enabled.h'
--- a/storage/pbxt/src/pbms_enabled.h 2009-08-18 07:46:53 +0000
+++ b/storage/pbxt/src/pbms_enabled.h 2009-11-24 10:55:06 +0000
@@ -35,13 +35,6 @@
#include "pbms.h"
-#ifdef DRIZZLED
-#include <drizzled/server_includes.h>
-#define TABLE Table
-#else
-#include <mysql_priv.h>
-#endif
-
/*
* pbms_initialize() should be called from the engines plugIn's 'init()' function.
* The engine_name is the name of your engine, "PBXT" or "InnoDB" for example.
=== modified file 'storage/pbxt/src/pthread_xt.cc'
--- a/storage/pbxt/src/pthread_xt.cc 2009-08-18 07:46:53 +0000
+++ b/storage/pbxt/src/pthread_xt.cc 2009-11-24 10:55:06 +0000
@@ -578,8 +578,8 @@ xtPublic int xt_p_mutex_unlock(xt_mutex_
xtPublic int xt_p_mutex_destroy(xt_mutex_type *mutex)
{
- ASSERT_NS(mutex->mu_init == 12345);
- mutex->mu_init = 89898;
+ //ASSERT_NS(mutex->mu_init == 12345);
+ mutex->mu_init = 11111;
#ifdef XT_THREAD_LOCK_INFO
xt_thread_lock_info_free(&mutex->mu_lock_info);
#endif
=== modified file 'storage/pbxt/src/restart_xt.cc'
--- a/storage/pbxt/src/restart_xt.cc 2009-09-03 06:15:03 +0000
+++ b/storage/pbxt/src/restart_xt.cc 2009-11-27 15:37:02 +0000
@@ -34,12 +34,16 @@
#include "ha_pbxt.h"
+#ifdef DRIZZLED
+#include <drizzled/data_home.h>
+using drizzled::plugin::Registry;
+#endif
+
#include "xactlog_xt.h"
#include "database_xt.h"
#include "util_xt.h"
#include "strutil_xt.h"
#include "filesys_xt.h"
-#include "restart_xt.h"
#include "myxt_xt.h"
#include "trace_xt.h"
@@ -57,13 +61,27 @@
//#define PRINTF xt_ftracef
//#define PRINTF xt_trace
-void xt_print_bytes(xtWord1 *buf, u_int len)
+/*
+ * -----------------------------------------------------------------------
+ * GLOBALS
+ */
+
+xtPublic int pbxt_recovery_state;
+
+/*
+ * -----------------------------------------------------------------------
+ * UTILITIES
+ */
+
+#ifdef TRACE_RECORD_DATA
+static void xt_print_bytes(xtWord1 *buf, u_int len)
{
for (u_int i=0; i<len; i++) {
PRINTF("%02x ", (u_int) *buf);
buf++;
}
}
+#endif
void xt_print_log_record(xtLogID log, xtLogOffset offset, XTXactLogBufferDPtr record)
{
@@ -252,7 +270,7 @@ void xt_print_log_record(xtLogID log, xt
rec_type = "DELETE";
break;
case XT_LOG_ENT_DELETE_FL:
- rec_type = "DELETE-FL-BG";
+ rec_type = "DELETE-FL";
break;
case XT_LOG_ENT_UPDATE_BG:
rec_type = "UPDATE-BG";
@@ -320,6 +338,11 @@ void xt_print_log_record(xtLogID log, xt
case XT_LOG_ENT_END_OF_LOG:
rec_type = "END OF LOG";
break;
+ case XT_LOG_ENT_PREPARE:
+ rec_type = "PREPARE";
+ xn_id = XT_GET_DISK_4(record->xp.xp_xact_id_4);
+ xn_set = TRUE;
+ break;
}
if (log)
@@ -327,6 +350,8 @@ void xt_print_log_record(xtLogID log, xt
PRINTF("%s ", rec_type);
if (type)
PRINTF("op=%lu tab=%lu %s=%lu ", (u_long) op_no, (u_long) tab_id, type, (u_long) rec_id);
+ else if (tab_id)
+ PRINTF("tab=%lu ", (u_long) tab_id);
if (row_id)
PRINTF("row=%lu ", (u_long) row_id);
if (log_id)
@@ -459,6 +484,7 @@ static xtBool xres_open_table(XTThreadPt
}
return OK;
}
+
ws->ws_tab_gone = tab_id;
return FAILED;
}
@@ -629,6 +655,7 @@ static void xres_apply_change(XTThreadPt
xtWord1 *rec_data = NULL;
XTTabRecFreeDPtr free_data;
+ ASSERT(ot->ot_thread == self);
if (tab->tab_dic.dic_key_count == 0)
check_index = FALSE;
@@ -1300,7 +1327,7 @@ static void xres_apply_operations(XTThre
* These operations are applied even though operations
* in sequence are missing.
*/
-xtBool xres_sync_operations(XTThreadPtr self, XTDatabaseHPtr db, XTWriterStatePtr ws)
+static xtBool xres_sync_operations(XTThreadPtr self, XTDatabaseHPtr db, XTWriterStatePtr ws)
{
u_int edx;
XTTableEntryPtr te_ptr;
@@ -1881,15 +1908,6 @@ xtBool XTXactRestart::xres_check_checksu
void XTXactRestart::xres_recover_progress(XTThreadPtr self, XTOpenFilePtr *of, int perc)
{
#ifdef XT_USE_GLOBAL_DB
- if (!perc) {
- char file_path[PATH_MAX];
-
- xt_strcpy(PATH_MAX, file_path, xres_db->db_main_path);
- xt_add_pbxt_file(PATH_MAX, file_path, "recovery-progress");
- *of = xt_open_file(self, file_path, XT_FS_CREATE | XT_FS_MAKE_PATH);
- xt_set_eof_file(self, *of, 0);
- }
-
if (perc > 100) {
char file_path[PATH_MAX];
@@ -1905,6 +1923,15 @@ void XTXactRestart::xres_recover_progres
else {
char number[40];
+ if (!*of) {
+ char file_path[PATH_MAX];
+
+ xt_strcpy(PATH_MAX, file_path, xres_db->db_main_path);
+ xt_add_pbxt_file(PATH_MAX, file_path, "recovery-progress");
+ *of = xt_open_file(self, file_path, XT_FS_CREATE | XT_FS_MAKE_PATH);
+ xt_set_eof_file(self, *of, 0);
+ }
+
sprintf(number, "%d", perc);
if (!xt_pwrite_file(*of, 0, strlen(number), number, &self->st_statistics.st_x, self))
xt_throw(self);
@@ -1927,10 +1954,11 @@ xtBool XTXactRestart::xres_restart(XTThr
off_t bytes_to_read;
volatile xtBool print_progress = FALSE;
volatile off_t perc_size = 0, next_goal = 0;
- int perc_complete = 1;
+ int perc_complete = 1, perc_to_write = 1;
XTOpenFilePtr progress_file = NULL;
xtBool min_ram_xn_id_set = FALSE;
u_int log_count;
+ time_t start_time;
memset(&ws, 0, sizeof(ws));
@@ -1955,12 +1983,11 @@ xtBool XTXactRestart::xres_restart(XTThr
/* Don't print anything about recovering an empty database: */
if (bytes_to_read != 0)
xt_logf(XT_NT_INFO, "PBXT: Recovering from %lu-%llu, bytes to read: %llu\n", (u_long) xres_cp_log_id, (u_llong) xres_cp_log_offset, (u_llong) bytes_to_read);
- if (bytes_to_read >= 10*1024*1024) {
- print_progress = TRUE;
- perc_size = bytes_to_read / 100;
- next_goal = perc_size;
- xres_recover_progress(self, &progress_file, 0);
- }
+
+ print_progress = FALSE;
+ start_time = time(NULL);
+ perc_size = bytes_to_read / 100;
+ next_goal = perc_size;
if (!db->db_xlog.xlog_seq_start(&ws.ws_seqread, xres_cp_log_id, xres_cp_log_offset, FALSE)) {
ok = FALSE;
@@ -1983,17 +2010,28 @@ xtBool XTXactRestart::xres_restart(XTThr
#ifdef PRINT_LOG_ON_RECOVERY
xt_print_log_record(ws.ws_seqread.xseq_rec_log_id, ws.ws_seqread.xseq_rec_log_offset, record);
#endif
- if (print_progress && bytes_read > next_goal) {
- if (((perc_complete - 1) % 25) == 0)
- xt_logf(XT_NT_INFO, "PBXT: ");
- if ((perc_complete % 25) == 0)
- xt_logf(XT_NT_INFO, "%2d\n", (int) perc_complete);
- else
- xt_logf(XT_NT_INFO, "%2d ", (int) perc_complete);
- xt_log_flush(self);
- xres_recover_progress(self, &progress_file, perc_complete);
- next_goal += perc_size;
- perc_complete++;
+ if (bytes_read >= next_goal) {
+ while (bytes_read >= next_goal) {
+ next_goal += perc_size;
+ perc_complete++;
+ }
+ if (!print_progress) {
+ if (time(NULL) - start_time > 2)
+ print_progress = TRUE;
+ }
+ if (print_progress) {
+ while (perc_to_write < perc_complete) {
+ if (((perc_to_write - 1) % 25) == 0)
+ xt_logf(XT_NT_INFO, "PBXT: ");
+ if ((perc_to_write % 25) == 0)
+ xt_logf(XT_NT_INFO, "%2d\n", (int) perc_to_write);
+ else
+ xt_logf(XT_NT_INFO, "%2d ", (int) perc_to_write);
+ xt_log_flush(self);
+ xres_recover_progress(self, &progress_file, perc_to_write);
+ perc_to_write++;
+ }
+ }
}
switch (record->xl.xl_status_1) {
case XT_LOG_ENT_HEADER:
@@ -2053,8 +2091,11 @@ xtBool XTXactRestart::xres_restart(XTThr
xact->xd_end_xn_id = xn_id;
xact->xd_flags |= XT_XN_XAC_ENDED | XT_XN_XAC_SWEEP;
xact->xd_flags &= ~XT_XN_XAC_RECOVERED; // We can expect an end record on cleanup!
+ xact->xd_flags &= ~XT_XN_XAC_PREPARED; // Prepared transactions cannot be swept!
if (record->xl.xl_status_1 == XT_LOG_ENT_COMMIT)
xact->xd_flags |= XT_XN_XAC_COMMITTED;
+ if (xt_sl_get_size(db->db_xn_xa_list) > 0)
+ xt_xn_delete_xa_data_by_xact(db, xn_id, self);
}
break;
case XT_LOG_ENT_CLEANUP:
@@ -2071,6 +2112,14 @@ xtBool XTXactRestart::xres_restart(XTThr
rec_log_id = XT_GET_DISK_4(record->xl.xl_log_id_4);
xt_dl_set_to_delete(self, db, rec_log_id);
break;
+ case XT_LOG_ENT_PREPARE:
+ xn_id = XT_GET_DISK_4(record->xp.xp_xact_id_4);
+ if ((xact = xt_xn_get_xact(db, xn_id, self))) {
+ xact->xd_flags |= XT_XN_XAC_PREPARED;
+ if (!xt_xn_store_xa_data(db, xn_id, record->xp.xp_xa_len_1, record->xp.xp_xa_data, self))
+ xt_throw(self);
+ }
+ break;
default:
xt_xres_apply_in_order(self, &ws, ws.ws_seqread.xseq_rec_log_id, ws.ws_seqread.xseq_rec_log_offset, record);
break;
@@ -2534,7 +2583,8 @@ static void xres_cp_main(XTThreadPtr sel
/* This condition means we could checkpoint: */
if (!(xt_sl_get_size(db->db_datalogs.dlc_to_delete) == 0 &&
xt_sl_get_size(db->db_datalogs.dlc_deleted) == 0 &&
- xt_comp_log_pos(log_id, log_offset, db->db_restart.xres_cp_log_id, db->db_restart.xres_cp_log_offset) <= 0))
+ xt_comp_log_pos(log_id, log_offset, db->db_restart.xres_cp_log_id, db->db_restart.xres_cp_log_offset) <= 0) &&
+ xt_sl_get_size(db->db_xn_xa_list) == 0)
break;
xres_cp_wait_for_log_writer(self, db, 400);
@@ -2654,7 +2704,7 @@ xtPublic xtBool xt_begin_checkpoint(XTDa
* until they are flushed.
*/
/* This is an alternative to the above.
- if (!xt_xlog_flush_log(self))
+ if (!xt_xlog_flush_log(db, self))
xt_throw(self);
*/
xt_lock_mutex_ns(&db->db_wr_lock);
@@ -2776,6 +2826,14 @@ xtPublic xtBool xt_end_checkpoint(XTData
size_t chk_size = 0;
u_int no_of_logs = 0;
+ /* As long as we have outstanding XA transactions, we may not checkpoint! */
+ if (xt_sl_get_size(db->db_xn_xa_list) > 0) {
+#ifdef DEBUG
+ printf("Checkpoint must wait\n");
+#endif
+ return OK;
+ }
+
#ifdef NEVER_CHECKPOINT
return OK;
#endif
@@ -3183,7 +3241,7 @@ xtPublic void xt_dump_xlogs(XTDatabaseHP
* D A T A B A S E R E C O V E R Y T H R E A D
*/
-extern XTDatabaseHPtr pbxt_database;
+
static XTThreadPtr xres_recovery_thread;
static void *xn_xres_run_recovery_thread(XTThreadPtr self)
@@ -3193,18 +3251,18 @@ static void *xn_xres_run_recovery_thread
if (!(mysql_thread = (THD *) myxt_create_thread()))
xt_throw(self);
- while (!xres_recovery_thread->t_quit && !ha_resolve_by_legacy_type(mysql_thread, DB_TYPE_PBXT))
- xt_sleep_milli_second(1);
+ myxt_wait_pbxt_plugin_slot_assigned(self);
if (!xres_recovery_thread->t_quit) {
- /* {GLOBAL-DB}
- * It can happen that something will just get in before this
- * thread and open/recover the database!
- */
- if (!pbxt_database) {
- try_(a) {
+ try_(a) {
+ /* {GLOBAL-DB}
+ * It can happen that something will just get in before this
+ * thread and open/recover the database!
+ */
+ if (!pbxt_database) {
xt_open_database(self, mysql_real_data_home, TRUE);
- /* This can be done at the same time by a foreground thread,
+ /* {GLOBAL-DB}
+ * This can be done at the same time as the recovery thread,
* strictly speaking I need a lock.
*/
if (!pbxt_database) {
@@ -3212,11 +3270,22 @@ static void *xn_xres_run_recovery_thread
xt_heap_reference(self, pbxt_database);
}
}
- catch_(a) {
- xt_log_and_clear_exception(self);
- }
- cont_(a);
+ else
+ xt_use_database(self, pbxt_database, XT_FOR_USER);
+
+ pbxt_recovery_state = XT_RECOVER_DONE;
+
+ /* {WAIT-FOR-SW-AFTER-RECOV}
+ * Moved to here...
+ */
+ xt_wait_for_sweeper(self, self->st_database, 0);
+
+ pbxt_recovery_state = XT_RECOVER_SWEPT;
}
+ catch_(a) {
+ xt_log_and_clear_exception(self);
+ }
+ cont_(a);
}
/*
@@ -3261,11 +3330,12 @@ xtPublic void xt_xres_start_database_rec
sprintf(name, "DB-RECOVERY-%s", xt_last_directory_of_path(mysql_real_data_home));
xt_remove_dir_char(name);
+ pbxt_recovery_state = XT_RECOVER_PENDING;
xres_recovery_thread = xt_create_daemon(self, name);
xt_run_thread(self, xres_recovery_thread, xn_xres_run_recovery_thread);
}
-xtPublic void xt_xres_wait_for_recovery(XTThreadPtr self)
+xtPublic void xt_xres_terminate_recovery(XTThreadPtr self)
{
XTThreadPtr thr_rec;
=== modified file 'storage/pbxt/src/restart_xt.h'
--- a/storage/pbxt/src/restart_xt.h 2009-09-03 06:15:03 +0000
+++ b/storage/pbxt/src/restart_xt.h 2009-11-24 10:55:06 +0000
@@ -37,6 +37,8 @@ struct XTOpenTable;
struct XTDatabase;
struct XTTable;
+extern int pbxt_recovery_state;
+
typedef struct XTWriterState {
struct XTDatabase *ws_db;
xtBool ws_in_recover;
@@ -132,6 +134,16 @@ void xt_print_log_record(xtLogID log, of
void xt_dump_xlogs(struct XTDatabase *db, xtLogID start_log);
void xt_xres_start_database_recovery(XTThreadPtr self);
-void xt_xres_wait_for_recovery(XTThreadPtr self);
+void xt_xres_terminate_recovery(XTThreadPtr self);
+
+#define XT_RECOVER_PENDING 0
+#define XT_RECOVER_DONE 1
+#define XT_RECOVER_SWEPT 2
+
+inline void xt_xres_wait_for_recovery(XTThreadPtr XT_UNUSED(self), int state)
+{
+ while (pbxt_recovery_state < state)
+ xt_sleep_milli_second(100);
+}
#endif
=== modified file 'storage/pbxt/src/strutil_xt.cc'
--- a/storage/pbxt/src/strutil_xt.cc 2009-10-26 11:35:42 +0000
+++ b/storage/pbxt/src/strutil_xt.cc 2009-11-24 10:55:06 +0000
@@ -21,8 +21,10 @@
* H&G2JCtL
*/
-#include "mysql_priv.h"
#include "xt_config.h"
+
+#include <stdio.h>
+#include <string.h>
#include <ctype.h>
#include "strutil_xt.h"
@@ -107,13 +109,17 @@ xtPublic void xt_2nd_last_name_of_path(s
*dest = 0;
return;
}
- /* If temporary file */
- if (!is_prefix(path, mysql_data_home) &&
+
+ /* {INVALID-OLD-TABLE-FIX}
+ * I have changed the implementation of
+ * this bug fix (see {INVALID-OLD-TABLE-FIX}).
+ if (!is_prefix(path, mysql_data_home) &&
!is_prefix(path, mysql_real_data_home))
{
*dest= 0;
return;
}
+ */
ptr = path + len - 1;
while (ptr != path && !XT_IS_DIR_CHAR(*ptr))
@@ -374,7 +380,7 @@ xtPublic void xt_int8_to_byte_size(xtInt
/* Version number must also be set in configure.in! */
xtPublic c_char *xt_get_version(void)
{
- return "1.0.08d RC";
+ return "1.0.09f RC";
}
/* Copy and URL decode! */
=== modified file 'storage/pbxt/src/systab_xt.cc'
--- a/storage/pbxt/src/systab_xt.cc 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/systab_xt.cc 2009-11-24 10:55:06 +0000
@@ -130,7 +130,7 @@ static int pbms_discover_handler(handler
* MYSQL UTILITIES
*/
-void xt_my_set_notnull_in_record(Field *field, char *record)
+static void xt_my_set_notnull_in_record(Field *field, char *record)
{
if (field->null_ptr)
record[(uint) (field->null_ptr - (uchar *) field->table->record[0])] &= (uchar) ~field->null_bit;
@@ -518,7 +518,7 @@ bool XTStatisticsTable::seqScanRead(xtWo
* SYSTEM TABLE SHARES
*/
-void st_path_to_table_name(size_t size, char *buffer, const char *path)
+static void st_path_to_table_name(size_t size, char *buffer, const char *path)
{
char *str;
=== modified file 'storage/pbxt/src/tabcache_xt.cc'
--- a/storage/pbxt/src/tabcache_xt.cc 2009-09-03 06:15:03 +0000
+++ b/storage/pbxt/src/tabcache_xt.cc 2009-11-27 15:37:02 +0000
@@ -590,7 +590,7 @@ xtBool XTTabCache::tc_fetch(XT_ROW_REC_F
* So there could be a deadlock if I don't flush the log!
*/
if ((self = xt_get_self())) {
- if (!xt_xlog_flush_log(self))
+ if (!xt_xlog_flush_log(tci_table->tab_db, self))
goto failed;
}
@@ -1150,6 +1150,8 @@ static void *tabc_fr_run_thread(XTThread
int count;
void *mysql_thread;
+ myxt_wait_pbxt_plugin_slot_assigned(self);
+
mysql_thread = myxt_create_thread();
while (!self->t_quit) {
=== modified file 'storage/pbxt/src/tabcache_xt.h'
--- a/storage/pbxt/src/tabcache_xt.h 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/tabcache_xt.h 2009-11-24 10:55:06 +0000
@@ -168,7 +168,7 @@ typedef struct XTTableSeq {
#define TAB_CAC_UNLOCK(i, o) xt_xsmutex_unlock(i, o)
#elif defined(TAB_CAC_USE_PTHREAD_RW)
#define TAB_CAC_LOCK_TYPE xt_rwlock_type
-#define TAB_CAC_INIT_LOCK(s, i) xt_init_rwlock(s, i)
+#define TAB_CAC_INIT_LOCK(s, i) xt_init_rwlock_with_autoname(s, i)
#define TAB_CAC_FREE_LOCK(s, i) xt_free_rwlock(i)
#define TAB_CAC_READ_LOCK(i, o) xt_slock_rwlock_ns(i)
#define TAB_CAC_WRITE_LOCK(i, o) xt_xlock_rwlock_ns(i)
=== modified file 'storage/pbxt/src/table_xt.cc'
--- a/storage/pbxt/src/table_xt.cc 2009-08-18 07:46:53 +0000
+++ b/storage/pbxt/src/table_xt.cc 2009-11-25 15:40:51 +0000
@@ -35,7 +35,6 @@
#include <drizzled/common.h>
#include <mysys/thr_lock.h>
#include <drizzled/dtcollation.h>
-#include <drizzled/plugin/storage_engine.h>
#else
#include "mysql_priv.h"
#endif
@@ -48,7 +47,6 @@
#include "cache_xt.h"
#include "trace_xt.h"
#include "index_xt.h"
-#include "restart_xt.h"
#include "systab_xt.h"
#ifdef DEBUG
@@ -2347,7 +2345,7 @@ xtPublic void xt_flush_table(XTThreadPtr
}
-xtPublic XTOpenTablePtr tab_open_table(XTTableHPtr tab)
+static XTOpenTablePtr tab_open_table(XTTableHPtr tab)
{
volatile XTOpenTablePtr ot;
XTThreadPtr self;
@@ -2588,7 +2586,7 @@ xtPublic xtBool xt_tab_put_log_op_rec_da
return FAILED;
}
- return xt_xlog_modify_table(ot, status, op_seq, free_rec_id, rec_id, size, buffer);
+ return xt_xlog_modify_table(tab->tab_id, status, op_seq, free_rec_id, rec_id, size, buffer, ot->ot_thread);
}
xtPublic xtBool xt_tab_put_log_rec_data(XTOpenTablePtr ot, u_int status, xtRecordID free_rec_id, xtRecordID rec_id, size_t size, xtWord1 *buffer, xtOpSeqNo *op_seq)
@@ -2606,7 +2604,7 @@ xtPublic xtBool xt_tab_put_log_rec_data(
return FAILED;
}
- return xt_xlog_modify_table(ot, status, *op_seq, free_rec_id, rec_id, size, buffer);
+ return xt_xlog_modify_table(tab->tab_id, status, *op_seq, free_rec_id, rec_id, size, buffer, ot->ot_thread);
}
xtPublic xtBool xt_tab_get_rec_data(XTOpenTablePtr ot, xtRecordID rec_id, size_t size, xtWord1 *buffer)
@@ -3541,7 +3539,7 @@ xtPublic xtBool xt_tab_free_row(XTOpenTa
tab->tab_row_fnum++;
xt_unlock_mutex_ns(&tab->tab_row_lock);
- if (!xt_xlog_modify_table(ot, XT_LOG_ENT_ROW_FREED, op_seq, 0, row_id, sizeof(XTTabRowRefDRec), (xtWord1 *) &free_row))
+ if (!xt_xlog_modify_table(tab->tab_id, XT_LOG_ENT_ROW_FREED, op_seq, 0, row_id, sizeof(XTTabRowRefDRec), (xtWord1 *) &free_row, ot->ot_thread))
return FAILED;
return OK;
@@ -3791,7 +3789,7 @@ xtPublic int xt_tab_remove_record(XTOpen
xt_unlock_mutex_ns(&tab->tab_rec_lock);
free_rec->rf_rec_type_1 = old_rec_type;
- return xt_xlog_modify_table(ot, XT_LOG_ENT_REC_REMOVED_BI, op_seq, (xtRecordID) new_rec_type, rec_id, rec_size, ot->ot_row_rbuffer);
+ return xt_xlog_modify_table(tab->tab_id, XT_LOG_ENT_REC_REMOVED_BI, op_seq, (xtRecordID) new_rec_type, rec_id, rec_size, ot->ot_row_rbuffer, ot->ot_thread);
}
static xtRowID tab_new_row(XTOpenTablePtr ot, XTTableHPtr tab)
@@ -3837,7 +3835,7 @@ static xtRowID tab_new_row(XTOpenTablePt
op_seq = tab->tab_seq.ts_get_op_seq();
xt_unlock_mutex_ns(&tab->tab_row_lock);
- if (!xt_xlog_modify_table(ot, status, op_seq, next_row_id, row_id, 0, NULL))
+ if (!xt_xlog_modify_table(tab->tab_id, status, op_seq, next_row_id, row_id, 0, NULL, ot->ot_thread))
return 0;
XT_DISABLED_TRACE(("new row tx=%d row=%d\n", (int) ot->ot_thread->st_xact_data->xd_start_xn_id, (int) row_id));
@@ -3868,7 +3866,7 @@ xtPublic xtBool xt_tab_set_row(XTOpenTab
if (!tab->tab_rows.xt_tc_write(ot->ot_row_file, row_id, 0, sizeof(XTTabRowRefDRec), (xtWord1 *) &row_buf, &op_seq, TRUE, ot->ot_thread))
return FAILED;
- return xt_xlog_modify_table(ot, status, op_seq, 0, row_id, sizeof(XTTabRowRefDRec), (xtWord1 *) &row_buf);
+ return xt_xlog_modify_table(tab->tab_id, status, op_seq, 0, row_id, sizeof(XTTabRowRefDRec), (xtWord1 *) &row_buf, ot->ot_thread);
}
xtPublic xtBool xt_tab_free_record(XTOpenTablePtr ot, u_int status, xtRecordID rec_id, xtBool clean_delete)
@@ -3937,7 +3935,7 @@ xtPublic xtBool xt_tab_free_record(XTOpe
tab->tab_rec_fnum++;
xt_unlock_mutex_ns(&tab->tab_rec_lock);
- if (!xt_xlog_modify_table(ot, status, op_seq, rec_id, rec_id, sizeof(XTactFreeRecEntryDRec) - offsetof(XTactFreeRecEntryDRec, fr_stat_id_1), &free_rec.fr_stat_id_1))
+ if (!xt_xlog_modify_table(tab->tab_id, status, op_seq, rec_id, rec_id, sizeof(XTactFreeRecEntryDRec) - offsetof(XTactFreeRecEntryDRec, fr_stat_id_1), &free_rec.fr_stat_id_1, ot->ot_thread))
return FAILED;
}
return OK;
@@ -4016,7 +4014,7 @@ static xtBool tab_add_record(XTOpenTable
}
xt_unlock_mutex_ns(&tab->tab_rec_lock);
- if (!xt_xlog_modify_table(ot, status, op_seq, next_rec_id, rec_id, rec_info->ri_rec_buf_size, (xtWord1 *) rec_info->ri_fix_rec_buf))
+ if (!xt_xlog_modify_table(tab->tab_id, status, op_seq, next_rec_id, rec_id, rec_info->ri_rec_buf_size, (xtWord1 *) rec_info->ri_fix_rec_buf, ot->ot_thread))
return FAILED;
if (rec_info->ri_ext_rec) {
@@ -4932,6 +4930,12 @@ xtPublic void xt_tab_seq_exit(XTOpenTabl
#endif
#endif
+xtPublic void xt_tab_seq_repeat(XTOpenTablePtr ot)
+{
+ ot->ot_seq_rec_id--;
+ ot->ot_seq_offset -= ot->ot_table->tab_dic.dic_rec_size;
+}
+
xtPublic xtBool xt_tab_seq_next(XTOpenTablePtr ot, xtWord1 *buffer, xtBool *eof)
{
register XTTableHPtr tab = ot->ot_table;
@@ -5094,7 +5098,7 @@ static xtBool tab_exec_repair_pending(XT
return FALSE;
}
else {
- if (!xt_open_file_ns(&of, file_path, XT_FS_DEFAULT))
+ if (!xt_open_file_ns(&of, file_path, XT_FS_DEFAULT | XT_FS_MISSING_OK))
return FALSE;
}
if (!of)
@@ -5190,15 +5194,76 @@ static xtBool tab_exec_repair_pending(XT
return FALSE;
}
-xtPublic void tab_make_table_name(XTTableHPtr tab, char *table_name, size_t size)
+static void tab_make_table_name(XTTableHPtr tab, char *table_name, size_t size)
{
- char name_buf[XT_IDENTIFIER_NAME_SIZE*3+3];
+ char *nptr;
- xt_2nd_last_name_of_path(sizeof(name_buf), name_buf, tab->tab_name->ps_path);
- myxt_static_convert_file_name(name_buf, table_name, size);
- xt_strcat(size, table_name, ".");
- myxt_static_convert_file_name(xt_last_name_of_path(tab->tab_name->ps_path), name_buf, sizeof(name_buf));
- xt_strcat(size, table_name, name_buf);
+ nptr = xt_last_name_of_path(tab->tab_name->ps_path);
+ if (xt_starts_with(nptr, "#sql")) {
+ /* {INVALID-OLD-TABLE-FIX}
+ * Temporary files can have strange paths, for example
+ * ..../var/tmp/mysqld.1/#sqldaec_1_6
+ * This occurs, for example, occurs when the temp_table.test is
+ * run using the PBXT suite in MariaDB:
+ * ./mtr --suite=pbxt --do-test=temp_table
+ *
+ * Calling myxt_static_convert_file_name, with a '.', in the name
+ * causes the error:
+ * [ERROR] Invalid (old?) table or database name 'mysqld.1'
+ * To prevent this, we do not convert the temporary
+ * table names using the mysql functions.
+ *
+ * Note, this bug was found by Monty, and fixed by modifying
+ * xt_2nd_last_name_of_path(), see {INVALID-OLD-TABLE-FIX}.
+ *
+ */
+ xt_2nd_last_name_of_path(size, table_name, tab->tab_name->ps_path);
+ xt_strcat(size, table_name, ".");
+ xt_strcat(size, table_name, nptr);
+ }
+ else {
+ char name_buf[XT_TABLE_NAME_SIZE*3+3];
+ char *part_ptr;
+ size_t len;
+
+ xt_2nd_last_name_of_path(sizeof(name_buf), name_buf, tab->tab_name->ps_path);
+ myxt_static_convert_file_name(name_buf, table_name, size);
+ xt_strcat(size, table_name, ".");
+
+ /* Handle partition extensions to table names: */
+ if ((part_ptr = strstr(nptr, "#P#")))
+ xt_strncpy(sizeof(name_buf), name_buf, nptr, part_ptr - nptr);
+ else
+ xt_strcpy(sizeof(name_buf), name_buf, nptr);
+
+ len = strlen(table_name);
+ myxt_static_convert_file_name(name_buf, table_name + len, size - len);
+
+ if (part_ptr) {
+ /* Add the partition extension (which is relevant to the engine). */
+ char *sub_part_ptr;
+
+ part_ptr += 3;
+ if ((sub_part_ptr = strstr(part_ptr, "#SP#")))
+ xt_strncpy(sizeof(name_buf), name_buf, part_ptr, sub_part_ptr - part_ptr);
+ else
+ xt_strcpy(sizeof(name_buf), name_buf, part_ptr);
+
+ xt_strcat(size, table_name, " (");
+ len = strlen(table_name);
+ myxt_static_convert_file_name(name_buf, table_name + len, size - len);
+
+ if (sub_part_ptr) {
+
+ sub_part_ptr += 4;
+ xt_strcat(size, table_name, " - ");
+ len = strlen(table_name);
+ myxt_static_convert_file_name(sub_part_ptr, table_name + len, size - len);
+ }
+
+ xt_strcat(size, table_name, ")");
+ }
+ }
}
xtPublic xtBool xt_tab_is_table_repair_pending(XTTableHPtr tab)
=== modified file 'storage/pbxt/src/table_xt.h'
--- a/storage/pbxt/src/table_xt.h 2009-08-18 07:46:53 +0000
+++ b/storage/pbxt/src/table_xt.h 2009-11-24 10:55:06 +0000
@@ -127,7 +127,7 @@ struct XTTablePath;
#define XT_TAB_ROW_UNLOCK(i, s) xt_xsmutex_unlock(i, (s)->t_id)
#elif defined(XT_TAB_ROW_USE_PTHREAD_RW)
#define XT_TAB_ROW_LOCK_TYPE xt_rwlock_type
-#define XT_TAB_ROW_INIT_LOCK(s, i) xt_init_rwlock(s, i)
+#define XT_TAB_ROW_INIT_LOCK(s, i) xt_init_rwlock_with_autoname(s, i)
#define XT_TAB_ROW_FREE_LOCK(s, i) xt_free_rwlock(i)
#define XT_TAB_ROW_READ_LOCK(i, s) xt_slock_rwlock_ns(i)
#define XT_TAB_ROW_WRITE_LOCK(i, s) xt_xlock_rwlock_ns(i)
@@ -528,13 +528,14 @@ xtBool xt_table_exists(struct XTDatab
void xt_enum_tables_init(u_int *edx);
XTTableEntryPtr xt_enum_tables_next(struct XTThread *self, struct XTDatabase *db, u_int *edx);
-void xt_enum_files_of_tables_init(struct XTDatabase *db, char *tab_name, xtTableID tab_id, XTFilesOfTablePtr ft);
+void xt_enum_files_of_tables_init(XTPathStrPtr tab_name, xtTableID tab_id, XTFilesOfTablePtr ft);
xtBool xt_enum_files_of_tables_next(XTFilesOfTablePtr ft);
xtBool xt_tab_seq_init(XTOpenTablePtr ot);
void xt_tab_seq_reset(XTOpenTablePtr ot);
void xt_tab_seq_exit(XTOpenTablePtr ot);
xtBool xt_tab_seq_next(XTOpenTablePtr ot, xtWord1 *buffer, xtBool *eof);
+void xt_tab_seq_repeat(XTOpenTablePtr ot);
xtBool xt_tab_new_record(XTOpenTablePtr ot, xtWord1 *buffer);
xtBool xt_tab_delete_record(XTOpenTablePtr ot, xtWord1 *buffer);
=== modified file 'storage/pbxt/src/thread_xt.cc'
--- a/storage/pbxt/src/thread_xt.cc 2009-10-06 15:16:01 +0000
+++ b/storage/pbxt/src/thread_xt.cc 2009-11-24 10:55:06 +0000
@@ -75,6 +75,13 @@ static xt_mutex_type thr_array_lock;
/* Global accumulated statistics: */
static XTStatisticsRec thr_statistics;
+#ifdef DEBUG
+static void break_in_assertion(c_char *expr, c_char *func, c_char *file, u_int line)
+{
+ printf("%s(%s:%d) %s\n", func, file, (int) line, expr);
+}
+#endif
+
/*
* -----------------------------------------------------------------------
* Error logging
@@ -658,6 +665,9 @@ static c_char *thr_get_err_string(int xt
case XT_ERR_FK_REF_TEMP_TABLE: str = "Foreign key may not reference temporary table"; break;
case XT_ERR_MYSQL_SHUTDOWN: str = "Cannot open table, MySQL has shutdown"; break;
case XT_ERR_MYSQL_NO_THREAD: str = "Cannot create thread, MySQL has shutdown"; break;
+ case XT_ERR_BUFFER_TOO_SMALL: str = "System backup buffer too small"; break;
+ case XT_ERR_BAD_BACKUP_FORMAT: str = "Unknown or corrupt backup format, restore aborted"; break;
+ case XT_ERR_PBXT_NOT_INSTALLED: str = "PBXT plugin is not installed"; break;
default: str = "Unknown XT error"; break;
}
return str;
@@ -862,6 +872,11 @@ xtPublic xtBool xt_exception_errno(XTExc
return FAILED;
}
+xtPublic void xt_exception_xterr(XTExceptionPtr e, XTThreadPtr self, c_char *func, c_char *file, u_int line, int xt_err)
+{
+ xt_exception_error(e, self, func, file, line, xt_err, 0, thr_get_err_string(xt_err));
+}
+
/*
* -----------------------------------------------------------------------
* LOG ERRORS
@@ -887,7 +902,7 @@ xtPublic xtBool xt_assert(XTThreadPtr se
#ifdef DEBUG
//xt_set_fflush(TRUE);
//xt_dump_trace();
- printf("%s(%s:%d) %s\n", func, file, (int) line, expr);
+ break_in_assertion(expr, func, file, line);
#ifdef CRASH_ON_ASSERT
abort();
#endif
=== modified file 'storage/pbxt/src/thread_xt.h'
--- a/storage/pbxt/src/thread_xt.h 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/thread_xt.h 2009-11-24 10:55:06 +0000
@@ -536,6 +536,8 @@ extern struct XTThread **xt_thr_array;
* Function prototypes
*/
+extern "C" void *thr_main(void *data);
+
void xt_get_now(char *buffer, size_t len);
xtBool xt_init_logging(void);
void xt_exit_logging(void);
@@ -583,6 +585,7 @@ void xt_register_xterr(c_char *func, c
void xt_exceptionf(XTExceptionPtr e, XTThreadPtr self, c_char *func, c_char *file, u_int line, int xt_err, int sys_err, c_char *fmt, ...);
void xt_exception_error(XTExceptionPtr e, XTThreadPtr self, c_char *func, c_char *file, u_int line, int xt_err, int sys_err, c_char *msg);
xtBool xt_exception_errno(XTExceptionPtr e, XTThreadPtr self, c_char *func, c_char *file, u_int line, int err);
+void xt_exception_xterr(XTExceptionPtr e, XTThreadPtr self, c_char *func, c_char *file, u_int line, int xt_err);
void xt_log_errno(XTThreadPtr self, c_char *func, c_char *file, u_int line, int err);
@@ -610,7 +613,7 @@ void xt_critical_wait(void);
void xt_yield(void);
void xt_sleep_milli_second(u_int t);
xtBool xt_suspend(XTThreadPtr self);
-xtBool xt_unsuspend(XTThreadPtr self, XTThreadPtr target);
+xtBool xt_unsuspend(XTThreadPtr target);
void xt_lock_thread(XTThreadPtr thread);
void xt_unlock_thread(XTThreadPtr thread);
xtBool xt_wait_thread(XTThreadPtr thread);
=== modified file 'storage/pbxt/src/util_xt.cc'
--- a/storage/pbxt/src/util_xt.cc 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/util_xt.cc 2009-11-24 10:55:06 +0000
@@ -150,6 +150,23 @@ xtPublic xtWord1 xt_get_checksum1(xtWord
return (xtWord1) (sum ^ (sum >> 24) ^ (sum >> 16) ^ (sum >> 8));
}
+xtPublic xtWord4 xt_get_checksum4(xtWord1 *data, size_t len)
+{
+ register xtWord4 sum = 0, g;
+ xtWord1 *chk;
+
+ chk = data + len - 1;
+ while (chk > data) {
+ sum = (sum << 4) + *chk;
+ if ((g = sum & 0xF0000000)) {
+ sum = sum ^ (g >> 24);
+ sum = sum ^ g;
+ }
+ chk--;
+ }
+ return sum;
+}
+
/*
* --------------- Data Buffer ------------------
*/
=== modified file 'storage/pbxt/src/util_xt.h'
--- a/storage/pbxt/src/util_xt.h 2009-03-26 12:18:01 +0000
+++ b/storage/pbxt/src/util_xt.h 2009-11-24 10:55:06 +0000
@@ -39,6 +39,7 @@ xtWord4 xt_file_name_to_id(char *file_na
xtBool xt_time_difference(register xtWord4 now, register xtWord4 then);
xtWord2 xt_get_checksum(xtWord1 *data, size_t len, u_int interval);
xtWord1 xt_get_checksum1(xtWord1 *data, size_t len);
+xtWord4 xt_get_checksum4(xtWord1 *data, size_t len);
typedef struct XTDataBuffer {
size_t db_size;
=== modified file 'storage/pbxt/src/xaction_xt.cc'
--- a/storage/pbxt/src/xaction_xt.cc 2009-09-03 06:15:03 +0000
+++ b/storage/pbxt/src/xaction_xt.cc 2009-11-24 10:55:06 +0000
@@ -1075,6 +1075,7 @@ xtPublic void xt_xn_init_db(XTThreadPtr
#endif
xt_spinlock_init_with_autoname(self, &db->db_xn_id_lock);
xt_spinlock_init_with_autoname(self, &db->db_xn_wait_spinlock);
+ xt_init_mutex_with_autoname(self, &db->db_xn_xa_lock);
//xt_init_mutex_with_autoname(self, &db->db_xn_wait_lock);
//xt_init_cond(self, &db->db_xn_wait_cond);
xt_init_mutex_with_autoname(self, &db->db_sw_lock);
@@ -1096,6 +1097,9 @@ xtPublic void xt_xn_init_db(XTThreadPtr
}
}
+ /* Create a sorted list for XA transactions recovered: */
+ db->db_xn_xa_list = xt_new_sortedlist(self, sizeof(XTXactXARec), 100, 50, xt_xn_xa_compare, db, NULL, FALSE, FALSE);
+
/* Initialize the data logs: */
db->db_datalogs.dlc_init(self, db);
@@ -1146,6 +1150,7 @@ xtPublic void xt_xn_exit_db(XTThreadPtr
printf("=========> MAX TXs NOT CLEAN: %lu\n", not_clean_max);
printf("=========> MAX TXs IN RAM: %lu\n", in_ram_max);
#endif
+ XTXactPreparePtr xap, xap_next;
xt_stop_sweeper(self, db); // Should be done already!
xt_stop_writer(self, db); // Should be done already!
@@ -1187,6 +1192,19 @@ xtPublic void xt_xn_exit_db(XTThreadPtr
xt_free_mutex(&db->db_sw_lock);
//xt_free_cond(&db->db_xn_wait_cond);
//xt_free_mutex(&db->db_xn_wait_lock);
+ xt_free_mutex(&db->db_xn_xa_lock);
+ for (u_int i=0; i<XT_XA_HASH_TAB_SIZE; i++) {
+ xap = db->db_xn_xa_table[i];
+ while (xap) {
+ xap_next = xap->xp_next;
+ xt_free(self, xap);
+ xap = xap_next;
+ }
+ }
+ if (db->db_xn_xa_list) {
+ xt_free_sortedlist(self, db->db_xn_xa_list);
+ db->db_xn_xa_list = NULL;
+ }
xt_spinlock_free(self, &db->db_xn_wait_spinlock);
xt_spinlock_free(self, &db->db_xn_id_lock);
#ifdef DEBUG_RAM_LIST
@@ -1428,7 +1446,7 @@ static xtBool xn_end_xact(XTThreadPtr th
wait_xn_id = thread->st_prev_xact[thread->st_last_xact];
thread->st_prev_xact[thread->st_last_xact] = xn_id;
/* This works because XT_MAX_XACT_BEHIND == 2! */
- ASSERT_NS((thread->st_last_xact + 1) % XT_MAX_XACT_BEHIND == thread->st_last_xact ^ 1);
+ ASSERT_NS((thread->st_last_xact + 1) % XT_MAX_XACT_BEHIND == (thread->st_last_xact ^ 1));
thread->st_last_xact ^= 1;
while (xt_xn_is_before(db->db_xn_to_clean_id, wait_xn_id) && (db->db_sw_faster & XT_SW_TOO_FAR_BEHIND)) {
xt_critical_wait();
@@ -1548,6 +1566,149 @@ xtPublic int xt_xn_status(XTOpenTablePtr
return XT_XN_ABORTED;
}
+/* ----------------------------------------------------------------------
+ * XA Functionality
+ */
+
+xtPublic int xt_xn_xa_compare(XTThreadPtr XT_UNUSED(self), register const void *XT_UNUSED(thunk), register const void *a, register const void *b)
+{
+ xtXactID *x = (xtXactID *) a;
+ XTXactXAPtr y = (XTXactXAPtr) b;
+
+ if (*x == y->xx_xact_id)
+ return 0;
+ if (xt_xn_is_before(*x, y->xx_xact_id))
+ return -1;
+ return 1;
+}
+
+xtPublic xtBool xt_xn_prepare(int len, xtWord1 *xa_data, XTThreadPtr thread)
+{
+ XTXactDataPtr xact;
+
+ ASSERT_NS(thread->st_xact_data);
+ if ((xact = thread->st_xact_data)) {
+ xtXactID xn_id = xact->xd_start_xn_id;
+
+ /* Only makes sense if the transaction has already been logged: */
+ if ((thread->st_xact_data->xd_flags & XT_XN_XAC_LOGGED)) {
+ if (!xt_xlog_modify_table(0, XT_LOG_ENT_PREPARE, xn_id, 0, 0, len, xa_data, thread))
+ return FAILED;
+ }
+ }
+ return OK;
+}
+
+xtPublic xtBool xt_xn_store_xa_data(XTDatabaseHPtr db, xtXactID xact_id, int len, xtWord1 *xa_data, XTThreadPtr XT_UNUSED(thread))
+{
+ XTXactPreparePtr xap;
+ u_int idx;
+ XTXactXARec xx;
+
+ if (!(xap = (XTXactPreparePtr) xt_malloc_ns(offsetof(XTXactPrepareRec, xp_xa_data) + len)))
+ return FAILED;
+ xap->xp_xact_id = xact_id;
+ xap->xp_hash = xt_get_checksum4(xa_data, len);
+ xap->xp_data_len = len;
+ memcpy(xap->xp_xa_data, xa_data, len);
+ xx.xx_xact_id = xact_id;
+ xx.xx_xa_ptr = xap;
+
+ idx = xap->xp_hash % XT_XA_HASH_TAB_SIZE;
+ xt_lock_mutex_ns(&db->db_xn_xa_lock);
+ if (!xt_sl_insert(NULL, db->db_xn_xa_list, &xact_id, &xx)) {
+ xt_unlock_mutex_ns(&db->db_xn_xa_lock);
+ xt_free_ns(xap);
+ }
+ xap->xp_next = db->db_xn_xa_table[idx];
+ db->db_xn_xa_table[idx] = xap;
+ xt_unlock_mutex_ns(&db->db_xn_xa_lock);
+ return OK;
+}
+
+xtPublic void xt_xn_delete_xa_data_by_xact(XTDatabaseHPtr db, xtXactID xact_id, XTThreadPtr thread)
+{
+ XTXactXAPtr xx;
+
+ xt_lock_mutex_ns(&db->db_xn_xa_lock);
+ if (!(xx = (XTXactXAPtr) xt_sl_find(NULL, db->db_xn_xa_list, &xact_id)))
+ return;
+ xt_xn_delete_xa_data(db, xx->xx_xa_ptr, TRUE, thread);
+}
+
+xtPublic void xt_xn_delete_xa_data(XTDatabaseHPtr db, XTXactPreparePtr xap, xtBool unlock, XTThreadPtr XT_UNUSED(thread))
+{
+ u_int idx;
+ XTXactPreparePtr xap_ptr, xap_pptr = NULL;
+
+ xt_sl_delete(NULL, db->db_xn_xa_list, &xap->xp_xact_id);
+ idx = xap->xp_hash % XT_XA_HASH_TAB_SIZE;
+ xap_ptr = db->db_xn_xa_table[idx];
+ while (xap_ptr) {
+ if (xap_ptr == xap)
+ break;
+ xap_pptr = xap_ptr;
+ xap_ptr = xap_ptr->xp_next;
+ }
+ if (xap_ptr) {
+ if (xap_pptr)
+ xap_pptr->xp_next = xap_ptr->xp_next;
+ else
+ db->db_xn_xa_table[idx] = xap_ptr->xp_next;
+ xt_free_ns(xap);
+ }
+ if (unlock)
+ xt_unlock_mutex_ns(&db->db_xn_xa_lock);
+}
+
+xtPublic XTXactPreparePtr xt_xn_find_xa_data(XTDatabaseHPtr db, int len, xtWord1 *xa_data, xtBool lock, XTThreadPtr XT_UNUSED(thread))
+{
+ xtWord4 hash;
+ XTXactPreparePtr xap;
+ u_int idx;
+
+ if (lock)
+ xt_lock_mutex_ns(&db->db_xn_xa_lock);
+ hash = xt_get_checksum4(xa_data, len);
+ idx = hash % XT_XA_HASH_TAB_SIZE;
+ xap = db->db_xn_xa_table[idx];
+ while (xap) {
+ if (xap->xp_hash == hash &&
+ xap->xp_data_len == len &&
+ memcmp(xap->xp_xa_data, xa_data, len) == 0) {
+ break;
+ }
+ xap = xap->xp_next;
+ }
+
+ return xap;
+}
+
+xtPublic XTXactPreparePtr xt_xn_enum_xa_data(XTDatabaseHPtr db, XTXactEnumXAPtr exa)
+{
+ XTXactXAPtr xx;
+
+ if (!exa->exa_locked) {
+ xt_lock_mutex_ns(&db->db_xn_xa_lock);
+ exa->exa_locked = TRUE;
+ }
+
+ if ((xx = (XTXactXAPtr) xt_sl_item_at(db->db_xn_xa_list, exa->exa_index))) {
+ exa->exa_index++;
+ return xx->xx_xa_ptr;
+ }
+
+ if (exa->exa_locked) {
+ exa->exa_locked = FALSE;
+ xt_unlock_mutex_ns(&db->db_xn_xa_lock);
+ }
+ return NULL;
+}
+
+/* ----------------------------------------------------------------------
+ * S W E E P E R F U N C T I O N S
+ */
+
xtPublic xtWord8 xt_xn_bytes_to_sweep(XTDatabaseHPtr db, XTThreadPtr thread)
{
xtXactID xn_id;
@@ -2047,7 +2208,7 @@ static xtBool xn_sw_cleanup_variation(XT
if(!tab->tab_recs.xt_tc_write_cond(self, ot->ot_rec_file, rec_id, rec_head.tr_rec_type_1, &op_seq, xn_id, row_id, stat_id, rec_type))
/* this means record was not updated by xt_tc_write_bor and doesn't need to */
break;
- if (!xt_xlog_modify_table(ot, XT_LOG_ENT_REC_CLEANED_1, op_seq, 0, rec_id, 1, &rec_head.tr_rec_type_1))
+ if (!xt_xlog_modify_table(tab->tab_id, XT_LOG_ENT_REC_CLEANED_1, op_seq, 0, rec_id, 1, &rec_head.tr_rec_type_1, self))
throw_();
xn_sw_clean_indices(self, ot, rec_id, row_id, rec_buf, ss->ss_databuf.db_data);
break;
@@ -2397,8 +2558,10 @@ static void xn_sw_main(XTThreadPtr self)
if ((xact = xt_xn_get_xact(db, db->db_xn_to_clean_id, self))) {
xtXactID xn_id;
- if (!(xact->xd_flags & XT_XN_XAC_SWEEP))
- /* Transaction has not yet ending, and ready to sweep. */
+ /* The sweep flag is set when the transaction is ready for sweeping.
+ * Prepared transactions may not be swept!
+ */
+ if (!(xact->xd_flags & XT_XN_XAC_SWEEP) || (xact->xd_flags & XT_XN_XAC_PREPARED))
goto sleep;
/* Check if we can cleanup the transaction.
@@ -2493,7 +2656,7 @@ static void xn_sw_main(XTThreadPtr self)
* we flush the log.
*/
if (now >= idle_start + 2) {
- if (!xt_xlog_flush_log(self))
+ if (!xt_xlog_flush_log(db, self))
xt_throw(self);
ss->ss_flush_pending = FALSE;
}
@@ -2516,7 +2679,7 @@ static void xn_sw_main(XTThreadPtr self)
}
if (ss->ss_flush_pending) {
- xt_xlog_flush_log(self);
+ xt_xlog_flush_log(db, self);
ss->ss_flush_pending = FALSE;
}
=== modified file 'storage/pbxt/src/xaction_xt.h'
--- a/storage/pbxt/src/xaction_xt.h 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/xaction_xt.h 2009-11-24 10:55:06 +0000
@@ -87,6 +87,7 @@ struct XTOpenTable;
#define XT_XN_XAC_CLEANED 8 /* The transaction has been cleaned. */
#define XT_XN_XAC_RECOVERED 16 /* This transaction was detected on recovery. */
#define XT_XN_XAC_SWEEP 32 /* End ID has been set, OK to sweep. */
+#define XT_XN_XAC_PREPARED 64 /* The transaction was prepared (used only by recovery). */
#define XT_XN_VISIBLE 0 /* The transaction is committed, and the record is visible. */
#define XT_XN_NOT_VISIBLE 1 /* The transaction is committed, but not visible. */
@@ -95,6 +96,24 @@ struct XTOpenTable;
#define XT_XN_OTHER_UPDATE 4 /* The record was updated by someone else. */
#define XT_XN_REREAD 5 /* The transaction is not longer in RAM, status is unkown, retry. */
+typedef struct XTXactPrepare {
+ xtXactID xp_xact_id;
+ xtWord4 xp_hash;
+ struct XTXactPrepare *xp_next; /* Next item in hash table. */
+ int xp_data_len;
+ xtWord1 xp_xa_data[XT_MAX_XA_DATA_SIZE];
+} XTXactPrepareRec, *XTXactPreparePtr;
+
+typedef struct XTXactXA {
+ xtXactID xx_xact_id;
+ XTXactPreparePtr xx_xa_ptr;
+} XTXactXARec, *XTXactXAPtr;
+
+typedef struct XTXactEnumXA {
+ u_int exa_index;
+ xtBool exa_locked;
+} XTXactEnumXARec, *XTXactEnumXAPtr;
+
typedef struct XTXactData {
xtXactID xd_start_xn_id; /* Note: may be zero!. */
xtXactID xd_end_xn_id; /* Note: may be zero!. */
@@ -105,6 +124,7 @@ typedef struct XTXactData {
int xd_flags;
xtWord4 xd_end_time;
xtThreadID xd_thread_id;
+ xtWord4 xd_xa_hash; /* 0 if no XA transaction. */
/* A transaction may be indexed twice in the hash table.
* Once on the start sequence number, and once on the
@@ -123,7 +143,7 @@ typedef struct XTXactData {
#if defined(XT_XACT_USE_PTHREAD_RW)
#define XT_XACT_LOCK_TYPE xt_rwlock_type
-#define XT_XACT_INIT_LOCK(s, i) xt_init_rwlock(s, i)
+#define XT_XACT_INIT_LOCK(s, i) xt_init_rwlock_with_autoname(s, i)
#define XT_XACT_FREE_LOCK(s, i) xt_free_rwlock(i)
#define XT_XACT_READ_LOCK(i, s) xt_slock_rwlock_ns(i)
#define XT_XACT_WRITE_LOCK(i, s) xt_xlock_rwlock_ns(i)
@@ -183,6 +203,14 @@ void xt_xn_wakeup_thread(xtThreadID th
xtXactID xt_xn_get_curr_id(struct XTDatabase *db);
xtWord8 xt_xn_bytes_to_sweep(struct XTDatabase *db, struct XTThread *thread);
+int xt_xn_xa_compare(struct XTThread *self, register const void *thunk, register const void *a, register const void *b);
+xtBool xt_xn_prepare(int len, xtWord1 *xa_data, struct XTThread *thread);
+xtBool xt_xn_store_xa_data(struct XTDatabase *db, xtXactID xn_id, int len, xtWord1 *xa_data, struct XTThread *thread);
+void xt_xn_delete_xa_data_by_xact(struct XTDatabase *db, xtXactID xact_id, struct XTThread *thread);
+void xt_xn_delete_xa_data(struct XTDatabase *db, XTXactPreparePtr xap, xtBool unlock, struct XTThread *thread);
+XTXactPreparePtr xt_xn_find_xa_data(struct XTDatabase *db, int len, xtWord1 *xa_data, xtBool lock, struct XTThread *thread);
+XTXactPreparePtr xt_xn_enum_xa_data(struct XTDatabase *db, XTXactEnumXAPtr exa);
+
XTXactDataPtr xt_xn_add_old_xact(struct XTDatabase *db, xtXactID xn_id, struct XTThread *thread);
XTXactDataPtr xt_xn_get_xact(struct XTDatabase *db, xtXactID xn_id, struct XTThread *thread);
xtBool xt_xn_delete_xact(struct XTDatabase *db, xtXactID xn_id, struct XTThread *thread);
=== modified file 'storage/pbxt/src/xactlog_xt.cc'
--- a/storage/pbxt/src/xactlog_xt.cc 2009-09-03 06:15:03 +0000
+++ b/storage/pbxt/src/xactlog_xt.cc 2009-11-24 10:55:06 +0000
@@ -1108,6 +1108,9 @@ xtBool XTDatabaseLog::xlog_append(XTThre
if ((part_size = xl_write_buf_pos % 512)) {
part_size = 512 - part_size;
xl_write_buffer[xl_write_buf_pos] = XT_LOG_ENT_END_OF_LOG;
+#ifdef HAVE_valgrind
+ memset(xl_write_buffer + xl_write_buf_pos + 1, 0x66, part_size);
+#endif
if (!xt_pwrite_file(xl_log_file, xl_write_log_offset, xl_write_buf_pos+part_size, xl_write_buffer, &thread->st_statistics.st_xlog, thread))
goto write_failed;
}
@@ -1477,9 +1480,9 @@ void XTDatabaseLog::xlog_name(size_t siz
* T H R E A D T R A N S A C T I O N B U F F E R
*/
-xtPublic xtBool xt_xlog_flush_log(XTThreadPtr thread)
+xtPublic xtBool xt_xlog_flush_log(struct XTDatabase *db, XTThreadPtr thread)
{
- return thread->st_database->db_xlog.xlog_flush(thread);
+ return db->db_xlog.xlog_flush(thread);
}
xtPublic xtBool xt_xlog_log_data(XTThreadPtr thread, size_t size, XTXactLogBufferDPtr log_entry, xtBool commit)
@@ -1488,15 +1491,14 @@ xtPublic xtBool xt_xlog_log_data(XTThrea
}
/* Allocate a record from the free list. */
-xtPublic xtBool xt_xlog_modify_table(struct XTOpenTable *ot, u_int status, xtOpSeqNo op_seq, xtRecordID free_rec_id, xtRecordID rec_id, size_t size, xtWord1 *data)
+xtPublic xtBool xt_xlog_modify_table(xtTableID tab_id, u_int status, xtOpSeqNo op_seq, xtRecordID free_rec_id, xtRecordID rec_id, size_t size, xtWord1 *data, XTThreadPtr thread)
{
XTXactLogBufferDRec log_entry;
- XTThreadPtr thread = ot->ot_thread;
- XTTableHPtr tab = ot->ot_table;
size_t len;
xtWord4 sum = 0;
int check_size = 1;
XTXactDataPtr xact = NULL;
+ xtBool commit = FALSE;
switch (status) {
case XT_LOG_ENT_REC_MODIFIED:
@@ -1505,7 +1507,7 @@ xtPublic xtBool xt_xlog_modify_table(str
case XT_LOG_ENT_DELETE:
check_size = 2;
XT_SET_DISK_4(log_entry.xu.xu_op_seq_4, op_seq);
- XT_SET_DISK_4(log_entry.xu.xu_tab_id_4, tab->tab_id);
+ XT_SET_DISK_4(log_entry.xu.xu_tab_id_4, tab_id);
XT_SET_DISK_4(log_entry.xu.xu_rec_id_4, rec_id);
XT_SET_DISK_2(log_entry.xu.xu_size_2, size);
len = offsetof(XTactUpdateEntryDRec, xu_rec_type_1);
@@ -1521,7 +1523,7 @@ xtPublic xtBool xt_xlog_modify_table(str
case XT_LOG_ENT_DELETE_FL:
check_size = 2;
XT_SET_DISK_4(log_entry.xf.xf_op_seq_4, op_seq);
- XT_SET_DISK_4(log_entry.xf.xf_tab_id_4, tab->tab_id);
+ XT_SET_DISK_4(log_entry.xf.xf_tab_id_4, tab_id);
XT_SET_DISK_4(log_entry.xf.xf_rec_id_4, rec_id);
XT_SET_DISK_2(log_entry.xf.xf_size_2, size);
XT_SET_DISK_4(log_entry.xf.xf_free_rec_id_4, free_rec_id);
@@ -1539,14 +1541,14 @@ xtPublic xtBool xt_xlog_modify_table(str
case XT_LOG_ENT_REC_REMOVED_EXT:
ASSERT_NS(size == 1 + XT_XACT_ID_SIZE + sizeof(XTTabRecFreeDRec));
XT_SET_DISK_4(log_entry.fr.fr_op_seq_4, op_seq);
- XT_SET_DISK_4(log_entry.fr.fr_tab_id_4, tab->tab_id);
+ XT_SET_DISK_4(log_entry.fr.fr_tab_id_4, tab_id);
XT_SET_DISK_4(log_entry.fr.fr_rec_id_4, rec_id);
len = offsetof(XTactFreeRecEntryDRec, fr_stat_id_1);
break;
case XT_LOG_ENT_REC_REMOVED_BI:
check_size = 2;
XT_SET_DISK_4(log_entry.rb.rb_op_seq_4, op_seq);
- XT_SET_DISK_4(log_entry.rb.rb_tab_id_4, tab->tab_id);
+ XT_SET_DISK_4(log_entry.rb.rb_tab_id_4, tab_id);
XT_SET_DISK_4(log_entry.rb.rb_rec_id_4, rec_id);
XT_SET_DISK_2(log_entry.rb.rb_size_2, size);
log_entry.rb.rb_new_rec_type_1 = (xtWord1) free_rec_id;
@@ -1556,42 +1558,42 @@ xtPublic xtBool xt_xlog_modify_table(str
case XT_LOG_ENT_REC_MOVED:
ASSERT_NS(size == 8);
XT_SET_DISK_4(log_entry.xw.xw_op_seq_4, op_seq);
- XT_SET_DISK_4(log_entry.xw.xw_tab_id_4, tab->tab_id);
+ XT_SET_DISK_4(log_entry.xw.xw_tab_id_4, tab_id);
XT_SET_DISK_4(log_entry.xw.xw_rec_id_4, rec_id);
len = offsetof(XTactWriteRecEntryDRec, xw_rec_type_1);
break;
case XT_LOG_ENT_REC_CLEANED:
ASSERT_NS(size == offsetof(XTTabRecHeadDRec, tr_prev_rec_id_4) + XT_RECORD_ID_SIZE);
XT_SET_DISK_4(log_entry.xw.xw_op_seq_4, op_seq);
- XT_SET_DISK_4(log_entry.xw.xw_tab_id_4, tab->tab_id);
+ XT_SET_DISK_4(log_entry.xw.xw_tab_id_4, tab_id);
XT_SET_DISK_4(log_entry.xw.xw_rec_id_4, rec_id);
len = offsetof(XTactWriteRecEntryDRec, xw_rec_type_1);
break;
case XT_LOG_ENT_REC_CLEANED_1:
ASSERT_NS(size == 1);
XT_SET_DISK_4(log_entry.xw.xw_op_seq_4, op_seq);
- XT_SET_DISK_4(log_entry.xw.xw_tab_id_4, tab->tab_id);
+ XT_SET_DISK_4(log_entry.xw.xw_tab_id_4, tab_id);
XT_SET_DISK_4(log_entry.xw.xw_rec_id_4, rec_id);
len = offsetof(XTactWriteRecEntryDRec, xw_rec_type_1);
break;
case XT_LOG_ENT_REC_UNLINKED:
ASSERT_NS(size == offsetof(XTTabRecHeadDRec, tr_prev_rec_id_4) + XT_RECORD_ID_SIZE);
XT_SET_DISK_4(log_entry.xw.xw_op_seq_4, op_seq);
- XT_SET_DISK_4(log_entry.xw.xw_tab_id_4, tab->tab_id);
+ XT_SET_DISK_4(log_entry.xw.xw_tab_id_4, tab_id);
XT_SET_DISK_4(log_entry.xw.xw_rec_id_4, rec_id);
len = offsetof(XTactWriteRecEntryDRec, xw_rec_type_1);
break;
case XT_LOG_ENT_ROW_NEW:
ASSERT_NS(size == 0);
XT_SET_DISK_4(log_entry.xa.xa_op_seq_4, op_seq);
- XT_SET_DISK_4(log_entry.xa.xa_tab_id_4, tab->tab_id);
+ XT_SET_DISK_4(log_entry.xa.xa_tab_id_4, tab_id);
XT_SET_DISK_4(log_entry.xa.xa_row_id_4, rec_id);
len = offsetof(XTactRowAddedEntryDRec, xa_row_id_4) + XT_ROW_ID_SIZE;
break;
case XT_LOG_ENT_ROW_NEW_FL:
ASSERT_NS(size == 0);
XT_SET_DISK_4(log_entry.xa.xa_op_seq_4, op_seq);
- XT_SET_DISK_4(log_entry.xa.xa_tab_id_4, tab->tab_id);
+ XT_SET_DISK_4(log_entry.xa.xa_tab_id_4, tab_id);
XT_SET_DISK_4(log_entry.xa.xa_row_id_4, rec_id);
XT_SET_DISK_4(log_entry.xa.xa_free_list_4, free_rec_id);
sum ^= XT_CHECKSUM4_REC(free_rec_id);
@@ -1602,10 +1604,17 @@ xtPublic xtBool xt_xlog_modify_table(str
case XT_LOG_ENT_ROW_FREED:
ASSERT_NS(size == sizeof(XTTabRowRefDRec));
XT_SET_DISK_4(log_entry.wr.wr_op_seq_4, op_seq);
- XT_SET_DISK_4(log_entry.wr.wr_tab_id_4, tab->tab_id);
+ XT_SET_DISK_4(log_entry.wr.wr_tab_id_4, tab_id);
XT_SET_DISK_4(log_entry.wr.wr_row_id_4, rec_id);
len = offsetof(XTactWriteRowEntryDRec, wr_ref_id_4);
break;
+ case XT_LOG_ENT_PREPARE:
+ check_size = 2;
+ XT_SET_DISK_4(log_entry.xp.xp_xact_id_4, op_seq);
+ log_entry.xp.xp_xa_len_1 = (xtWord1) size;
+ len = offsetof(XTXactPrepareEntryDRec, xp_xa_data);
+ commit = TRUE;
+ break;
default:
ASSERT_NS(FALSE);
len = 0;
@@ -1615,7 +1624,7 @@ xtPublic xtBool xt_xlog_modify_table(str
xtWord1 *dptr = data;
xtWord4 g;
- sum ^= op_seq ^ (tab->tab_id << 8) ^ XT_CHECKSUM4_REC(rec_id);
+ sum ^= op_seq ^ (tab_id << 8) ^ XT_CHECKSUM4_REC(rec_id);
if ((g = sum & 0xF0000000)) {
sum = sum ^ (g >> 24);
sum = sum ^ g;
@@ -1643,9 +1652,9 @@ xtPublic xtBool xt_xlog_modify_table(str
xt_print_log_record(0, 0, &log_entry);
#endif
if (xact)
- return thread->st_database->db_xlog.xlog_append(thread, len, (xtWord1 *) &log_entry, size, data, FALSE, &xact->xd_begin_log, &xact->xd_begin_offset);
+ return thread->st_database->db_xlog.xlog_append(thread, len, (xtWord1 *) &log_entry, size, data, commit, &xact->xd_begin_log, &xact->xd_begin_offset);
- return thread->st_database->db_xlog.xlog_append(thread, len, (xtWord1 *) &log_entry, size, data, FALSE, NULL, NULL);
+ return thread->st_database->db_xlog.xlog_append(thread, len, (xtWord1 *) &log_entry, size, data, commit, NULL, NULL);
}
/*
@@ -1905,6 +1914,7 @@ xtBool XTDatabaseLog::xlog_verify(XTXact
xtRecordID rec_id, free_rec_id;
int check_size = 1;
xtWord1 *dptr;
+ xtWord4 g;
switch (record->xh.xh_status_1) {
case XT_LOG_ENT_HEADER:
@@ -2019,13 +2029,19 @@ xtBool XTDatabaseLog::xlog_verify(XTXact
return record->xe.xe_checksum_1 == (XT_CHECKSUM_1(sum) ^ XT_CHECKSUM_1(log_id));
case XT_LOG_ENT_END_OF_LOG:
return FALSE;
+ case XT_LOG_ENT_PREPARE:
+ check_size = 2;
+ op_seq = XT_GET_DISK_4(record->xp.xp_xact_id_4);
+ tab_id = 0;
+ rec_id = 0;
+ dptr = record->xp.xp_xa_data;
+ rec_size -= offsetof(XTXactPrepareEntryDRec, xp_xa_data);
+ break;
default:
ASSERT_NS(FALSE);
return FALSE;
}
- xtWord4 g;
-
sum ^= (xtWord4) op_seq ^ ((xtWord4) tab_id << 8) ^ XT_CHECKSUM4_REC(rec_id);
if ((g = sum & 0xF0000000)) {
@@ -2193,6 +2209,14 @@ xtBool XTDatabaseLog::xlog_seq_next(XTXa
}
goto return_empty;
}
+ case XT_LOG_ENT_PREPARE:
+ check_size = 2;
+ len = offsetof(XTXactPrepareEntryDRec, xp_xa_data);
+ if (len > max_rec_len)
+ /* The size is not in the buffer: */
+ goto read_more;
+ len += (size_t) record->xp.xp_xa_len_1;
+ break;
default:
/* It is possible to land here after a crash, if the
* log was not completely written.
@@ -2231,7 +2255,7 @@ xtBool XTDatabaseLog::xlog_seq_next(XTXa
goto return_empty;
}
- /* The record is not completely in the buffer: */
+ /* The record is now completely in the buffer: */
seq->xseq_record_len = len;
*ret_entry = (XTXactLogBufferDPtr) seq->xseq_buffer;
return OK;
@@ -2428,7 +2452,7 @@ static void xlog_wr_wait_for_log_flush(X
if (reason == XT_LOG_CACHE_FULL || reason == XT_TIME_TO_WRITE || reason == XT_CHECKPOINT_REQ) {
/* Make sure that we have something to write: */
if (db->db_xlog.xlog_bytes_to_write() < 2 * 1204 * 1024)
- xt_xlog_flush_log(self);
+ xt_xlog_flush_log(db, self);
}
#ifdef TRACE_WRITER_ACTIVITY
@@ -2529,6 +2553,7 @@ static void xlog_wr_main(XTThreadPtr sel
case XT_LOG_ENT_ABORT:
case XT_LOG_ENT_CLEANUP:
case XT_LOG_ENT_OP_SYNC:
+ case XT_LOG_ENT_PREPARE:
break;
case XT_LOG_ENT_DEL_LOG:
xtLogID log_id;
=== modified file 'storage/pbxt/src/xactlog_xt.h'
--- a/storage/pbxt/src/xactlog_xt.h 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/xactlog_xt.h 2009-11-24 10:55:06 +0000
@@ -160,6 +160,7 @@ typedef struct XTXLogCache {
#define XT_LOG_ENT_END_OF_LOG 37 /* This is a record that indicates the end of the log, and
* fills to the end of a 512 byte block.
*/
+#define XT_LOG_ENT_PREPARE 39 /* XA prepare log entry. */
#define XT_LOG_FILE_MAGIC 0xAE88FE12
#define XT_LOG_VERSION_NO 1
@@ -201,6 +202,14 @@ typedef struct XTXactEndEntry {
XTDiskValue4 xe_not_used_4; /* Was the end sequence number (no longer used - v1.0.04+), set to zero). */
} XTXactEndEntryDRec, *XTXactEndEntryDPtr;
+typedef struct XTXactPrepareEntry {
+ xtWord1 xp_status_1; /* XT_LOG_ENT_PREPARE */
+ XTDiskValue2 xp_checksum_2;
+ XTDiskValue4 xp_xact_id_4; /* The transaction. */
+ xtWord1 xp_xa_len_1; /* The length of the XA data. */
+ xtWord1 xp_xa_data[XT_MAX_XA_DATA_SIZE];
+} XTXactPrepareEntryDRec, *XTXactPrepareEntryDPtr;
+
typedef struct XTXactCleanupEntry {
xtWord1 xc_status_1; /* XT_LOG_ENT_CLEANUP */
xtWord1 xc_checksum_1;
@@ -344,6 +353,7 @@ typedef union XTXactLogBuffer {
XTactOpSyncEntryDRec os;
XTactExtRecEntryDRec er;
XTactNoOpEntryDRec no;
+ XTXactPrepareEntryDRec xp;
} XTXactLogBufferDRec, *XTXactLogBufferDPtr;
/* ---------------------------------------- */
@@ -453,9 +463,9 @@ private:
xtBool xlog_open_log(xtLogID log_id, off_t curr_eof, struct XTThread *thread);
} XTDatabaseLogRec, *XTDatabaseLogPtr;
-xtBool xt_xlog_flush_log(struct XTThread *thread);
+xtBool xt_xlog_flush_log(struct XTDatabase *db, struct XTThread *thread);
xtBool xt_xlog_log_data(struct XTThread *thread, size_t len, XTXactLogBufferDPtr log_entry, xtBool commit);
-xtBool xt_xlog_modify_table(struct XTOpenTable *ot, u_int status, xtOpSeqNo op_seq, xtRecordID free_list, xtRecordID address, size_t size, xtWord1 *data);
+xtBool xt_xlog_modify_table(xtTableID tab_id, u_int status, xtOpSeqNo op_seq, xtRecordID free_list, xtRecordID address, size_t size, xtWord1 *data, struct XTThread *thread);
void xt_xlog_init(struct XTThread *self, size_t cache_size);
void xt_xlog_exit(struct XTThread *self);
=== modified file 'storage/pbxt/src/xt_config.h'
--- a/storage/pbxt/src/xt_config.h 2009-08-31 11:07:44 +0000
+++ b/storage/pbxt/src/xt_config.h 2009-11-24 10:55:06 +0000
@@ -50,7 +50,9 @@ const int max_connections = 500;
/*
* Make sure we use the thread safe version of the library.
*/
+#ifndef _THREAD_SAFE // Seems to be defined by some Drizzle header
#define _THREAD_SAFE
+#endif
/*
* This causes things to be defined like stuff in inttypes.h
@@ -72,12 +74,12 @@ const int max_connections = 500;
#define XT_MAC
#endif
-#if defined(MSDOS) || defined(__WIN__)
+#if defined(MSDOS) || defined(__WIN__) || defined(_WIN64)
#define XT_WIN
#endif
#ifdef XT_WIN
-#ifdef _DEBUG
+#if defined(_DEBUG) && !defined(DEBUG)
#define DEBUG
#endif // _DEBUG
#else
@@ -101,8 +103,13 @@ const int max_connections = 500;
* Definition of which atomic operations to use:
*/
#ifdef XT_WIN
+#ifdef _WIN64
+/* 64-bit Windows atomic ops are not yet supported: */
+#define XT_NO_ATOMICS
+#else
/* MS Studio style embedded assembler for x86 */
#define XT_ATOMIC_WIN32_X86
+#endif
#elif defined(__GNUC__) && (defined(__x86_64__) || defined(__i386__))
/* Use GNU style embedded assembler for x86 */
#define XT_ATOMIC_GNUC_X86
@@ -115,4 +122,10 @@ const int max_connections = 500;
#define XT_NO_ATOMICS
#endif
+#ifndef DRIZZLED
+#if MYSQL_VERSION_ID >= 50404
+#define MYSQL_SUPPORTS_BACKUP
+#endif
+#endif
+
#endif
=== modified file 'storage/pbxt/src/xt_defs.h'
--- a/storage/pbxt/src/xt_defs.h 2009-08-17 11:12:36 +0000
+++ b/storage/pbxt/src/xt_defs.h 2009-11-24 10:55:06 +0000
@@ -379,6 +379,19 @@ typedef struct XTPathStr {
*/
//#define XT_IMPLEMENT_NO_ACTION
+/* Define this value if online-backup should be supported.
+ * Note that, online backup is currently only supported
+ * by MySQL 6.0.9 or later
+ */
+#define XT_ENABLE_ONLINE_BACKUP
+
+/* Define this switch if you don't want to use atomic
+ * synchronisation.
+ */
+#ifndef XT_NO_ATOMICS
+//#define XT_NO_ATOMICS
+#endif
+
/* ----------------------------------------------------------------------
* GLOBAL CONSTANTS
*/
@@ -406,6 +419,8 @@ typedef struct XTPathStr {
#define XT_ADD_PTR(p, l) ((void *) ((char *) (p) + (l)))
+#define XT_MAX_XA_DATA_SIZE (3*4 + 128) /* Corresponds to the maximum size of struct xid_t in handler.h. */
+
/* ----------------------------------------------------------------------
* DEFINES DEPENDENT ON CONSTANTS
*/
@@ -744,6 +759,7 @@ extern xtBool pbxt_crash_debug;
#define MYSQL_PLUGIN_VAR_HEADER DRIZZLE_PLUGIN_VAR_HEADER
#define MYSQL_SYSVAR_STR DRIZZLE_SYSVAR_STR
#define MYSQL_SYSVAR_INT DRIZZLE_SYSVAR_INT
+#define MYSQL_SYSVAR_BOOL DRIZZLE_SYSVAR_BOOL
#define MYSQL_SYSVAR DRIZZLE_SYSVAR
#define MYSQL_STORAGE_ENGINE_PLUGIN DRIZZLE_STORAGE_ENGINE_PLUGIN
#define MYSQL_INFORMATION_SCHEMA_PLUGIN DRIZZLE_INFORMATION_SCHEMA_PLUGIN
@@ -752,9 +768,8 @@ extern xtBool pbxt_crash_debug;
#define mx_tmp_use_all_columns(x, y) (x)->use_all_columns(y)
#define mx_tmp_restore_column_map(x, y) (x)->restore_column_map(y)
-#define MX_BIT_FAST_TEST_AND_SET(x, y) bitmap_test_and_set(x, y)
-#define MX_TABLE_TYPES_T handler::Table_flags
+#define MX_TABLE_TYPES_T Cursor::Table_flags
#define MX_UINT8_T uint8_t
#define MX_ULONG_T uint32_t
#define MX_ULONGLONG_T uint64_t
@@ -762,6 +777,10 @@ extern xtBool pbxt_crash_debug;
#define MX_CHARSET_INFO struct charset_info_st
#define MX_CONST_CHARSET_INFO const struct charset_info_st
#define MX_CONST const
+#define MX_BITMAP MyBitmap
+#define MX_BIT_SIZE() numOfBitsInMap()
+#define MX_BIT_SET(x, y) (x)->setBit(y)
+#define MX_BIT_FAST_TEST_AND_SET(x, y) (x)->testAndSet(y)
#define my_bool bool
#define int16 int16_t
@@ -771,6 +790,7 @@ extern xtBool pbxt_crash_debug;
#define uchar unsigned char
#define longlong int64_t
#define ulonglong uint64_t
+#define handler Cursor
#define HAVE_LONG_LONG
@@ -823,10 +843,13 @@ extern xtBool pbxt_crash_debug;
class PBXTStorageEngine;
typedef PBXTStorageEngine handlerton;
+class Session;
+
+extern "C" void session_mark_transaction_to_rollback(Session *session, bool all);
#else // DRIZZLED
/* The MySQL case: */
-#if MYSQL_VERSION_ID >= 60008
+#if MYSQL_VERSION_ID >= 50404
#define STRUCT_TABLE struct TABLE
#else
#define STRUCT_TABLE struct st_table
@@ -844,13 +867,13 @@ typedef PBXTStorageEngine handlerton;
#define MX_CHARSET_INFO CHARSET_INFO
#define MX_CONST_CHARSET_INFO struct charset_info_st
#define MX_CONST
+#define MX_BITMAP MY_BITMAP
+#define MX_BIT_SIZE() n_bits
+#define MX_BIT_SET(x, y) bitmap_set_bit(x, y)
#endif // DRIZZLED
-#define MX_BITMAP MY_BITMAP
-#define MX_BIT_SIZE() n_bits
#define MX_BIT_IS_SUBSET(x, y) bitmap_is_subset(x, y)
-#define MX_BIT_SET(x, y) bitmap_set_bit(x, y)
#ifndef XT_SCAN_CORE_DEFINED
#define XT_SCAN_CORE_DEFINED
=== modified file 'storage/pbxt/src/xt_errno.h'
--- a/storage/pbxt/src/xt_errno.h 2009-09-03 06:15:03 +0000
+++ b/storage/pbxt/src/xt_errno.h 2009-11-24 10:55:06 +0000
@@ -119,6 +119,9 @@
#define XT_ERR_FK_REF_TEMP_TABLE -95
#define XT_ERR_MYSQL_SHUTDOWN -98
#define XT_ERR_MYSQL_NO_THREAD -99
+#define XT_ERR_BUFFER_TOO_SMALL -100
+#define XT_ERR_BAD_BACKUP_FORMAT -101
+#define XT_ERR_PBXT_NOT_INSTALLED -102
#ifdef XT_WIN
#define XT_ENOMEM ERROR_NOT_ENOUGH_MEMORY
1
0
[Maria-developers] bzr commit into Mariadb 5.2, with Maria 2.0:maria/5.2 branch (igor:2739) WL#2771
by Igor Babaev 21 Dec '09
by Igor Babaev 21 Dec '09
21 Dec '09
#At lp:maria/5.2 based on revid:psergey@askmonty.org-20091216092851-9ehou85ydmgziniy
2739 Igor Babaev 2009-12-20
Backport into MariaDB-5.2 the following:
WL#2771 "Block Nested Loop Join and Batched Key Access Join"
added:
mysql-test/r/join_cache.result
mysql-test/r/join_nested_jcl6.result
mysql-test/r/join_outer_jcl6.result
mysql-test/r/select_jcl6.result
mysql-test/t/join_cache.test
mysql-test/t/join_nested_jcl6.test
mysql-test/t/join_outer_jcl6.test
mysql-test/t/select_jcl6.test
sql/sql_join_cache.cc
modified:
libmysqld/CMakeLists.txt
libmysqld/Makefile.am
mysql-test/r/index_merge_myisam.result
mysql-test/r/join_outer.result
mysql-test/r/table_elim.result
sql/CMakeLists.txt
sql/Makefile.am
sql/ds_mrr.cc
sql/field.cc
sql/item.cc
sql/item.h
sql/mysqld.cc
sql/set_var.cc
sql/sql_class.h
sql/sql_select.cc
sql/sql_select.h
storage/myisam/ha_myisam.cc
=== modified file 'libmysqld/CMakeLists.txt'
--- a/libmysqld/CMakeLists.txt 2009-12-15 22:37:39 +0000
+++ b/libmysqld/CMakeLists.txt 2009-12-21 02:26:15 +0000
@@ -123,7 +123,8 @@ SET(LIBMYSQLD_SOURCES emb_qcache.cc libm
../sql/sql_class.cc ../sql/sql_crypt.cc ../sql/sql_cursor.cc
../sql/sql_db.cc ../sql/sql_delete.cc ../sql/sql_derived.cc
../sql/sql_do.cc ../sql/sql_error.cc ../sql/sql_handler.cc
- ../sql/sql_help.cc ../sql/sql_insert.cc ../sql/sql_lex.cc
+ ../sql/sql_help.cc ../sql/sql_insert.cc ../sql/sql_join_cache.cc
+ ../sql/sql_lex.cc
../sql/sql_list.cc ../sql/sql_load.cc ../sql/sql_locale.cc
../sql/sql_binlog.cc ../sql/sql_manager.cc ../sql/sql_map.cc
../sql/sql_parse.cc ../sql/sql_partition.cc ../sql/sql_plugin.cc
=== modified file 'libmysqld/Makefile.am'
--- a/libmysqld/Makefile.am 2009-12-15 07:16:46 +0000
+++ b/libmysqld/Makefile.am 2009-12-21 02:26:15 +0000
@@ -63,6 +63,7 @@ sqlsources = ds_mrr.cc derror.cc field.c
sql_profile.cc \
sql_analyse.cc sql_base.cc sql_cache.cc sql_class.cc \
sql_crypt.cc sql_db.cc sql_delete.cc sql_error.cc sql_insert.cc \
+ sql_join_cache.cc \
sql_lex.cc sql_list.cc sql_manager.cc sql_map.cc \
scheduler.cc sql_connect.cc sql_parse.cc \
sql_prepare.cc sql_derived.cc sql_rename.cc \
=== modified file 'mysql-test/r/index_merge_myisam.result'
--- a/mysql-test/r/index_merge_myisam.result 2009-12-15 07:16:46 +0000
+++ b/mysql-test/r/index_merge_myisam.result 2009-12-21 02:26:15 +0000
@@ -342,8 +342,6 @@ create table t4 (a int);
insert into t4 values (1),(4),(3);
set @save_join_buffer_size=@@join_buffer_size;
set join_buffer_size= 4000;
-Warnings:
-Warning 1292 Truncated incorrect join_buffer_size value: '4000'
explain select max(A.key1 + B.key1 + A.key2 + B.key2 + A.key3 + B.key3 + A.key4 + B.key4 + A.key5 + B.key5)
from t0 as A force index(i1,i2), t0 as B force index (i1,i2)
where (A.key1 < 500000 or A.key2 < 3)
=== added file 'mysql-test/r/join_cache.result'
--- a/mysql-test/r/join_cache.result 1970-01-01 00:00:00 +0000
+++ b/mysql-test/r/join_cache.result 2009-12-21 02:26:15 +0000
@@ -0,0 +1,4144 @@
+DROP TABLE IF EXISTS t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11;
+DROP DATABASE IF EXISTS world;
+set names utf8;
+CREATE DATABASE world;
+use world;
+CREATE TABLE Country (
+Code char(3) NOT NULL default '',
+Name char(52) NOT NULL default '',
+SurfaceArea float(10,2) NOT NULL default '0.00',
+Population int(11) NOT NULL default '0',
+Capital int(11) default NULL
+);
+CREATE TABLE City (
+ID int(11) NOT NULL,
+Name char(35) NOT NULL default '',
+Country char(3) NOT NULL default '',
+Population int(11) NOT NULL default '0'
+);
+CREATE TABLE CountryLanguage (
+Country char(3) NOT NULL default '',
+Language char(30) NOT NULL default '',
+Percentage float(3,1) NOT NULL default '0.0'
+);
+SELECT COUNT(*) FROM Country;
+COUNT(*)
+239
+SELECT COUNT(*) FROM City;
+COUNT(*)
+4079
+SELECT COUNT(*) FROM CountryLanguage;
+COUNT(*)
+984
+show variables like 'join_buffer_size';
+Variable_name Value
+join_buffer_size 131072
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 1
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country ALL NULL NULL NULL NULL 239 Using where
+1 SIMPLE City ALL NULL NULL NULL NULL 4079 Using where; Using join buffer
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country ALL NULL NULL NULL NULL 239 Using where
+1 SIMPLE CountryLanguage ALL NULL NULL NULL NULL 984 Using where; Using join buffer
+1 SIMPLE City ALL NULL NULL NULL NULL 4079 Using where; Using join buffer
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+Name Name Language
+Leiden Netherlands Dutch
+La Matanza Argentina Spanish
+Lomas de Zamora Argentina Spanish
+La Plata Argentina Spanish
+Lanús Argentina Spanish
+Las Heras Argentina Spanish
+La Rioja Argentina Spanish
+Liège Belgium Dutch
+La Paz Bolivia Spanish
+Londrina Brazil Portuguese
+Limeira Brazil Portuguese
+Lages Brazil Portuguese
+Luziânia Brazil Portuguese
+Lauro de Freitas Brazil Portuguese
+Linhares Brazil Portuguese
+London United Kingdom English
+Liverpool United Kingdom English
+Leeds United Kingdom English
+Leicester United Kingdom English
+Luton United Kingdom English
+Los Angeles Chile Spanish
+La Serena Chile Spanish
+La Romana Dominican Republic Spanish
+Loja Ecuador Spanish
+Luxor Egypt Arabic
+Las Palmas de Gran Canaria Spain Spanish
+L´Hospitalet de Llobregat Spain Spanish
+Leganés Spain Spanish
+León Spain Spanish
+Logroño Spain Spanish
+Lleida (Lérida) Spain Spanish
+Le-Cap-Haïtien Haiti Haiti Creole
+La Ceiba Honduras Spanish
+Livorno Italy Italian
+Latina Italy Italian
+Lecce Italy Italian
+La Spezia Italy Italian
+Linz Austria German
+London Canada English
+Laval Canada English
+Longueuil Canada English
+Lanzhou China Chinese
+Luoyang China Chinese
+Liuzhou China Chinese
+Liaoyang China Chinese
+Liupanshui China Chinese
+Liaoyuan China Chinese
+Lianyungang China Chinese
+Leshan China Chinese
+Linyi China Chinese
+Luzhou China Chinese
+Laiwu China Chinese
+Liaocheng China Chinese
+Laizhou China Chinese
+Linfen China Chinese
+Liangcheng China Chinese
+Longkou China Chinese
+Langfang China Chinese
+Liu´an China Chinese
+Longjing China Chinese
+Lengshuijiang China Chinese
+Laiyang China Chinese
+Longyan China Chinese
+Linhe China Chinese
+Leiyang China Chinese
+Loudi China Chinese
+Luohe China Chinese
+Linqing China Chinese
+Laohekou China Chinese
+Linchuan China Chinese
+Lhasa China Chinese
+Lianyuan China Chinese
+Liyang China Chinese
+Liling China Chinese
+Linhai China Chinese
+Larisa Greece Greek
+La Habana Cuba Spanish
+Lilongwe Malawi Chichewa
+León Mexico Spanish
+La Paz Mexico Spanish
+La Paz Mexico Spanish
+Lázaro Cárdenas Mexico Spanish
+Lagos de Moreno Mexico Spanish
+Lerdo Mexico Spanish
+Los Cabos Mexico Spanish
+Lerma Mexico Spanish
+Las Margaritas Mexico Spanish
+Lashio (Lasho) Myanmar Burmese
+Lalitapur Nepal Nepali
+León Nicaragua Spanish
+Lambaré Paraguay Spanish
+Lima Peru Spanish
+Lisboa Portugal Portuguese
+Lódz Poland Polish
+Lublin Poland Polish
+Legnica Poland Polish
+Lyon France French
+Le Havre France French
+Lille France French
+Le Mans France French
+Limoges France French
+Linköping Sweden Swedish
+Lund Sweden Swedish
+Leipzig Germany German
+Lübeck Germany German
+Ludwigshafen am Rhein Germany German
+Leverkusen Germany German
+Lünen Germany German
+Lahti Finland Finnish
+Lausanne Switzerland German
+Latakia Syria Arabic
+Luchou Taiwan Min
+Lungtan Taiwan Min
+Liberec Czech Republic Czech
+Lviv Ukraine Ukrainian
+Lugansk Ukraine Ukrainian
+Lutsk Ukraine Ukrainian
+Lysyt?ansk Ukraine Ukrainian
+Lower Hutt New Zealand English
+Lida Belarus Belorussian
+Los Teques Venezuela Spanish
+Lipetsk Russian Federation Russian
+Ljubertsy Russian Federation Russian
+Leninsk-Kuznetski Russian Federation Russian
+Long Xuyen Vietnam Vietnamese
+Los Angeles United States English
+Las Vegas United States English
+Long Beach United States English
+Lexington-Fayette United States English
+Louisville United States English
+Lincoln United States English
+Lubbock United States English
+Little Rock United States English
+Laredo United States English
+Lakewood United States English
+Lansing United States English
+Lancaster United States English
+Lafayette United States English
+Lowell United States English
+Livonia United States English
+set join_cache_level=2;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 2
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country ALL NULL NULL NULL NULL 239 Using where
+1 SIMPLE City ALL NULL NULL NULL NULL 4079 Using where; Using join buffer
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country ALL NULL NULL NULL NULL 239 Using where
+1 SIMPLE CountryLanguage ALL NULL NULL NULL NULL 984 Using where; Using join buffer
+1 SIMPLE City ALL NULL NULL NULL NULL 4079 Using where; Using join buffer
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+Name Name Language
+Leiden Netherlands Dutch
+La Matanza Argentina Spanish
+Lomas de Zamora Argentina Spanish
+La Plata Argentina Spanish
+Lanús Argentina Spanish
+Las Heras Argentina Spanish
+La Rioja Argentina Spanish
+Liège Belgium Dutch
+La Paz Bolivia Spanish
+Londrina Brazil Portuguese
+Limeira Brazil Portuguese
+Lages Brazil Portuguese
+Luziânia Brazil Portuguese
+Lauro de Freitas Brazil Portuguese
+Linhares Brazil Portuguese
+London United Kingdom English
+Liverpool United Kingdom English
+Leeds United Kingdom English
+Leicester United Kingdom English
+Luton United Kingdom English
+Los Angeles Chile Spanish
+La Serena Chile Spanish
+La Romana Dominican Republic Spanish
+Loja Ecuador Spanish
+Luxor Egypt Arabic
+Las Palmas de Gran Canaria Spain Spanish
+L´Hospitalet de Llobregat Spain Spanish
+Leganés Spain Spanish
+León Spain Spanish
+Logroño Spain Spanish
+Lleida (Lérida) Spain Spanish
+Le-Cap-Haïtien Haiti Haiti Creole
+La Ceiba Honduras Spanish
+Livorno Italy Italian
+Latina Italy Italian
+Lecce Italy Italian
+La Spezia Italy Italian
+Linz Austria German
+London Canada English
+Laval Canada English
+Longueuil Canada English
+Lanzhou China Chinese
+Luoyang China Chinese
+Liuzhou China Chinese
+Liaoyang China Chinese
+Liupanshui China Chinese
+Liaoyuan China Chinese
+Lianyungang China Chinese
+Leshan China Chinese
+Linyi China Chinese
+Luzhou China Chinese
+Laiwu China Chinese
+Liaocheng China Chinese
+Laizhou China Chinese
+Linfen China Chinese
+Liangcheng China Chinese
+Longkou China Chinese
+Langfang China Chinese
+Liu´an China Chinese
+Longjing China Chinese
+Lengshuijiang China Chinese
+Laiyang China Chinese
+Longyan China Chinese
+Linhe China Chinese
+Leiyang China Chinese
+Loudi China Chinese
+Luohe China Chinese
+Linqing China Chinese
+Laohekou China Chinese
+Linchuan China Chinese
+Lhasa China Chinese
+Lianyuan China Chinese
+Liyang China Chinese
+Liling China Chinese
+Linhai China Chinese
+Larisa Greece Greek
+La Habana Cuba Spanish
+Lilongwe Malawi Chichewa
+León Mexico Spanish
+La Paz Mexico Spanish
+La Paz Mexico Spanish
+Lázaro Cárdenas Mexico Spanish
+Lagos de Moreno Mexico Spanish
+Lerdo Mexico Spanish
+Los Cabos Mexico Spanish
+Lerma Mexico Spanish
+Las Margaritas Mexico Spanish
+Lashio (Lasho) Myanmar Burmese
+Lalitapur Nepal Nepali
+León Nicaragua Spanish
+Lambaré Paraguay Spanish
+Lima Peru Spanish
+Lisboa Portugal Portuguese
+Lódz Poland Polish
+Lublin Poland Polish
+Legnica Poland Polish
+Lyon France French
+Le Havre France French
+Lille France French
+Le Mans France French
+Limoges France French
+Linköping Sweden Swedish
+Lund Sweden Swedish
+Leipzig Germany German
+Lübeck Germany German
+Ludwigshafen am Rhein Germany German
+Leverkusen Germany German
+Lünen Germany German
+Lahti Finland Finnish
+Lausanne Switzerland German
+Latakia Syria Arabic
+Luchou Taiwan Min
+Lungtan Taiwan Min
+Liberec Czech Republic Czech
+Lviv Ukraine Ukrainian
+Lugansk Ukraine Ukrainian
+Lutsk Ukraine Ukrainian
+Lysyt?ansk Ukraine Ukrainian
+Lower Hutt New Zealand English
+Lida Belarus Belorussian
+Los Teques Venezuela Spanish
+Lipetsk Russian Federation Russian
+Ljubertsy Russian Federation Russian
+Leninsk-Kuznetski Russian Federation Russian
+Long Xuyen Vietnam Vietnamese
+Los Angeles United States English
+Las Vegas United States English
+Long Beach United States English
+Lexington-Fayette United States English
+Louisville United States English
+Lincoln United States English
+Lubbock United States English
+Little Rock United States English
+Laredo United States English
+Lakewood United States English
+Lansing United States English
+Lancaster United States English
+Lafayette United States English
+Lowell United States English
+Livonia United States English
+set join_cache_level=default;
+set join_buffer_size=256;
+show variables like 'join_buffer_size';
+Variable_name Value
+join_buffer_size 256
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 1
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country ALL NULL NULL NULL NULL 239 Using where
+1 SIMPLE City ALL NULL NULL NULL NULL 4079 Using where; Using join buffer
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country ALL NULL NULL NULL NULL 239 Using where
+1 SIMPLE CountryLanguage ALL NULL NULL NULL NULL 984 Using where; Using join buffer
+1 SIMPLE City ALL NULL NULL NULL NULL 4079 Using where; Using join buffer
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+Name Name Language
+Leiden Netherlands Dutch
+La Matanza Argentina Spanish
+Lomas de Zamora Argentina Spanish
+La Plata Argentina Spanish
+Lanús Argentina Spanish
+Las Heras Argentina Spanish
+La Rioja Argentina Spanish
+Liège Belgium Dutch
+La Paz Bolivia Spanish
+Londrina Brazil Portuguese
+Limeira Brazil Portuguese
+Lages Brazil Portuguese
+Luziânia Brazil Portuguese
+Lauro de Freitas Brazil Portuguese
+Linhares Brazil Portuguese
+London United Kingdom English
+Liverpool United Kingdom English
+Leeds United Kingdom English
+Leicester United Kingdom English
+Luton United Kingdom English
+Los Angeles Chile Spanish
+La Serena Chile Spanish
+La Romana Dominican Republic Spanish
+Loja Ecuador Spanish
+Luxor Egypt Arabic
+Las Palmas de Gran Canaria Spain Spanish
+L´Hospitalet de Llobregat Spain Spanish
+Leganés Spain Spanish
+León Spain Spanish
+Logroño Spain Spanish
+Lleida (Lérida) Spain Spanish
+Le-Cap-Haïtien Haiti Haiti Creole
+La Ceiba Honduras Spanish
+Livorno Italy Italian
+Latina Italy Italian
+Lecce Italy Italian
+La Spezia Italy Italian
+Linz Austria German
+London Canada English
+Laval Canada English
+Longueuil Canada English
+Lanzhou China Chinese
+Luoyang China Chinese
+Liuzhou China Chinese
+Liaoyang China Chinese
+Liupanshui China Chinese
+Liaoyuan China Chinese
+Lianyungang China Chinese
+Leshan China Chinese
+Linyi China Chinese
+Luzhou China Chinese
+Laiwu China Chinese
+Liaocheng China Chinese
+Laizhou China Chinese
+Linfen China Chinese
+Liangcheng China Chinese
+Longkou China Chinese
+Langfang China Chinese
+Liu´an China Chinese
+Longjing China Chinese
+Lengshuijiang China Chinese
+Laiyang China Chinese
+Longyan China Chinese
+Linhe China Chinese
+Leiyang China Chinese
+Loudi China Chinese
+Luohe China Chinese
+Linqing China Chinese
+Laohekou China Chinese
+Linchuan China Chinese
+Lhasa China Chinese
+Lianyuan China Chinese
+Liyang China Chinese
+Liling China Chinese
+Linhai China Chinese
+Larisa Greece Greek
+La Habana Cuba Spanish
+Lilongwe Malawi Chichewa
+León Mexico Spanish
+La Paz Mexico Spanish
+La Paz Mexico Spanish
+Lázaro Cárdenas Mexico Spanish
+Lagos de Moreno Mexico Spanish
+Lerdo Mexico Spanish
+Los Cabos Mexico Spanish
+Lerma Mexico Spanish
+Las Margaritas Mexico Spanish
+Lashio (Lasho) Myanmar Burmese
+Lalitapur Nepal Nepali
+León Nicaragua Spanish
+Lambaré Paraguay Spanish
+Lima Peru Spanish
+Lisboa Portugal Portuguese
+Lódz Poland Polish
+Lublin Poland Polish
+Legnica Poland Polish
+Lyon France French
+Le Havre France French
+Lille France French
+Le Mans France French
+Limoges France French
+Linköping Sweden Swedish
+Lund Sweden Swedish
+Leipzig Germany German
+Lübeck Germany German
+Ludwigshafen am Rhein Germany German
+Leverkusen Germany German
+Lünen Germany German
+Lahti Finland Finnish
+Lausanne Switzerland German
+Latakia Syria Arabic
+Luchou Taiwan Min
+Lungtan Taiwan Min
+Liberec Czech Republic Czech
+Lviv Ukraine Ukrainian
+Lugansk Ukraine Ukrainian
+Lutsk Ukraine Ukrainian
+Lysyt?ansk Ukraine Ukrainian
+Lower Hutt New Zealand English
+Lida Belarus Belorussian
+Los Teques Venezuela Spanish
+Lipetsk Russian Federation Russian
+Ljubertsy Russian Federation Russian
+Leninsk-Kuznetski Russian Federation Russian
+Long Xuyen Vietnam Vietnamese
+Los Angeles United States English
+Las Vegas United States English
+Long Beach United States English
+Lexington-Fayette United States English
+Louisville United States English
+Lincoln United States English
+Lubbock United States English
+Little Rock United States English
+Laredo United States English
+Lakewood United States English
+Lansing United States English
+Lancaster United States English
+Lafayette United States English
+Lowell United States English
+Livonia United States English
+set join_cache_level=2;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 2
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country ALL NULL NULL NULL NULL 239 Using where
+1 SIMPLE City ALL NULL NULL NULL NULL 4079 Using where; Using join buffer
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country ALL NULL NULL NULL NULL 239 Using where
+1 SIMPLE CountryLanguage ALL NULL NULL NULL NULL 984 Using where; Using join buffer
+1 SIMPLE City ALL NULL NULL NULL NULL 4079 Using where; Using join buffer
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+Name Name Language
+Leiden Netherlands Dutch
+La Matanza Argentina Spanish
+Lomas de Zamora Argentina Spanish
+La Plata Argentina Spanish
+Lanús Argentina Spanish
+Las Heras Argentina Spanish
+La Rioja Argentina Spanish
+Liège Belgium Dutch
+La Paz Bolivia Spanish
+Londrina Brazil Portuguese
+Limeira Brazil Portuguese
+Lages Brazil Portuguese
+Luziânia Brazil Portuguese
+Lauro de Freitas Brazil Portuguese
+Linhares Brazil Portuguese
+London United Kingdom English
+Liverpool United Kingdom English
+Leeds United Kingdom English
+Leicester United Kingdom English
+Luton United Kingdom English
+Los Angeles Chile Spanish
+La Serena Chile Spanish
+La Romana Dominican Republic Spanish
+Loja Ecuador Spanish
+Luxor Egypt Arabic
+Las Palmas de Gran Canaria Spain Spanish
+L´Hospitalet de Llobregat Spain Spanish
+Leganés Spain Spanish
+León Spain Spanish
+Logroño Spain Spanish
+Lleida (Lérida) Spain Spanish
+Le-Cap-Haïtien Haiti Haiti Creole
+La Ceiba Honduras Spanish
+Livorno Italy Italian
+Latina Italy Italian
+Lecce Italy Italian
+La Spezia Italy Italian
+Linz Austria German
+London Canada English
+Laval Canada English
+Longueuil Canada English
+Lanzhou China Chinese
+Luoyang China Chinese
+Liuzhou China Chinese
+Liaoyang China Chinese
+Liupanshui China Chinese
+Liaoyuan China Chinese
+Lianyungang China Chinese
+Leshan China Chinese
+Linyi China Chinese
+Luzhou China Chinese
+Laiwu China Chinese
+Liaocheng China Chinese
+Laizhou China Chinese
+Linfen China Chinese
+Liangcheng China Chinese
+Longkou China Chinese
+Langfang China Chinese
+Liu´an China Chinese
+Longjing China Chinese
+Lengshuijiang China Chinese
+Laiyang China Chinese
+Longyan China Chinese
+Linhe China Chinese
+Leiyang China Chinese
+Loudi China Chinese
+Luohe China Chinese
+Linqing China Chinese
+Laohekou China Chinese
+Linchuan China Chinese
+Lhasa China Chinese
+Lianyuan China Chinese
+Liyang China Chinese
+Liling China Chinese
+Linhai China Chinese
+Larisa Greece Greek
+La Habana Cuba Spanish
+Lilongwe Malawi Chichewa
+León Mexico Spanish
+La Paz Mexico Spanish
+La Paz Mexico Spanish
+Lázaro Cárdenas Mexico Spanish
+Lagos de Moreno Mexico Spanish
+Lerdo Mexico Spanish
+Los Cabos Mexico Spanish
+Lerma Mexico Spanish
+Las Margaritas Mexico Spanish
+Lashio (Lasho) Myanmar Burmese
+Lalitapur Nepal Nepali
+León Nicaragua Spanish
+Lambaré Paraguay Spanish
+Lima Peru Spanish
+Lisboa Portugal Portuguese
+Lódz Poland Polish
+Lublin Poland Polish
+Legnica Poland Polish
+Lyon France French
+Le Havre France French
+Lille France French
+Le Mans France French
+Limoges France French
+Linköping Sweden Swedish
+Lund Sweden Swedish
+Leipzig Germany German
+Lübeck Germany German
+Ludwigshafen am Rhein Germany German
+Leverkusen Germany German
+Lünen Germany German
+Lahti Finland Finnish
+Lausanne Switzerland German
+Latakia Syria Arabic
+Luchou Taiwan Min
+Lungtan Taiwan Min
+Liberec Czech Republic Czech
+Lviv Ukraine Ukrainian
+Lugansk Ukraine Ukrainian
+Lutsk Ukraine Ukrainian
+Lysyt?ansk Ukraine Ukrainian
+Lower Hutt New Zealand English
+Lida Belarus Belorussian
+Los Teques Venezuela Spanish
+Lipetsk Russian Federation Russian
+Ljubertsy Russian Federation Russian
+Leninsk-Kuznetski Russian Federation Russian
+Long Xuyen Vietnam Vietnamese
+Los Angeles United States English
+Las Vegas United States English
+Long Beach United States English
+Lexington-Fayette United States English
+Louisville United States English
+Lincoln United States English
+Lubbock United States English
+Little Rock United States English
+Laredo United States English
+Lakewood United States English
+Lansing United States English
+Lancaster United States English
+Lafayette United States English
+Lowell United States English
+Livonia United States English
+set join_cache_level=default;
+set join_buffer_size=default;
+show variables like 'join_buffer_size';
+Variable_name Value
+join_buffer_size 131072
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 1
+DROP DATABASE world;
+CREATE DATABASE world;
+use world;
+CREATE TABLE Country (
+Code char(3) NOT NULL default '',
+Name char(52) NOT NULL default '',
+SurfaceArea float(10,2) NOT NULL default '0.00',
+Population int(11) NOT NULL default '0',
+Capital int(11) default NULL,
+PRIMARY KEY (Code),
+UNIQUE INDEX (Name)
+);
+CREATE TABLE City (
+ID int(11) NOT NULL auto_increment,
+Name char(35) NOT NULL default '',
+Country char(3) NOT NULL default '',
+Population int(11) NOT NULL default '0',
+PRIMARY KEY (ID),
+INDEX (Population),
+INDEX (Country)
+);
+CREATE TABLE CountryLanguage (
+Country char(3) NOT NULL default '',
+Language char(30) NOT NULL default '',
+Percentage float(3,1) NOT NULL default '0.0',
+PRIMARY KEY (Country, Language),
+INDEX (Percentage)
+);
+show variables like 'join_buffer_size';
+Variable_name Value
+join_buffer_size 131072
+set join_cache_level=5;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 5
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+1 SIMPLE City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE CountryLanguage ALL PRIMARY,Percentage NULL NULL NULL 984 Using where
+1 SIMPLE Country eq_ref PRIMARY PRIMARY 3 world.CountryLanguage.Country 1 Using where; Using join buffer
+1 SIMPLE City ref Country Country 3 world.Country.Code 18 Using index condition(BKA); Using where; Using join buffer
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+Name Name Language
+Leiden Netherlands Dutch
+La Matanza Argentina Spanish
+Lomas de Zamora Argentina Spanish
+La Plata Argentina Spanish
+Lanús Argentina Spanish
+Las Heras Argentina Spanish
+La Rioja Argentina Spanish
+Liège Belgium Dutch
+La Paz Bolivia Spanish
+Londrina Brazil Portuguese
+Limeira Brazil Portuguese
+Lages Brazil Portuguese
+Luziânia Brazil Portuguese
+Lauro de Freitas Brazil Portuguese
+Linhares Brazil Portuguese
+London United Kingdom English
+Liverpool United Kingdom English
+Leeds United Kingdom English
+Leicester United Kingdom English
+Luton United Kingdom English
+Los Angeles Chile Spanish
+La Serena Chile Spanish
+La Romana Dominican Republic Spanish
+Loja Ecuador Spanish
+Luxor Egypt Arabic
+Las Palmas de Gran Canaria Spain Spanish
+L´Hospitalet de Llobregat Spain Spanish
+Leganés Spain Spanish
+León Spain Spanish
+Logroño Spain Spanish
+Lleida (Lérida) Spain Spanish
+Le-Cap-Haïtien Haiti Haiti Creole
+La Ceiba Honduras Spanish
+Livorno Italy Italian
+Latina Italy Italian
+Lecce Italy Italian
+La Spezia Italy Italian
+Linz Austria German
+London Canada English
+Laval Canada English
+Longueuil Canada English
+Lanzhou China Chinese
+Luoyang China Chinese
+Liuzhou China Chinese
+Liaoyang China Chinese
+Liupanshui China Chinese
+Liaoyuan China Chinese
+Lianyungang China Chinese
+Leshan China Chinese
+Linyi China Chinese
+Luzhou China Chinese
+Laiwu China Chinese
+Liaocheng China Chinese
+Laizhou China Chinese
+Linfen China Chinese
+Liangcheng China Chinese
+Longkou China Chinese
+Langfang China Chinese
+Liu´an China Chinese
+Longjing China Chinese
+Lengshuijiang China Chinese
+Laiyang China Chinese
+Longyan China Chinese
+Linhe China Chinese
+Leiyang China Chinese
+Loudi China Chinese
+Luohe China Chinese
+Linqing China Chinese
+Laohekou China Chinese
+Linchuan China Chinese
+Lhasa China Chinese
+Lianyuan China Chinese
+Liyang China Chinese
+Liling China Chinese
+Linhai China Chinese
+Larisa Greece Greek
+La Habana Cuba Spanish
+Lilongwe Malawi Chichewa
+León Mexico Spanish
+La Paz Mexico Spanish
+La Paz Mexico Spanish
+Lázaro Cárdenas Mexico Spanish
+Lagos de Moreno Mexico Spanish
+Lerdo Mexico Spanish
+Los Cabos Mexico Spanish
+Lerma Mexico Spanish
+Las Margaritas Mexico Spanish
+Lashio (Lasho) Myanmar Burmese
+Lalitapur Nepal Nepali
+León Nicaragua Spanish
+Lambaré Paraguay Spanish
+Lima Peru Spanish
+Lisboa Portugal Portuguese
+Lódz Poland Polish
+Lublin Poland Polish
+Legnica Poland Polish
+Lyon France French
+Le Havre France French
+Lille France French
+Le Mans France French
+Limoges France French
+Linköping Sweden Swedish
+Lund Sweden Swedish
+Leipzig Germany German
+Lübeck Germany German
+Ludwigshafen am Rhein Germany German
+Leverkusen Germany German
+Lünen Germany German
+Lahti Finland Finnish
+Lausanne Switzerland German
+Latakia Syria Arabic
+Luchou Taiwan Min
+Lungtan Taiwan Min
+Liberec Czech Republic Czech
+Lviv Ukraine Ukrainian
+Lugansk Ukraine Ukrainian
+Lutsk Ukraine Ukrainian
+Lysyt?ansk Ukraine Ukrainian
+Lower Hutt New Zealand English
+Lida Belarus Belorussian
+Los Teques Venezuela Spanish
+Lipetsk Russian Federation Russian
+Ljubertsy Russian Federation Russian
+Leninsk-Kuznetski Russian Federation Russian
+Long Xuyen Vietnam Vietnamese
+Los Angeles United States English
+Las Vegas United States English
+Long Beach United States English
+Lexington-Fayette United States English
+Louisville United States English
+Lincoln United States English
+Lubbock United States English
+Little Rock United States English
+Laredo United States English
+Lakewood United States English
+Lansing United States English
+Lancaster United States English
+Lafayette United States English
+Lowell United States English
+Livonia United States English
+# !!!NB igor: after backporting the SJ code the following should return
+# EXPLAIN
+# SELECT Name FROM City
+# WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+# City.Population > 100000;
+# id select_type table type possible_keys key key_len ref rows Extra
+# 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+# 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+EXPLAIN
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 PRIMARY City ALL Population NULL NULL NULL 4079 Using where
+2 DEPENDENT SUBQUERY Country unique_subquery PRIMARY,Name PRIMARY 3 func 1 Using where
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+Name
+Vientiane
+Riga
+Daugavpils
+Maseru
+Beirut
+Tripoli
+Monrovia
+Tripoli
+Bengasi
+Misrata
+Vilnius
+Kaunas
+Klaipeda
+?iauliai
+Panevezys
+EXPLAIN
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+FROM Country LEFT JOIN CountryLanguage ON
+(CountryLanguage.Country=Country.Code AND Language='English')
+WHERE
+Country.Population > 10000000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country ALL NULL NULL NULL NULL 239 Using where
+1 SIMPLE CountryLanguage eq_ref PRIMARY PRIMARY 33 world.Country.Code,const 1 Using where; Using join buffer
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+FROM Country LEFT JOIN CountryLanguage ON
+(CountryLanguage.Country=Country.Code AND Language='English')
+WHERE
+Country.Population > 10000000;
+Name IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+Australia 81.2
+United Kingdom 97.3
+Canada 60.4
+United States 86.2
+Zimbabwe 2.2
+Japan 0.1
+South Africa 8.5
+Malaysia 1.6
+Afghanistan NULL
+Netherlands NULL
+Algeria NULL
+Angola NULL
+Argentina NULL
+Bangladesh NULL
+Belgium NULL
+Brazil NULL
+Burkina Faso NULL
+Chile NULL
+Ecuador NULL
+Egypt NULL
+Spain NULL
+Ethiopia NULL
+Philippines NULL
+Ghana NULL
+Guatemala NULL
+Indonesia NULL
+India NULL
+Iraq NULL
+Iran NULL
+Italy NULL
+Yemen NULL
+Yugoslavia NULL
+Cambodia NULL
+Cameroon NULL
+Kazakstan NULL
+Kenya NULL
+China NULL
+Colombia NULL
+Congo, The Democratic Republic of the NULL
+North Korea NULL
+South Korea NULL
+Greece NULL
+Cuba NULL
+Madagascar NULL
+Malawi NULL
+Mali NULL
+Morocco NULL
+Mexico NULL
+Mozambique NULL
+Myanmar NULL
+Nepal NULL
+Niger NULL
+Nigeria NULL
+Côte d?Ivoire NULL
+Pakistan NULL
+Peru NULL
+Poland NULL
+France NULL
+Romania NULL
+Germany NULL
+Saudi Arabia NULL
+Somalia NULL
+Sri Lanka NULL
+Sudan NULL
+Syria NULL
+Taiwan NULL
+Tanzania NULL
+Thailand NULL
+Czech Republic NULL
+Turkey NULL
+Uganda NULL
+Ukraine NULL
+Hungary NULL
+Uzbekistan NULL
+Belarus NULL
+Venezuela NULL
+Russian Federation NULL
+Vietnam NULL
+set join_cache_level=6;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 6
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+1 SIMPLE City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE CountryLanguage ALL PRIMARY,Percentage NULL NULL NULL 984 Using where
+1 SIMPLE Country eq_ref PRIMARY PRIMARY 3 world.CountryLanguage.Country 1 Using where; Using join buffer
+1 SIMPLE City ref Country Country 3 world.Country.Code 18 Using index condition(BKA); Using where; Using join buffer
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+Name Name Language
+Leiden Netherlands Dutch
+La Matanza Argentina Spanish
+Lomas de Zamora Argentina Spanish
+La Plata Argentina Spanish
+Lanús Argentina Spanish
+Las Heras Argentina Spanish
+La Rioja Argentina Spanish
+Liège Belgium Dutch
+La Paz Bolivia Spanish
+Londrina Brazil Portuguese
+Limeira Brazil Portuguese
+Lages Brazil Portuguese
+Luziânia Brazil Portuguese
+Lauro de Freitas Brazil Portuguese
+Linhares Brazil Portuguese
+London United Kingdom English
+Liverpool United Kingdom English
+Leeds United Kingdom English
+Leicester United Kingdom English
+Luton United Kingdom English
+Los Angeles Chile Spanish
+La Serena Chile Spanish
+La Romana Dominican Republic Spanish
+Loja Ecuador Spanish
+Luxor Egypt Arabic
+Las Palmas de Gran Canaria Spain Spanish
+L´Hospitalet de Llobregat Spain Spanish
+Leganés Spain Spanish
+León Spain Spanish
+Logroño Spain Spanish
+Lleida (Lérida) Spain Spanish
+Le-Cap-Haïtien Haiti Haiti Creole
+La Ceiba Honduras Spanish
+Livorno Italy Italian
+Latina Italy Italian
+Lecce Italy Italian
+La Spezia Italy Italian
+Linz Austria German
+London Canada English
+Laval Canada English
+Longueuil Canada English
+Lanzhou China Chinese
+Luoyang China Chinese
+Liuzhou China Chinese
+Liaoyang China Chinese
+Liupanshui China Chinese
+Liaoyuan China Chinese
+Lianyungang China Chinese
+Leshan China Chinese
+Linyi China Chinese
+Luzhou China Chinese
+Laiwu China Chinese
+Liaocheng China Chinese
+Laizhou China Chinese
+Linfen China Chinese
+Liangcheng China Chinese
+Longkou China Chinese
+Langfang China Chinese
+Liu´an China Chinese
+Longjing China Chinese
+Lengshuijiang China Chinese
+Laiyang China Chinese
+Longyan China Chinese
+Linhe China Chinese
+Leiyang China Chinese
+Loudi China Chinese
+Luohe China Chinese
+Linqing China Chinese
+Laohekou China Chinese
+Linchuan China Chinese
+Lhasa China Chinese
+Lianyuan China Chinese
+Liyang China Chinese
+Liling China Chinese
+Linhai China Chinese
+Larisa Greece Greek
+La Habana Cuba Spanish
+Lilongwe Malawi Chichewa
+León Mexico Spanish
+La Paz Mexico Spanish
+La Paz Mexico Spanish
+Lázaro Cárdenas Mexico Spanish
+Lagos de Moreno Mexico Spanish
+Lerdo Mexico Spanish
+Los Cabos Mexico Spanish
+Lerma Mexico Spanish
+Las Margaritas Mexico Spanish
+Lashio (Lasho) Myanmar Burmese
+Lalitapur Nepal Nepali
+León Nicaragua Spanish
+Lambaré Paraguay Spanish
+Lima Peru Spanish
+Lisboa Portugal Portuguese
+Lódz Poland Polish
+Lublin Poland Polish
+Legnica Poland Polish
+Lyon France French
+Le Havre France French
+Lille France French
+Le Mans France French
+Limoges France French
+Linköping Sweden Swedish
+Lund Sweden Swedish
+Leipzig Germany German
+Lübeck Germany German
+Ludwigshafen am Rhein Germany German
+Leverkusen Germany German
+Lünen Germany German
+Lahti Finland Finnish
+Lausanne Switzerland German
+Latakia Syria Arabic
+Luchou Taiwan Min
+Lungtan Taiwan Min
+Liberec Czech Republic Czech
+Lviv Ukraine Ukrainian
+Lugansk Ukraine Ukrainian
+Lutsk Ukraine Ukrainian
+Lysyt?ansk Ukraine Ukrainian
+Lower Hutt New Zealand English
+Lida Belarus Belorussian
+Los Teques Venezuela Spanish
+Lipetsk Russian Federation Russian
+Ljubertsy Russian Federation Russian
+Leninsk-Kuznetski Russian Federation Russian
+Long Xuyen Vietnam Vietnamese
+Los Angeles United States English
+Las Vegas United States English
+Long Beach United States English
+Lexington-Fayette United States English
+Louisville United States English
+Lincoln United States English
+Lubbock United States English
+Little Rock United States English
+Laredo United States English
+Lakewood United States English
+Lansing United States English
+Lancaster United States English
+Lafayette United States English
+Lowell United States English
+Livonia United States English
+# !!!NB igor: after backporting the SJ code the following should return
+# EXPLAIN
+# SELECT Name FROM City
+# WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+# City.Population > 100000;
+# id select_type table type possible_keys key key_len ref rows Extra
+# 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+# 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+EXPLAIN
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 PRIMARY City ALL Population NULL NULL NULL 4079 Using where
+2 DEPENDENT SUBQUERY Country unique_subquery PRIMARY,Name PRIMARY 3 func 1 Using where
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+Name
+Vientiane
+Riga
+Daugavpils
+Maseru
+Beirut
+Tripoli
+Monrovia
+Tripoli
+Bengasi
+Misrata
+Vilnius
+Kaunas
+Klaipeda
+?iauliai
+Panevezys
+EXPLAIN
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+FROM Country LEFT JOIN CountryLanguage ON
+(CountryLanguage.Country=Country.Code AND Language='English')
+WHERE
+Country.Population > 10000000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country ALL NULL NULL NULL NULL 239 Using where
+1 SIMPLE CountryLanguage eq_ref PRIMARY PRIMARY 33 world.Country.Code,const 1 Using where; Using join buffer
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+FROM Country LEFT JOIN CountryLanguage ON
+(CountryLanguage.Country=Country.Code AND Language='English')
+WHERE
+Country.Population > 10000000;
+Name IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+Australia 81.2
+United Kingdom 97.3
+Canada 60.4
+United States 86.2
+Zimbabwe 2.2
+Japan 0.1
+South Africa 8.5
+Malaysia 1.6
+Afghanistan NULL
+Netherlands NULL
+Algeria NULL
+Angola NULL
+Argentina NULL
+Bangladesh NULL
+Belgium NULL
+Brazil NULL
+Burkina Faso NULL
+Chile NULL
+Ecuador NULL
+Egypt NULL
+Spain NULL
+Ethiopia NULL
+Philippines NULL
+Ghana NULL
+Guatemala NULL
+Indonesia NULL
+India NULL
+Iraq NULL
+Iran NULL
+Italy NULL
+Yemen NULL
+Yugoslavia NULL
+Cambodia NULL
+Cameroon NULL
+Kazakstan NULL
+Kenya NULL
+China NULL
+Colombia NULL
+Congo, The Democratic Republic of the NULL
+North Korea NULL
+South Korea NULL
+Greece NULL
+Cuba NULL
+Madagascar NULL
+Malawi NULL
+Mali NULL
+Morocco NULL
+Mexico NULL
+Mozambique NULL
+Myanmar NULL
+Nepal NULL
+Niger NULL
+Nigeria NULL
+Côte d?Ivoire NULL
+Pakistan NULL
+Peru NULL
+Poland NULL
+France NULL
+Romania NULL
+Germany NULL
+Saudi Arabia NULL
+Somalia NULL
+Sri Lanka NULL
+Sudan NULL
+Syria NULL
+Taiwan NULL
+Tanzania NULL
+Thailand NULL
+Czech Republic NULL
+Turkey NULL
+Uganda NULL
+Ukraine NULL
+Hungary NULL
+Uzbekistan NULL
+Belarus NULL
+Venezuela NULL
+Russian Federation NULL
+Vietnam NULL
+set join_cache_level=7;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 7
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+1 SIMPLE City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE CountryLanguage ALL PRIMARY,Percentage NULL NULL NULL 984 Using where
+1 SIMPLE Country eq_ref PRIMARY PRIMARY 3 world.CountryLanguage.Country 1 Using where; Using join buffer
+1 SIMPLE City ref Country Country 3 world.Country.Code 18 Using index condition(BKA); Using where; Using join buffer
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+Name Name Language
+Leiden Netherlands Dutch
+La Matanza Argentina Spanish
+Lomas de Zamora Argentina Spanish
+La Plata Argentina Spanish
+Lanús Argentina Spanish
+Las Heras Argentina Spanish
+La Rioja Argentina Spanish
+Liège Belgium Dutch
+La Paz Bolivia Spanish
+Londrina Brazil Portuguese
+Limeira Brazil Portuguese
+Lages Brazil Portuguese
+Luziânia Brazil Portuguese
+Lauro de Freitas Brazil Portuguese
+Linhares Brazil Portuguese
+London United Kingdom English
+Liverpool United Kingdom English
+Leeds United Kingdom English
+Leicester United Kingdom English
+Luton United Kingdom English
+Los Angeles Chile Spanish
+La Serena Chile Spanish
+La Romana Dominican Republic Spanish
+Loja Ecuador Spanish
+Luxor Egypt Arabic
+Las Palmas de Gran Canaria Spain Spanish
+L´Hospitalet de Llobregat Spain Spanish
+Leganés Spain Spanish
+León Spain Spanish
+Logroño Spain Spanish
+Lleida (Lérida) Spain Spanish
+Le-Cap-Haïtien Haiti Haiti Creole
+La Ceiba Honduras Spanish
+Livorno Italy Italian
+Latina Italy Italian
+Lecce Italy Italian
+La Spezia Italy Italian
+Linz Austria German
+London Canada English
+Laval Canada English
+Longueuil Canada English
+Lanzhou China Chinese
+Luoyang China Chinese
+Liuzhou China Chinese
+Liaoyang China Chinese
+Liupanshui China Chinese
+Liaoyuan China Chinese
+Lianyungang China Chinese
+Leshan China Chinese
+Linyi China Chinese
+Luzhou China Chinese
+Laiwu China Chinese
+Liaocheng China Chinese
+Laizhou China Chinese
+Linfen China Chinese
+Liangcheng China Chinese
+Longkou China Chinese
+Langfang China Chinese
+Liu´an China Chinese
+Longjing China Chinese
+Lengshuijiang China Chinese
+Laiyang China Chinese
+Longyan China Chinese
+Linhe China Chinese
+Leiyang China Chinese
+Loudi China Chinese
+Luohe China Chinese
+Linqing China Chinese
+Laohekou China Chinese
+Linchuan China Chinese
+Lhasa China Chinese
+Lianyuan China Chinese
+Liyang China Chinese
+Liling China Chinese
+Linhai China Chinese
+Larisa Greece Greek
+La Habana Cuba Spanish
+Lilongwe Malawi Chichewa
+León Mexico Spanish
+La Paz Mexico Spanish
+La Paz Mexico Spanish
+Lázaro Cárdenas Mexico Spanish
+Lagos de Moreno Mexico Spanish
+Lerdo Mexico Spanish
+Los Cabos Mexico Spanish
+Lerma Mexico Spanish
+Las Margaritas Mexico Spanish
+Lashio (Lasho) Myanmar Burmese
+Lalitapur Nepal Nepali
+León Nicaragua Spanish
+Lambaré Paraguay Spanish
+Lima Peru Spanish
+Lisboa Portugal Portuguese
+Lódz Poland Polish
+Lublin Poland Polish
+Legnica Poland Polish
+Lyon France French
+Le Havre France French
+Lille France French
+Le Mans France French
+Limoges France French
+Linköping Sweden Swedish
+Lund Sweden Swedish
+Leipzig Germany German
+Lübeck Germany German
+Ludwigshafen am Rhein Germany German
+Leverkusen Germany German
+Lünen Germany German
+Lahti Finland Finnish
+Lausanne Switzerland German
+Latakia Syria Arabic
+Luchou Taiwan Min
+Lungtan Taiwan Min
+Liberec Czech Republic Czech
+Lviv Ukraine Ukrainian
+Lugansk Ukraine Ukrainian
+Lutsk Ukraine Ukrainian
+Lysyt?ansk Ukraine Ukrainian
+Lower Hutt New Zealand English
+Lida Belarus Belorussian
+Los Teques Venezuela Spanish
+Lipetsk Russian Federation Russian
+Ljubertsy Russian Federation Russian
+Leninsk-Kuznetski Russian Federation Russian
+Long Xuyen Vietnam Vietnamese
+Los Angeles United States English
+Las Vegas United States English
+Long Beach United States English
+Lexington-Fayette United States English
+Louisville United States English
+Lincoln United States English
+Lubbock United States English
+Little Rock United States English
+Laredo United States English
+Lakewood United States English
+Lansing United States English
+Lancaster United States English
+Lafayette United States English
+Lowell United States English
+Livonia United States English
+# !!!NB igor: after backporting the SJ code the following should return
+# EXPLAIN
+# SELECT Name FROM City
+# WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+# City.Population > 100000;
+# id select_type table type possible_keys key key_len ref rows Extra
+# 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+# 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+EXPLAIN
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 PRIMARY City ALL Population NULL NULL NULL 4079 Using where
+2 DEPENDENT SUBQUERY Country unique_subquery PRIMARY,Name PRIMARY 3 func 1 Using where
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+Name
+Vientiane
+Riga
+Daugavpils
+Maseru
+Beirut
+Tripoli
+Monrovia
+Tripoli
+Bengasi
+Misrata
+Vilnius
+Kaunas
+Klaipeda
+?iauliai
+Panevezys
+EXPLAIN
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+FROM Country LEFT JOIN CountryLanguage ON
+(CountryLanguage.Country=Country.Code AND Language='English')
+WHERE
+Country.Population > 10000000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country ALL NULL NULL NULL NULL 239 Using where
+1 SIMPLE CountryLanguage eq_ref PRIMARY PRIMARY 33 world.Country.Code,const 1 Using where; Using join buffer
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+FROM Country LEFT JOIN CountryLanguage ON
+(CountryLanguage.Country=Country.Code AND Language='English')
+WHERE
+Country.Population > 10000000;
+Name IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+Australia 81.2
+United Kingdom 97.3
+Canada 60.4
+United States 86.2
+Zimbabwe 2.2
+Japan 0.1
+South Africa 8.5
+Malaysia 1.6
+Afghanistan NULL
+Netherlands NULL
+Algeria NULL
+Angola NULL
+Argentina NULL
+Bangladesh NULL
+Belgium NULL
+Brazil NULL
+Burkina Faso NULL
+Chile NULL
+Ecuador NULL
+Egypt NULL
+Spain NULL
+Ethiopia NULL
+Philippines NULL
+Ghana NULL
+Guatemala NULL
+Indonesia NULL
+India NULL
+Iraq NULL
+Iran NULL
+Italy NULL
+Yemen NULL
+Yugoslavia NULL
+Cambodia NULL
+Cameroon NULL
+Kazakstan NULL
+Kenya NULL
+China NULL
+Colombia NULL
+Congo, The Democratic Republic of the NULL
+North Korea NULL
+South Korea NULL
+Greece NULL
+Cuba NULL
+Madagascar NULL
+Malawi NULL
+Mali NULL
+Morocco NULL
+Mexico NULL
+Mozambique NULL
+Myanmar NULL
+Nepal NULL
+Niger NULL
+Nigeria NULL
+Côte d?Ivoire NULL
+Pakistan NULL
+Peru NULL
+Poland NULL
+France NULL
+Romania NULL
+Germany NULL
+Saudi Arabia NULL
+Somalia NULL
+Sri Lanka NULL
+Sudan NULL
+Syria NULL
+Taiwan NULL
+Tanzania NULL
+Thailand NULL
+Czech Republic NULL
+Turkey NULL
+Uganda NULL
+Ukraine NULL
+Hungary NULL
+Uzbekistan NULL
+Belarus NULL
+Venezuela NULL
+Russian Federation NULL
+Vietnam NULL
+set join_cache_level=8;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 8
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+1 SIMPLE City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE CountryLanguage ALL PRIMARY,Percentage NULL NULL NULL 984 Using where
+1 SIMPLE Country eq_ref PRIMARY PRIMARY 3 world.CountryLanguage.Country 1 Using where; Using join buffer
+1 SIMPLE City ref Country Country 3 world.Country.Code 18 Using index condition(BKA); Using where; Using join buffer
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+Name Name Language
+Leiden Netherlands Dutch
+La Matanza Argentina Spanish
+Lomas de Zamora Argentina Spanish
+La Plata Argentina Spanish
+Lanús Argentina Spanish
+Las Heras Argentina Spanish
+La Rioja Argentina Spanish
+Liège Belgium Dutch
+La Paz Bolivia Spanish
+Londrina Brazil Portuguese
+Limeira Brazil Portuguese
+Lages Brazil Portuguese
+Luziânia Brazil Portuguese
+Lauro de Freitas Brazil Portuguese
+Linhares Brazil Portuguese
+London United Kingdom English
+Liverpool United Kingdom English
+Leeds United Kingdom English
+Leicester United Kingdom English
+Luton United Kingdom English
+Los Angeles Chile Spanish
+La Serena Chile Spanish
+La Romana Dominican Republic Spanish
+Loja Ecuador Spanish
+Luxor Egypt Arabic
+Las Palmas de Gran Canaria Spain Spanish
+L´Hospitalet de Llobregat Spain Spanish
+Leganés Spain Spanish
+León Spain Spanish
+Logroño Spain Spanish
+Lleida (Lérida) Spain Spanish
+Le-Cap-Haïtien Haiti Haiti Creole
+La Ceiba Honduras Spanish
+Livorno Italy Italian
+Latina Italy Italian
+Lecce Italy Italian
+La Spezia Italy Italian
+Linz Austria German
+London Canada English
+Laval Canada English
+Longueuil Canada English
+Lanzhou China Chinese
+Luoyang China Chinese
+Liuzhou China Chinese
+Liaoyang China Chinese
+Liupanshui China Chinese
+Liaoyuan China Chinese
+Lianyungang China Chinese
+Leshan China Chinese
+Linyi China Chinese
+Luzhou China Chinese
+Laiwu China Chinese
+Liaocheng China Chinese
+Laizhou China Chinese
+Linfen China Chinese
+Liangcheng China Chinese
+Longkou China Chinese
+Langfang China Chinese
+Liu´an China Chinese
+Longjing China Chinese
+Lengshuijiang China Chinese
+Laiyang China Chinese
+Longyan China Chinese
+Linhe China Chinese
+Leiyang China Chinese
+Loudi China Chinese
+Luohe China Chinese
+Linqing China Chinese
+Laohekou China Chinese
+Linchuan China Chinese
+Lhasa China Chinese
+Lianyuan China Chinese
+Liyang China Chinese
+Liling China Chinese
+Linhai China Chinese
+Larisa Greece Greek
+La Habana Cuba Spanish
+Lilongwe Malawi Chichewa
+León Mexico Spanish
+La Paz Mexico Spanish
+La Paz Mexico Spanish
+Lázaro Cárdenas Mexico Spanish
+Lagos de Moreno Mexico Spanish
+Lerdo Mexico Spanish
+Los Cabos Mexico Spanish
+Lerma Mexico Spanish
+Las Margaritas Mexico Spanish
+Lashio (Lasho) Myanmar Burmese
+Lalitapur Nepal Nepali
+León Nicaragua Spanish
+Lambaré Paraguay Spanish
+Lima Peru Spanish
+Lisboa Portugal Portuguese
+Lódz Poland Polish
+Lublin Poland Polish
+Legnica Poland Polish
+Lyon France French
+Le Havre France French
+Lille France French
+Le Mans France French
+Limoges France French
+Linköping Sweden Swedish
+Lund Sweden Swedish
+Leipzig Germany German
+Lübeck Germany German
+Ludwigshafen am Rhein Germany German
+Leverkusen Germany German
+Lünen Germany German
+Lahti Finland Finnish
+Lausanne Switzerland German
+Latakia Syria Arabic
+Luchou Taiwan Min
+Lungtan Taiwan Min
+Liberec Czech Republic Czech
+Lviv Ukraine Ukrainian
+Lugansk Ukraine Ukrainian
+Lutsk Ukraine Ukrainian
+Lysyt?ansk Ukraine Ukrainian
+Lower Hutt New Zealand English
+Lida Belarus Belorussian
+Los Teques Venezuela Spanish
+Lipetsk Russian Federation Russian
+Ljubertsy Russian Federation Russian
+Leninsk-Kuznetski Russian Federation Russian
+Long Xuyen Vietnam Vietnamese
+Los Angeles United States English
+Las Vegas United States English
+Long Beach United States English
+Lexington-Fayette United States English
+Louisville United States English
+Lincoln United States English
+Lubbock United States English
+Little Rock United States English
+Laredo United States English
+Lakewood United States English
+Lansing United States English
+Lancaster United States English
+Lafayette United States English
+Lowell United States English
+Livonia United States English
+# !!!NB igor: after backporting the SJ code the following should return
+# EXPLAIN
+# SELECT Name FROM City
+# WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+# City.Population > 100000;
+# id select_type table type possible_keys key key_len ref rows Extra
+# 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+# 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+EXPLAIN
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 PRIMARY City ALL Population NULL NULL NULL 4079 Using where
+2 DEPENDENT SUBQUERY Country unique_subquery PRIMARY,Name PRIMARY 3 func 1 Using where
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+Name
+Vientiane
+Riga
+Daugavpils
+Maseru
+Beirut
+Tripoli
+Monrovia
+Tripoli
+Bengasi
+Misrata
+Vilnius
+Kaunas
+Klaipeda
+?iauliai
+Panevezys
+EXPLAIN
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+FROM Country LEFT JOIN CountryLanguage ON
+(CountryLanguage.Country=Country.Code AND Language='English')
+WHERE
+Country.Population > 10000000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country ALL NULL NULL NULL NULL 239 Using where
+1 SIMPLE CountryLanguage eq_ref PRIMARY PRIMARY 33 world.Country.Code,const 1 Using where; Using join buffer
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+FROM Country LEFT JOIN CountryLanguage ON
+(CountryLanguage.Country=Country.Code AND Language='English')
+WHERE
+Country.Population > 10000000;
+Name IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+Australia 81.2
+United Kingdom 97.3
+Canada 60.4
+United States 86.2
+Zimbabwe 2.2
+Japan 0.1
+South Africa 8.5
+Malaysia 1.6
+Afghanistan NULL
+Netherlands NULL
+Algeria NULL
+Angola NULL
+Argentina NULL
+Bangladesh NULL
+Belgium NULL
+Brazil NULL
+Burkina Faso NULL
+Chile NULL
+Ecuador NULL
+Egypt NULL
+Spain NULL
+Ethiopia NULL
+Philippines NULL
+Ghana NULL
+Guatemala NULL
+Indonesia NULL
+India NULL
+Iraq NULL
+Iran NULL
+Italy NULL
+Yemen NULL
+Yugoslavia NULL
+Cambodia NULL
+Cameroon NULL
+Kazakstan NULL
+Kenya NULL
+China NULL
+Colombia NULL
+Congo, The Democratic Republic of the NULL
+North Korea NULL
+South Korea NULL
+Greece NULL
+Cuba NULL
+Madagascar NULL
+Malawi NULL
+Mali NULL
+Morocco NULL
+Mexico NULL
+Mozambique NULL
+Myanmar NULL
+Nepal NULL
+Niger NULL
+Nigeria NULL
+Côte d?Ivoire NULL
+Pakistan NULL
+Peru NULL
+Poland NULL
+France NULL
+Romania NULL
+Germany NULL
+Saudi Arabia NULL
+Somalia NULL
+Sri Lanka NULL
+Sudan NULL
+Syria NULL
+Taiwan NULL
+Tanzania NULL
+Thailand NULL
+Czech Republic NULL
+Turkey NULL
+Uganda NULL
+Ukraine NULL
+Hungary NULL
+Uzbekistan NULL
+Belarus NULL
+Venezuela NULL
+Russian Federation NULL
+Vietnam NULL
+set join_buffer_size=256;
+show variables like 'join_buffer_size';
+Variable_name Value
+join_buffer_size 256
+set join_cache_level=5;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 5
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+1 SIMPLE City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE CountryLanguage ALL PRIMARY,Percentage NULL NULL NULL 984 Using where
+1 SIMPLE Country eq_ref PRIMARY PRIMARY 3 world.CountryLanguage.Country 1 Using where; Using join buffer
+1 SIMPLE City ref Country Country 3 world.Country.Code 18 Using index condition(BKA); Using where; Using join buffer
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+Name Name Language
+Leiden Netherlands Dutch
+La Matanza Argentina Spanish
+Lomas de Zamora Argentina Spanish
+La Plata Argentina Spanish
+Lanús Argentina Spanish
+Las Heras Argentina Spanish
+La Rioja Argentina Spanish
+Liège Belgium Dutch
+La Paz Bolivia Spanish
+Londrina Brazil Portuguese
+Limeira Brazil Portuguese
+Lages Brazil Portuguese
+Luziânia Brazil Portuguese
+Lauro de Freitas Brazil Portuguese
+Linhares Brazil Portuguese
+London United Kingdom English
+Liverpool United Kingdom English
+Leeds United Kingdom English
+Leicester United Kingdom English
+Luton United Kingdom English
+Los Angeles Chile Spanish
+La Serena Chile Spanish
+La Romana Dominican Republic Spanish
+Loja Ecuador Spanish
+Luxor Egypt Arabic
+Las Palmas de Gran Canaria Spain Spanish
+L´Hospitalet de Llobregat Spain Spanish
+Leganés Spain Spanish
+León Spain Spanish
+Logroño Spain Spanish
+Lleida (Lérida) Spain Spanish
+Le-Cap-Haïtien Haiti Haiti Creole
+La Ceiba Honduras Spanish
+Livorno Italy Italian
+Latina Italy Italian
+Lecce Italy Italian
+La Spezia Italy Italian
+Linz Austria German
+London Canada English
+Laval Canada English
+Longueuil Canada English
+Lanzhou China Chinese
+Luoyang China Chinese
+Liuzhou China Chinese
+Liaoyang China Chinese
+Liupanshui China Chinese
+Liaoyuan China Chinese
+Lianyungang China Chinese
+Leshan China Chinese
+Linyi China Chinese
+Luzhou China Chinese
+Laiwu China Chinese
+Liaocheng China Chinese
+Laizhou China Chinese
+Linfen China Chinese
+Liangcheng China Chinese
+Longkou China Chinese
+Langfang China Chinese
+Liu´an China Chinese
+Longjing China Chinese
+Lengshuijiang China Chinese
+Laiyang China Chinese
+Longyan China Chinese
+Linhe China Chinese
+Leiyang China Chinese
+Loudi China Chinese
+Luohe China Chinese
+Linqing China Chinese
+Laohekou China Chinese
+Linchuan China Chinese
+Lhasa China Chinese
+Lianyuan China Chinese
+Liyang China Chinese
+Liling China Chinese
+Linhai China Chinese
+Larisa Greece Greek
+La Habana Cuba Spanish
+Lilongwe Malawi Chichewa
+León Mexico Spanish
+La Paz Mexico Spanish
+La Paz Mexico Spanish
+Lázaro Cárdenas Mexico Spanish
+Lagos de Moreno Mexico Spanish
+Lerdo Mexico Spanish
+Los Cabos Mexico Spanish
+Lerma Mexico Spanish
+Las Margaritas Mexico Spanish
+Lashio (Lasho) Myanmar Burmese
+Lalitapur Nepal Nepali
+León Nicaragua Spanish
+Lambaré Paraguay Spanish
+Lima Peru Spanish
+Lisboa Portugal Portuguese
+Lódz Poland Polish
+Lublin Poland Polish
+Legnica Poland Polish
+Lyon France French
+Le Havre France French
+Lille France French
+Le Mans France French
+Limoges France French
+Linköping Sweden Swedish
+Lund Sweden Swedish
+Leipzig Germany German
+Lübeck Germany German
+Ludwigshafen am Rhein Germany German
+Leverkusen Germany German
+Lünen Germany German
+Lahti Finland Finnish
+Lausanne Switzerland German
+Latakia Syria Arabic
+Luchou Taiwan Min
+Lungtan Taiwan Min
+Liberec Czech Republic Czech
+Lviv Ukraine Ukrainian
+Lugansk Ukraine Ukrainian
+Lutsk Ukraine Ukrainian
+Lysyt?ansk Ukraine Ukrainian
+Lower Hutt New Zealand English
+Lida Belarus Belorussian
+Los Teques Venezuela Spanish
+Lipetsk Russian Federation Russian
+Ljubertsy Russian Federation Russian
+Leninsk-Kuznetski Russian Federation Russian
+Long Xuyen Vietnam Vietnamese
+Los Angeles United States English
+Las Vegas United States English
+Long Beach United States English
+Lexington-Fayette United States English
+Louisville United States English
+Lincoln United States English
+Lubbock United States English
+Little Rock United States English
+Laredo United States English
+Lakewood United States English
+Lansing United States English
+Lancaster United States English
+Lafayette United States English
+Lowell United States English
+Livonia United States English
+# !!!NB igor: after backporting the SJ code the following should return
+# EXPLAIN
+# SELECT Name FROM City
+# WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+# City.Population > 100000;
+# id select_type table type possible_keys key key_len ref rows Extra
+# 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+# 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+EXPLAIN
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 PRIMARY City ALL Population NULL NULL NULL 4079 Using where
+2 DEPENDENT SUBQUERY Country unique_subquery PRIMARY,Name PRIMARY 3 func 1 Using where
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+Name
+Vientiane
+Riga
+Daugavpils
+Maseru
+Beirut
+Tripoli
+Monrovia
+Tripoli
+Bengasi
+Misrata
+Vilnius
+Kaunas
+Klaipeda
+?iauliai
+Panevezys
+set join_cache_level=6;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 6
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+1 SIMPLE City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE CountryLanguage ALL PRIMARY,Percentage NULL NULL NULL 984 Using where
+1 SIMPLE Country eq_ref PRIMARY PRIMARY 3 world.CountryLanguage.Country 1 Using where; Using join buffer
+1 SIMPLE City ref Country Country 3 world.Country.Code 18 Using index condition(BKA); Using where; Using join buffer
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+Name Name Language
+Leiden Netherlands Dutch
+La Matanza Argentina Spanish
+Lomas de Zamora Argentina Spanish
+La Plata Argentina Spanish
+Lanús Argentina Spanish
+Las Heras Argentina Spanish
+La Rioja Argentina Spanish
+Liège Belgium Dutch
+La Paz Bolivia Spanish
+Londrina Brazil Portuguese
+Limeira Brazil Portuguese
+Lages Brazil Portuguese
+Luziânia Brazil Portuguese
+Lauro de Freitas Brazil Portuguese
+Linhares Brazil Portuguese
+London United Kingdom English
+Liverpool United Kingdom English
+Leeds United Kingdom English
+Leicester United Kingdom English
+Luton United Kingdom English
+Los Angeles Chile Spanish
+La Serena Chile Spanish
+La Romana Dominican Republic Spanish
+Loja Ecuador Spanish
+Luxor Egypt Arabic
+Las Palmas de Gran Canaria Spain Spanish
+L´Hospitalet de Llobregat Spain Spanish
+Leganés Spain Spanish
+León Spain Spanish
+Logroño Spain Spanish
+Lleida (Lérida) Spain Spanish
+Le-Cap-Haïtien Haiti Haiti Creole
+La Ceiba Honduras Spanish
+Livorno Italy Italian
+Latina Italy Italian
+Lecce Italy Italian
+La Spezia Italy Italian
+Linz Austria German
+London Canada English
+Laval Canada English
+Longueuil Canada English
+Lanzhou China Chinese
+Luoyang China Chinese
+Liuzhou China Chinese
+Liaoyang China Chinese
+Liupanshui China Chinese
+Liaoyuan China Chinese
+Lianyungang China Chinese
+Leshan China Chinese
+Linyi China Chinese
+Luzhou China Chinese
+Laiwu China Chinese
+Liaocheng China Chinese
+Laizhou China Chinese
+Linfen China Chinese
+Liangcheng China Chinese
+Longkou China Chinese
+Langfang China Chinese
+Liu´an China Chinese
+Longjing China Chinese
+Lengshuijiang China Chinese
+Laiyang China Chinese
+Longyan China Chinese
+Linhe China Chinese
+Leiyang China Chinese
+Loudi China Chinese
+Luohe China Chinese
+Linqing China Chinese
+Laohekou China Chinese
+Linchuan China Chinese
+Lhasa China Chinese
+Lianyuan China Chinese
+Liyang China Chinese
+Liling China Chinese
+Linhai China Chinese
+Larisa Greece Greek
+La Habana Cuba Spanish
+Lilongwe Malawi Chichewa
+León Mexico Spanish
+La Paz Mexico Spanish
+La Paz Mexico Spanish
+Lázaro Cárdenas Mexico Spanish
+Lagos de Moreno Mexico Spanish
+Lerdo Mexico Spanish
+Los Cabos Mexico Spanish
+Lerma Mexico Spanish
+Las Margaritas Mexico Spanish
+Lashio (Lasho) Myanmar Burmese
+Lalitapur Nepal Nepali
+León Nicaragua Spanish
+Lambaré Paraguay Spanish
+Lima Peru Spanish
+Lisboa Portugal Portuguese
+Lódz Poland Polish
+Lublin Poland Polish
+Legnica Poland Polish
+Lyon France French
+Le Havre France French
+Lille France French
+Le Mans France French
+Limoges France French
+Linköping Sweden Swedish
+Lund Sweden Swedish
+Leipzig Germany German
+Lübeck Germany German
+Ludwigshafen am Rhein Germany German
+Leverkusen Germany German
+Lünen Germany German
+Lahti Finland Finnish
+Lausanne Switzerland German
+Latakia Syria Arabic
+Luchou Taiwan Min
+Lungtan Taiwan Min
+Liberec Czech Republic Czech
+Lviv Ukraine Ukrainian
+Lugansk Ukraine Ukrainian
+Lutsk Ukraine Ukrainian
+Lysyt?ansk Ukraine Ukrainian
+Lower Hutt New Zealand English
+Lida Belarus Belorussian
+Los Teques Venezuela Spanish
+Lipetsk Russian Federation Russian
+Ljubertsy Russian Federation Russian
+Leninsk-Kuznetski Russian Federation Russian
+Long Xuyen Vietnam Vietnamese
+Los Angeles United States English
+Las Vegas United States English
+Long Beach United States English
+Lexington-Fayette United States English
+Louisville United States English
+Lincoln United States English
+Lubbock United States English
+Little Rock United States English
+Laredo United States English
+Lakewood United States English
+Lansing United States English
+Lancaster United States English
+Lafayette United States English
+Lowell United States English
+Livonia United States English
+# !!!NB igor: after backporting the SJ code the following should return
+# EXPLAIN
+# SELECT Name FROM City
+# WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+# City.Population > 100000;
+# id select_type table type possible_keys key key_len ref rows Extra
+# 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+# 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+EXPLAIN
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 PRIMARY City ALL Population NULL NULL NULL 4079 Using where
+2 DEPENDENT SUBQUERY Country unique_subquery PRIMARY,Name PRIMARY 3 func 1 Using where
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+Name
+Vientiane
+Riga
+Daugavpils
+Maseru
+Beirut
+Tripoli
+Monrovia
+Tripoli
+Bengasi
+Misrata
+Vilnius
+Kaunas
+Klaipeda
+?iauliai
+Panevezys
+set join_cache_level=7;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 7
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+1 SIMPLE City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE CountryLanguage ALL PRIMARY,Percentage NULL NULL NULL 984 Using where
+1 SIMPLE Country eq_ref PRIMARY PRIMARY 3 world.CountryLanguage.Country 1 Using where; Using join buffer
+1 SIMPLE City ref Country Country 3 world.Country.Code 18 Using index condition(BKA); Using where; Using join buffer
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+Name Name Language
+Leiden Netherlands Dutch
+La Matanza Argentina Spanish
+Lomas de Zamora Argentina Spanish
+La Plata Argentina Spanish
+Lanús Argentina Spanish
+Las Heras Argentina Spanish
+La Rioja Argentina Spanish
+Liège Belgium Dutch
+La Paz Bolivia Spanish
+Londrina Brazil Portuguese
+Limeira Brazil Portuguese
+Lages Brazil Portuguese
+Luziânia Brazil Portuguese
+Lauro de Freitas Brazil Portuguese
+Linhares Brazil Portuguese
+London United Kingdom English
+Liverpool United Kingdom English
+Leeds United Kingdom English
+Leicester United Kingdom English
+Luton United Kingdom English
+Los Angeles Chile Spanish
+La Serena Chile Spanish
+La Romana Dominican Republic Spanish
+Loja Ecuador Spanish
+Luxor Egypt Arabic
+Las Palmas de Gran Canaria Spain Spanish
+L´Hospitalet de Llobregat Spain Spanish
+Leganés Spain Spanish
+León Spain Spanish
+Logroño Spain Spanish
+Lleida (Lérida) Spain Spanish
+Le-Cap-Haïtien Haiti Haiti Creole
+La Ceiba Honduras Spanish
+Livorno Italy Italian
+Latina Italy Italian
+Lecce Italy Italian
+La Spezia Italy Italian
+Linz Austria German
+London Canada English
+Laval Canada English
+Longueuil Canada English
+Lanzhou China Chinese
+Luoyang China Chinese
+Liuzhou China Chinese
+Liaoyang China Chinese
+Liupanshui China Chinese
+Liaoyuan China Chinese
+Lianyungang China Chinese
+Leshan China Chinese
+Linyi China Chinese
+Luzhou China Chinese
+Laiwu China Chinese
+Liaocheng China Chinese
+Laizhou China Chinese
+Linfen China Chinese
+Liangcheng China Chinese
+Longkou China Chinese
+Langfang China Chinese
+Liu´an China Chinese
+Longjing China Chinese
+Lengshuijiang China Chinese
+Laiyang China Chinese
+Longyan China Chinese
+Linhe China Chinese
+Leiyang China Chinese
+Loudi China Chinese
+Luohe China Chinese
+Linqing China Chinese
+Laohekou China Chinese
+Linchuan China Chinese
+Lhasa China Chinese
+Lianyuan China Chinese
+Liyang China Chinese
+Liling China Chinese
+Linhai China Chinese
+Larisa Greece Greek
+La Habana Cuba Spanish
+Lilongwe Malawi Chichewa
+León Mexico Spanish
+La Paz Mexico Spanish
+La Paz Mexico Spanish
+Lázaro Cárdenas Mexico Spanish
+Lagos de Moreno Mexico Spanish
+Lerdo Mexico Spanish
+Los Cabos Mexico Spanish
+Lerma Mexico Spanish
+Las Margaritas Mexico Spanish
+Lashio (Lasho) Myanmar Burmese
+Lalitapur Nepal Nepali
+León Nicaragua Spanish
+Lambaré Paraguay Spanish
+Lima Peru Spanish
+Lisboa Portugal Portuguese
+Lódz Poland Polish
+Lublin Poland Polish
+Legnica Poland Polish
+Lyon France French
+Le Havre France French
+Lille France French
+Le Mans France French
+Limoges France French
+Linköping Sweden Swedish
+Lund Sweden Swedish
+Leipzig Germany German
+Lübeck Germany German
+Ludwigshafen am Rhein Germany German
+Leverkusen Germany German
+Lünen Germany German
+Lahti Finland Finnish
+Lausanne Switzerland German
+Latakia Syria Arabic
+Luchou Taiwan Min
+Lungtan Taiwan Min
+Liberec Czech Republic Czech
+Lviv Ukraine Ukrainian
+Lugansk Ukraine Ukrainian
+Lutsk Ukraine Ukrainian
+Lysyt?ansk Ukraine Ukrainian
+Lower Hutt New Zealand English
+Lida Belarus Belorussian
+Los Teques Venezuela Spanish
+Lipetsk Russian Federation Russian
+Ljubertsy Russian Federation Russian
+Leninsk-Kuznetski Russian Federation Russian
+Long Xuyen Vietnam Vietnamese
+Los Angeles United States English
+Las Vegas United States English
+Long Beach United States English
+Lexington-Fayette United States English
+Louisville United States English
+Lincoln United States English
+Lubbock United States English
+Little Rock United States English
+Laredo United States English
+Lakewood United States English
+Lansing United States English
+Lancaster United States English
+Lafayette United States English
+Lowell United States English
+Livonia United States English
+# !!!NB igor: after backporting the SJ code the following should return
+# EXPLAIN
+# SELECT Name FROM City
+# WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+# City.Population > 100000;
+# id select_type table type possible_keys key key_len ref rows Extra
+# 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+# 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+EXPLAIN
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 PRIMARY City ALL Population NULL NULL NULL 4079 Using where
+2 DEPENDENT SUBQUERY Country unique_subquery PRIMARY,Name PRIMARY 3 func 1 Using where
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+Name
+Vientiane
+Riga
+Daugavpils
+Maseru
+Beirut
+Tripoli
+Monrovia
+Tripoli
+Bengasi
+Misrata
+Vilnius
+Kaunas
+Klaipeda
+?iauliai
+Panevezys
+set join_cache_level=8;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 8
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+1 SIMPLE City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE CountryLanguage ALL PRIMARY,Percentage NULL NULL NULL 984 Using where
+1 SIMPLE Country eq_ref PRIMARY PRIMARY 3 world.CountryLanguage.Country 1 Using where; Using join buffer
+1 SIMPLE City ref Country Country 3 world.Country.Code 18 Using index condition(BKA); Using where; Using join buffer
+SELECT City.Name, Country.Name, CountryLanguage.Language
+FROM City,Country,CountryLanguage
+WHERE City.Country=Country.Code AND
+CountryLanguage.Country=Country.Code AND
+City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+CountryLanguage.Percentage > 50;
+Name Name Language
+Leiden Netherlands Dutch
+La Matanza Argentina Spanish
+Lomas de Zamora Argentina Spanish
+La Plata Argentina Spanish
+Lanús Argentina Spanish
+Las Heras Argentina Spanish
+La Rioja Argentina Spanish
+Liège Belgium Dutch
+La Paz Bolivia Spanish
+Londrina Brazil Portuguese
+Limeira Brazil Portuguese
+Lages Brazil Portuguese
+Luziânia Brazil Portuguese
+Lauro de Freitas Brazil Portuguese
+Linhares Brazil Portuguese
+London United Kingdom English
+Liverpool United Kingdom English
+Leeds United Kingdom English
+Leicester United Kingdom English
+Luton United Kingdom English
+Los Angeles Chile Spanish
+La Serena Chile Spanish
+La Romana Dominican Republic Spanish
+Loja Ecuador Spanish
+Luxor Egypt Arabic
+Las Palmas de Gran Canaria Spain Spanish
+L´Hospitalet de Llobregat Spain Spanish
+Leganés Spain Spanish
+León Spain Spanish
+Logroño Spain Spanish
+Lleida (Lérida) Spain Spanish
+Le-Cap-Haïtien Haiti Haiti Creole
+La Ceiba Honduras Spanish
+Livorno Italy Italian
+Latina Italy Italian
+Lecce Italy Italian
+La Spezia Italy Italian
+Linz Austria German
+London Canada English
+Laval Canada English
+Longueuil Canada English
+Lanzhou China Chinese
+Luoyang China Chinese
+Liuzhou China Chinese
+Liaoyang China Chinese
+Liupanshui China Chinese
+Liaoyuan China Chinese
+Lianyungang China Chinese
+Leshan China Chinese
+Linyi China Chinese
+Luzhou China Chinese
+Laiwu China Chinese
+Liaocheng China Chinese
+Laizhou China Chinese
+Linfen China Chinese
+Liangcheng China Chinese
+Longkou China Chinese
+Langfang China Chinese
+Liu´an China Chinese
+Longjing China Chinese
+Lengshuijiang China Chinese
+Laiyang China Chinese
+Longyan China Chinese
+Linhe China Chinese
+Leiyang China Chinese
+Loudi China Chinese
+Luohe China Chinese
+Linqing China Chinese
+Laohekou China Chinese
+Linchuan China Chinese
+Lhasa China Chinese
+Lianyuan China Chinese
+Liyang China Chinese
+Liling China Chinese
+Linhai China Chinese
+Larisa Greece Greek
+La Habana Cuba Spanish
+Lilongwe Malawi Chichewa
+León Mexico Spanish
+La Paz Mexico Spanish
+La Paz Mexico Spanish
+Lázaro Cárdenas Mexico Spanish
+Lagos de Moreno Mexico Spanish
+Lerdo Mexico Spanish
+Los Cabos Mexico Spanish
+Lerma Mexico Spanish
+Las Margaritas Mexico Spanish
+Lashio (Lasho) Myanmar Burmese
+Lalitapur Nepal Nepali
+León Nicaragua Spanish
+Lambaré Paraguay Spanish
+Lima Peru Spanish
+Lisboa Portugal Portuguese
+Lódz Poland Polish
+Lublin Poland Polish
+Legnica Poland Polish
+Lyon France French
+Le Havre France French
+Lille France French
+Le Mans France French
+Limoges France French
+Linköping Sweden Swedish
+Lund Sweden Swedish
+Leipzig Germany German
+Lübeck Germany German
+Ludwigshafen am Rhein Germany German
+Leverkusen Germany German
+Lünen Germany German
+Lahti Finland Finnish
+Lausanne Switzerland German
+Latakia Syria Arabic
+Luchou Taiwan Min
+Lungtan Taiwan Min
+Liberec Czech Republic Czech
+Lviv Ukraine Ukrainian
+Lugansk Ukraine Ukrainian
+Lutsk Ukraine Ukrainian
+Lysyt?ansk Ukraine Ukrainian
+Lower Hutt New Zealand English
+Lida Belarus Belorussian
+Los Teques Venezuela Spanish
+Lipetsk Russian Federation Russian
+Ljubertsy Russian Federation Russian
+Leninsk-Kuznetski Russian Federation Russian
+Long Xuyen Vietnam Vietnamese
+Los Angeles United States English
+Las Vegas United States English
+Long Beach United States English
+Lexington-Fayette United States English
+Louisville United States English
+Lincoln United States English
+Lubbock United States English
+Little Rock United States English
+Laredo United States English
+Lakewood United States English
+Lansing United States English
+Lancaster United States English
+Lafayette United States English
+Lowell United States English
+Livonia United States English
+# !!!NB igor: after backporting the SJ code the following should return
+# EXPLAIN
+# SELECT Name FROM City
+# WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+# City.Population > 100000;
+# id select_type table type possible_keys key key_len ref rows Extra
+# 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+# 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+EXPLAIN
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 PRIMARY City ALL Population NULL NULL NULL 4079 Using where
+2 DEPENDENT SUBQUERY Country unique_subquery PRIMARY,Name PRIMARY 3 func 1 Using where
+SELECT Name FROM City
+WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+City.Population > 100000;
+Name
+Vientiane
+Riga
+Daugavpils
+Maseru
+Beirut
+Tripoli
+Monrovia
+Tripoli
+Bengasi
+Misrata
+Vilnius
+Kaunas
+Klaipeda
+?iauliai
+Panevezys
+set join_cache_level=default;
+set join_buffer_size=default;
+show variables like 'join_buffer_size';
+Variable_name Value
+join_buffer_size 131072
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 1
+set join_cache_level=1;
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND City.Population > 3000000;
+Name Name
+Sydney Australia
+Dhaka Bangladesh
+São Paulo Brazil
+Rio de Janeiro Brazil
+London United Kingdom
+Santiago de Chile Chile
+Cairo Egypt
+Alexandria Egypt
+Jakarta Indonesia
+Mumbai (Bombay) India
+Delhi India
+Calcutta [Kolkata] India
+Chennai (Madras) India
+Baghdad Iraq
+Teheran Iran
+Tokyo Japan
+Jokohama [Yokohama] Japan
+Shanghai China
+Peking China
+Chongqing China
+Tianjin China
+Wuhan China
+Harbin China
+Shenyang China
+Kanton [Guangzhou] China
+Chengdu China
+Santafé de Bogotá Colombia
+Kinshasa Congo, The Democratic Republic of the
+Seoul South Korea
+Pusan South Korea
+Ciudad de México Mexico
+Rangoon (Yangon) Myanmar
+Karachi Pakistan
+Lahore Pakistan
+Lima Peru
+Berlin Germany
+Riyadh Saudi Arabia
+Singapore Singapore
+Bangkok Thailand
+Istanbul Turkey
+Ankara Turkey
+Moscow Russian Federation
+St Petersburg Russian Federation
+Ho Chi Minh City Vietnam
+New York United States
+Los Angeles United States
+set join_cache_level=8;
+set join_buffer_size=256;
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND City.Population > 3000000;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE City range Population,Country Population 4 NULL # Using index condition; Using MRR
+1 SIMPLE Country eq_ref PRIMARY PRIMARY 3 world.City.Country # Using join buffer
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND City.Population > 3000000;
+Name Name
+Sydney Australia
+Dhaka Bangladesh
+Rio de Janeiro Brazil
+São Paulo Brazil
+London United Kingdom
+Santiago de Chile Chile
+Alexandria Egypt
+Cairo Egypt
+Jakarta Indonesia
+Delhi India
+Calcutta [Kolkata] India
+Mumbai (Bombay) India
+Chennai (Madras) India
+Baghdad Iraq
+Teheran Iran
+Tokyo Japan
+Jokohama [Yokohama] Japan
+Peking China
+Chongqing China
+Shanghai China
+Wuhan China
+Harbin China
+Shenyang China
+Kanton [Guangzhou] China
+Tianjin China
+Chengdu China
+Santafé de Bogotá Colombia
+Kinshasa Congo, The Democratic Republic of the
+Seoul South Korea
+Pusan South Korea
+Ciudad de México Mexico
+Rangoon (Yangon) Myanmar
+Karachi Pakistan
+Lahore Pakistan
+Lima Peru
+Berlin Germany
+Riyadh Saudi Arabia
+Singapore Singapore
+Bangkok Thailand
+Ankara Turkey
+Istanbul Turkey
+St Petersburg Russian Federation
+Moscow Russian Federation
+Ho Chi Minh City Vietnam
+Los Angeles United States
+New York United States
+set join_buffer_size=default;
+set join_cache_level=6;
+ALTER TABLE Country MODIFY Name varchar(52) NOT NULL default '';
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+ALTER TABLE Country MODIFY Name varchar(300) NOT NULL default '';
+SELECT City.Name, Country.Name FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name
+Vientiane Laos
+Riga Latvia
+Daugavpils Latvia
+Maseru Lesotho
+Beirut Lebanon
+Tripoli Lebanon
+Monrovia Liberia
+Tripoli Libyan Arab Jamahiriya
+Bengasi Libyan Arab Jamahiriya
+Misrata Libyan Arab Jamahiriya
+Vilnius Lithuania
+Kaunas Lithuania
+Klaipeda Lithuania
+?iauliai Lithuania
+Panevezys Lithuania
+ALTER TABLE Country ADD COLUMN PopulationBar text;
+UPDATE Country
+SET PopulationBar=REPEAT('x', CAST(Population/100000 AS unsigned int));
+SELECT City.Name, Country.Name, Country.PopulationBar FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name PopulationBar
+Vientiane Laos xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Riga Latvia xxxxxxxxxxxxxxxxxxxxxxxx
+Daugavpils Latvia xxxxxxxxxxxxxxxxxxxxxxxx
+Maseru Lesotho xxxxxxxxxxxxxxxxxxxxxx
+Beirut Lebanon xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Tripoli Lebanon xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Monrovia Liberia xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Tripoli Libyan Arab Jamahiriya xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Bengasi Libyan Arab Jamahiriya xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Misrata Libyan Arab Jamahiriya xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Vilnius Lithuania xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Kaunas Lithuania xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Klaipeda Lithuania xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+?iauliai Lithuania xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Panevezys Lithuania xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+set join_buffer_size=256;
+SELECT City.Name, Country.Name, Country.PopulationBar FROM City,Country
+WHERE City.Country=Country.Code AND
+Country.Name LIKE 'L%' AND City.Population > 100000;
+Name Name PopulationBar
+Vientiane Laos xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Riga Latvia xxxxxxxxxxxxxxxxxxxxxxxx
+Daugavpils Latvia xxxxxxxxxxxxxxxxxxxxxxxx
+Maseru Lesotho xxxxxxxxxxxxxxxxxxxxxx
+Beirut Lebanon xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Tripoli Lebanon xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Monrovia Liberia xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Tripoli Libyan Arab Jamahiriya xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Bengasi Libyan Arab Jamahiriya xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Misrata Libyan Arab Jamahiriya xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Vilnius Lithuania xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Kaunas Lithuania xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Klaipeda Lithuania xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+?iauliai Lithuania xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+Panevezys Lithuania xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+set join_cache_level=default;
+set join_buffer_size=default;
+DROP DATABASE world;
+use test;
+CREATE TABLE t1(
+affiliatetometaid int NOT NULL default '0',
+uniquekey int NOT NULL default '0',
+metaid int NOT NULL default '0',
+affiliateid int NOT NULL default '0',
+xml text,
+isactive char(1) NOT NULL default 'Y',
+PRIMARY KEY (affiliatetometaid)
+);
+CREATE UNIQUE INDEX t1_uniquekey ON t1(uniquekey);
+CREATE INDEX t1_affiliateid ON t1(affiliateid);
+CREATE INDEX t1_metaid on t1 (metaid);
+INSERT INTO t1 VALUES
+(1616, 1571693233, 1391, 2, NULL, 'Y'), (1943, 1993216749, 1726, 2, NULL, 'Y');
+CREATE TABLE t2(
+metaid int NOT NULL default '0',
+name varchar(80) NOT NULL default '',
+dateadded timestamp NOT NULL ,
+xml text,
+status int default NULL,
+origin int default NULL,
+gid int NOT NULL default '1',
+formattypeid int default NULL,
+PRIMARY KEY (metaid)
+);
+CREATE INDEX t2_status ON t2(status);
+CREATE INDEX t2_gid ON t2(gid);
+CREATE INDEX t2_formattypeid ON t2(formattypeid);
+INSERT INTO t2 VALUES
+(1391, "I Just Died", "2003-10-02 10:07:37", "", 1, NULL, 3, NULL),
+(1726, "Me, Myself & I", "2003-12-05 11:24:36", " ", 1, NULL, 3, NULL);
+CREATE TABLE t3(
+mediaid int NOT NULL ,
+metaid int NOT NULL default '0',
+formatid int NOT NULL default '0',
+status int default NULL,
+path varchar(100) NOT NULL default '',
+datemodified timestamp NOT NULL ,
+resourcetype int NOT NULL default '1',
+parameters text,
+signature int default NULL,
+quality int NOT NULL default '255',
+PRIMARY KEY (mediaid)
+);
+CREATE INDEX t3_metaid ON t3(metaid);
+CREATE INDEX t3_formatid ON t3(formatid);
+CREATE INDEX t3_status ON t3(status);
+CREATE INDEX t3_metaidformatid ON t3(metaid,formatid);
+CREATE INDEX t3_signature ON t3(signature);
+CREATE INDEX t3_quality ON t3(quality);
+INSERT INTO t3 VALUES
+(6, 4, 8, 0, "010101_anastacia_spmidi.mid", "2004-03-16 13:40:00", 1, NULL, NULL, 255),
+(3343, 3, 8, 1, "010102_4VN4bsPwnxRQUJW5Zp1RhG2IL9vvl_8.mid", "2004-03-16 13:40:00", 1, NULL, NULL, 255);
+CREATE TABLE t4(
+formatid int NOT NULL ,
+name varchar(60) NOT NULL default '',
+formatclassid int NOT NULL default '0',
+mime varchar(60) default NULL,
+extension varchar(10) default NULL,
+priority int NOT NULL default '0',
+canaddtocapability char(1) NOT NULL default 'Y',
+PRIMARY KEY (formatid)
+);
+CREATE INDEX t4_formatclassid ON t4(formatclassid);
+CREATE INDEX t4_formats_idx ON t4(canaddtocapability);
+INSERT INTO t4 VALUES
+(19, "XHTML", 11, "text/html", "xhtml", 10, 'Y'),
+(54, "AMR (wide band)", 13, "audio/amr-wb", "awb", 0, 'Y');
+CREATE TABLE t5(
+formatclassid int NOT NULL ,
+name varchar(60) NOT NULL default '',
+priority int NOT NULL default '0',
+formattypeid int NOT NULL default '0',
+PRIMARY KEY (formatclassid)
+);
+CREATE INDEX t5_formattypeid on t5(formattypeid);
+INSERT INTO t5 VALUES
+(11, "Info", 0, 4), (13, "Digital Audio", 0, 2);
+CREATE TABLE t6(
+formattypeid int NOT NULL ,
+name varchar(60) NOT NULL default '',
+priority int default NULL,
+PRIMARY KEY (formattypeid)
+);
+INSERT INTO t6 VALUES
+(2, "Ringtones", 0);
+CREATE TABLE t7(
+metaid int NOT NULL default '0',
+artistid int NOT NULL default '0',
+PRIMARY KEY (metaid,artistid)
+);
+INSERT INTO t7 VALUES
+(4, 5), (3, 4);
+CREATE TABLE t8(
+artistid int NOT NULL ,
+name varchar(80) NOT NULL default '',
+PRIMARY KEY (artistid)
+);
+INSERT INTO t8 VALUES
+(5, "Anastacia"), (4, "John Mayer");
+CREATE TABLE t9(
+subgenreid int NOT NULL default '0',
+metaid int NOT NULL default '0',
+PRIMARY KEY (subgenreid,metaid)
+) ;
+CREATE INDEX t9_subgenreid ON t9(subgenreid);
+CREATE INDEX t9_metaid ON t9(metaid);
+INSERT INTO t9 VALUES
+(138, 4), (31, 3);
+CREATE TABLE t10(
+subgenreid int NOT NULL ,
+genreid int NOT NULL default '0',
+name varchar(80) NOT NULL default '',
+PRIMARY KEY (subgenreid)
+) ;
+CREATE INDEX t10_genreid ON t10(genreid);
+INSERT INTO t10 VALUES
+(138, 19, ''), (31, 3, '');
+CREATE TABLE t11(
+genreid int NOT NULL default '0',
+name char(80) NOT NULL default '',
+priority int NOT NULL default '0',
+masterclip char(1) default NULL,
+PRIMARY KEY (genreid)
+) ;
+CREATE INDEX t11_masterclip ON t11( masterclip);
+INSERT INTO t11 VALUES
+(19, "Pop & Dance", 95, 'Y'), (3, "Rock & Alternative", 100, 'Y');
+set join_cache_level=6;
+EXPLAIN
+SELECT t1.uniquekey, t1.xml AS affiliateXml,
+t8.name AS artistName, t8.artistid,
+t11.name AS genreName, t11.genreid, t11.priority AS genrePriority,
+t10.subgenreid, t10.name AS subgenreName,
+t2.name AS metaName, t2.metaid, t2.xml AS metaXml,
+t4.priority + t5.priority + t6.priority AS overallPriority,
+t3.path AS path, t3.mediaid,
+t4.formatid, t4.name AS formatName,
+t5.formatclassid, t5.name AS formatclassName,
+t6.formattypeid, t6.name AS formattypeName
+FROM t1, t2, t3, t4, t5, t6, t7, t8, t9, t10, t11
+WHERE t7.metaid = t2.metaid AND t7.artistid = t8.artistid AND
+t9.metaid = t2.metaid AND t9.subgenreid = t10.subgenreid AND
+t10.genreid = t11.genreid AND t3.metaid = t2.metaid AND
+t3.formatid = t4.formatid AND t4.formatclassid = t5.formatclassid AND
+t4.canaddtocapability = 'Y' AND t5.formattypeid = t6.formattypeid AND
+t6.formattypeid IN (2) AND (t3.formatid IN (31, 8, 76)) AND
+t1.metaid = t2.metaid AND t1.affiliateid = '2';
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t6 system PRIMARY NULL NULL NULL 1
+1 SIMPLE t1 ref t1_affiliateid,t1_metaid t1_affiliateid 4 const 1
+1 SIMPLE t4 ref PRIMARY,t4_formatclassid,t4_formats_idx t4_formats_idx 1 const 1 Using index condition; Using where; Using join buffer
+1 SIMPLE t5 eq_ref PRIMARY,t5_formattypeid PRIMARY 4 test.t4.formatclassid 1 Using where; Using join buffer
+1 SIMPLE t2 eq_ref PRIMARY PRIMARY 4 test.t1.metaid 1 Using join buffer
+1 SIMPLE t7 ref PRIMARY PRIMARY 4 test.t1.metaid 1 Using index
+1 SIMPLE t3 ref t3_metaid,t3_formatid,t3_metaidformatid t3_metaid 4 test.t1.metaid 2 Using where; Using join buffer
+1 SIMPLE t8 eq_ref PRIMARY PRIMARY 4 test.t7.artistid 1 Using join buffer
+1 SIMPLE t9 index PRIMARY,t9_subgenreid,t9_metaid PRIMARY 8 NULL 2 Using where; Using index; Using join buffer
+1 SIMPLE t10 eq_ref PRIMARY,t10_genreid PRIMARY 4 test.t9.subgenreid 1 Using join buffer
+1 SIMPLE t11 eq_ref PRIMARY PRIMARY 4 test.t10.genreid 1 Using join buffer
+SELECT t1.uniquekey, t1.xml AS affiliateXml,
+t8.name AS artistName, t8.artistid,
+t11.name AS genreName, t11.genreid, t11.priority AS genrePriority,
+t10.subgenreid, t10.name AS subgenreName,
+t2.name AS metaName, t2.metaid, t2.xml AS metaXml,
+t4.priority + t5.priority + t6.priority AS overallPriority,
+t3.path AS path, t3.mediaid,
+t4.formatid, t4.name AS formatName,
+t5.formatclassid, t5.name AS formatclassName,
+t6.formattypeid, t6.name AS formattypeName
+FROM t1, t2, t3, t4, t5, t6, t7, t8, t9, t10, t11
+WHERE t7.metaid = t2.metaid AND t7.artistid = t8.artistid AND
+t9.metaid = t2.metaid AND t9.subgenreid = t10.subgenreid AND
+t10.genreid = t11.genreid AND t3.metaid = t2.metaid AND
+t3.formatid = t4.formatid AND t4.formatclassid = t5.formatclassid AND
+t4.canaddtocapability = 'Y' AND t5.formattypeid = t6.formattypeid AND
+t6.formattypeid IN (2) AND (t3.formatid IN (31, 8, 76)) AND
+t1.metaid = t2.metaid AND t1.affiliateid = '2';
+uniquekey affiliateXml artistName artistid genreName genreid genrePriority subgenreid subgenreName metaName metaid metaXml overallPriority path mediaid formatid formatName formatclassid formatclassName formattypeid formattypeName
+DROP TABLE t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11;
+CREATE TABLE t1 (a1 int, filler1 char(64) default ' ' );
+CREATE TABLE t2 (
+a2 int, b2 int, filler2 char(64) default ' ',
+PRIMARY KEY idx(a2,b2,filler2)
+) ;
+CREATE TABLE t3 (b3 int, c3 int, INDEX idx(b3));
+INSERT INTO t1(a1) VALUES
+(4), (7), (1), (9), (8), (5), (3), (6), (2);
+INSERT INTO t2(a2,b2) VALUES
+(1,30), (3,40), (2,61), (6,73), (8,92), (9,27), (4,18), (5,84), (7,56),
+(4,14), (6,76), (8,98), (7,55), (1,39), (2,68), (3,45), (9,21), (5,81),
+(5,88), (2,65), (6,74), (9,23), (1,37), (3,44), (4,17), (8,99), (7,51),
+(9,28), (7,52), (1,33), (4,13), (5,87), (3,43), (8,91), (2,62), (6,79),
+(3,49), (8,93), (7,34), (5,82), (6,78), (2,63), (1,32), (9,22), (4,11);
+INSERT INTO t3 VALUES
+(30,302), (92,923), (18,187), (45,459), (30,309),
+(39,393), (68,685), (45,458), (21,210), (81,817),
+(40,405), (61,618), (73,738), (92,929), (27,275),
+(18,188), (84,846), (56,564), (14,144), (76,763),
+(98,982), (55,551), (17,174), (99,998), (51,513),
+(28,282), (52,527), (33,336), (13,138), (87,878),
+(43,431), (91,916), (62,624), (79,797), (49,494),
+(93,933), (34,347), (82,829), (78,780), (63,634),
+(32,329), (22,228), (11,114), (74,749), (23,236);
+set join_cache_level=1;
+EXPLAIN
+SELECT a1<>a2, a1, a2, b2, b3, c3,
+SUBSTR(filler1,1,1) AS s1, SUBSTR(filler2,1,1) AS s2
+FROM t1,t2,t3 WHERE a1=a2 AND b2=b3 AND MOD(c3,10)>7;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 9
+1 SIMPLE t2 ref PRIMARY PRIMARY 4 test.t1.a1 1 Using index
+1 SIMPLE t3 ref idx idx 5 test.t2.b2 5 Using where
+SELECT a1<>a2, a1, a2, b2, b3, c3,
+SUBSTR(filler1,1,1) AS s1, SUBSTR(filler2,1,1) AS s2
+FROM t1,t2,t3 WHERE a1=a2 AND b2=b3 AND MOD(c3,10)>7;
+a1<>a2 a1 a2 b2 b3 c3 s1 s2
+0 4 4 13 13 138
+0 4 4 18 18 188
+0 1 1 30 30 309
+0 1 1 32 32 329
+0 9 9 22 22 228
+0 8 8 92 92 929
+0 8 8 99 99 998
+0 5 5 82 82 829
+0 5 5 87 87 878
+0 3 3 45 45 459
+0 3 3 45 45 458
+0 6 6 73 73 738
+0 6 6 74 74 749
+0 2 2 61 61 618
+set join_cache_level=5;
+set join_buffer_size=512;
+EXPLAIN
+SELECT a1<>a2, a1, a2, b2, b3, c3,
+SUBSTR(filler1,1,1) AS s1, SUBSTR(filler2,1,1) AS s2
+FROM t1,t2,t3 WHERE a1=a2 AND b2=b3 AND MOD(c3,10)>7;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 9
+1 SIMPLE t2 ref PRIMARY PRIMARY 4 test.t1.a1 1 Using index
+1 SIMPLE t3 ref idx idx 5 test.t2.b2 5 Using where; Using join buffer
+SELECT a1<>a2, a1, a2, b2, b3, c3,
+SUBSTR(filler1,1,1) AS s1, SUBSTR(filler2,1,1) AS s2
+FROM t1,t2,t3 WHERE a1=a2 AND b2=b3 AND MOD(c3,10)>7;
+a1<>a2 a1 a2 b2 b3 c3 s1 s2
+0 4 4 18 18 188
+0 4 4 13 13 138
+0 1 1 30 30 309
+0 1 1 32 32 329
+0 9 9 22 22 228
+0 8 8 92 92 929
+0 8 8 99 99 998
+0 5 5 82 82 829
+0 3 3 45 45 459
+0 3 3 45 45 458
+0 5 5 87 87 878
+0 2 2 61 61 618
+0 6 6 73 73 738
+0 6 6 74 74 749
+DROP TABLE t1,t2,t3;
+CREATE TABLE t1 (a int, b int, INDEX idx(b));
+CREATE TABLE t2 (a int, b int, INDEX idx(a));
+INSERT INTO t1 VALUES (5,30), (3,20), (7,40), (2,10), (8,30), (1,10), (4,20);
+INSERT INTO t2 VALUES (7,10), (1,20), (2,20), (8,20), (8,10), (1,20);
+INSERT INTO t2 VALUES (1,10), (4,20), (3,20), (7,20), (7,10), (1,20);
+set join_buffer_size=32;
+Warnings:
+Warning 1292 Truncated incorrect join_buffer_size value: '32'
+set join_cache_level=8;
+EXPLAIN SELECT * FROM t1,t2 WHERE t1.a=t2.a AND t1.b >= 30;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL idx NULL NULL NULL 7 Using where
+1 SIMPLE t2 ref idx idx 5 test.t1.a 2 Using join buffer
+SELECT * FROM t1,t2 WHERE t1.a=t2.a AND t1.b >= 30;
+a b a b
+7 40 7 10
+7 40 7 10
+7 40 7 20
+8 30 8 10
+8 30 8 20
+DROP TABLE t1,t2;
+#
+# Bug #40134: outer join with not exists optimization and join buffer
+#
+set join_cache_level=default;
+set join_buffer_size=default;
+CREATE TABLE t1 (a int NOT NULL);
+INSERT INTO t1 VALUES (2), (4), (3), (5), (1);
+CREATE TABLE t2 (a int NOT NULL, b int NOT NULL, INDEX i_a(a));
+INSERT INTO t2 VALUES (4,10), (2,10), (2,30), (2,20), (4,20);
+EXPLAIN
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a=t2.a WHERE t2.b IS NULL;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 5
+1 SIMPLE t2 ref i_a i_a 4 test.t1.a 2 Using where; Not exists
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a=t2.a WHERE t2.b IS NULL;
+a a b
+3 NULL NULL
+5 NULL NULL
+1 NULL NULL
+SET join_cache_level=6;
+EXPLAIN
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a=t2.a WHERE t2.b IS NULL;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 5
+1 SIMPLE t2 ref i_a i_a 4 test.t1.a 2 Using where; Not exists; Using join buffer
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a=t2.a WHERE t2.b IS NULL;
+a a b
+3 NULL NULL
+5 NULL NULL
+1 NULL NULL
+DROP TABLE t1, t2;
+set join_cache_level=default;
+set join_buffer_size=default;
+#
+# BUG#40136: Group by is ignored when join buffer is used for an outer join
+#
+create table t1(a int PRIMARY KEY, b int);
+insert into t1 values
+(5, 10), (2, 70), (7, 80), (6, 20), (1, 50), (9, 40), (8, 30), (3, 60);
+create table t2 (p int, a int, INDEX i_a(a));
+insert into t2 values
+(103, 7), (109, 3), (102, 3), (108, 1), (106, 3),
+(107, 7), (105, 1), (101, 3), (100, 7), (110, 1);
+set @save_join_cache_level=@@join_cache_level;
+set join_cache_level=6;
+The following must not show "using join cache":
+explain
+select t1.a, count(t2.p) as count
+from t1 left join t2 on t1.a=t2.a and t2.p % 2 = 1 group by t1.a;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 8 Using temporary; Using filesort
+1 SIMPLE t2 ref i_a i_a 5 test.t1.a 2 Using where; Using join buffer
+select t1.a, count(t2.p) as count
+from t1 left join t2 on t1.a=t2.a and t2.p % 2 = 1 group by t1.a;
+a count
+1 1
+2 0
+3 2
+5 0
+6 0
+7 2
+8 0
+9 0
+set join_cache_level=@save_join_cache_level;
+drop table t1, t2;
+#
+# BUG#40268: Nested outer join with not null-rejecting where condition
+# over an inner table which is not the last in the nest
+#
+CREATE TABLE t2 (a int, b int, c int);
+CREATE TABLE t3 (a int, b int, c int);
+CREATE TABLE t4 (a int, b int, c int);
+INSERT INTO t2 VALUES (3,3,0), (4,2,0), (5,3,0);
+INSERT INTO t3 VALUES (1,2,0), (2,2,0);
+INSERT INTO t4 VALUES (3,2,0), (4,2,0);
+set join_cache_level=6;
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM t2 LEFT JOIN (t3, t4) ON t2.b=t4.b
+WHERE t3.a+2<t2.a OR t3.c IS NULL;
+a b a b a b
+4 2 1 2 3 2
+4 2 1 2 4 2
+3 3 NULL NULL NULL NULL
+5 3 NULL NULL NULL NULL
+set join_cache_level=default;
+DROP TABLE t2, t3, t4;
+#
+# Bug #40192: outer join with where clause when using BNL
+#
+create table t1 (a int, b int);
+insert into t1 values (2, 20), (3, 30), (1, 10);
+create table t2 (a int, c int);
+insert into t2 values (1, 101), (3, 102), (1, 100);
+set join_cache_level=6;
+select * from t1 left join t2 on t1.a=t2.a;
+a b a c
+1 10 1 101
+3 30 3 102
+1 10 1 100
+2 20 NULL NULL
+explain select * from t1 left join t2 on t1.a=t2.a where t2.c=102 or t2.c is null;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 Using where; Using join buffer
+select * from t1 left join t2 on t1.a=t2.a where t2.c=102 or t2.c is null;
+a b a c
+3 30 3 102
+2 20 NULL NULL
+set join_cache_level=default;
+drop table t1, t2;
+#
+# Bug #40317: outer join with with constant on expression equal to FALSE
+#
+create table t1 (a int);
+insert into t1 values (30), (40), (20);
+create table t2 (b int);
+insert into t2 values (200), (100);
+set join_cache_level=6;
+select * from t1 left join t2 on (1=0);
+a b
+30 NULL
+40 NULL
+20 NULL
+explain select * from t1 left join t2 on (1=0) where a=40;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3 Using where
+1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using where; Using join buffer
+select * from t1 left join t2 on (1=0) where a=40;
+a b
+40 NULL
+set join_cache_level=1;
+explain select * from t1 left join t2 on (1=0);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3
+1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using where
+set join_cache_level=default;
+drop table t1, t2;
+#
+# Bug #41204: small buffer with big rec_per_key for ref access
+#
+CREATE TABLE t1 (a int);
+INSERT INTO t1 VALUES (0);
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1 VALUES (20000), (10000);
+CREATE TABLE t2 (pk int AUTO_INCREMENT PRIMARY KEY, b int, c int, INDEX idx(b));
+INSERT INTO t2(b,c) VALUES (10000, 3), (20000, 7), (20000, 1), (10000, 9), (20000, 5);
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+ANALYZE TABLE t1,t2;
+set join_cache_level=6;
+set join_buffer_size=1024;
+EXPLAIN SELECT AVG(c) FROM t1,t2 WHERE t1.a=t2.b;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 2050
+1 SIMPLE t2 ref idx idx 5 test.t1.a 640 Using join buffer
+SELECT AVG(c) FROM t1,t2 WHERE t1.a=t2.b;
+AVG(c)
+5.0000
+set join_buffer_size=default;
+set join_cache_level=default;
+DROP TABLE t1, t2;
+#
+# Bug #41894: big join buffer of level 7 used to join records
+# with null values in place of varchar strings
+#
+CREATE TABLE t1 (a int NOT NULL AUTO_INCREMENT PRIMARY KEY,
+b varchar(127) DEFAULT NULL);
+INSERT INTO t1(a) VALUES (1);
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+CREATE TABLE t2 (a int NOT NULL PRIMARY KEY, b varchar(127) DEFAULT NULL);
+INSERT INTO t2 SELECT * FROM t1;
+CREATE TABLE t3 (a int NOT NULL PRIMARY KEY, b varchar(127) DEFAULT NULL);
+INSERT INTO t3 SELECT * FROM t1;
+set join_cache_level=7;
+set join_buffer_size=1024*1024;
+EXPLAIN
+SELECT COUNT(*) FROM t1,t2,t3
+WHERE t1.a=t2.a AND t2.a=t3.a AND
+t1.b IS NULL AND t2.b IS NULL AND t3.b IS NULL;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL PRIMARY NULL NULL NULL 16384 Using where
+1 SIMPLE t2 eq_ref PRIMARY PRIMARY 4 test.t1.a 1 Using where; Using join buffer
+1 SIMPLE t3 eq_ref PRIMARY PRIMARY 4 test.t1.a 1 Using where; Using join buffer
+SELECT COUNT(*) FROM t1,t2,t3
+WHERE t1.a=t2.a AND t2.a=t3.a AND
+t1.b IS NULL AND t2.b IS NULL AND t3.b IS NULL;
+COUNT(*)
+16384
+set join_buffer_size=default;
+set join_cache_level=default;
+DROP TABLE t1,t2,t3;
+#
+# Bug #42020: join buffer is used for outer join with fields of
+# several outer tables in join buffer
+#
+CREATE TABLE t1 (
+a bigint NOT NULL,
+PRIMARY KEY (a)
+);
+INSERT INTO t1 VALUES
+(2), (1);
+CREATE TABLE t2 (
+a bigint NOT NULL,
+b bigint NOT NULL,
+PRIMARY KEY (a,b)
+);
+INSERT INTO t2 VALUES
+(2,30), (2,40), (2,50), (2,60), (2,70), (2,80),
+(1,10), (1, 20), (1,30), (1,40), (1,50);
+CREATE TABLE t3 (
+pk bigint NOT NULL AUTO_INCREMENT,
+a bigint NOT NULL,
+b bigint NOT NULL,
+val bigint DEFAULT '0',
+PRIMARY KEY (pk),
+KEY idx (a,b)
+);
+INSERT INTO t3(a,b) VALUES
+(2,30), (2,40), (2,50), (2,60), (2,70), (2,80),
+(4,30), (4,40), (4,50), (4,60), (4,70), (4,80),
+(5,30), (5,40), (5,50), (5,60), (5,70), (5,80),
+(7,30), (7,40), (7,50), (7,60), (7,70), (7,80);
+SELECT t1.a, t2.a, t3.a, t2.b, t3.b, t3.val
+FROM (t1,t2) LEFT JOIN t3 ON (t1.a=t3.a AND t2.b=t3.b)
+WHERE t1.a=t2.a;
+a a a b b val
+1 1 NULL 10 NULL NULL
+1 1 NULL 20 NULL NULL
+1 1 NULL 30 NULL NULL
+1 1 NULL 40 NULL NULL
+1 1 NULL 50 NULL NULL
+2 2 2 30 30 0
+2 2 2 40 40 0
+2 2 2 50 50 0
+2 2 2 60 60 0
+2 2 2 70 70 0
+2 2 2 80 80 0
+set join_cache_level=6;
+set join_buffer_size=256;
+EXPLAIN
+SELECT t1.a, t2.a, t3.a, t2.b, t3.b, t3.val
+FROM (t1,t2) LEFT JOIN t3 ON (t1.a=t3.a AND t2.b=t3.b)
+WHERE t1.a=t2.a;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 index PRIMARY PRIMARY 8 NULL 2 Using index
+1 SIMPLE t2 ref PRIMARY PRIMARY 8 test.t1.a 1 Using index
+1 SIMPLE t3 ref idx idx 16 test.t1.a,test.t2.b 2 Using join buffer
+SELECT t1.a, t2.a, t3.a, t2.b, t3.b, t3.val
+FROM (t1,t2) LEFT JOIN t3 ON (t1.a=t3.a AND t2.b=t3.b)
+WHERE t1.a=t2.a;
+a a a b b val
+1 1 NULL 10 NULL NULL
+1 1 NULL 20 NULL NULL
+1 1 NULL 30 NULL NULL
+1 1 NULL 40 NULL NULL
+1 1 NULL 50 NULL NULL
+2 2 2 30 30 0
+2 2 2 40 40 0
+2 2 2 50 50 0
+2 2 2 60 60 0
+2 2 2 70 70 0
+2 2 2 80 80 0
+DROP INDEX idx ON t3;
+set join_cache_level=4;
+EXPLAIN
+SELECT t1.a, t2.a, t3.a, t2.b, t3.b, t3.val
+FROM (t1,t2) LEFT JOIN t3 ON (t1.a=t3.a AND t2.b=t3.b)
+WHERE t1.a=t2.a;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 index PRIMARY PRIMARY 8 NULL 2 Using index
+1 SIMPLE t2 ref PRIMARY PRIMARY 8 test.t1.a 1 Using index
+1 SIMPLE t3 ALL NULL NULL NULL NULL 24 Using where; Using join buffer
+SELECT t1.a, t2.a, t3.a, t2.b, t3.b, t3.val
+FROM (t1,t2) LEFT JOIN t3 ON (t1.a=t3.a AND t2.b=t3.b)
+WHERE t1.a=t2.a;
+a a a b b val
+1 1 NULL 10 NULL NULL
+1 1 NULL 20 NULL NULL
+1 1 NULL 30 NULL NULL
+1 1 NULL 40 NULL NULL
+1 1 NULL 50 NULL NULL
+2 2 2 30 30 0
+2 2 2 40 40 0
+2 2 2 50 50 0
+2 2 2 60 60 0
+2 2 2 70 70 0
+2 2 2 80 80 0
+set join_buffer_size=default;
+set join_cache_level=default;
+DROP TABLE t1,t2,t3;
+create table t1(f1 int, f2 int);
+insert into t1 values (1,1),(2,2),(3,3);
+create table t2(f1 int not null, f2 int not null, f3 char(200), key(f1,f2));
+insert into t2 values (1,1, 'qwerty'),(1,2, 'qwerty'),(1,3, 'qwerty');
+insert into t2 values (2,1, 'qwerty'),(2,2, 'qwerty'),(2,3, 'qwerty'),
+(2,4, 'qwerty'),(2,5, 'qwerty');
+insert into t2 values (3,1, 'qwerty'),(3,4, 'qwerty');
+insert into t2 values (4,1, 'qwerty'),(4,2, 'qwerty'),(4,3, 'qwerty'),
+(4,4, 'qwerty');
+insert into t2 values (1,1, 'qwerty'),(1,2, 'qwerty'),(1,3, 'qwerty');
+insert into t2 values (2,1, 'qwerty'),(2,2, 'qwerty'),(2,3, 'qwerty'),
+(2,4, 'qwerty'),(2,5, 'qwerty');
+insert into t2 values (3,1, 'qwerty'),(3,4, 'qwerty');
+insert into t2 values (4,1, 'qwerty'),(4,2, 'qwerty'),(4,3, 'qwerty'),
+(4,4, 'qwerty');
+set join_cache_level=5;
+select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t1.f2 and t2.f2 + 1 >= t1.f1 + 1;
+f1 f2 f3
+1 1 qwerty
+2 2 qwerty
+1 1 qwerty
+2 2 qwerty
+explain select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t2.f2;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3
+1 SIMPLE t2 ref f1 f1 4 test.t1.f1 3 Using index condition(BKA); Using join buffer
+set join_cache_level=6;
+select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t1.f2 and t2.f2 + 1 >= t1.f1 + 1;
+f1 f2 f3
+1 1 qwerty
+2 2 qwerty
+1 1 qwerty
+2 2 qwerty
+explain select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t2.f2;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3
+1 SIMPLE t2 ref f1 f1 4 test.t1.f1 3 Using index condition(BKA); Using join buffer
+set join_cache_level=7;
+select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t1.f2 and t2.f2 + 1 >= t1.f1 + 1;
+f1 f2 f3
+1 1 qwerty
+2 2 qwerty
+1 1 qwerty
+2 2 qwerty
+explain select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t2.f2;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3
+1 SIMPLE t2 ref f1 f1 4 test.t1.f1 3 Using index condition(BKA); Using join buffer
+set join_cache_level=8;
+select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t1.f2 and t2.f2 + 1 >= t1.f1 + 1;
+f1 f2 f3
+1 1 qwerty
+2 2 qwerty
+1 1 qwerty
+2 2 qwerty
+explain select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t2.f2;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3
+1 SIMPLE t2 ref f1 f1 4 test.t1.f1 3 Using index condition(BKA); Using join buffer
+drop table t1,t2;
+set join_cache_level=default;
+#
+# Bug #42955: join with GROUP BY/ORDER BY and when BKA is enabled
+#
+create table t1 (d int, id1 int, index idx1 (d, id1));
+insert into t1 values
+(3, 20), (2, 40), (3, 10), (1, 10), (3, 20), (1, 40), (2, 30), (3, 30);
+create table t2 (id1 int, id2 int, index idx2 (id1));
+insert into t2 values
+(20, 100), (30, 400), (20, 400), (30, 200), (10, 300), (10, 200), (40, 100),
+(40, 200), (30, 300), (10, 400), (20, 200), (20, 300);
+set join_cache_level=6;
+explain
+select t1.id1, sum(t2.id2) from t1 join t2 on t1.id1=t2.id1
+where t1.d=3 group by t1.id1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ref idx1 idx1 5 const 4 Using index; Using temporary; Using filesort
+1 SIMPLE t2 ref idx2 idx2 5 test.t1.id1 2 Using join buffer
+select t1.id1, sum(t2.id2) from t1 join t2 on t1.id1=t2.id1
+where t1.d=3 group by t1.id1;
+id1 sum(t2.id2)
+10 900
+20 2000
+30 900
+explain
+select t1.id1 from t1 join t2 on t1.id1=t2.id1
+where t1.d=3 and t2.id2 > 200 order by t1.id1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ref idx1 idx1 5 const 4 Using index; Using temporary; Using filesort
+1 SIMPLE t2 ref idx2 idx2 5 test.t1.id1 2 Using where; Using join buffer
+select t1.id1 from t1 join t2 on t1.id1=t2.id1
+where t1.d=3 and t2.id2 > 200 order by t1.id1;
+id1
+10
+10
+20
+20
+20
+20
+30
+30
+set join_cache_level=default;
+drop table t1,t2;
+#
+# Bug #44019: star-like multi-join query executed join_cache_level=6
+#
+create table t1 (a int, b int, c int, d int);
+create table t2 (b int, e varchar(16), index idx(b));
+create table t3 (d int, f varchar(16), index idx(d));
+create table t4 (c int, g varchar(16), index idx(c));
+insert into t1 values
+(5, 50, 500, 5000), (3, 30, 300, 3000), (9, 90, 900, 9000),
+(2, 20, 200, 2000), (4, 40, 400, 4000), (8, 80, 800, 800),
+(7, 70, 700, 7000);
+insert into t2 values
+(30, 'bbb'), (10, 'b'), (70, 'bbbbbbb'), (60, 'bbbbbb'),
+(31, 'bbb'), (11, 'b'), (71, 'bbbbbbb'), (61, 'bbbbbb'),
+(32, 'bbb'), (12, 'b'), (72, 'bbbbbbb'), (62, 'bbbbbb');
+insert into t3 values
+(4000, 'dddd'), (3000, 'ddd'), (1000, 'd'), (8000, 'dddddddd'),
+(4001, 'dddd'), (3001, 'ddd'), (1001, 'd'), (8001, 'dddddddd'),
+(4002, 'dddd'), (3002, 'ddd'), (1002, 'd'), (8002, 'dddddddd');
+insert into t4 values
+(200, 'cc'), (600, 'cccccc'), (300, 'ccc'), (500, 'ccccc'),
+(201, 'cc'), (601, 'cccccc'), (301, 'ccc'), (501, 'ccccc'),
+(202, 'cc'), (602, 'cccccc'), (302, 'ccc'), (502, 'ccccc');
+analyze table t2,t3,t4;
+set join_cache_level=1;
+explain
+select t1.a, t1.b, t1.c, t1.d, t2.e, t3.f, t4.g from t1,t2,t3,t4
+where t2.b=t1.b and t3.d=t1.d and t4.c=t1.c;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 7
+1 SIMPLE t2 ref idx idx 5 test.t1.b 1
+1 SIMPLE t3 ref idx idx 5 test.t1.d 1
+1 SIMPLE t4 ref idx idx 5 test.t1.c 1
+select t1.a, t1.b, t1.c, t1.d, t2.e, t3.f, t4.g from t1,t2,t3,t4
+where t2.b=t1.b and t3.d=t1.d and t4.c=t1.c;
+a b c d e f g
+3 30 300 3000 bbb ddd ccc
+set join_cache_level=6;
+explain
+select t1.a, t1.b, t1.c, t1.d, t2.e, t3.f, t4.g from t1,t2,t3,t4
+where t2.b=t1.b and t3.d=t1.d and t4.c=t1.c;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 7
+1 SIMPLE t2 ref idx idx 5 test.t1.b 1 Using join buffer
+1 SIMPLE t3 ref idx idx 5 test.t1.d 1 Using join buffer
+1 SIMPLE t4 ref idx idx 5 test.t1.c 1 Using join buffer
+select t1.a, t1.b, t1.c, t1.d, t2.e, t3.f, t4.g from t1,t2,t3,t4
+where t2.b=t1.b and t3.d=t1.d and t4.c=t1.c;
+a b c d e f g
+3 30 300 3000 bbb ddd ccc
+set join_cache_level=default;
+drop table t1,t2,t3,t4;
+#
+# Bug #44250: Corruption of linked join buffers when using BKA
+#
+CREATE TABLE t1 (
+id1 bigint(20) DEFAULT NULL,
+id2 bigint(20) DEFAULT NULL,
+id3 bigint(20) DEFAULT NULL,
+num1 bigint(20) DEFAULT NULL,
+num2 int(11) DEFAULT NULL,
+num3 bigint(20) DEFAULT NULL
+);
+CREATE TABLE t2 (
+id3 bigint(20) NOT NULL DEFAULT '0',
+id4 bigint(20) DEFAULT NULL,
+enum1 enum('Enabled','Disabled','Paused') DEFAULT NULL,
+PRIMARY KEY (id3)
+);
+CREATE TABLE t3 (
+id4 bigint(20) NOT NULL DEFAULT '0',
+text1 text,
+PRIMARY KEY (id4)
+);
+CREATE TABLE t4 (
+id2 bigint(20) NOT NULL DEFAULT '0',
+dummy int(11) DEFAULT '0',
+PRIMARY KEY (id2)
+);
+CREATE TABLE t5 (
+id1 bigint(20) NOT NULL DEFAULT '0',
+id2 bigint(20) NOT NULL DEFAULT '0',
+enum2 enum('Active','Deleted','Paused') DEFAULT NULL,
+PRIMARY KEY (id1,id2)
+);
+set join_cache_level=8;
+set join_buffer_size=2048;
+EXPLAIN
+SELECT STRAIGHT_JOIN t1.id1, t1.num3, t3.text1, t3.id4, t2.id3, t4.dummy
+FROM t1 JOIN t2 JOIN t3 JOIN t4 JOIN t5
+WHERE t1.id1=t5.id1 AND t1.id2=t5.id2 and t4.id2=t1.id2 AND
+t5.enum2='Active' AND t3.id4=t2.id4 AND t2.id3=t1.id3 AND t3.text1<'D';
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 349
+1 SIMPLE t2 eq_ref PRIMARY PRIMARY 8 test.t1.id3 1 Using join buffer
+1 SIMPLE t3 eq_ref PRIMARY PRIMARY 8 test.t2.id4 1 Using where; Using join buffer
+1 SIMPLE t4 eq_ref PRIMARY PRIMARY 8 test.t1.id2 1 Using join buffer
+1 SIMPLE t5 eq_ref PRIMARY PRIMARY 16 test.t1.id1,test.t1.id2 1 Using where; Using join buffer
+SELECT STRAIGHT_JOIN t1.id1, t1.num3, t3.text1, t3.id4, t2.id3, t4.dummy
+FROM t1 JOIN t2 JOIN t3 JOIN t4 JOIN t5
+WHERE t1.id1=t5.id1 AND t1.id2=t5.id2 and t4.id2=t1.id2 AND
+t5.enum2='Active' AND t3.id4=t2.id4 AND t2.id3=t1.id3 AND t3.text1<'D';
+id1 num3 text1 id4 id3 dummy
+228172702 14 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 2567095402 2667134182 0
+228172702 15 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 2567095402 2667134182 0
+228172702 3 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 2567095402 2667134182 0
+228172702 134 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 2567095402 2667134182 0
+228808822 61 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 826928662 935693782 0
+228808822 13 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 826928662 935693782 0
+228808822 60 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 826928662 935693782 0
+228808822 13 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 826928662 935693782 0
+228808822 3 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 826928662 935693782 0
+228808822 4 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 826928662 935693782 0
+228808822 6 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 826928662 935693782 0
+228808822 17 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 826928662 935693782 0
+228808822 50 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 826928662 935693782 0
+228808822 18 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 826928662 935693782 0
+228808822 1 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 826928662 935693782 0
+228808822 3 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 826928662 935693782 0
+228808822 4 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 826928662 935693782 0
+228808822 89 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+228808822 19 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+228808822 84 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+228808822 14 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+228808822 9 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+228808822 1 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+228808822 10 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+228808822 26 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+228808822 4 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+228808822 3 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+228808822 1 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+228808822 3 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+228808822 28 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+228808822 62 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 2381969632 2482416112 0
+set join_buffer_size=default;
+set join_cache_level=default;
+DROP TABLE t1,t2,t3,t4,t5;
+#
+# Bug#45267: Incomplete check caused wrong result.
+#
+CREATE TABLE t1 (
+`pk` int(11) NOT NULL AUTO_INCREMENT PRIMARY KEY
+);
+CREATE TABLE t3 (
+`pk` int(11) NOT NULL AUTO_INCREMENT PRIMARY KEY
+);
+INSERT INTO t3 VALUES
+(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),
+(16),(17),(18),(19),(20);
+CREATE TABLE t2 (
+`pk` int(11) NOT NULL AUTO_INCREMENT,
+`int_nokey` int(11) NOT NULL,
+`time_key` time NOT NULL,
+PRIMARY KEY (`pk`),
+KEY `time_key` (`time_key`)
+);
+INSERT INTO t2 VALUES (10,9,'22:36:46'),(11,0,'08:46:46');
+SELECT DISTINCT t1.`pk`
+FROM t1 RIGHT JOIN t2 STRAIGHT_JOIN t3 ON t2.`int_nokey` ON t2.`time_key`
+GROUP BY 1;
+pk
+NULL
+DROP TABLE IF EXISTS t1, t2, t3;
+#
+# Bug #46328: Use of aggregate function without GROUP BY clause
+# returns many rows (vs. one )
+#
+CREATE TABLE t1 (
+int_key int(11) NOT NULL,
+KEY int_key (int_key)
+);
+INSERT INTO t1 VALUES
+(0),(2),(2),(2),(3),(4),(5),(5),(6),(6),(8),(8),(9),(9);
+CREATE TABLE t2 (
+int_key int(11) NOT NULL,
+KEY int_key (int_key)
+);
+INSERT INTO t2 VALUES (2),(3);
+
+# The query shall return 1 record with a max value 9 and one of the
+# int_key values inserted above (undefined which one). A changed
+# execution plan may change the value in the second column
+SELECT MAX(t1.int_key), t1.int_key
+FROM t1 STRAIGHT_JOIN t2
+ORDER BY t1.int_key;
+MAX(t1.int_key) int_key
+9 0
+
+explain
+SELECT MAX(t1.int_key), t1.int_key
+FROM t1 STRAIGHT_JOIN t2
+ORDER BY t1.int_key;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 index NULL int_key 4 NULL 14 Using index
+1 SIMPLE t2 index NULL int_key 4 NULL 2 Using index; Using join buffer
+
+DROP TABLE t1,t2;
+SET join_cache_level=default;
+#
+# Regression test for
+# Bug#46733 - NULL value not returned for aggregate on empty result
+# set w/ semijoin on
+#
+CREATE TABLE t1 (
+i int(11) NOT NULL,
+v varchar(1) DEFAULT NULL,
+PRIMARY KEY (i)
+);
+INSERT INTO t1 VALUES (10,'a'),(11,'b'),(12,'c'),(13,'d');
+CREATE TABLE t2 (
+i int(11) NOT NULL,
+v varchar(1) DEFAULT NULL,
+PRIMARY KEY (i)
+);
+INSERT INTO t2 VALUES (1,'x'),(2,'y');
+
+SELECT MAX(t1.i)
+FROM t1 JOIN t2 ON t2.v
+ORDER BY t2.v;
+MAX(t1.i)
+NULL
+
+EXPLAIN
+SELECT MAX(t1.i)
+FROM t1 JOIN t2 ON t2.v
+ORDER BY t2.v;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using where
+1 SIMPLE t1 index NULL PRIMARY 4 NULL 4 Using index; Using join buffer
+
+DROP TABLE t1,t2;
+#
+# Bug #45092: join buffer contains two blob columns one of which is
+# used in the key employed to access the joined table
+#
+CREATE TABLE t1 (c1 int, c2 int, key (c2));
+INSERT INTO t1 VALUES (1,1);
+INSERT INTO t1 VALUES (2,2);
+CREATE TABLE t2 (c1 text, c2 text);
+INSERT INTO t2 VALUES('tt', 'uu');
+INSERT INTO t2 VALUES('zzzz', 'xxxxxxxxx');
+ANALYZE TABLE t1,t2;
+set join_cache_level=6;
+SELECT t1.*, t2.*, LENGTH(t2.c1), LENGTH(t2.c2) FROM t1,t2
+WHERE t1.c2=LENGTH(t2.c2) and t1.c1=LENGTH(t2.c1);
+c1 c2 c1 c2 LENGTH(t2.c1) LENGTH(t2.c2)
+2 2 tt uu 2 2
+set join_cache_level=default;
+DROP TABLE t1,t2;
=== added file 'mysql-test/r/join_nested_jcl6.result'
--- a/mysql-test/r/join_nested_jcl6.result 1970-01-01 00:00:00 +0000
+++ b/mysql-test/r/join_nested_jcl6.result 2009-12-21 02:26:15 +0000
@@ -0,0 +1,1856 @@
+set join_cache_level=6;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 6
+DROP TABLE IF EXISTS t0,t1,t2,t3,t4,t5,t6,t7,t8,t9;
+CREATE TABLE t0 (a int, b int, c int);
+CREATE TABLE t1 (a int, b int, c int);
+CREATE TABLE t2 (a int, b int, c int);
+CREATE TABLE t3 (a int, b int, c int);
+CREATE TABLE t4 (a int, b int, c int);
+CREATE TABLE t5 (a int, b int, c int);
+CREATE TABLE t6 (a int, b int, c int);
+CREATE TABLE t7 (a int, b int, c int);
+CREATE TABLE t8 (a int, b int, c int);
+CREATE TABLE t9 (a int, b int, c int);
+INSERT INTO t0 VALUES (1,1,0), (1,2,0), (2,2,0);
+INSERT INTO t1 VALUES (1,3,0), (2,2,0), (3,2,0);
+INSERT INTO t2 VALUES (3,3,0), (4,2,0), (5,3,0);
+INSERT INTO t3 VALUES (1,2,0), (2,2,0);
+INSERT INTO t4 VALUES (3,2,0), (4,2,0);
+INSERT INTO t5 VALUES (3,1,0), (2,2,0), (3,3,0);
+INSERT INTO t6 VALUES (3,2,0), (6,2,0), (6,1,0);
+INSERT INTO t7 VALUES (1,1,0), (2,2,0);
+INSERT INTO t8 VALUES (0,2,0), (1,2,0);
+INSERT INTO t9 VALUES (1,1,0), (1,2,0), (3,3,0);
+SELECT t2.a,t2.b
+FROM t2;
+a b
+3 3
+4 2
+5 3
+SELECT t3.a,t3.b
+FROM t3;
+a b
+1 2
+2 2
+SELECT t4.a,t4.b
+FROM t4;
+a b
+3 2
+4 2
+SELECT t3.a,t3.b,t4.a,t4.b
+FROM t3,t4;
+a b a b
+1 2 3 2
+2 2 3 2
+1 2 4 2
+2 2 4 2
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM t2
+LEFT JOIN
+(t3, t4)
+ON t2.b=t4.b;
+a b a b a b
+4 2 1 2 3 2
+4 2 2 2 3 2
+4 2 1 2 4 2
+4 2 2 2 4 2
+3 3 NULL NULL NULL NULL
+5 3 NULL NULL NULL NULL
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b;
+a b a b a b
+4 2 1 2 3 2
+4 2 1 2 4 2
+3 3 NULL NULL NULL NULL
+5 3 NULL NULL NULL NULL
+EXPLAIN EXTENDED
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM t2
+LEFT JOIN
+(t3, t4)
+ON t2.b=t4.b
+WHERE t3.a=1 OR t3.c IS NULL;
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 100.00
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t4 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+Warnings:
+Note 1003 select `test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b`,`test`.`t4`.`a` AS `a`,`test`.`t4`.`b` AS `b` from `test`.`t2` left join (`test`.`t3` join `test`.`t4`) on((`test`.`t4`.`b` = `test`.`t2`.`b`)) where ((`test`.`t3`.`a` = 1) or isnull(`test`.`t3`.`c`))
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM t2
+LEFT JOIN
+(t3, t4)
+ON t2.b=t4.b
+WHERE t3.a=1 OR t3.c IS NULL;
+a b a b a b
+4 2 1 2 3 2
+4 2 1 2 4 2
+3 3 NULL NULL NULL NULL
+5 3 NULL NULL NULL NULL
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM t2
+LEFT JOIN
+(t3, t4)
+ON t2.b=t4.b
+WHERE t3.a>1 OR t3.c IS NULL;
+a b a b a b
+4 2 2 2 3 2
+4 2 2 2 4 2
+3 3 NULL NULL NULL NULL
+5 3 NULL NULL NULL NULL
+SELECT t5.a,t5.b
+FROM t5;
+a b
+3 1
+2 2
+3 3
+SELECT t3.a,t3.b,t4.a,t4.b,t5.a,t5.b
+FROM t3,t4,t5;
+a b a b a b
+1 2 3 2 3 1
+2 2 3 2 3 1
+1 2 4 2 3 1
+2 2 4 2 3 1
+1 2 3 2 2 2
+2 2 3 2 2 2
+1 2 4 2 2 2
+2 2 4 2 2 2
+1 2 3 2 3 3
+2 2 3 2 3 3
+1 2 4 2 3 3
+2 2 4 2 3 3
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,t5.a,t5.b
+FROM t2
+LEFT JOIN
+(t3, t4, t5)
+ON t2.b=t4.b;
+a b a b a b a b
+4 2 1 2 3 2 3 1
+4 2 2 2 3 2 3 1
+4 2 1 2 4 2 3 1
+4 2 2 2 4 2 3 1
+4 2 1 2 3 2 2 2
+4 2 2 2 3 2 2 2
+4 2 1 2 4 2 2 2
+4 2 2 2 4 2 2 2
+4 2 1 2 3 2 3 3
+4 2 2 2 3 2 3 3
+4 2 1 2 4 2 3 3
+4 2 2 2 4 2 3 3
+3 3 NULL NULL NULL NULL NULL NULL
+5 3 NULL NULL NULL NULL NULL NULL
+EXPLAIN EXTENDED
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,t5.a,t5.b
+FROM t2
+LEFT JOIN
+(t3, t4, t5)
+ON t2.b=t4.b
+WHERE t3.a>1 OR t3.c IS NULL;
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 100.00
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t4 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t5 ALL NULL NULL NULL NULL 3 100.00 Using join buffer
+Warnings:
+Note 1003 select `test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b`,`test`.`t4`.`a` AS `a`,`test`.`t4`.`b` AS `b`,`test`.`t5`.`a` AS `a`,`test`.`t5`.`b` AS `b` from `test`.`t2` left join (`test`.`t3` join `test`.`t4` join `test`.`t5`) on((`test`.`t4`.`b` = `test`.`t2`.`b`)) where ((`test`.`t3`.`a` > 1) or isnull(`test`.`t3`.`c`))
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,t5.a,t5.b
+FROM t2
+LEFT JOIN
+(t3, t4, t5)
+ON t2.b=t4.b
+WHERE t3.a>1 OR t3.c IS NULL;
+a b a b a b a b
+4 2 2 2 3 2 3 1
+4 2 2 2 4 2 3 1
+4 2 2 2 3 2 2 2
+4 2 2 2 4 2 2 2
+4 2 2 2 3 2 3 3
+4 2 2 2 4 2 3 3
+3 3 NULL NULL NULL NULL NULL NULL
+5 3 NULL NULL NULL NULL NULL NULL
+EXPLAIN EXTENDED
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,t5.a,t5.b
+FROM t2
+LEFT JOIN
+(t3, t4, t5)
+ON t2.b=t4.b
+WHERE (t3.a>1 OR t3.c IS NULL) AND
+(t5.a<3 OR t5.c IS NULL);
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 100.00
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t4 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t5 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+Warnings:
+Note 1003 select `test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b`,`test`.`t4`.`a` AS `a`,`test`.`t4`.`b` AS `b`,`test`.`t5`.`a` AS `a`,`test`.`t5`.`b` AS `b` from `test`.`t2` left join (`test`.`t3` join `test`.`t4` join `test`.`t5`) on((`test`.`t4`.`b` = `test`.`t2`.`b`)) where (((`test`.`t3`.`a` > 1) or isnull(`test`.`t3`.`c`)) and ((`test`.`t5`.`a` < 3) or isnull(`test`.`t5`.`c`)))
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,t5.a,t5.b
+FROM t2
+LEFT JOIN
+(t3, t4, t5)
+ON t2.b=t4.b
+WHERE (t3.a>1 OR t3.c IS NULL) AND
+(t5.a<3 OR t5.c IS NULL);
+a b a b a b a b
+4 2 2 2 3 2 2 2
+4 2 2 2 4 2 2 2
+3 3 NULL NULL NULL NULL NULL NULL
+5 3 NULL NULL NULL NULL NULL NULL
+SELECT t6.a,t6.b
+FROM t6;
+a b
+3 2
+6 2
+6 1
+SELECT t7.a,t7.b
+FROM t7;
+a b
+1 1
+2 2
+SELECT t6.a,t6.b,t7.a,t7.b
+FROM t6,t7;
+a b a b
+3 2 1 1
+3 2 2 2
+6 2 1 1
+6 2 2 2
+6 1 1 1
+6 1 2 2
+SELECT t8.a,t8.b
+FROM t8;
+a b
+0 2
+1 2
+EXPLAIN EXTENDED
+SELECT t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM (t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10;
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t7 ALL NULL NULL NULL NULL 2 100.00
+1 SIMPLE t6 ALL NULL NULL NULL NULL 3 100.00 Using join buffer
+1 SIMPLE t8 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+Warnings:
+Note 1003 select `test`.`t6`.`a` AS `a`,`test`.`t6`.`b` AS `b`,`test`.`t7`.`a` AS `a`,`test`.`t7`.`b` AS `b`,`test`.`t8`.`a` AS `a`,`test`.`t8`.`b` AS `b` from `test`.`t6` join `test`.`t7` left join `test`.`t8` on(((`test`.`t7`.`b` = `test`.`t8`.`b`) and (`test`.`t6`.`b` < 10))) where 1
+SELECT t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM (t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10;
+a b a b a b
+3 2 2 2 0 2
+6 2 2 2 0 2
+6 1 2 2 0 2
+3 2 2 2 1 2
+6 2 2 2 1 2
+6 1 2 2 1 2
+3 2 1 1 NULL NULL
+6 2 1 1 NULL NULL
+6 1 1 1 NULL NULL
+SELECT t5.a,t5.b
+FROM t5;
+a b
+3 1
+2 2
+3 3
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b;
+a b a b a b a b
+2 2 3 2 2 2 0 2
+2 2 6 2 2 2 0 2
+2 2 3 2 2 2 1 2
+2 2 6 2 2 2 1 2
+3 1 3 2 1 1 NULL NULL
+3 1 6 2 1 1 NULL NULL
+3 3 NULL NULL NULL NULL NULL NULL
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b AND
+(t8.a < 1 OR t8.c IS NULL);
+a b a b a b a b
+2 2 3 2 2 2 0 2
+2 2 6 2 2 2 0 2
+3 1 3 2 1 1 NULL NULL
+3 1 6 2 1 1 NULL NULL
+3 3 NULL NULL NULL NULL NULL NULL
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b;
+a b a b a b
+4 2 1 2 3 2
+4 2 1 2 4 2
+3 3 NULL NULL NULL NULL
+5 3 NULL NULL NULL NULL
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
+t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b,
+t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b;
+a b a b a b a b a b a b a b
+4 2 1 2 3 2 2 2 3 2 2 2 0 2
+4 2 1 2 4 2 2 2 3 2 2 2 0 2
+4 2 1 2 3 2 2 2 6 2 2 2 0 2
+4 2 1 2 4 2 2 2 6 2 2 2 0 2
+4 2 1 2 3 2 2 2 3 2 2 2 1 2
+4 2 1 2 4 2 2 2 3 2 2 2 1 2
+4 2 1 2 3 2 2 2 6 2 2 2 1 2
+4 2 1 2 4 2 2 2 6 2 2 2 1 2
+4 2 1 2 3 2 3 1 3 2 1 1 NULL NULL
+4 2 1 2 4 2 3 1 3 2 1 1 NULL NULL
+4 2 1 2 3 2 3 1 6 2 1 1 NULL NULL
+4 2 1 2 4 2 3 1 6 2 1 1 NULL NULL
+4 2 1 2 3 2 3 3 NULL NULL NULL NULL NULL NULL
+4 2 1 2 4 2 3 3 NULL NULL NULL NULL NULL NULL
+3 3 NULL NULL NULL NULL 2 2 3 2 2 2 0 2
+5 3 NULL NULL NULL NULL 2 2 3 2 2 2 0 2
+3 3 NULL NULL NULL NULL 2 2 6 2 2 2 0 2
+5 3 NULL NULL NULL NULL 2 2 6 2 2 2 0 2
+3 3 NULL NULL NULL NULL 2 2 3 2 2 2 1 2
+5 3 NULL NULL NULL NULL 2 2 3 2 2 2 1 2
+3 3 NULL NULL NULL NULL 2 2 6 2 2 2 1 2
+5 3 NULL NULL NULL NULL 2 2 6 2 2 2 1 2
+3 3 NULL NULL NULL NULL 3 1 3 2 1 1 NULL NULL
+5 3 NULL NULL NULL NULL 3 1 3 2 1 1 NULL NULL
+3 3 NULL NULL NULL NULL 3 1 6 2 1 1 NULL NULL
+5 3 NULL NULL NULL NULL 3 1 6 2 1 1 NULL NULL
+3 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL
+5 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
+t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b,
+t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b
+WHERE t2.a > 3 AND
+(t6.a < 6 OR t6.c IS NULL);
+a b a b a b a b a b a b a b
+4 2 1 2 3 2 2 2 3 2 2 2 0 2
+4 2 1 2 4 2 2 2 3 2 2 2 0 2
+4 2 1 2 3 2 2 2 3 2 2 2 1 2
+4 2 1 2 4 2 2 2 3 2 2 2 1 2
+4 2 1 2 3 2 3 1 3 2 1 1 NULL NULL
+4 2 1 2 4 2 3 1 3 2 1 1 NULL NULL
+4 2 1 2 3 2 3 3 NULL NULL NULL NULL NULL NULL
+4 2 1 2 4 2 3 3 NULL NULL NULL NULL NULL NULL
+5 3 NULL NULL NULL NULL 2 2 3 2 2 2 0 2
+5 3 NULL NULL NULL NULL 2 2 3 2 2 2 1 2
+5 3 NULL NULL NULL NULL 3 1 3 2 1 1 NULL NULL
+5 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL
+SELECT t1.a,t1.b
+FROM t1;
+a b
+1 3
+2 2
+3 2
+SELECT t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
+t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t1
+LEFT JOIN
+(
+t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b,
+t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b
+)
+ON (t3.b=2 OR t3.c IS NULL) AND (t6.b=2 OR t6.c IS NULL) AND
+(t1.b=t5.b OR t3.c IS NULL OR t6.c IS NULL or t8.c IS NULL) AND
+(t1.a != 2);
+a b a b a b a b a b a b a b a b
+3 2 4 2 1 2 3 2 2 2 3 2 2 2 0 2
+3 2 4 2 1 2 4 2 2 2 3 2 2 2 0 2
+3 2 4 2 1 2 3 2 2 2 6 2 2 2 0 2
+3 2 4 2 1 2 4 2 2 2 6 2 2 2 0 2
+3 2 4 2 1 2 3 2 2 2 3 2 2 2 1 2
+3 2 4 2 1 2 4 2 2 2 3 2 2 2 1 2
+3 2 4 2 1 2 3 2 2 2 6 2 2 2 1 2
+3 2 4 2 1 2 4 2 2 2 6 2 2 2 1 2
+1 3 4 2 1 2 3 2 3 1 3 2 1 1 NULL NULL
+3 2 4 2 1 2 3 2 3 1 3 2 1 1 NULL NULL
+1 3 4 2 1 2 4 2 3 1 3 2 1 1 NULL NULL
+3 2 4 2 1 2 4 2 3 1 3 2 1 1 NULL NULL
+1 3 4 2 1 2 3 2 3 1 6 2 1 1 NULL NULL
+3 2 4 2 1 2 3 2 3 1 6 2 1 1 NULL NULL
+1 3 4 2 1 2 4 2 3 1 6 2 1 1 NULL NULL
+3 2 4 2 1 2 4 2 3 1 6 2 1 1 NULL NULL
+1 3 4 2 1 2 3 2 3 3 NULL NULL NULL NULL NULL NULL
+3 2 4 2 1 2 3 2 3 3 NULL NULL NULL NULL NULL NULL
+1 3 4 2 1 2 4 2 3 3 NULL NULL NULL NULL NULL NULL
+3 2 4 2 1 2 4 2 3 3 NULL NULL NULL NULL NULL NULL
+1 3 3 3 NULL NULL NULL NULL 2 2 3 2 2 2 0 2
+3 2 3 3 NULL NULL NULL NULL 2 2 3 2 2 2 0 2
+1 3 5 3 NULL NULL NULL NULL 2 2 3 2 2 2 0 2
+3 2 5 3 NULL NULL NULL NULL 2 2 3 2 2 2 0 2
+1 3 3 3 NULL NULL NULL NULL 2 2 6 2 2 2 0 2
+3 2 3 3 NULL NULL NULL NULL 2 2 6 2 2 2 0 2
+1 3 5 3 NULL NULL NULL NULL 2 2 6 2 2 2 0 2
+3 2 5 3 NULL NULL NULL NULL 2 2 6 2 2 2 0 2
+1 3 3 3 NULL NULL NULL NULL 2 2 3 2 2 2 1 2
+3 2 3 3 NULL NULL NULL NULL 2 2 3 2 2 2 1 2
+1 3 5 3 NULL NULL NULL NULL 2 2 3 2 2 2 1 2
+3 2 5 3 NULL NULL NULL NULL 2 2 3 2 2 2 1 2
+1 3 3 3 NULL NULL NULL NULL 2 2 6 2 2 2 1 2
+3 2 3 3 NULL NULL NULL NULL 2 2 6 2 2 2 1 2
+1 3 5 3 NULL NULL NULL NULL 2 2 6 2 2 2 1 2
+3 2 5 3 NULL NULL NULL NULL 2 2 6 2 2 2 1 2
+1 3 3 3 NULL NULL NULL NULL 3 1 3 2 1 1 NULL NULL
+3 2 3 3 NULL NULL NULL NULL 3 1 3 2 1 1 NULL NULL
+1 3 5 3 NULL NULL NULL NULL 3 1 3 2 1 1 NULL NULL
+3 2 5 3 NULL NULL NULL NULL 3 1 3 2 1 1 NULL NULL
+1 3 3 3 NULL NULL NULL NULL 3 1 6 2 1 1 NULL NULL
+3 2 3 3 NULL NULL NULL NULL 3 1 6 2 1 1 NULL NULL
+1 3 5 3 NULL NULL NULL NULL 3 1 6 2 1 1 NULL NULL
+3 2 5 3 NULL NULL NULL NULL 3 1 6 2 1 1 NULL NULL
+1 3 3 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL
+3 2 3 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL
+1 3 5 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL
+3 2 5 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL
+2 2 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
+SELECT t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
+t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t1
+LEFT JOIN
+(
+t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b,
+t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b
+)
+ON (t3.b=2 OR t3.c IS NULL) AND (t6.b=2 OR t6.c IS NULL) AND
+(t1.b=t5.b OR t3.c IS NULL OR t6.c IS NULL or t8.c IS NULL) AND
+(t1.a != 2)
+WHERE (t2.a >= 4 OR t2.c IS NULL);
+a b a b a b a b a b a b a b a b
+3 2 4 2 1 2 3 2 2 2 3 2 2 2 0 2
+3 2 4 2 1 2 4 2 2 2 3 2 2 2 0 2
+3 2 4 2 1 2 3 2 2 2 6 2 2 2 0 2
+3 2 4 2 1 2 4 2 2 2 6 2 2 2 0 2
+3 2 4 2 1 2 3 2 2 2 3 2 2 2 1 2
+3 2 4 2 1 2 4 2 2 2 3 2 2 2 1 2
+3 2 4 2 1 2 3 2 2 2 6 2 2 2 1 2
+3 2 4 2 1 2 4 2 2 2 6 2 2 2 1 2
+1 3 4 2 1 2 3 2 3 1 3 2 1 1 NULL NULL
+3 2 4 2 1 2 3 2 3 1 3 2 1 1 NULL NULL
+1 3 4 2 1 2 4 2 3 1 3 2 1 1 NULL NULL
+3 2 4 2 1 2 4 2 3 1 3 2 1 1 NULL NULL
+1 3 4 2 1 2 3 2 3 1 6 2 1 1 NULL NULL
+3 2 4 2 1 2 3 2 3 1 6 2 1 1 NULL NULL
+1 3 4 2 1 2 4 2 3 1 6 2 1 1 NULL NULL
+3 2 4 2 1 2 4 2 3 1 6 2 1 1 NULL NULL
+1 3 4 2 1 2 3 2 3 3 NULL NULL NULL NULL NULL NULL
+3 2 4 2 1 2 3 2 3 3 NULL NULL NULL NULL NULL NULL
+1 3 4 2 1 2 4 2 3 3 NULL NULL NULL NULL NULL NULL
+3 2 4 2 1 2 4 2 3 3 NULL NULL NULL NULL NULL NULL
+1 3 5 3 NULL NULL NULL NULL 2 2 3 2 2 2 0 2
+3 2 5 3 NULL NULL NULL NULL 2 2 3 2 2 2 0 2
+1 3 5 3 NULL NULL NULL NULL 2 2 6 2 2 2 0 2
+3 2 5 3 NULL NULL NULL NULL 2 2 6 2 2 2 0 2
+1 3 5 3 NULL NULL NULL NULL 2 2 3 2 2 2 1 2
+3 2 5 3 NULL NULL NULL NULL 2 2 3 2 2 2 1 2
+1 3 5 3 NULL NULL NULL NULL 2 2 6 2 2 2 1 2
+3 2 5 3 NULL NULL NULL NULL 2 2 6 2 2 2 1 2
+1 3 5 3 NULL NULL NULL NULL 3 1 3 2 1 1 NULL NULL
+3 2 5 3 NULL NULL NULL NULL 3 1 3 2 1 1 NULL NULL
+1 3 5 3 NULL NULL NULL NULL 3 1 6 2 1 1 NULL NULL
+3 2 5 3 NULL NULL NULL NULL 3 1 6 2 1 1 NULL NULL
+1 3 5 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL
+3 2 5 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL
+2 2 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
+SELECT t0.a,t0.b
+FROM t0;
+a b
+1 1
+1 2
+2 2
+EXPLAIN EXTENDED
+SELECT t0.a,t0.b,t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
+t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t0,t1
+LEFT JOIN
+(
+t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b,
+t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b
+)
+ON (t3.b=2 OR t3.c IS NULL) AND (t6.b=2 OR t6.c IS NULL) AND
+(t1.b=t5.b OR t3.c IS NULL OR t6.c IS NULL or t8.c IS NULL) AND
+(t1.a != 2)
+WHERE t0.a=1 AND
+t0.b=t1.b AND
+(t2.a >= 4 OR t2.c IS NULL);
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t0 ALL NULL NULL NULL NULL 3 100.00 Using where
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t4 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t5 ALL NULL NULL NULL NULL 3 100.00 Using join buffer
+1 SIMPLE t7 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t6 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t8 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+Warnings:
+Note 1003 select `test`.`t0`.`a` AS `a`,`test`.`t0`.`b` AS `b`,`test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b`,`test`.`t4`.`a` AS `a`,`test`.`t4`.`b` AS `b`,`test`.`t5`.`a` AS `a`,`test`.`t5`.`b` AS `b`,`test`.`t6`.`a` AS `a`,`test`.`t6`.`b` AS `b`,`test`.`t7`.`a` AS `a`,`test`.`t7`.`b` AS `b`,`test`.`t8`.`a` AS `a`,`test`.`t8`.`b` AS `b` from `test`.`t0` join `test`.`t1` left join (`test`.`t2` left join (`test`.`t3` join `test`.`t4`) on(((`test`.`t4`.`b` = `test`.`t2`.`b`) and (`test`.`t3`.`a` = 1))) join `test`.`t5` left join (`test`.`t6` join `test`.`t7` left join `test`.`t8` on(((`test`.`t8`.`b` = `test`.`t5`.`b`) and (`test`.`t6`.`b` < 10)))) on(((`test`.`t7`.`b` = `test`.`t5`.`b`) and (`test`.`t6`.`b` >= 2)))) on((((`test`.`t3`.`b` = 2) or isnull(`test`.`t3`.`c`)) and ((`test`.`t6`.`b` = 2) or isnull(`test`.`t6`.`c`)) and ((`test`.`t5`.`b` = `test`.`t0`.`b`) or isnull(`test`.`t3`.`c`) or isnull(`test`.`t6`.`c`) or isnull(`test`.`t8`.`c`)) and (`test`.`t1`.`a` <> 2))) where ((`test`.`t1`.`b` = `test`.`t0`.`b`) and (`test`.`t0`.`a` = 1) and ((`test`.`t2`.`a` >= 4) or isnull(`test`.`t2`.`c`)))
+SELECT t0.a,t0.b,t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
+t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t0,t1
+LEFT JOIN
+(
+t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b,
+t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b
+)
+ON (t3.b=2 OR t3.c IS NULL) AND (t6.b=2 OR t6.c IS NULL) AND
+(t1.b=t5.b OR t3.c IS NULL OR t6.c IS NULL or t8.c IS NULL) AND
+(t1.a != 2)
+WHERE t0.a=1 AND
+t0.b=t1.b AND
+(t2.a >= 4 OR t2.c IS NULL);
+a b a b a b a b a b a b a b a b a b
+1 2 3 2 4 2 1 2 3 2 2 2 3 2 2 2 0 2
+1 2 3 2 4 2 1 2 4 2 2 2 3 2 2 2 0 2
+1 2 3 2 4 2 1 2 3 2 2 2 6 2 2 2 0 2
+1 2 3 2 4 2 1 2 4 2 2 2 6 2 2 2 0 2
+1 2 3 2 4 2 1 2 3 2 2 2 3 2 2 2 1 2
+1 2 3 2 4 2 1 2 4 2 2 2 3 2 2 2 1 2
+1 2 3 2 4 2 1 2 3 2 2 2 6 2 2 2 1 2
+1 2 3 2 4 2 1 2 4 2 2 2 6 2 2 2 1 2
+1 2 3 2 4 2 1 2 3 2 3 1 3 2 1 1 NULL NULL
+1 2 3 2 4 2 1 2 4 2 3 1 3 2 1 1 NULL NULL
+1 2 3 2 4 2 1 2 3 2 3 1 6 2 1 1 NULL NULL
+1 2 3 2 4 2 1 2 4 2 3 1 6 2 1 1 NULL NULL
+1 2 3 2 4 2 1 2 3 2 3 3 NULL NULL NULL NULL NULL NULL
+1 2 3 2 4 2 1 2 4 2 3 3 NULL NULL NULL NULL NULL NULL
+1 2 3 2 5 3 NULL NULL NULL NULL 2 2 3 2 2 2 0 2
+1 2 3 2 5 3 NULL NULL NULL NULL 2 2 6 2 2 2 0 2
+1 2 3 2 5 3 NULL NULL NULL NULL 2 2 3 2 2 2 1 2
+1 2 3 2 5 3 NULL NULL NULL NULL 2 2 6 2 2 2 1 2
+1 2 3 2 5 3 NULL NULL NULL NULL 3 1 3 2 1 1 NULL NULL
+1 2 3 2 5 3 NULL NULL NULL NULL 3 1 6 2 1 1 NULL NULL
+1 2 3 2 5 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL
+1 2 2 2 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
+EXPLAIN EXTENDED
+SELECT t0.a,t0.b,t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
+t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b,t9.a,t9.b
+FROM t0,t1
+LEFT JOIN
+(
+t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b,
+t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b
+)
+ON (t3.b=2 OR t3.c IS NULL) AND (t6.b=2 OR t6.c IS NULL) AND
+(t1.b=t5.b OR t3.c IS NULL OR t6.c IS NULL or t8.c IS NULL) AND
+(t1.a != 2),
+t9
+WHERE t0.a=1 AND
+t0.b=t1.b AND
+(t2.a >= 4 OR t2.c IS NULL) AND
+(t3.a < 5 OR t3.c IS NULL) AND
+(t3.b=t4.b OR t3.c IS NULL OR t4.c IS NULL) AND
+(t5.a >=2 OR t5.c IS NULL) AND
+(t6.a >=4 OR t6.c IS NULL) AND
+(t7.a <= 2 OR t7.c IS NULL) AND
+(t8.a < 1 OR t8.c IS NULL) AND
+(t8.b=t9.b OR t8.c IS NULL) AND
+(t9.a=1);
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t0 ALL NULL NULL NULL NULL 3 100.00 Using where
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t4 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t5 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t7 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t6 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t8 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t9 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+Warnings:
+Note 1003 select `test`.`t0`.`a` AS `a`,`test`.`t0`.`b` AS `b`,`test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b`,`test`.`t4`.`a` AS `a`,`test`.`t4`.`b` AS `b`,`test`.`t5`.`a` AS `a`,`test`.`t5`.`b` AS `b`,`test`.`t6`.`a` AS `a`,`test`.`t6`.`b` AS `b`,`test`.`t7`.`a` AS `a`,`test`.`t7`.`b` AS `b`,`test`.`t8`.`a` AS `a`,`test`.`t8`.`b` AS `b`,`test`.`t9`.`a` AS `a`,`test`.`t9`.`b` AS `b` from `test`.`t0` join `test`.`t1` left join (`test`.`t2` left join (`test`.`t3` join `test`.`t4`) on(((`test`.`t4`.`b` = `test`.`t2`.`b`) and (`test`.`t3`.`a` = 1))) join `test`.`t5` left join (`test`.`t6` join `test`.`t7` left join `test`.`t8` on(((`test`.`t8`.`b` = `test`.`t5`.`b`) and (`test`.`t6`.`b` < 10)))) on(((`test`.`t7`.`b` = `test`.`t5`.`b`) and (`test`.`t6`.`b` >= 2)))) on((((`test`.`t3`.`b` = 2) or isnull(`test`.`t3`.`c`)) and ((`test`.`t6`.`b` = 2) or isnull(`test`.`t6`.`c`)) and ((`test`.`t5`.`b` = `test`.`t0`.`b`) or isnull(`test`.`t3`.`c`) or isnull(`test`.`t6`.`c`) or isnull(`test`.`t8`.`c`)) and (`test`.`t1`.`a` <> 2))) join `test`.`t9` where ((`test`.`t9`.`a` = 1) and (`test`.`t1`.`b` = `test`.`t0`.`b`) and (`test`.`t0`.`a` = 1) and ((`test`.`t2`.`a` >= 4) or isnull(`test`.`t2`.`c`)) and ((`test`.`t3`.`a` < 5) or isnull(`test`.`t3`.`c`)) and ((`test`.`t4`.`b` = `test`.`t3`.`b`) or isnull(`test`.`t3`.`c`) or isnull(`test`.`t4`.`c`)) and ((`test`.`t5`.`a` >= 2) or isnull(`test`.`t5`.`c`)) and ((`test`.`t6`.`a` >= 4) or isnull(`test`.`t6`.`c`)) and ((`test`.`t7`.`a` <= 2) or isnull(`test`.`t7`.`c`)) and ((`test`.`t8`.`a` < 1) or isnull(`test`.`t8`.`c`)) and ((`test`.`t9`.`b` = `test`.`t8`.`b`) or isnull(`test`.`t8`.`c`)))
+SELECT t9.a,t9.b
+FROM t9;
+a b
+1 1
+1 2
+3 3
+SELECT t0.a,t0.b,t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
+t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b,t9.a,t9.b
+FROM t0,t1
+LEFT JOIN
+(
+t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b,
+t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b
+)
+ON (t3.b=2 OR t3.c IS NULL) AND (t6.b=2 OR t6.c IS NULL) AND
+(t1.b=t5.b OR t3.c IS NULL OR t6.c IS NULL or t8.c IS NULL) AND
+(t1.a != 2),
+t9
+WHERE t0.a=1 AND
+t0.b=t1.b AND
+(t2.a >= 4 OR t2.c IS NULL) AND
+(t3.a < 5 OR t3.c IS NULL) AND
+(t3.b=t4.b OR t3.c IS NULL OR t4.c IS NULL) AND
+(t5.a >=2 OR t5.c IS NULL) AND
+(t6.a >=4 OR t6.c IS NULL) AND
+(t7.a <= 2 OR t7.c IS NULL) AND
+(t8.a < 1 OR t8.c IS NULL) AND
+(t8.b=t9.b OR t8.c IS NULL) AND
+(t9.a=1);
+a b a b a b a b a b a b a b a b a b a b
+1 2 3 2 4 2 1 2 3 2 2 2 6 2 2 2 0 2 1 2
+1 2 3 2 4 2 1 2 4 2 2 2 6 2 2 2 0 2 1 2
+1 2 3 2 4 2 1 2 3 2 3 1 6 2 1 1 NULL NULL 1 1
+1 2 3 2 4 2 1 2 4 2 3 1 6 2 1 1 NULL NULL 1 1
+1 2 3 2 4 2 1 2 3 2 3 1 6 2 1 1 NULL NULL 1 2
+1 2 3 2 4 2 1 2 4 2 3 1 6 2 1 1 NULL NULL 1 2
+1 2 3 2 4 2 1 2 3 2 3 3 NULL NULL NULL NULL NULL NULL 1 1
+1 2 3 2 4 2 1 2 4 2 3 3 NULL NULL NULL NULL NULL NULL 1 1
+1 2 3 2 4 2 1 2 3 2 3 3 NULL NULL NULL NULL NULL NULL 1 2
+1 2 3 2 4 2 1 2 4 2 3 3 NULL NULL NULL NULL NULL NULL 1 2
+1 2 3 2 5 3 NULL NULL NULL NULL 2 2 6 2 2 2 0 2 1 2
+1 2 3 2 5 3 NULL NULL NULL NULL 3 1 6 2 1 1 NULL NULL 1 1
+1 2 3 2 5 3 NULL NULL NULL NULL 3 1 6 2 1 1 NULL NULL 1 2
+1 2 3 2 5 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL 1 1
+1 2 3 2 5 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL 1 2
+1 2 2 2 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 1 1
+1 2 2 2 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 1 2
+SELECT t1.a,t1.b
+FROM t1;
+a b
+1 3
+2 2
+3 2
+SELECT t2.a,t2.b
+FROM t2;
+a b
+3 3
+4 2
+5 3
+SELECT t3.a,t3.b
+FROM t3;
+a b
+1 2
+2 2
+SELECT t2.a,t2.b,t3.a,t3.b
+FROM t2
+LEFT JOIN
+t3
+ON t2.b=t3.b;
+a b a b
+4 2 1 2
+4 2 2 2
+3 3 NULL NULL
+5 3 NULL NULL
+SELECT t1.a,t1.b,t2.a,t2.b,t3.a,t3.b
+FROM t1, t2
+LEFT JOIN
+t3
+ON t2.b=t3.b
+WHERE t1.a <= 2;
+a b a b a b
+1 3 4 2 1 2
+2 2 4 2 1 2
+1 3 4 2 2 2
+2 2 4 2 2 2
+1 3 3 3 NULL NULL
+2 2 3 3 NULL NULL
+1 3 5 3 NULL NULL
+2 2 5 3 NULL NULL
+SELECT t1.a,t1.b,t2.a,t2.b,t3.a,t3.b
+FROM t1, t3
+RIGHT JOIN
+t2
+ON t2.b=t3.b
+WHERE t1.a <= 2;
+a b a b a b
+1 3 4 2 1 2
+2 2 4 2 1 2
+1 3 4 2 2 2
+2 2 4 2 2 2
+1 3 3 3 NULL NULL
+2 2 3 3 NULL NULL
+1 3 5 3 NULL NULL
+2 2 5 3 NULL NULL
+SELECT t3.a,t3.b,t4.a,t4.b
+FROM t3,t4;
+a b a b
+1 2 3 2
+2 2 3 2
+1 2 4 2
+2 2 4 2
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b;
+a b a b a b
+4 2 1 2 3 2
+4 2 1 2 4 2
+3 3 NULL NULL NULL NULL
+5 3 NULL NULL NULL NULL
+SELECT t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM t1, t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b
+WHERE t1.a <= 2;
+a b a b a b a b
+1 3 4 2 1 2 3 2
+2 2 4 2 1 2 3 2
+1 3 4 2 1 2 4 2
+2 2 4 2 1 2 4 2
+1 3 3 3 NULL NULL NULL NULL
+2 2 3 3 NULL NULL NULL NULL
+1 3 5 3 NULL NULL NULL NULL
+2 2 5 3 NULL NULL NULL NULL
+SELECT t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM t1, (t3, t4)
+RIGHT JOIN
+t2
+ON t3.a=1 AND t2.b=t4.b
+WHERE t1.a <= 2;
+a b a b a b a b
+1 3 4 2 1 2 3 2
+2 2 4 2 1 2 3 2
+1 3 4 2 1 2 4 2
+2 2 4 2 1 2 4 2
+1 3 3 3 NULL NULL NULL NULL
+2 2 3 3 NULL NULL NULL NULL
+1 3 5 3 NULL NULL NULL NULL
+2 2 5 3 NULL NULL NULL NULL
+SELECT t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM t1, (t3, t4)
+RIGHT JOIN
+t2
+ON t3.a=1 AND t2.b=t4.b
+WHERE t1.a <= 2;
+a b a b a b a b
+1 3 4 2 1 2 3 2
+2 2 4 2 1 2 3 2
+1 3 4 2 1 2 4 2
+2 2 4 2 1 2 4 2
+1 3 3 3 NULL NULL NULL NULL
+2 2 3 3 NULL NULL NULL NULL
+1 3 5 3 NULL NULL NULL NULL
+2 2 5 3 NULL NULL NULL NULL
+EXPLAIN EXTENDED
+SELECT t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM t1, (t3, t4)
+RIGHT JOIN
+t2
+ON t3.a=1 AND t2.b=t4.b
+WHERE t1.a <= 2;
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3 100.00 Using where
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 100.00 Using join buffer
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t4 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+Warnings:
+Note 1003 select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b`,`test`.`t4`.`a` AS `a`,`test`.`t4`.`b` AS `b` from `test`.`t1` join `test`.`t2` left join (`test`.`t3` join `test`.`t4`) on(((`test`.`t4`.`b` = `test`.`t2`.`b`) and (`test`.`t3`.`a` = 1))) where (`test`.`t1`.`a` <= 2)
+CREATE INDEX idx_b ON t2(b);
+EXPLAIN EXTENDED
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM (t3,t4)
+LEFT JOIN
+(t1,t2)
+ON t3.a=1 AND t3.b=t2.b AND t2.b=t4.b;
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2 100.00
+1 SIMPLE t4 ALL NULL NULL NULL NULL 2 100.00 Using join buffer
+1 SIMPLE t2 ref idx_b idx_b 5 test.t3.b 2 100.00 Using where; Using join buffer
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3 100.00 Using join buffer
+Warnings:
+Note 1003 select `test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b`,`test`.`t4`.`a` AS `a`,`test`.`t4`.`b` AS `b` from `test`.`t3` join `test`.`t4` left join (`test`.`t1` join `test`.`t2`) on(((`test`.`t3`.`a` = 1) and (`test`.`t3`.`b` = `test`.`t2`.`b`) and (`test`.`t2`.`b` = `test`.`t4`.`b`))) where 1
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+FROM (t3,t4)
+LEFT JOIN
+(t1,t2)
+ON t3.a=1 AND t3.b=t2.b AND t2.b=t4.b;
+a b a b a b
+4 2 1 2 3 2
+4 2 1 2 4 2
+4 2 1 2 3 2
+4 2 1 2 4 2
+4 2 1 2 3 2
+4 2 1 2 4 2
+NULL NULL 2 2 3 2
+NULL NULL 2 2 4 2
+EXPLAIN EXTENDED
+SELECT t0.a,t0.b,t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
+t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b,t9.a,t9.b
+FROM t0,t1
+LEFT JOIN
+(
+t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b,
+t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b
+)
+ON (t3.b=2 OR t3.c IS NULL) AND (t6.b=2 OR t6.c IS NULL) AND
+(t1.b=t5.b OR t3.c IS NULL OR t6.c IS NULL or t8.c IS NULL) AND
+(t1.a != 2),
+t9
+WHERE t0.a=1 AND
+t0.b=t1.b AND
+(t2.a >= 4 OR t2.c IS NULL) AND
+(t3.a < 5 OR t3.c IS NULL) AND
+(t3.b=t4.b OR t3.c IS NULL OR t4.c IS NULL) AND
+(t5.a >=2 OR t5.c IS NULL) AND
+(t6.a >=4 OR t6.c IS NULL) AND
+(t7.a <= 2 OR t7.c IS NULL) AND
+(t8.a < 1 OR t8.c IS NULL) AND
+(t8.b=t9.b OR t8.c IS NULL) AND
+(t9.a=1);
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t0 ALL NULL NULL NULL NULL 3 100.00 Using where
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t4 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t5 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t7 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t6 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t8 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t9 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+Warnings:
+Note 1003 select `test`.`t0`.`a` AS `a`,`test`.`t0`.`b` AS `b`,`test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b`,`test`.`t4`.`a` AS `a`,`test`.`t4`.`b` AS `b`,`test`.`t5`.`a` AS `a`,`test`.`t5`.`b` AS `b`,`test`.`t6`.`a` AS `a`,`test`.`t6`.`b` AS `b`,`test`.`t7`.`a` AS `a`,`test`.`t7`.`b` AS `b`,`test`.`t8`.`a` AS `a`,`test`.`t8`.`b` AS `b`,`test`.`t9`.`a` AS `a`,`test`.`t9`.`b` AS `b` from `test`.`t0` join `test`.`t1` left join (`test`.`t2` left join (`test`.`t3` join `test`.`t4`) on(((`test`.`t4`.`b` = `test`.`t2`.`b`) and (`test`.`t3`.`a` = 1))) join `test`.`t5` left join (`test`.`t6` join `test`.`t7` left join `test`.`t8` on(((`test`.`t8`.`b` = `test`.`t5`.`b`) and (`test`.`t6`.`b` < 10)))) on(((`test`.`t7`.`b` = `test`.`t5`.`b`) and (`test`.`t6`.`b` >= 2)))) on((((`test`.`t3`.`b` = 2) or isnull(`test`.`t3`.`c`)) and ((`test`.`t6`.`b` = 2) or isnull(`test`.`t6`.`c`)) and ((`test`.`t5`.`b` = `test`.`t0`.`b`) or isnull(`test`.`t3`.`c`) or isnull(`test`.`t6`.`c`) or isnull(`test`.`t8`.`c`)) and (`test`.`t1`.`a` <> 2))) join `test`.`t9` where ((`test`.`t9`.`a` = 1) and (`test`.`t1`.`b` = `test`.`t0`.`b`) and (`test`.`t0`.`a` = 1) and ((`test`.`t2`.`a` >= 4) or isnull(`test`.`t2`.`c`)) and ((`test`.`t3`.`a` < 5) or isnull(`test`.`t3`.`c`)) and ((`test`.`t4`.`b` = `test`.`t3`.`b`) or isnull(`test`.`t3`.`c`) or isnull(`test`.`t4`.`c`)) and ((`test`.`t5`.`a` >= 2) or isnull(`test`.`t5`.`c`)) and ((`test`.`t6`.`a` >= 4) or isnull(`test`.`t6`.`c`)) and ((`test`.`t7`.`a` <= 2) or isnull(`test`.`t7`.`c`)) and ((`test`.`t8`.`a` < 1) or isnull(`test`.`t8`.`c`)) and ((`test`.`t9`.`b` = `test`.`t8`.`b`) or isnull(`test`.`t8`.`c`)))
+CREATE INDEX idx_b ON t4(b);
+CREATE INDEX idx_b ON t5(b);
+EXPLAIN EXTENDED
+SELECT t0.a,t0.b,t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
+t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b,t9.a,t9.b
+FROM t0,t1
+LEFT JOIN
+(
+t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b,
+t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b
+)
+ON (t3.b=2 OR t3.c IS NULL) AND (t6.b=2 OR t6.c IS NULL) AND
+(t1.b=t5.b OR t3.c IS NULL OR t6.c IS NULL or t8.c IS NULL) AND
+(t1.a != 2),
+t9
+WHERE t0.a=1 AND
+t0.b=t1.b AND
+(t2.a >= 4 OR t2.c IS NULL) AND
+(t3.a < 5 OR t3.c IS NULL) AND
+(t3.b=t4.b OR t3.c IS NULL OR t4.c IS NULL) AND
+(t5.a >=2 OR t5.c IS NULL) AND
+(t6.a >=4 OR t6.c IS NULL) AND
+(t7.a <= 2 OR t7.c IS NULL) AND
+(t8.a < 1 OR t8.c IS NULL) AND
+(t8.b=t9.b OR t8.c IS NULL) AND
+(t9.a=1);
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t0 ALL NULL NULL NULL NULL 3 100.00 Using where
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t4 ref idx_b idx_b 5 test.t2.b 2 100.00 Using where; Using join buffer
+1 SIMPLE t5 ALL idx_b NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t7 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t6 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t8 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t9 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+Warnings:
+Note 1003 select `test`.`t0`.`a` AS `a`,`test`.`t0`.`b` AS `b`,`test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b`,`test`.`t4`.`a` AS `a`,`test`.`t4`.`b` AS `b`,`test`.`t5`.`a` AS `a`,`test`.`t5`.`b` AS `b`,`test`.`t6`.`a` AS `a`,`test`.`t6`.`b` AS `b`,`test`.`t7`.`a` AS `a`,`test`.`t7`.`b` AS `b`,`test`.`t8`.`a` AS `a`,`test`.`t8`.`b` AS `b`,`test`.`t9`.`a` AS `a`,`test`.`t9`.`b` AS `b` from `test`.`t0` join `test`.`t1` left join (`test`.`t2` left join (`test`.`t3` join `test`.`t4`) on(((`test`.`t4`.`b` = `test`.`t2`.`b`) and (`test`.`t3`.`a` = 1))) join `test`.`t5` left join (`test`.`t6` join `test`.`t7` left join `test`.`t8` on(((`test`.`t8`.`b` = `test`.`t5`.`b`) and (`test`.`t6`.`b` < 10)))) on(((`test`.`t7`.`b` = `test`.`t5`.`b`) and (`test`.`t6`.`b` >= 2)))) on((((`test`.`t3`.`b` = 2) or isnull(`test`.`t3`.`c`)) and ((`test`.`t6`.`b` = 2) or isnull(`test`.`t6`.`c`)) and ((`test`.`t5`.`b` = `test`.`t0`.`b`) or isnull(`test`.`t3`.`c`) or isnull(`test`.`t6`.`c`) or isnull(`test`.`t8`.`c`)) and (`test`.`t1`.`a` <> 2))) join `test`.`t9` where ((`test`.`t9`.`a` = 1) and (`test`.`t1`.`b` = `test`.`t0`.`b`) and (`test`.`t0`.`a` = 1) and ((`test`.`t2`.`a` >= 4) or isnull(`test`.`t2`.`c`)) and ((`test`.`t3`.`a` < 5) or isnull(`test`.`t3`.`c`)) and ((`test`.`t4`.`b` = `test`.`t3`.`b`) or isnull(`test`.`t3`.`c`) or isnull(`test`.`t4`.`c`)) and ((`test`.`t5`.`a` >= 2) or isnull(`test`.`t5`.`c`)) and ((`test`.`t6`.`a` >= 4) or isnull(`test`.`t6`.`c`)) and ((`test`.`t7`.`a` <= 2) or isnull(`test`.`t7`.`c`)) and ((`test`.`t8`.`a` < 1) or isnull(`test`.`t8`.`c`)) and ((`test`.`t9`.`b` = `test`.`t8`.`b`) or isnull(`test`.`t8`.`c`)))
+CREATE INDEX idx_b ON t8(b);
+EXPLAIN EXTENDED
+SELECT t0.a,t0.b,t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
+t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b,t9.a,t9.b
+FROM t0,t1
+LEFT JOIN
+(
+t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b,
+t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b
+)
+ON (t3.b=2 OR t3.c IS NULL) AND (t6.b=2 OR t6.c IS NULL) AND
+(t1.b=t5.b OR t3.c IS NULL OR t6.c IS NULL or t8.c IS NULL) AND
+(t1.a != 2),
+t9
+WHERE t0.a=1 AND
+t0.b=t1.b AND
+(t2.a >= 4 OR t2.c IS NULL) AND
+(t3.a < 5 OR t3.c IS NULL) AND
+(t3.b=t4.b OR t3.c IS NULL OR t4.c IS NULL) AND
+(t5.a >=2 OR t5.c IS NULL) AND
+(t6.a >=4 OR t6.c IS NULL) AND
+(t7.a <= 2 OR t7.c IS NULL) AND
+(t8.a < 1 OR t8.c IS NULL) AND
+(t8.b=t9.b OR t8.c IS NULL) AND
+(t9.a=1);
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t0 ALL NULL NULL NULL NULL 3 100.00 Using where
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t4 ref idx_b idx_b 5 test.t2.b 2 100.00 Using where; Using join buffer
+1 SIMPLE t5 ALL idx_b NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t7 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t6 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t8 ref idx_b idx_b 5 test.t5.b 2 100.00 Using where; Using join buffer
+1 SIMPLE t9 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+Warnings:
+Note 1003 select `test`.`t0`.`a` AS `a`,`test`.`t0`.`b` AS `b`,`test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b`,`test`.`t4`.`a` AS `a`,`test`.`t4`.`b` AS `b`,`test`.`t5`.`a` AS `a`,`test`.`t5`.`b` AS `b`,`test`.`t6`.`a` AS `a`,`test`.`t6`.`b` AS `b`,`test`.`t7`.`a` AS `a`,`test`.`t7`.`b` AS `b`,`test`.`t8`.`a` AS `a`,`test`.`t8`.`b` AS `b`,`test`.`t9`.`a` AS `a`,`test`.`t9`.`b` AS `b` from `test`.`t0` join `test`.`t1` left join (`test`.`t2` left join (`test`.`t3` join `test`.`t4`) on(((`test`.`t4`.`b` = `test`.`t2`.`b`) and (`test`.`t3`.`a` = 1))) join `test`.`t5` left join (`test`.`t6` join `test`.`t7` left join `test`.`t8` on(((`test`.`t8`.`b` = `test`.`t5`.`b`) and (`test`.`t6`.`b` < 10)))) on(((`test`.`t7`.`b` = `test`.`t5`.`b`) and (`test`.`t6`.`b` >= 2)))) on((((`test`.`t3`.`b` = 2) or isnull(`test`.`t3`.`c`)) and ((`test`.`t6`.`b` = 2) or isnull(`test`.`t6`.`c`)) and ((`test`.`t5`.`b` = `test`.`t0`.`b`) or isnull(`test`.`t3`.`c`) or isnull(`test`.`t6`.`c`) or isnull(`test`.`t8`.`c`)) and (`test`.`t1`.`a` <> 2))) join `test`.`t9` where ((`test`.`t9`.`a` = 1) and (`test`.`t1`.`b` = `test`.`t0`.`b`) and (`test`.`t0`.`a` = 1) and ((`test`.`t2`.`a` >= 4) or isnull(`test`.`t2`.`c`)) and ((`test`.`t3`.`a` < 5) or isnull(`test`.`t3`.`c`)) and ((`test`.`t4`.`b` = `test`.`t3`.`b`) or isnull(`test`.`t3`.`c`) or isnull(`test`.`t4`.`c`)) and ((`test`.`t5`.`a` >= 2) or isnull(`test`.`t5`.`c`)) and ((`test`.`t6`.`a` >= 4) or isnull(`test`.`t6`.`c`)) and ((`test`.`t7`.`a` <= 2) or isnull(`test`.`t7`.`c`)) and ((`test`.`t8`.`a` < 1) or isnull(`test`.`t8`.`c`)) and ((`test`.`t9`.`b` = `test`.`t8`.`b`) or isnull(`test`.`t8`.`c`)))
+CREATE INDEX idx_b ON t1(b);
+CREATE INDEX idx_a ON t0(a);
+EXPLAIN EXTENDED
+SELECT t0.a,t0.b,t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
+t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b,t9.a,t9.b
+FROM t0,t1
+LEFT JOIN
+(
+t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b,
+t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b
+)
+ON (t3.b=2 OR t3.c IS NULL) AND (t6.b=2 OR t6.c IS NULL) AND
+(t1.b=t5.b OR t3.c IS NULL OR t6.c IS NULL or t8.c IS NULL) AND
+(t1.a != 2),
+t9
+WHERE t0.a=1 AND
+t0.b=t1.b AND
+(t2.a >= 4 OR t2.c IS NULL) AND
+(t3.a < 5 OR t3.c IS NULL) AND
+(t3.b=t4.b OR t3.c IS NULL OR t4.c IS NULL) AND
+(t5.a >=2 OR t5.c IS NULL) AND
+(t6.a >=4 OR t6.c IS NULL) AND
+(t7.a <= 2 OR t7.c IS NULL) AND
+(t8.a < 1 OR t8.c IS NULL) AND
+(t8.b=t9.b OR t8.c IS NULL) AND
+(t9.a=1);
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t0 ref idx_a idx_a 5 const 1 100.00
+1 SIMPLE t1 ref idx_b idx_b 5 test.t0.b 2 100.00 Using join buffer
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t4 ref idx_b idx_b 5 test.t2.b 2 100.00 Using where; Using join buffer
+1 SIMPLE t5 ALL idx_b NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t7 ALL NULL NULL NULL NULL 2 100.00 Using where; Using join buffer
+1 SIMPLE t6 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+1 SIMPLE t8 ref idx_b idx_b 5 test.t5.b 2 100.00 Using where; Using join buffer
+1 SIMPLE t9 ALL NULL NULL NULL NULL 3 100.00 Using where; Using join buffer
+Warnings:
+Note 1003 select `test`.`t0`.`a` AS `a`,`test`.`t0`.`b` AS `b`,`test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b`,`test`.`t4`.`a` AS `a`,`test`.`t4`.`b` AS `b`,`test`.`t5`.`a` AS `a`,`test`.`t5`.`b` AS `b`,`test`.`t6`.`a` AS `a`,`test`.`t6`.`b` AS `b`,`test`.`t7`.`a` AS `a`,`test`.`t7`.`b` AS `b`,`test`.`t8`.`a` AS `a`,`test`.`t8`.`b` AS `b`,`test`.`t9`.`a` AS `a`,`test`.`t9`.`b` AS `b` from `test`.`t0` join `test`.`t1` left join (`test`.`t2` left join (`test`.`t3` join `test`.`t4`) on(((`test`.`t4`.`b` = `test`.`t2`.`b`) and (`test`.`t3`.`a` = 1))) join `test`.`t5` left join (`test`.`t6` join `test`.`t7` left join `test`.`t8` on(((`test`.`t8`.`b` = `test`.`t5`.`b`) and (`test`.`t6`.`b` < 10)))) on(((`test`.`t7`.`b` = `test`.`t5`.`b`) and (`test`.`t6`.`b` >= 2)))) on((((`test`.`t3`.`b` = 2) or isnull(`test`.`t3`.`c`)) and ((`test`.`t6`.`b` = 2) or isnull(`test`.`t6`.`c`)) and ((`test`.`t5`.`b` = `test`.`t0`.`b`) or isnull(`test`.`t3`.`c`) or isnull(`test`.`t6`.`c`) or isnull(`test`.`t8`.`c`)) and (`test`.`t1`.`a` <> 2))) join `test`.`t9` where ((`test`.`t9`.`a` = 1) and (`test`.`t1`.`b` = `test`.`t0`.`b`) and (`test`.`t0`.`a` = 1) and ((`test`.`t2`.`a` >= 4) or isnull(`test`.`t2`.`c`)) and ((`test`.`t3`.`a` < 5) or isnull(`test`.`t3`.`c`)) and ((`test`.`t4`.`b` = `test`.`t3`.`b`) or isnull(`test`.`t3`.`c`) or isnull(`test`.`t4`.`c`)) and ((`test`.`t5`.`a` >= 2) or isnull(`test`.`t5`.`c`)) and ((`test`.`t6`.`a` >= 4) or isnull(`test`.`t6`.`c`)) and ((`test`.`t7`.`a` <= 2) or isnull(`test`.`t7`.`c`)) and ((`test`.`t8`.`a` < 1) or isnull(`test`.`t8`.`c`)) and ((`test`.`t9`.`b` = `test`.`t8`.`b`) or isnull(`test`.`t8`.`c`)))
+SELECT t0.a,t0.b,t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
+t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b,t9.a,t9.b
+FROM t0,t1
+LEFT JOIN
+(
+t2
+LEFT JOIN
+(t3, t4)
+ON t3.a=1 AND t2.b=t4.b,
+t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b
+)
+ON (t3.b=2 OR t3.c IS NULL) AND (t6.b=2 OR t6.c IS NULL) AND
+(t1.b=t5.b OR t3.c IS NULL OR t6.c IS NULL or t8.c IS NULL) AND
+(t1.a != 2),
+t9
+WHERE t0.a=1 AND
+t0.b=t1.b AND
+(t2.a >= 4 OR t2.c IS NULL) AND
+(t3.a < 5 OR t3.c IS NULL) AND
+(t3.b=t4.b OR t3.c IS NULL OR t4.c IS NULL) AND
+(t5.a >=2 OR t5.c IS NULL) AND
+(t6.a >=4 OR t6.c IS NULL) AND
+(t7.a <= 2 OR t7.c IS NULL) AND
+(t8.a < 1 OR t8.c IS NULL) AND
+(t8.b=t9.b OR t8.c IS NULL) AND
+(t9.a=1);
+a b a b a b a b a b a b a b a b a b a b
+1 2 3 2 4 2 1 2 3 2 2 2 6 2 2 2 0 2 1 2
+1 2 3 2 4 2 1 2 4 2 2 2 6 2 2 2 0 2 1 2
+1 2 3 2 4 2 1 2 3 2 3 1 6 2 1 1 NULL NULL 1 1
+1 2 3 2 4 2 1 2 4 2 3 1 6 2 1 1 NULL NULL 1 1
+1 2 3 2 4 2 1 2 3 2 3 1 6 2 1 1 NULL NULL 1 2
+1 2 3 2 4 2 1 2 4 2 3 1 6 2 1 1 NULL NULL 1 2
+1 2 3 2 4 2 1 2 3 2 3 3 NULL NULL NULL NULL NULL NULL 1 1
+1 2 3 2 4 2 1 2 4 2 3 3 NULL NULL NULL NULL NULL NULL 1 1
+1 2 3 2 4 2 1 2 3 2 3 3 NULL NULL NULL NULL NULL NULL 1 2
+1 2 3 2 4 2 1 2 4 2 3 3 NULL NULL NULL NULL NULL NULL 1 2
+1 2 3 2 5 3 NULL NULL NULL NULL 2 2 6 2 2 2 0 2 1 2
+1 2 3 2 5 3 NULL NULL NULL NULL 3 1 6 2 1 1 NULL NULL 1 1
+1 2 3 2 5 3 NULL NULL NULL NULL 3 1 6 2 1 1 NULL NULL 1 2
+1 2 3 2 5 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL 1 1
+1 2 3 2 5 3 NULL NULL NULL NULL 3 3 NULL NULL NULL NULL NULL NULL 1 2
+1 2 2 2 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 1 1
+1 2 2 2 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 1 2
+SELECT t2.a,t2.b
+FROM t2;
+a b
+3 3
+4 2
+5 3
+SELECT t3.a,t3.b
+FROM t3;
+a b
+1 2
+2 2
+SELECT t2.a,t2.b,t3.a,t3.b
+FROM t2 LEFT JOIN t3 ON t2.b=t3.b
+WHERE t2.a = 4 OR (t2.a > 4 AND t3.a IS NULL);
+a b a b
+4 2 1 2
+4 2 2 2
+5 3 NULL NULL
+SELECT t2.a,t2.b,t3.a,t3.b
+FROM t2 LEFT JOIN (t3) ON t2.b=t3.b
+WHERE t2.a = 4 OR (t2.a > 4 AND t3.a IS NULL);
+a b a b
+4 2 1 2
+4 2 2 2
+5 3 NULL NULL
+ALTER TABLE t3
+CHANGE COLUMN a a1 int,
+CHANGE COLUMN c c1 int;
+SELECT t2.a,t2.b,t3.a1,t3.b
+FROM t2 LEFT JOIN t3 ON t2.b=t3.b
+WHERE t2.a = 4 OR (t2.a > 4 AND t3.a1 IS NULL);
+a b a1 b
+4 2 1 2
+4 2 2 2
+5 3 NULL NULL
+SELECT t2.a,t2.b,t3.a1,t3.b
+FROM t2 NATURAL LEFT JOIN t3
+WHERE t2.a = 4 OR (t2.a > 4 AND t3.a1 IS NULL);
+a b a1 b
+4 2 1 2
+4 2 2 2
+5 3 NULL NULL
+DROP TABLE t0,t1,t2,t3,t4,t5,t6,t7,t8,t9;
+CREATE TABLE t1 (a int);
+CREATE TABLE t2 (a int);
+CREATE TABLE t3 (a int);
+INSERT INTO t1 VALUES (1);
+INSERT INTO t2 VALUES (2);
+INSERT INTO t3 VALUES (2);
+INSERT INTO t1 VALUES (2);
+SELECT * FROM t1 LEFT JOIN (t2 LEFT JOIN t3 ON t2.a=t3.a) ON t1.a=t3.a;
+a a a
+2 2 2
+1 NULL NULL
+SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 ON t2.a=t3.a ON t1.a=t3.a;
+a a a
+2 2 2
+1 NULL NULL
+DELETE FROM t1 WHERE a=2;
+SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 ON t2.a=t3.a ON t1.a=t3.a;
+a a a
+1 NULL NULL
+DELETE FROM t2;
+SELECT * FROM t1 LEFT JOIN t2 LEFT JOIN t3 ON t2.a=t3.a ON t1.a=t3.a;
+a a a
+1 NULL NULL
+DROP TABLE t1,t2,t3;
+CREATE TABLE t1(a int, key (a));
+CREATE TABLE t2(b int, key (b));
+CREATE TABLE t3(c int, key (c));
+INSERT INTO t1 VALUES (NULL), (0), (1), (2), (3), (4), (5), (6), (7), (8), (9),
+(10), (11), (12), (13), (14), (15), (16), (17), (18), (19);
+INSERT INTO t2 VALUES (NULL), (0), (1), (2), (3), (4), (5), (6), (7), (8), (9),
+(10), (11), (12), (13), (14), (15), (16), (17), (18), (19);
+INSERT INTO t3 VALUES (0), (1), (2), (3), (4), (5);
+EXPLAIN SELECT a, b, c FROM t1 LEFT JOIN (t2, t3) ON c < 3 and b = c;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 index NULL a 5 NULL 21 Using index
+1 SIMPLE t3 index c c 5 NULL 6 Using where; Using index
+1 SIMPLE t2 ref b b 5 test.t3.c 2 Using index
+EXPLAIN SELECT a, b, c FROM t1 LEFT JOIN (t2, t3) ON b < 3 and b = c;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 index NULL a 5 NULL 21 Using index
+1 SIMPLE t3 index c c 5 NULL 6 Using index
+1 SIMPLE t2 ref b b 5 test.t3.c 2 Using where; Using index
+SELECT a, b, c FROM t1 LEFT JOIN (t2, t3) ON b < 3 and b = c;
+a b c
+NULL 0 0
+NULL 1 1
+NULL 2 2
+0 0 0
+0 1 1
+0 2 2
+1 0 0
+1 1 1
+1 2 2
+2 0 0
+2 1 1
+2 2 2
+3 0 0
+3 1 1
+3 2 2
+4 0 0
+4 1 1
+4 2 2
+5 0 0
+5 1 1
+5 2 2
+6 0 0
+6 1 1
+6 2 2
+7 0 0
+7 1 1
+7 2 2
+8 0 0
+8 1 1
+8 2 2
+9 0 0
+9 1 1
+9 2 2
+10 0 0
+10 1 1
+10 2 2
+11 0 0
+11 1 1
+11 2 2
+12 0 0
+12 1 1
+12 2 2
+13 0 0
+13 1 1
+13 2 2
+14 0 0
+14 1 1
+14 2 2
+15 0 0
+15 1 1
+15 2 2
+16 0 0
+16 1 1
+16 2 2
+17 0 0
+17 1 1
+17 2 2
+18 0 0
+18 1 1
+18 2 2
+19 0 0
+19 1 1
+19 2 2
+DELETE FROM t3;
+EXPLAIN SELECT a, b, c FROM t1 LEFT JOIN (t2, t3) ON b < 3 and b = c;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 index NULL a 5 NULL 21 Using index
+1 SIMPLE t3 index c c 5 NULL 0 Using index
+1 SIMPLE t2 ref b b 5 test.t3.c 2 Using where; Using index
+SELECT a, b, c FROM t1 LEFT JOIN (t2, t3) ON b < 3 and b = c;
+a b c
+NULL NULL NULL
+0 NULL NULL
+1 NULL NULL
+2 NULL NULL
+3 NULL NULL
+4 NULL NULL
+5 NULL NULL
+6 NULL NULL
+7 NULL NULL
+8 NULL NULL
+9 NULL NULL
+10 NULL NULL
+11 NULL NULL
+12 NULL NULL
+13 NULL NULL
+14 NULL NULL
+15 NULL NULL
+16 NULL NULL
+17 NULL NULL
+18 NULL NULL
+19 NULL NULL
+DROP TABLE t1,t2,t3;
+CREATE TABLE t1 (c11 int);
+CREATE TABLE t2 (c21 int);
+CREATE TABLE t3 (c31 int);
+INSERT INTO t1 VALUES (4), (5);
+SELECT * FROM t1 LEFT JOIN t2 ON c11=c21;
+c11 c21
+4 NULL
+5 NULL
+EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON c11=c21;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 system NULL NULL NULL NULL 0 const row not found
+1 SIMPLE t1 ALL NULL NULL NULL NULL 2
+SELECT * FROM t1 LEFT JOIN (t2 LEFT JOIN t3 ON c21=c31) ON c11=c21;
+c11 c21 c31
+4 NULL NULL
+5 NULL NULL
+EXPLAIN SELECT * FROM t1 LEFT JOIN (t2 LEFT JOIN t3 ON c21=c31) ON c11=c21;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 2
+1 SIMPLE t2 ALL NULL NULL NULL NULL 0 Using where; Using join buffer
+1 SIMPLE t3 ALL NULL NULL NULL NULL 0 Using where; Using join buffer
+DROP TABLE t1,t2,t3;
+CREATE TABLE t1 (goods int(12) NOT NULL, price varchar(128) NOT NULL);
+INSERT INTO t1 VALUES (23, 2340), (26, 9900);
+CREATE TABLE t2 (goods int(12), name varchar(50), shop char(2));
+INSERT INTO t2 VALUES (23, 'as300', 'fr'), (26, 'as600', 'fr');
+create table t3 (groupid int(12) NOT NULL, goodsid int(12) NOT NULL);
+INSERT INTO t3 VALUES (3,23), (6,26);
+CREATE TABLE t4 (groupid int(12));
+INSERT INTO t4 VALUES (1), (2), (3), (4), (5), (6);
+SELECT * FROM
+(SELECT DISTINCT gl.groupid, gp.price
+FROM t4 gl
+LEFT JOIN
+(t3 g INNER JOIN t2 p ON g.goodsid = p.goods
+INNER JOIN t1 gp ON p.goods = gp.goods)
+ON gl.groupid = g.groupid and p.shop = 'fr') t;
+groupid price
+3 2340
+6 9900
+1 NULL
+2 NULL
+4 NULL
+5 NULL
+CREATE VIEW v1 AS
+SELECT g.groupid groupid, p.goods goods,
+p.name name, p.shop shop,
+gp.price price
+FROM t3 g INNER JOIN t2 p ON g.goodsid = p.goods
+INNER JOIN t1 gp on p.goods = gp.goods;
+CREATE VIEW v2 AS
+SELECT DISTINCT g.groupid, fr.price
+FROM t4 g
+LEFT JOIN
+v1 fr on g.groupid = fr.groupid and fr.shop = 'fr';
+SELECT * FROM v2;
+groupid price
+3 2340
+6 9900
+1 NULL
+2 NULL
+4 NULL
+5 NULL
+SELECT * FROM
+(SELECT DISTINCT g.groupid, fr.price
+FROM t4 g
+LEFT JOIN
+v1 fr on g.groupid = fr.groupid and fr.shop = 'fr') t;
+groupid price
+3 2340
+6 9900
+1 NULL
+2 NULL
+4 NULL
+5 NULL
+DROP VIEW v1,v2;
+DROP TABLE t1,t2,t3,t4;
+CREATE TABLE t1(a int);
+CREATE TABLE t2(b int);
+CREATE TABLE t3(c int, d int);
+CREATE TABLE t4(d int);
+CREATE TABLE t5(e int, f int);
+CREATE TABLE t6(f int);
+CREATE VIEW v1 AS
+SELECT e FROM t5 JOIN t6 ON t5.e=t6.f;
+CREATE VIEW v2 AS
+SELECT e FROM t5 NATURAL JOIN t6;
+SELECT t1.a FROM t1 JOIN t2 ON a=b JOIN t3 ON a=c JOIN t4 USING(d);
+a
+SELECT t1.x FROM t1 JOIN t2 ON a=b JOIN t3 ON a=c JOIN t4 USING(d);
+ERROR 42S22: Unknown column 't1.x' in 'field list'
+SELECT t1.a FROM t1 JOIN t2 ON a=b JOIN t3 ON a=c NATURAL JOIN t4;
+a
+SELECT t1.x FROM t1 JOIN t2 ON a=b JOIN t3 ON a=c NATURAL JOIN t4;
+ERROR 42S22: Unknown column 't1.x' in 'field list'
+SELECT v1.e FROM v1 JOIN t2 ON e=b JOIN t3 ON e=c JOIN t4 USING(d);
+e
+SELECT v1.x FROM v1 JOIN t2 ON e=b JOIN t3 ON e=c JOIN t4 USING(d);
+ERROR 42S22: Unknown column 'v1.x' in 'field list'
+SELECT v2.e FROM v2 JOIN t2 ON e=b JOIN t3 ON e=c JOIN t4 USING(d);
+e
+SELECT v2.x FROM v2 JOIN t2 ON e=b JOIN t3 ON e=c JOIN t4 USING(d);
+ERROR 42S22: Unknown column 'v2.x' in 'field list'
+DROP VIEW v1, v2;
+DROP TABLE t1, t2, t3, t4, t5, t6;
+create table t1 (id1 int(11) not null);
+insert into t1 values (1),(2);
+create table t2 (id2 int(11) not null);
+insert into t2 values (1),(2),(3),(4);
+create table t3 (id3 char(16) not null);
+insert into t3 values ('100');
+create table t4 (id2 int(11) not null, id3 char(16));
+create table t5 (id1 int(11) not null, key (id1));
+insert into t5 values (1),(2),(1);
+create view v1 as
+select t4.id3 from t4 join t2 on t4.id2 = t2.id2;
+select t1.id1 from t1 inner join (t3 left join v1 on t3.id3 = v1.id3);
+id1
+1
+2
+drop view v1;
+drop table t1, t2, t3, t4, t5;
+create table t0 (a int);
+insert into t0 values (0),(1),(2),(3);
+create table t1(a int);
+insert into t1 select A.a + 10*(B.a) from t0 A, t0 B;
+create table t2 (a int, b int);
+insert into t2 values (1,1), (2,2), (3,3);
+create table t3(a int, b int, filler char(200), key(a));
+insert into t3 select a,a,'filler' from t1;
+insert into t3 select a,a,'filler' from t1;
+create table t4 like t3;
+insert into t4 select * from t3;
+insert into t4 select * from t3;
+create table t5 like t4;
+insert into t5 select * from t4;
+insert into t5 select * from t4;
+create table t6 like t5;
+insert into t6 select * from t5;
+insert into t6 select * from t5;
+create table t7 like t6;
+insert into t7 select * from t6;
+insert into t7 select * from t6;
+explain select * from t4 join
+t2 left join (t3 join t5 on t5.a=t3.b) on t3.a=t2.b where t4.a<=>t3.b;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL X
+1 SIMPLE t3 ref a a 5 test.t2.b X Using join buffer
+1 SIMPLE t5 ref a a 5 test.t3.b X Using join buffer
+1 SIMPLE t4 ref a a 5 test.t3.b X Using index condition(BKA); Using join buffer
+explain select * from (t4 join t6 on t6.a=t4.b) right join t3 on t4.a=t3.b
+join t2 left join (t5 join t7 on t7.a=t5.b) on t5.a=t2.b where t3.a<=>t2.b;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL X
+1 SIMPLE t3 ref a a 5 test.t2.b X Using index condition(BKA); Using join buffer
+1 SIMPLE t4 ref a a 5 test.t3.b X Using join buffer
+1 SIMPLE t6 ref a a 5 test.t4.b X Using join buffer
+1 SIMPLE t5 ref a a 5 test.t2.b X Using join buffer
+1 SIMPLE t7 ref a a 5 test.t5.b X Using join buffer
+explain select * from t2 left join
+(t3 left join (t4 join t6 on t6.a=t4.b) on t4.a=t3.b
+join t5 on t5.a=t3.b) on t3.a=t2.b;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL X
+1 SIMPLE t3 ref a a 5 test.t2.b X Using join buffer
+1 SIMPLE t4 ref a a 5 test.t3.b X Using join buffer
+1 SIMPLE t6 ref a a 5 test.t4.b X Using join buffer
+1 SIMPLE t5 ref a a 5 test.t3.b X Using join buffer
+drop table t0, t1, t2, t3, t4, t5, t6, t7;
+create table t1 (a int);
+insert into t1 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
+create table t2 (a int, filler char(100), key(a));
+insert into t2 select A.a + 10*B.a, '' from t1 A, t1 B;
+create table t3 like t2;
+insert into t3 select * from t2;
+explain select * from t1 left join
+(t2 left join t3 on (t2.a = t3.a))
+on (t1.a = t2.a);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 10
+1 SIMPLE t2 ref a a 5 test.t1.a 1 Using join buffer
+1 SIMPLE t3 ref a a 5 test.t2.a 1 Using join buffer
+drop table t1, t2, t3;
+CREATE TABLE t1 (id int NOT NULL PRIMARY KEY, type varchar(10));
+CREATE TABLE t2 (pid int NOT NULL PRIMARY KEY, type varchar(10));
+CREATE TABLE t3 (cid int NOT NULL PRIMARY KEY,
+id int NOT NULL,
+pid int NOT NULL);
+INSERT INTO t1 VALUES (1, 'A'), (3, 'C');
+INSERT INTO t2 VALUES (1, 'A'), (3, 'C');
+INSERT INTO t3 VALUES (1, 1, 1), (3, 3, 3);
+SELECT * FROM t1 p LEFT JOIN (t3 JOIN t1)
+ON (t1.id=t3.id AND t1.type='B' AND p.id=t3.id)
+LEFT JOIN t2 ON (t3.pid=t2.pid)
+WHERE p.id=1;
+id type cid id pid id type pid type
+1 A NULL NULL NULL NULL NULL NULL NULL
+CREATE VIEW v1 AS
+SELECT t3.* FROM t3 JOIN t1 ON t1.id=t3.id AND t1.type='B';
+SELECT * FROM t1 p LEFT JOIN v1 ON p.id=v1.id
+LEFT JOIN t2 ON v1.pid=t2.pid
+WHERE p.id=1;
+id type cid id pid pid type
+1 A NULL NULL NULL NULL NULL
+DROP VIEW v1;
+DROP TABLE t1,t2,t3;
+CREATE TABLE t1 (id1 int PRIMARY KEY, id2 int);
+CREATE TABLE t2 (id1 int PRIMARY KEY, id2 int);
+CREATE TABLE t3 (id1 int PRIMARY KEY, id2 int);
+CREATE TABLE t4 (id1 int PRIMARY KEY, id2 int);
+CREATE TABLE t5 (id1 int PRIMARY KEY, id2 int);
+SELECT t1.id1 AS id, t5.id1 AS ngroupbynsa
+FROM t1 INNER JOIN t2 ON t2.id2 = t1.id1
+LEFT OUTER JOIN
+(t3 INNER JOIN t4 ON t4.id1 = t3.id2 INNER JOIN t5 ON t4.id2 = t5.id1)
+ON t3.id2 IS NOT NULL
+WHERE t1.id1=2;
+id ngroupbynsa
+PREPARE stmt FROM
+"SELECT t1.id1 AS id, t5.id1 AS ngroupbynsa
+ FROM t1 INNER JOIN t2 ON t2.id2 = t1.id1
+ LEFT OUTER JOIN
+ (t3 INNER JOIN t4 ON t4.id1 = t3.id2 INNER JOIN t5 ON t4.id2 = t5.id1)
+ ON t3.id2 IS NOT NULL
+ WHERE t1.id1=2";
+EXECUTE stmt;
+id ngroupbynsa
+EXECUTE stmt;
+id ngroupbynsa
+EXECUTE stmt;
+id ngroupbynsa
+EXECUTE stmt;
+id ngroupbynsa
+INSERT INTO t1 VALUES (1,1), (2,1), (3,2);
+INSERT INTO t2 VALUES (2,1), (3,2), (4,3);
+INSERT INTO t3 VALUES (1,1), (3,2), (2,NULL);
+INSERT INTO t4 VALUES (1,1), (2,1), (3,3);
+INSERT INTO t5 VALUES (1,1), (2,2), (3,3), (4,3);
+EXECUTE stmt;
+id ngroupbynsa
+2 1
+2 1
+EXECUTE stmt;
+id ngroupbynsa
+2 1
+2 1
+EXECUTE stmt;
+id ngroupbynsa
+2 1
+2 1
+EXECUTE stmt;
+id ngroupbynsa
+2 1
+2 1
+SELECT t1.id1 AS id, t5.id1 AS ngroupbynsa
+FROM t1 INNER JOIN t2 ON t2.id2 = t1.id1
+LEFT OUTER JOIN
+(t3 INNER JOIN t4 ON t4.id1 = t3.id2 INNER JOIN t5 ON t4.id2 = t5.id1)
+ON t3.id2 IS NOT NULL
+WHERE t1.id1=2;
+id ngroupbynsa
+2 1
+2 1
+DROP TABLE t1,t2,t3,t4,t5;
+CREATE TABLE t1 (
+id int NOT NULL PRIMARY KEY,
+ct int DEFAULT NULL,
+pc int DEFAULT NULL,
+INDEX idx_ct (ct),
+INDEX idx_pc (pc)
+);
+INSERT INTO t1 VALUES
+(1,NULL,NULL),(2,NULL,NULL),(3,NULL,NULL),(4,NULL,NULL),(5,NULL,NULL);
+CREATE TABLE t2 (
+id int NOT NULL PRIMARY KEY,
+sr int NOT NULL,
+nm varchar(255) NOT NULL,
+INDEX idx_sr (sr)
+);
+INSERT INTO t2 VALUES
+(2441905,4308,'LesAbymes'),(2441906,4308,'Anse-Bertrand');
+CREATE TABLE t3 (
+id int NOT NULL PRIMARY KEY,
+ct int NOT NULL,
+ln int NOT NULL,
+INDEX idx_ct (ct),
+INDEX idx_ln (ln)
+);
+CREATE TABLE t4 (
+id int NOT NULL PRIMARY KEY,
+nm varchar(255) NOT NULL
+);
+INSERT INTO t4 VALUES (4308,'Guadeloupe'),(4309,'Martinique');
+SELECT t1.*
+FROM t1 LEFT JOIN
+(t2 LEFT JOIN t3 ON t3.ct=t2.id AND t3.ln='5') ON t1.ct=t2.id
+WHERE t1.id='5';
+id ct pc
+5 NULL NULL
+SELECT t1.*, t4.nm
+FROM t1 LEFT JOIN
+(t2 LEFT JOIN t3 ON t3.ct=t2.id AND t3.ln='5') ON t1.ct=t2.id
+LEFT JOIN t4 ON t2.sr=t4.id
+WHERE t1.id='5';
+id ct pc nm
+5 NULL NULL NULL
+DROP TABLE t1,t2,t3,t4;
+CREATE TABLE t1 (a INT, b INT);
+CREATE TABLE t2 (a INT);
+CREATE TABLE t3 (a INT, c INT);
+CREATE TABLE t4 (a INT, c INT);
+CREATE TABLE t5 (a INT, c INT);
+SELECT b FROM t1 JOIN (t2 LEFT JOIN t3 USING (a) LEFT JOIN t4 USING (a)
+LEFT JOIN t5 USING (a)) USING (a);
+b
+SELECT c FROM t1 JOIN (t2 LEFT JOIN t3 USING (a) LEFT JOIN t4 USING (a)
+LEFT JOIN t5 USING (a)) USING (a);
+ERROR 23000: Column 'c' in field list is ambiguous
+SELECT b FROM t1 JOIN (t2 JOIN t3 USING (a) JOIN t4 USING (a)
+JOIN t5 USING (a)) USING (a);
+b
+SELECT c FROM t1 JOIN (t2 JOIN t3 USING (a) JOIN t4 USING (a)
+JOIN t5 USING (a)) USING (a);
+ERROR 23000: Column 'c' in field list is ambiguous
+DROP TABLE t1,t2,t3,t4,t5;
+CREATE TABLE t1 (a INT, b INT);
+CREATE TABLE t2 (a INT, b INT);
+CREATE TABLE t3 (a INT, b INT);
+INSERT INTO t1 VALUES (1,1);
+INSERT INTO t2 VALUES (1,1);
+INSERT INTO t3 VALUES (1,1);
+SELECT * FROM t1 JOIN (t2 JOIN t3 USING (b)) USING (a);
+ERROR 23000: Column 'a' in from clause is ambiguous
+DROP TABLE t1,t2,t3;
+CREATE TABLE t1 (
+carrier char(2) default NULL,
+id int NOT NULL auto_increment PRIMARY KEY
+);
+INSERT INTO t1 VALUES
+('CO',235371754),('CO',235376554),('CO',235376884),('CO',235377874),
+('CO',231060394),('CO',231059224),('CO',231059314),('CO',231060484),
+('CO',231060274),('CO',231060124),('CO',231060244),('CO',231058594),
+('CO',231058924),('CO',231058504),('CO',231059344),('CO',231060424),
+('CO',231059554),('CO',231060304),('CO',231059644),('CO',231059464),
+('CO',231059764),('CO',231058294),('CO',231058624),('CO',231058864),
+('CO',231059374),('CO',231059584),('CO',231059734),('CO',231059014),
+('CO',231059854),('CO',231059494),('CO',231059794),('CO',231058534),
+('CO',231058324),('CO',231058684),('CO',231059524),('CO',231059974);
+CREATE TABLE t2 (
+scan_date date default NULL,
+package_id int default NULL,
+INDEX scan_date(scan_date),
+INDEX package_id(package_id)
+);
+INSERT INTO t2 VALUES
+('2008-12-29',231062944),('2008-12-29',231065764),('2008-12-29',231066124),
+('2008-12-29',231060094),('2008-12-29',231061054),('2008-12-29',231065644),
+('2008-12-29',231064384),('2008-12-29',231064444),('2008-12-29',231073774),
+('2008-12-29',231058594),('2008-12-29',231059374),('2008-12-29',231066004),
+('2008-12-29',231068494),('2008-12-29',231070174),('2008-12-29',231071884),
+('2008-12-29',231063274),('2008-12-29',231063754),('2008-12-29',231064144),
+('2008-12-29',231069424),('2008-12-29',231073714),('2008-12-29',231058414),
+('2008-12-29',231060994),('2008-12-29',231069154),('2008-12-29',231068614),
+('2008-12-29',231071464),('2008-12-29',231074014),('2008-12-29',231059614),
+('2008-12-29',231059074),('2008-12-29',231059464),('2008-12-29',231069094),
+('2008-12-29',231067294),('2008-12-29',231070144),('2008-12-29',231073804),
+('2008-12-29',231072634),('2008-12-29',231058294),('2008-12-29',231065344),
+('2008-12-29',231066094),('2008-12-29',231069034),('2008-12-29',231058594),
+('2008-12-29',231059854),('2008-12-29',231059884),('2008-12-29',231059914),
+('2008-12-29',231063664),('2008-12-29',231063814),('2008-12-29',231063904);
+CREATE TABLE t3 (
+package_id int default NULL,
+INDEX package_id(package_id)
+);
+INSERT INTO t3 VALUES
+(231058294),(231058324),(231058354),(231058384),(231058414),(231058444),
+(231058474),(231058504),(231058534),(231058564),(231058594),(231058624),
+(231058684),(231058744),(231058804),(231058864),(231058924),(231058954),
+(231059014),(231059074),(231059104),(231059134),(231059164),(231059194),
+(231059224),(231059254),(231059284),(231059314),(231059344),(231059374),
+(231059404),(231059434),(231059464),(231059494),(231059524),(231059554),
+(231059584),(231059614),(231059644),(231059674),(231059704),(231059734),
+(231059764),(231059794),(231059824),(231059854),(231059884),(231059914),
+(231059944),(231059974),(231060004),(231060034),(231060064),(231060094),
+(231060124),(231060154),(231060184),(231060214),(231060244),(231060274),
+(231060304),(231060334),(231060364),(231060394),(231060424),(231060454),
+(231060484),(231060514),(231060544),(231060574),(231060604),(231060634),
+(231060664),(231060694),(231060724),(231060754),(231060784),(231060814),
+(231060844),(231060874),(231060904),(231060934),(231060964),(231060994),
+(231061024),(231061054),(231061084),(231061144),(231061174),(231061204),
+(231061234),(231061294),(231061354),(231061384),(231061414),(231061474),
+(231061564),(231061594),(231061624),(231061684),(231061714),(231061774),
+(231061804),(231061894),(231061984),(231062074),(231062134),(231062224),
+(231062254),(231062314),(231062374),(231062434),(231062494),(231062554),
+(231062584),(231062614),(231062644),(231062704),(231062734),(231062794),
+(231062854),(231062884),(231062944),(231063004),(231063034),(231063064),
+(231063124),(231063154),(231063184),(231063214),(231063274),(231063334),
+(231063394),(231063424),(231063454),(231063514),(231063574),(231063664);
+CREATE TABLE t4 (
+carrier char(2) NOT NULL default '' PRIMARY KEY,
+id int(11) default NULL,
+INDEX id(id)
+);
+INSERT INTO t4 VALUES
+('99',6),('SK',456),('UA',486),('AI',1081),('OS',1111),('VS',1510);
+CREATE TABLE t5 (
+carrier_id int default NULL,
+INDEX carrier_id(carrier_id)
+);
+INSERT INTO t5 VALUES
+(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),
+(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),
+(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),
+(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),
+(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),
+(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(6),(456),(456),(456),
+(456),(456),(456),(456),(456),(456),(456),(456),(456),(456),(456),(456),
+(456),(486),(1081),(1111),(1111),(1111),(1111),(1510);
+SELECT COUNT(*)
+FROM((t2 JOIN t1 ON t2.package_id = t1.id)
+JOIN t3 ON t3.package_id = t1.id);
+COUNT(*)
+6
+EXPLAIN
+SELECT COUNT(*)
+FROM ((t2 JOIN t1 ON t2.package_id = t1.id)
+JOIN t3 ON t3.package_id = t1.id)
+LEFT JOIN
+(t5 JOIN t4 ON t5.carrier_id = t4.id)
+ON t4.carrier = t1.carrier;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 index package_id package_id 5 NULL 45 Using index
+1 SIMPLE t1 eq_ref PRIMARY PRIMARY 4 test.t2.package_id 1 Using join buffer
+1 SIMPLE t4 eq_ref PRIMARY,id PRIMARY 2 test.t1.carrier 1
+1 SIMPLE t5 ref carrier_id carrier_id 5 test.t4.id 22 Using index
+1 SIMPLE t3 ref package_id package_id 5 test.t1.id 1 Using where; Using index
+SELECT COUNT(*)
+FROM ((t2 JOIN t1 ON t2.package_id = t1.id)
+JOIN t3 ON t3.package_id = t1.id)
+LEFT JOIN
+(t5 JOIN t4 ON t5.carrier_id = t4.id)
+ON t4.carrier = t1.carrier;
+COUNT(*)
+6
+DROP TABLE t1,t2,t3,t4,t5;
+End of 5.0 tests
+CREATE TABLE t5 (a int, b int, c int, PRIMARY KEY(a), KEY b_i (b));
+CREATE TABLE t6 (a int, b int, c int, PRIMARY KEY(a), KEY b_i (b));
+CREATE TABLE t7 (a int, b int, c int, PRIMARY KEY(a), KEY b_i (b));
+CREATE TABLE t8 (a int, b int, c int, PRIMARY KEY(a), KEY b_i (b));
+INSERT INTO t5 VALUES (1,1,0), (2,2,0), (3,3,0);
+INSERT INTO t6 VALUES (1,2,0), (3,2,0), (6,1,0);
+INSERT INTO t7 VALUES (1,1,0), (2,2,0);
+INSERT INTO t8 VALUES (0,2,0), (1,2,0);
+EXPLAIN
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b AND
+(t8.a > 0 OR t8.c IS NULL);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t5 ALL NULL NULL NULL NULL 3
+1 SIMPLE t7 ref b_i b_i 5 test.t5.b 2 Using join buffer
+1 SIMPLE t6 ALL b_i NULL NULL NULL 3 Using where; Using join buffer
+1 SIMPLE t8 ref b_i b_i 5 test.t7.b 2 Using where; Using join buffer
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t5
+LEFT JOIN
+(
+(t6, t7)
+LEFT JOIN
+t8
+ON t7.b=t8.b AND t6.b < 10
+)
+ON t6.b >= 2 AND t5.b=t7.b AND
+(t8.a > 0 OR t8.c IS NULL);
+a b a b a b a b
+2 2 1 2 2 2 1 2
+2 2 3 2 2 2 1 2
+1 1 1 2 1 1 NULL NULL
+1 1 3 2 1 1 NULL NULL
+3 3 NULL NULL NULL NULL NULL NULL
+DELETE FROM t5;
+DELETE FROM t6;
+DELETE FROM t7;
+DELETE FROM t8;
+INSERT INTO t5 VALUES (1,3,0), (3,2,0);
+INSERT INTO t6 VALUES (3,3,0);
+INSERT INTO t7 VALUES (1,2,0);
+INSERT INTO t8 VALUES (1,1,0);
+EXPLAIN
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t5 LEFT JOIN
+(t6 LEFT JOIN t7 ON t7.a=1, t8)
+ON (t5.b=t8.b);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t5 ALL NULL NULL NULL NULL 2
+1 SIMPLE t6 ALL NULL NULL NULL NULL 1 Using join buffer
+1 SIMPLE t7 const PRIMARY PRIMARY 4 const 1 Using join buffer
+1 SIMPLE t8 ALL b_i NULL NULL NULL 1 Using where; Using join buffer
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t5 LEFT JOIN
+(t6 LEFT JOIN t7 ON t7.a=1, t8)
+ON (t5.b=t8.b);
+a b a b a b a b
+1 3 NULL NULL NULL NULL NULL NULL
+3 2 NULL NULL NULL NULL NULL NULL
+EXPLAIN
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t5 LEFT JOIN
+(t6 LEFT JOIN t7 ON t7.b=2, t8)
+ON (t5.b=t8.b);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t5 ALL NULL NULL NULL NULL 2
+1 SIMPLE t6 ALL NULL NULL NULL NULL 1 Using join buffer
+1 SIMPLE t7 ref b_i b_i 5 const 0 Using join buffer
+1 SIMPLE t8 ALL b_i NULL NULL NULL 1 Using where; Using join buffer
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t5 LEFT JOIN
+(t6 LEFT JOIN t7 ON t7.b=2, t8)
+ON (t5.b=t8.b);
+a b a b a b a b
+1 3 NULL NULL NULL NULL NULL NULL
+3 2 NULL NULL NULL NULL NULL NULL
+EXPLAIN
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t5 LEFT JOIN
+(t8, t6 LEFT JOIN t7 ON t7.a=1)
+ON (t5.b=t8.b);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t5 ALL NULL NULL NULL NULL 2
+1 SIMPLE t8 ALL b_i NULL NULL NULL 1 Using where; Using join buffer
+1 SIMPLE t6 ALL NULL NULL NULL NULL 1 Using join buffer
+1 SIMPLE t7 const PRIMARY PRIMARY 4 const 1 Using join buffer
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+FROM t5 LEFT JOIN
+(t8, t6 LEFT JOIN t7 ON t7.a=1)
+ON (t5.b=t8.b);
+a b a b a b a b
+1 3 NULL NULL NULL NULL NULL NULL
+3 2 NULL NULL NULL NULL NULL NULL
+DROP TABLE t5,t6,t7,t8;
+set join_cache_level=default;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 1
=== modified file 'mysql-test/r/join_outer.result'
--- a/mysql-test/r/join_outer.result 2009-12-15 07:16:46 +0000
+++ b/mysql-test/r/join_outer.result 2009-12-21 02:26:15 +0000
@@ -859,14 +859,14 @@ a1 a2
EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON a1=0;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t1 system NULL NULL NULL NULL 1
-1 SIMPLE t2 ALL NULL NULL NULL NULL 2
+1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using where
SELECT * FROM t1 LEFT JOIN (t2,t3) ON a1=0;
a1 a2 a3
1 NULL NULL
EXPLAIN SELECT * FROM t1 LEFT JOIN (t2,t3) ON a1=0;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t1 system NULL NULL NULL NULL 1
-1 SIMPLE t2 ALL NULL NULL NULL NULL 2
+1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using where
1 SIMPLE t3 ALL NULL NULL NULL NULL 2
SELECT * FROM t0, t1 LEFT JOIN (t2,t3) ON a1=0 WHERE a0=a1;
a0 a1 a2 a3
@@ -875,7 +875,7 @@ EXPLAIN SELECT * FROM t0, t1 LEFT JOIN (
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t0 system PRIMARY NULL NULL NULL 1
1 SIMPLE t1 system PRIMARY NULL NULL NULL 1
-1 SIMPLE t2 ALL NULL NULL NULL NULL 2
+1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using where
1 SIMPLE t3 ALL NULL NULL NULL NULL 2
INSERT INTO t0 VALUES (0);
INSERT INTO t1 VALUES (0);
@@ -886,7 +886,7 @@ EXPLAIN SELECT * FROM t0, t1 LEFT JOIN (
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t0 const PRIMARY PRIMARY 4 const 1 Using index
1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1 Using index
-1 SIMPLE t2 ALL NULL NULL NULL NULL 2
+1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using where
1 SIMPLE t3 ALL NULL NULL NULL NULL 2
drop table t1,t2;
create table t1 (a int, b int);
=== added file 'mysql-test/r/join_outer_jcl6.result'
--- a/mysql-test/r/join_outer_jcl6.result 1970-01-01 00:00:00 +0000
+++ b/mysql-test/r/join_outer_jcl6.result 2009-12-21 02:26:15 +0000
@@ -0,0 +1,1264 @@
+set join_cache_level=6;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 6
+drop table if exists t0,t1,t2,t3,t4,t5;
+CREATE TABLE t1 (
+grp int(11) default NULL,
+a bigint(20) unsigned default NULL,
+c char(10) NOT NULL default ''
+) ENGINE=MyISAM;
+INSERT INTO t1 VALUES (1,1,'a'),(2,2,'b'),(2,3,'c'),(3,4,'E'),(3,5,'C'),(3,6,'D'),(NULL,NULL,'');
+create table t2 (id int, a bigint unsigned not null, c char(10), d int, primary key (a));
+insert into t2 values (1,1,"a",1),(3,4,"A",4),(3,5,"B",5),(3,6,"C",6),(4,7,"D",7);
+select t1.*,t2.* from t1 JOIN t2 where t1.a=t2.a;
+grp a c id a c d
+1 1 a 1 1 a 1
+3 4 E 3 4 A 4
+3 5 C 3 5 B 5
+3 6 D 3 6 C 6
+select t1.*,t2.* from t1 left join t2 on (t1.a=t2.a) order by t1.grp,t1.a,t2.c;
+grp a c id a c d
+NULL NULL NULL NULL NULL NULL
+1 1 a 1 1 a 1
+2 2 b NULL NULL NULL NULL
+2 3 c NULL NULL NULL NULL
+3 4 E 3 4 A 4
+3 5 C 3 5 B 5
+3 6 D 3 6 C 6
+select t1.*,t2.* from { oj t2 left outer join t1 on (t1.a=t2.a) };
+grp a c id a c d
+1 1 a 1 1 a 1
+3 4 E 3 4 A 4
+3 5 C 3 5 B 5
+3 6 D 3 6 C 6
+NULL NULL NULL 4 7 D 7
+select t1.*,t2.* from t1 as t0,{ oj t2 left outer join t1 on (t1.a=t2.a) } WHERE t0.a=2;
+grp a c id a c d
+1 1 a 1 1 a 1
+3 4 E 3 4 A 4
+3 5 C 3 5 B 5
+3 6 D 3 6 C 6
+NULL NULL NULL 4 7 D 7
+select t1.*,t2.* from t1 left join t2 using (a);
+grp a c id a c d
+1 1 a 1 1 a 1
+3 4 E 3 4 A 4
+3 5 C 3 5 B 5
+3 6 D 3 6 C 6
+2 2 b NULL NULL NULL NULL
+2 3 c NULL NULL NULL NULL
+NULL NULL NULL NULL NULL NULL
+select t1.*,t2.* from t1 left join t2 using (a) where t1.a=t2.a;
+grp a c id a c d
+1 1 a 1 1 a 1
+3 4 E 3 4 A 4
+3 5 C 3 5 B 5
+3 6 D 3 6 C 6
+select t1.*,t2.* from t1 left join t2 using (a,c);
+grp a c id a c d
+1 1 a 1 1 a 1
+2 2 b NULL NULL NULL NULL
+2 3 c NULL NULL NULL NULL
+3 4 E NULL NULL NULL NULL
+3 5 C NULL NULL NULL NULL
+3 6 D NULL NULL NULL NULL
+NULL NULL NULL NULL NULL NULL
+select t1.*,t2.* from t1 left join t2 using (c);
+grp a c id a c d
+1 1 a 1 1 a 1
+1 1 a 3 4 A 4
+2 2 b 3 5 B 5
+2 3 c 3 6 C 6
+3 5 C 3 6 C 6
+3 6 D 4 7 D 7
+3 4 E NULL NULL NULL NULL
+NULL NULL NULL NULL NULL NULL
+select t1.*,t2.* from t1 natural left outer join t2;
+grp a c id a c d
+1 1 a 1 1 a 1
+2 2 b NULL NULL NULL NULL
+2 3 c NULL NULL NULL NULL
+3 4 E NULL NULL NULL NULL
+3 5 C NULL NULL NULL NULL
+3 6 D NULL NULL NULL NULL
+NULL NULL NULL NULL NULL NULL
+select t1.*,t2.* from t1 left join t2 on (t1.a=t2.a) where t2.id=3;
+grp a c id a c d
+3 4 E 3 4 A 4
+3 5 C 3 5 B 5
+3 6 D 3 6 C 6
+select t1.*,t2.* from t1 left join t2 on (t1.a=t2.a) where t2.id is null;
+grp a c id a c d
+2 2 b NULL NULL NULL NULL
+2 3 c NULL NULL NULL NULL
+NULL NULL NULL NULL NULL NULL
+explain select t1.*,t2.* from t1,t2 where t1.a=t2.a and isnull(t2.a)=1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE NULL NULL NULL NULL NULL NULL NULL Impossible WHERE
+explain select t1.*,t2.* from t1 left join t2 on t1.a=t2.a where isnull(t2.a)=1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 7
+1 SIMPLE t2 eq_ref PRIMARY PRIMARY 8 test.t1.a 1 Using where; Using join buffer
+select t1.*,t2.*,t3.a from t1 left join t2 on (t1.a=t2.a) left join t1 as t3 on (t2.a=t3.a);
+grp a c id a c d a
+1 1 a 1 1 a 1 1
+3 4 E 3 4 A 4 4
+3 5 C 3 5 B 5 5
+3 6 D 3 6 C 6 6
+2 2 b NULL NULL NULL NULL NULL
+2 3 c NULL NULL NULL NULL NULL
+NULL NULL NULL NULL NULL NULL NULL
+explain select t1.*,t2.*,t3.a from t1 left join t2 on (t3.a=t2.a) left join t1 as t3 on (t1.a=t3.a);
+ERROR 42S22: Unknown column 't3.a' in 'on clause'
+select t1.*,t2.*,t3.a from t1 left join t2 on (t3.a=t2.a) left join t1 as t3 on (t1.a=t3.a);
+ERROR 42S22: Unknown column 't3.a' in 'on clause'
+select t1.*,t2.*,t3.a from t1 left join t2 on (t3.a=t2.a) left join t1 as t3 on (t2.a=t3.a);
+ERROR 42S22: Unknown column 't3.a' in 'on clause'
+select t1.*,t2.* from t1 inner join t2 using (a);
+grp a c id a c d
+1 1 a 1 1 a 1
+3 4 E 3 4 A 4
+3 5 C 3 5 B 5
+3 6 D 3 6 C 6
+select t1.*,t2.* from t1 inner join t2 on (t1.a=t2.a);
+grp a c id a c d
+1 1 a 1 1 a 1
+3 4 E 3 4 A 4
+3 5 C 3 5 B 5
+3 6 D 3 6 C 6
+select t1.*,t2.* from t1 natural join t2;
+grp a c id a c d
+1 1 a 1 1 a 1
+drop table t1,t2;
+CREATE TABLE t1 (
+usr_id INT unsigned NOT NULL,
+uniq_id INT unsigned NOT NULL AUTO_INCREMENT,
+start_num INT unsigned NOT NULL DEFAULT 1,
+increment INT unsigned NOT NULL DEFAULT 1,
+PRIMARY KEY (uniq_id),
+INDEX usr_uniq_idx (usr_id, uniq_id),
+INDEX uniq_usr_idx (uniq_id, usr_id)
+);
+CREATE TABLE t2 (
+id INT unsigned NOT NULL DEFAULT 0,
+usr2_id INT unsigned NOT NULL DEFAULT 0,
+max INT unsigned NOT NULL DEFAULT 0,
+c_amount INT unsigned NOT NULL DEFAULT 0,
+d_max INT unsigned NOT NULL DEFAULT 0,
+d_num INT unsigned NOT NULL DEFAULT 0,
+orig_time INT unsigned NOT NULL DEFAULT 0,
+c_time INT unsigned NOT NULL DEFAULT 0,
+active ENUM ("no","yes") NOT NULL,
+PRIMARY KEY (id,usr2_id),
+INDEX id_idx (id),
+INDEX usr2_idx (usr2_id)
+);
+INSERT INTO t1 VALUES (3,NULL,0,50),(3,NULL,0,200),(3,NULL,0,25),(3,NULL,0,84676),(3,NULL,0,235),(3,NULL,0,10),(3,NULL,0,3098),(3,NULL,0,2947),(3,NULL,0,8987),(3,NULL,0,8347654),(3,NULL,0,20398),(3,NULL,0,8976),(3,NULL,0,500),(3,NULL,0,198);
+SELECT t1.usr_id,t1.uniq_id,t1.increment,
+t2.usr2_id,t2.c_amount,t2.max
+FROM t1
+LEFT JOIN t2 ON t2.id = t1.uniq_id
+WHERE t1.uniq_id = 4
+ORDER BY t2.c_amount;
+usr_id uniq_id increment usr2_id c_amount max
+3 4 84676 NULL NULL NULL
+SELECT t1.usr_id,t1.uniq_id,t1.increment,
+t2.usr2_id,t2.c_amount,t2.max
+FROM t2
+RIGHT JOIN t1 ON t2.id = t1.uniq_id
+WHERE t1.uniq_id = 4
+ORDER BY t2.c_amount;
+usr_id uniq_id increment usr2_id c_amount max
+3 4 84676 NULL NULL NULL
+INSERT INTO t2 VALUES (2,3,3000,6000,0,0,746584,837484,'yes');
+INSERT INTO t2 VALUES (2,3,3000,6000,0,0,746584,837484,'yes');
+ERROR 23000: Duplicate entry '2-3' for key 'PRIMARY'
+INSERT INTO t2 VALUES (7,3,1000,2000,0,0,746294,937484,'yes');
+SELECT t1.usr_id,t1.uniq_id,t1.increment,t2.usr2_id,t2.c_amount,t2.max FROM t1 LEFT JOIN t2 ON t2.id = t1.uniq_id WHERE t1.uniq_id = 4 ORDER BY t2.c_amount;
+usr_id uniq_id increment usr2_id c_amount max
+3 4 84676 NULL NULL NULL
+SELECT t1.usr_id,t1.uniq_id,t1.increment,t2.usr2_id,t2.c_amount,t2.max FROM t1 LEFT JOIN t2 ON t2.id = t1.uniq_id WHERE t1.uniq_id = 4 GROUP BY t2.c_amount;
+usr_id uniq_id increment usr2_id c_amount max
+3 4 84676 NULL NULL NULL
+SELECT t1.usr_id,t1.uniq_id,t1.increment,t2.usr2_id,t2.c_amount,t2.max FROM t1 LEFT JOIN t2 ON t2.id = t1.uniq_id WHERE t1.uniq_id = 4;
+usr_id uniq_id increment usr2_id c_amount max
+3 4 84676 NULL NULL NULL
+drop table t1,t2;
+CREATE TABLE t1 (
+cod_asig int(11) DEFAULT '0' NOT NULL,
+desc_larga_cat varchar(80) DEFAULT '' NOT NULL,
+desc_larga_cas varchar(80) DEFAULT '' NOT NULL,
+desc_corta_cat varchar(40) DEFAULT '' NOT NULL,
+desc_corta_cas varchar(40) DEFAULT '' NOT NULL,
+cred_total double(3,1) DEFAULT '0.0' NOT NULL,
+pre_requisit int(11),
+co_requisit int(11),
+preco_requisit int(11),
+PRIMARY KEY (cod_asig)
+);
+INSERT INTO t1 VALUES (10360,'asdfggfg','Introduccion a los Ordenadores I','asdfggfg','Introduccio Ordinadors I',6.0,NULL,NULL,NULL);
+INSERT INTO t1 VALUES (10361,'Components i Circuits Electronics I','Componentes y Circuitos Electronicos I','Components i Circuits Electronics I','Comp. i Circ. Electr. I',6.0,NULL,NULL,NULL);
+INSERT INTO t1 VALUES (10362,'Laboratori d`Ordinadors','Laboratorio de Ordenadores','Laboratori d`Ordinadors','Laboratori Ordinadors',4.5,NULL,NULL,NULL);
+INSERT INTO t1 VALUES (10363,'Tecniques de Comunicacio Oral i Escrita','Tecnicas de Comunicacion Oral y Escrita','Tecniques de Comunicacio Oral i Escrita','Tec. Com. Oral i Escrita',4.5,NULL,NULL,NULL);
+INSERT INTO t1 VALUES (11403,'Projecte Fi de Carrera','Proyecto Fin de Carrera','Projecte Fi de Carrera','PFC',9.0,NULL,NULL,NULL);
+INSERT INTO t1 VALUES (11404,'+lgebra lineal','Algebra lineal','+lgebra lineal','+lgebra lineal',15.0,NULL,NULL,NULL);
+INSERT INTO t1 VALUES (11405,'+lgebra lineal','Algebra lineal','+lgebra lineal','+lgebra lineal',18.0,NULL,NULL,NULL);
+INSERT INTO t1 VALUES (11406,'Calcul Infinitesimal','Cßlculo Infinitesimal','Calcul Infinitesimal','Calcul Infinitesimal',15.0,NULL,NULL,NULL);
+CREATE TABLE t2 (
+idAssignatura int(11) DEFAULT '0' NOT NULL,
+Grup int(11) DEFAULT '0' NOT NULL,
+Places smallint(6) DEFAULT '0' NOT NULL,
+PlacesOcupades int(11) DEFAULT '0',
+PRIMARY KEY (idAssignatura,Grup)
+);
+INSERT INTO t2 VALUES (10360,12,333,0);
+INSERT INTO t2 VALUES (10361,30,2,0);
+INSERT INTO t2 VALUES (10361,40,3,0);
+INSERT INTO t2 VALUES (10360,45,10,0);
+INSERT INTO t2 VALUES (10362,10,12,0);
+INSERT INTO t2 VALUES (10360,55,2,0);
+INSERT INTO t2 VALUES (10360,70,0,0);
+INSERT INTO t2 VALUES (10360,565656,0,0);
+INSERT INTO t2 VALUES (10360,32767,7,0);
+INSERT INTO t2 VALUES (10360,33,8,0);
+INSERT INTO t2 VALUES (10360,7887,85,0);
+INSERT INTO t2 VALUES (11405,88,8,0);
+INSERT INTO t2 VALUES (10360,0,55,0);
+INSERT INTO t2 VALUES (10360,99,0,0);
+INSERT INTO t2 VALUES (11411,30,10,0);
+INSERT INTO t2 VALUES (11404,0,0,0);
+INSERT INTO t2 VALUES (10362,11,111,0);
+INSERT INTO t2 VALUES (10363,33,333,0);
+INSERT INTO t2 VALUES (11412,55,0,0);
+INSERT INTO t2 VALUES (50003,66,6,0);
+INSERT INTO t2 VALUES (11403,5,0,0);
+INSERT INTO t2 VALUES (11406,11,11,0);
+INSERT INTO t2 VALUES (11410,11410,131,0);
+INSERT INTO t2 VALUES (11416,11416,32767,0);
+INSERT INTO t2 VALUES (11409,0,0,0);
+CREATE TABLE t3 (
+id int(11) NOT NULL auto_increment,
+dni_pasaporte char(16) DEFAULT '' NOT NULL,
+idPla int(11) DEFAULT '0' NOT NULL,
+cod_asig int(11) DEFAULT '0' NOT NULL,
+any smallint(6) DEFAULT '0' NOT NULL,
+quatrimestre smallint(6) DEFAULT '0' NOT NULL,
+estat char(1) DEFAULT 'M' NOT NULL,
+PRIMARY KEY (id),
+UNIQUE dni_pasaporte (dni_pasaporte,idPla),
+UNIQUE dni_pasaporte_2 (dni_pasaporte,idPla,cod_asig,any,quatrimestre)
+);
+INSERT INTO t3 VALUES (1,'11111111',1,10362,98,1,'M');
+CREATE TABLE t4 (
+id int(11) NOT NULL auto_increment,
+papa int(11) DEFAULT '0' NOT NULL,
+fill int(11) DEFAULT '0' NOT NULL,
+idPla int(11) DEFAULT '0' NOT NULL,
+PRIMARY KEY (id),
+KEY papa (idPla,papa),
+UNIQUE papa_2 (idPla,papa,fill)
+);
+INSERT INTO t4 VALUES (1,-1,10360,1);
+INSERT INTO t4 VALUES (2,-1,10361,1);
+INSERT INTO t4 VALUES (3,-1,10362,1);
+SELECT DISTINCT fill,desc_larga_cat,cred_total,Grup,Places,PlacesOcupades FROM t4 LEFT JOIN t3 ON t3.cod_asig=fill AND estat='S' AND dni_pasaporte='11111111' AND t3.idPla=1 , t2,t1 WHERE fill=t1.cod_asig AND Places>PlacesOcupades AND fill=idAssignatura AND t4.idPla=1 AND papa=-1;
+fill desc_larga_cat cred_total Grup Places PlacesOcupades
+10360 asdfggfg 6.0 12 333 0
+10361 Components i Circuits Electronics I 6.0 30 2 0
+10361 Components i Circuits Electronics I 6.0 40 3 0
+10360 asdfggfg 6.0 45 10 0
+10362 Laboratori d`Ordinadors 4.5 10 12 0
+10360 asdfggfg 6.0 55 2 0
+10360 asdfggfg 6.0 32767 7 0
+10360 asdfggfg 6.0 33 8 0
+10360 asdfggfg 6.0 7887 85 0
+10360 asdfggfg 6.0 0 55 0
+10362 Laboratori d`Ordinadors 4.5 11 111 0
+SELECT DISTINCT fill,t3.idPla FROM t4 LEFT JOIN t3 ON t3.cod_asig=t4.fill AND t3.estat='S' AND t3.dni_pasaporte='1234' AND t3.idPla=1 ;
+fill idPla
+10360 NULL
+10361 NULL
+10362 NULL
+INSERT INTO t3 VALUES (3,'1234',1,10360,98,1,'S');
+SELECT DISTINCT fill,t3.idPla FROM t4 LEFT JOIN t3 ON t3.cod_asig=t4.fill AND t3.estat='S' AND t3.dni_pasaporte='1234' AND t3.idPla=1 ;
+fill idPla
+10360 1
+10361 NULL
+10362 NULL
+drop table t1,t2,t3,test.t4;
+CREATE TABLE t1 (
+id smallint(5) unsigned NOT NULL auto_increment,
+name char(60) DEFAULT '' NOT NULL,
+PRIMARY KEY (id)
+);
+INSERT INTO t1 VALUES (1,'Antonio Paz');
+INSERT INTO t1 VALUES (2,'Lilliana Angelovska');
+INSERT INTO t1 VALUES (3,'Thimble Smith');
+CREATE TABLE t2 (
+id smallint(5) unsigned NOT NULL auto_increment,
+owner smallint(5) unsigned DEFAULT '0' NOT NULL,
+name char(60),
+PRIMARY KEY (id)
+);
+INSERT INTO t2 VALUES (1,1,'El Gato');
+INSERT INTO t2 VALUES (2,1,'Perrito');
+INSERT INTO t2 VALUES (3,3,'Happy');
+select t1.name, t2.name, t2.id from t1 left join t2 on (t1.id = t2.owner);
+name name id
+Antonio Paz El Gato 1
+Antonio Paz Perrito 2
+Thimble Smith Happy 3
+Lilliana Angelovska NULL NULL
+select t1.name, t2.name, t2.id from t1 left join t2 on (t1.id = t2.owner) where t2.id is null;
+name name id
+Lilliana Angelovska NULL NULL
+explain select t1.name, t2.name, t2.id from t1 left join t2 on (t1.id = t2.owner) where t2.id is null;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 Using where; Not exists; Using join buffer
+explain select t1.name, t2.name, t2.id from t1 left join t2 on (t1.id = t2.owner) where t2.name is null;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 Using where; Using join buffer
+select count(*) from t1 left join t2 on (t1.id = t2.owner);
+count(*)
+4
+select t1.name, t2.name, t2.id from t2 right join t1 on (t1.id = t2.owner);
+name name id
+Antonio Paz El Gato 1
+Antonio Paz Perrito 2
+Thimble Smith Happy 3
+Lilliana Angelovska NULL NULL
+select t1.name, t2.name, t2.id from t2 right join t1 on (t1.id = t2.owner) where t2.id is null;
+name name id
+Lilliana Angelovska NULL NULL
+explain select t1.name, t2.name, t2.id from t2 right join t1 on (t1.id = t2.owner) where t2.id is null;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 Using where; Not exists; Using join buffer
+explain select t1.name, t2.name, t2.id from t2 right join t1 on (t1.id = t2.owner) where t2.name is null;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 Using where; Using join buffer
+select count(*) from t2 right join t1 on (t1.id = t2.owner);
+count(*)
+4
+select t1.name, t2.name, t2.id,t3.id from t2 right join t1 on (t1.id = t2.owner) left join t1 as t3 on t3.id=t2.owner;
+name name id id
+Antonio Paz El Gato 1 1
+Antonio Paz Perrito 2 1
+Thimble Smith Happy 3 3
+Lilliana Angelovska NULL NULL NULL
+select t1.name, t2.name, t2.id,t3.id from t1 right join t2 on (t1.id = t2.owner) right join t1 as t3 on t3.id=t2.owner;
+name name id id
+Antonio Paz El Gato 1 1
+Antonio Paz Perrito 2 1
+Thimble Smith Happy 3 3
+NULL NULL NULL 2
+select t1.name, t2.name, t2.id, t2.owner, t3.id from t1 left join t2 on (t1.id = t2.owner) right join t1 as t3 on t3.id=t2.owner;
+name name id owner id
+Antonio Paz El Gato 1 1 1
+Antonio Paz Perrito 2 1 1
+Thimble Smith Happy 3 3 3
+NULL NULL NULL NULL 2
+drop table t1,t2;
+create table t1 (id int not null, str char(10), index(str));
+insert into t1 values (1, null), (2, null), (3, "foo"), (4, "bar");
+select * from t1 where str is not null order by id;
+id str
+3 foo
+4 bar
+select * from t1 where str is null;
+id str
+1 NULL
+2 NULL
+drop table t1;
+CREATE TABLE t1 (
+t1_id bigint(21) NOT NULL auto_increment,
+PRIMARY KEY (t1_id)
+);
+CREATE TABLE t2 (
+t2_id bigint(21) NOT NULL auto_increment,
+PRIMARY KEY (t2_id)
+);
+CREATE TABLE t3 (
+t3_id bigint(21) NOT NULL auto_increment,
+PRIMARY KEY (t3_id)
+);
+CREATE TABLE t4 (
+seq_0_id bigint(21) DEFAULT '0' NOT NULL,
+seq_1_id bigint(21) DEFAULT '0' NOT NULL,
+KEY seq_0_id (seq_0_id),
+KEY seq_1_id (seq_1_id)
+);
+CREATE TABLE t5 (
+seq_0_id bigint(21) DEFAULT '0' NOT NULL,
+seq_1_id bigint(21) DEFAULT '0' NOT NULL,
+KEY seq_1_id (seq_1_id),
+KEY seq_0_id (seq_0_id)
+);
+insert into t1 values (1);
+insert into t2 values (1);
+insert into t3 values (1);
+insert into t4 values (1,1);
+insert into t5 values (1,1);
+explain select * from t3 left join t4 on t4.seq_1_id = t2.t2_id left join t1 on t1.t1_id = t4.seq_0_id left join t5 on t5.seq_0_id = t1.t1_id left join t2 on t2.t2_id = t5.seq_1_id where t3.t3_id = 23;
+ERROR 42S22: Unknown column 't2.t2_id' in 'on clause'
+drop table t1,t2,t3,t4,t5;
+create table t1 (n int, m int, o int, key(n));
+create table t2 (n int not null, m int, o int, primary key(n));
+insert into t1 values (1, 2, 11), (1, 2, 7), (2, 2, 8), (1,2,9),(1,3,9);
+insert into t2 values (1, 2, 3),(2, 2, 8), (4,3,9),(3,2,10);
+select t1.*, t2.* from t1 left join t2 on t1.n = t2.n and
+t1.m = t2.m where t1.n = 1;
+n m o n m o
+1 2 11 1 2 3
+1 2 7 1 2 3
+1 2 9 1 2 3
+1 3 9 NULL NULL NULL
+select t1.*, t2.* from t1 left join t2 on t1.n = t2.n and
+t1.m = t2.m where t1.n = 1 order by t1.o;
+n m o n m o
+1 2 7 1 2 3
+1 2 9 1 2 3
+1 3 9 NULL NULL NULL
+1 2 11 1 2 3
+drop table t1,t2;
+CREATE TABLE t1 (id1 INT NOT NULL PRIMARY KEY, dat1 CHAR(1), id2 INT);
+INSERT INTO t1 VALUES (1,'a',1);
+INSERT INTO t1 VALUES (2,'b',1);
+INSERT INTO t1 VALUES (3,'c',2);
+CREATE TABLE t2 (id2 INT NOT NULL PRIMARY KEY, dat2 CHAR(1));
+INSERT INTO t2 VALUES (1,'x');
+INSERT INTO t2 VALUES (2,'y');
+INSERT INTO t2 VALUES (3,'z');
+SELECT t2.id2 FROM t2 LEFT OUTER JOIN t1 ON t1.id2 = t2.id2 WHERE id1 IS NULL;
+id2
+3
+SELECT t2.id2 FROM t2 NATURAL LEFT OUTER JOIN t1 WHERE id1 IS NULL;
+id2
+3
+drop table t1,t2;
+create table t1 ( color varchar(20), name varchar(20) );
+insert into t1 values ( 'red', 'apple' );
+insert into t1 values ( 'yellow', 'banana' );
+insert into t1 values ( 'green', 'lime' );
+insert into t1 values ( 'black', 'grape' );
+insert into t1 values ( 'blue', 'blueberry' );
+create table t2 ( count int, color varchar(20) );
+insert into t2 values (10, 'green');
+insert into t2 values (5, 'black');
+insert into t2 values (15, 'white');
+insert into t2 values (7, 'green');
+select * from t1;
+color name
+red apple
+yellow banana
+green lime
+black grape
+blue blueberry
+select * from t2;
+count color
+10 green
+5 black
+15 white
+7 green
+select * from t2 natural join t1;
+color count name
+green 10 lime
+green 7 lime
+black 5 grape
+select t2.count, t1.name from t2 natural join t1;
+count name
+10 lime
+7 lime
+5 grape
+select t2.count, t1.name from t2 inner join t1 using (color);
+count name
+10 lime
+7 lime
+5 grape
+drop table t1;
+drop table t2;
+CREATE TABLE t1 (
+pcode varchar(8) DEFAULT '' NOT NULL
+);
+INSERT INTO t1 VALUES ('kvw2000'),('kvw2001'),('kvw3000'),('kvw3001'),('kvw3002'),('kvw3500'),('kvw3501'),('kvw3502'),('kvw3800'),('kvw3801'),('kvw3802'),('kvw3900'),('kvw3901'),('kvw3902'),('kvw4000'),('kvw4001'),('kvw4002'),('kvw4200'),('kvw4500'),('kvw5000'),('kvw5001'),('kvw5500'),('kvw5510'),('kvw5600'),('kvw5601'),('kvw6000'),('klw1000'),('klw1020'),('klw1500'),('klw2000'),('klw2001'),('klw2002'),('kld2000'),('klw2500'),('kmw1000'),('kmw1500'),('kmw2000'),('kmw2001'),('kmw2100'),('kmw3000'),('kmw3200');
+CREATE TABLE t2 (
+pcode varchar(8) DEFAULT '' NOT NULL,
+KEY pcode (pcode)
+);
+INSERT INTO t2 VALUES ('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw2000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3000'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw3500'),('kvw6000'),('kvw6000'),('kld2000');
+SELECT t1.pcode, IF(ISNULL(t2.pcode), 0, COUNT(*)) AS count FROM t1
+LEFT JOIN t2 ON t1.pcode = t2.pcode GROUP BY t1.pcode;
+pcode count
+kld2000 1
+klw1000 0
+klw1020 0
+klw1500 0
+klw2000 0
+klw2001 0
+klw2002 0
+klw2500 0
+kmw1000 0
+kmw1500 0
+kmw2000 0
+kmw2001 0
+kmw2100 0
+kmw3000 0
+kmw3200 0
+kvw2000 26
+kvw2001 0
+kvw3000 36
+kvw3001 0
+kvw3002 0
+kvw3500 26
+kvw3501 0
+kvw3502 0
+kvw3800 0
+kvw3801 0
+kvw3802 0
+kvw3900 0
+kvw3901 0
+kvw3902 0
+kvw4000 0
+kvw4001 0
+kvw4002 0
+kvw4200 0
+kvw4500 0
+kvw5000 0
+kvw5001 0
+kvw5500 0
+kvw5510 0
+kvw5600 0
+kvw5601 0
+kvw6000 2
+SELECT SQL_BIG_RESULT t1.pcode, IF(ISNULL(t2.pcode), 0, COUNT(*)) AS count FROM t1 LEFT JOIN t2 ON t1.pcode = t2.pcode GROUP BY t1.pcode;
+pcode count
+kld2000 1
+klw1000 0
+klw1020 0
+klw1500 0
+klw2000 0
+klw2001 0
+klw2002 0
+klw2500 0
+kmw1000 0
+kmw1500 0
+kmw2000 0
+kmw2001 0
+kmw2100 0
+kmw3000 0
+kmw3200 0
+kvw2000 26
+kvw2001 0
+kvw3000 36
+kvw3001 0
+kvw3002 0
+kvw3500 26
+kvw3501 0
+kvw3502 0
+kvw3800 0
+kvw3801 0
+kvw3802 0
+kvw3900 0
+kvw3901 0
+kvw3902 0
+kvw4000 0
+kvw4001 0
+kvw4002 0
+kvw4200 0
+kvw4500 0
+kvw5000 0
+kvw5001 0
+kvw5500 0
+kvw5510 0
+kvw5600 0
+kvw5601 0
+kvw6000 2
+drop table t1,t2;
+CREATE TABLE t1 (
+id int(11),
+pid int(11),
+rep_del tinyint(4),
+KEY id (id),
+KEY pid (pid)
+);
+INSERT INTO t1 VALUES (1,NULL,NULL);
+INSERT INTO t1 VALUES (2,1,NULL);
+select * from t1 LEFT JOIN t1 t2 ON (t1.id=t2.pid) AND t2.rep_del IS NULL;
+id pid rep_del id pid rep_del
+1 NULL NULL 2 1 NULL
+2 1 NULL NULL NULL NULL
+create index rep_del ON t1(rep_del);
+select * from t1 LEFT JOIN t1 t2 ON (t1.id=t2.pid) AND t2.rep_del IS NULL;
+id pid rep_del id pid rep_del
+1 NULL NULL 2 1 NULL
+2 1 NULL NULL NULL NULL
+drop table t1;
+CREATE TABLE t1 (
+id int(11) DEFAULT '0' NOT NULL,
+name tinytext DEFAULT '' NOT NULL,
+UNIQUE id (id)
+);
+Warnings:
+Warning 1101 BLOB/TEXT column 'name' can't have a default value
+INSERT INTO t1 VALUES (1,'yes'),(2,'no');
+CREATE TABLE t2 (
+id int(11) DEFAULT '0' NOT NULL,
+idx int(11) DEFAULT '0' NOT NULL,
+UNIQUE id (id,idx)
+);
+INSERT INTO t2 VALUES (1,1);
+explain SELECT * from t1 left join t2 on t1.id=t2.id where t2.id IS NULL;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 2
+1 SIMPLE t2 ref id id 4 test.t1.id 1 Using where; Using index; Not exists
+SELECT * from t1 left join t2 on t1.id=t2.id where t2.id IS NULL;
+id name id idx
+2 no NULL NULL
+drop table t1,t2;
+create table t1 (bug_id mediumint, reporter mediumint);
+create table t2 (bug_id mediumint, who mediumint, index(who));
+insert into t2 values (1,1),(1,2);
+insert into t1 values (1,1),(2,1);
+SELECT * FROM t1 LEFT JOIN t2 ON (t1.bug_id = t2.bug_id AND t2.who = 2) WHERE (t1.reporter = 2 OR t2.who = 2);
+bug_id reporter bug_id who
+1 1 1 2
+drop table t1,t2;
+create table t1 (fooID smallint unsigned auto_increment, primary key (fooID));
+create table t2 (fooID smallint unsigned not null, barID smallint unsigned not null, primary key (fooID,barID));
+insert into t1 (fooID) values (10),(20),(30);
+insert into t2 values (10,1),(20,2),(30,3);
+explain select * from t2 left join t1 on t1.fooID = t2.fooID and t1.fooID = 30;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 index NULL PRIMARY 4 NULL 3 Using index
+1 SIMPLE t1 const PRIMARY PRIMARY 2 const 1 Using where; Using index
+select * from t2 left join t1 on t1.fooID = t2.fooID and t1.fooID = 30;
+fooID barID fooID
+10 1 NULL
+20 2 NULL
+30 3 30
+select * from t2 left join t1 ignore index(primary) on t1.fooID = t2.fooID and t1.fooID = 30;
+fooID barID fooID
+30 3 30
+10 1 NULL
+20 2 NULL
+drop table t1,t2;
+create table t1 (i int);
+create table t2 (i int);
+create table t3 (i int);
+insert into t1 values(1),(2);
+insert into t2 values(2),(3);
+insert into t3 values(2),(4);
+select * from t1 natural left join t2 natural left join t3;
+i
+2
+1
+select * from t1 natural left join t2 where (t2.i is not null)=0;
+i
+1
+select * from t1 natural left join t2 where (t2.i is not null) is not null;
+i
+2
+1
+select * from t1 natural left join t2 where (i is not null)=0;
+i
+select * from t1 natural left join t2 where (i is not null) is not null;
+i
+2
+1
+drop table t1,t2,t3;
+create table t1 (f1 integer,f2 integer,f3 integer);
+create table t2 (f2 integer,f4 integer);
+create table t3 (f3 integer,f5 integer);
+select * from t1
+left outer join t2 using (f2)
+left outer join t3 using (f3);
+f3 f2 f1 f4 f5
+drop table t1,t2,t3;
+create table t1 (a1 int, a2 int);
+create table t2 (b1 int not null, b2 int);
+create table t3 (c1 int, c2 int);
+insert into t1 values (1,2), (2,2), (3,2);
+insert into t2 values (1,3), (2,3);
+insert into t3 values (2,4), (3,4);
+select * from t1 left join t2 on b1 = a1 left join t3 on c1 = a1 and b1 is null;
+a1 a2 b1 b2 c1 c2
+1 2 1 3 NULL NULL
+2 2 2 3 NULL NULL
+3 2 NULL NULL 3 4
+explain select * from t1 left join t2 on b1 = a1 left join t3 on c1 = a1 and b1 is null;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3
+1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using where; Using join buffer
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2 Using where; Using join buffer
+drop table t1, t2, t3;
+create table t1 (
+a int(11),
+b char(10),
+key (a)
+);
+insert into t1 (a) values (1),(2),(3),(4);
+create table t2 (a int);
+select * from t1 left join t2 on t1.a=t2.a where not (t2.a <=> t1.a);
+a b a
+1 NULL NULL
+2 NULL NULL
+3 NULL NULL
+4 NULL NULL
+select * from t1 left join t2 on t1.a=t2.a having not (t2.a <=> t1.a);
+a b a
+1 NULL NULL
+2 NULL NULL
+3 NULL NULL
+4 NULL NULL
+drop table t1,t2;
+create table t1 (
+match_id tinyint(3) unsigned not null auto_increment,
+home tinyint(3) unsigned default '0',
+unique key match_id (match_id),
+key match_id_2 (match_id)
+);
+insert into t1 values("1", "2");
+create table t2 (
+player_id tinyint(3) unsigned default '0',
+match_1_h tinyint(3) unsigned default '0',
+key player_id (player_id)
+);
+insert into t2 values("1", "5");
+insert into t2 values("2", "9");
+insert into t2 values("3", "3");
+insert into t2 values("4", "7");
+insert into t2 values("5", "6");
+insert into t2 values("6", "8");
+insert into t2 values("7", "4");
+insert into t2 values("8", "12");
+insert into t2 values("9", "11");
+insert into t2 values("10", "10");
+explain select s.*, '*', m.*, (s.match_1_h - m.home) UUX from
+(t2 s left join t1 m on m.match_id = 1)
+order by m.match_id desc;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE s ALL NULL NULL NULL NULL 10 Using temporary; Using filesort
+1 SIMPLE m const match_id,match_id_2 match_id 1 const 1 Using join buffer
+explain select s.*, '*', m.*, (s.match_1_h - m.home) UUX from
+(t2 s left join t1 m on m.match_id = 1)
+order by UUX desc;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE s ALL NULL NULL NULL NULL 10 Using temporary; Using filesort
+1 SIMPLE m const match_id,match_id_2 match_id 1 const 1 Using join buffer
+select s.*, '*', m.*, (s.match_1_h - m.home) UUX from
+(t2 s left join t1 m on m.match_id = 1)
+order by UUX desc;
+player_id match_1_h * match_id home UUX
+8 12 * 1 2 10
+9 11 * 1 2 9
+10 10 * 1 2 8
+2 9 * 1 2 7
+6 8 * 1 2 6
+4 7 * 1 2 5
+5 6 * 1 2 4
+1 5 * 1 2 3
+7 4 * 1 2 2
+3 3 * 1 2 1
+explain select s.*, '*', m.*, (s.match_1_h - m.home) UUX from
+t2 s straight_join t1 m where m.match_id = 1
+order by UUX desc;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE s ALL NULL NULL NULL NULL 10 Using temporary; Using filesort
+1 SIMPLE m const match_id,match_id_2 match_id 1 const 1 Using join buffer
+select s.*, '*', m.*, (s.match_1_h - m.home) UUX from
+t2 s straight_join t1 m where m.match_id = 1
+order by UUX desc;
+player_id match_1_h * match_id home UUX
+8 12 * 1 2 10
+9 11 * 1 2 9
+10 10 * 1 2 8
+2 9 * 1 2 7
+6 8 * 1 2 6
+4 7 * 1 2 5
+5 6 * 1 2 4
+1 5 * 1 2 3
+7 4 * 1 2 2
+3 3 * 1 2 1
+drop table t1, t2;
+create table t1 (a int, b int, unique index idx (a, b));
+create table t2 (a int, b int, c int, unique index idx (a, b));
+insert into t1 values (1, 10), (1,11), (2,10), (2,11);
+insert into t2 values (1,10,3);
+select t1.a, t1.b, t2.c from t1 left join t2
+on t1.a=t2.a and t1.b=t2.b and t2.c=3
+where t1.a=1 and t2.c is null;
+a b c
+1 11 NULL
+drop table t1, t2;
+CREATE TABLE t1 (
+ts_id bigint(20) default NULL,
+inst_id tinyint(4) default NULL,
+flag_name varchar(64) default NULL,
+flag_value text,
+UNIQUE KEY ts_id (ts_id,inst_id,flag_name)
+) ENGINE=MyISAM DEFAULT CHARSET=utf8;
+CREATE TABLE t2 (
+ts_id bigint(20) default NULL,
+inst_id tinyint(4) default NULL,
+flag_name varchar(64) default NULL,
+flag_value text,
+UNIQUE KEY ts_id (ts_id,inst_id,flag_name)
+) ENGINE=MyISAM DEFAULT CHARSET=utf8;
+INSERT INTO t1 VALUES
+(111056548820001, 0, 'flag1', NULL),
+(111056548820001, 0, 'flag2', NULL),
+(2, 0, 'other_flag', NULL);
+INSERT INTO t2 VALUES
+(111056548820001, 3, 'flag1', 'sss');
+SELECT t1.flag_name,t2.flag_value
+FROM t1 LEFT JOIN t2
+ON (t1.ts_id = t2.ts_id AND t1.flag_name = t2.flag_name AND
+t2.inst_id = 3)
+WHERE t1.inst_id = 0 AND t1.ts_id=111056548820001 AND
+t2.flag_value IS NULL;
+flag_name flag_value
+flag2 NULL
+DROP TABLE t1,t2;
+CREATE TABLE t1 (
+id int(11) unsigned NOT NULL auto_increment,
+text_id int(10) unsigned default NULL,
+PRIMARY KEY (id)
+);
+INSERT INTO t1 VALUES("1", "0");
+INSERT INTO t1 VALUES("2", "10");
+CREATE TABLE t2 (
+text_id char(3) NOT NULL default '',
+language_id char(3) NOT NULL default '',
+text_data text,
+PRIMARY KEY (text_id,language_id)
+);
+INSERT INTO t2 VALUES("0", "EN", "0-EN");
+INSERT INTO t2 VALUES("0", "SV", "0-SV");
+INSERT INTO t2 VALUES("10", "EN", "10-EN");
+INSERT INTO t2 VALUES("10", "SV", "10-SV");
+SELECT t1.id, t1.text_id, t2.text_data
+FROM t1 LEFT JOIN t2
+ON t1.text_id = t2.text_id
+AND t2.language_id = 'SV'
+ WHERE (t1.id LIKE '%' OR t2.text_data LIKE '%');
+id text_id text_data
+1 0 0-SV
+2 10 10-SV
+DROP TABLE t1, t2;
+CREATE TABLE t0 (a0 int PRIMARY KEY);
+CREATE TABLE t1 (a1 int PRIMARY KEY);
+CREATE TABLE t2 (a2 int);
+CREATE TABLE t3 (a3 int);
+INSERT INTO t0 VALUES (1);
+INSERT INTO t1 VALUES (1);
+INSERT INTO t2 VALUES (1), (2);
+INSERT INTO t3 VALUES (1), (2);
+SELECT * FROM t1 LEFT JOIN t2 ON a1=0;
+a1 a2
+1 NULL
+EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON a1=0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 system NULL NULL NULL NULL 1
+1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using where
+SELECT * FROM t1 LEFT JOIN (t2,t3) ON a1=0;
+a1 a2 a3
+1 NULL NULL
+EXPLAIN SELECT * FROM t1 LEFT JOIN (t2,t3) ON a1=0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 system NULL NULL NULL NULL 1
+1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using where
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2
+SELECT * FROM t0, t1 LEFT JOIN (t2,t3) ON a1=0 WHERE a0=a1;
+a0 a1 a2 a3
+1 1 NULL NULL
+EXPLAIN SELECT * FROM t0, t1 LEFT JOIN (t2,t3) ON a1=0 WHERE a0=a1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t0 system PRIMARY NULL NULL NULL 1
+1 SIMPLE t1 system PRIMARY NULL NULL NULL 1
+1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using where
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2
+INSERT INTO t0 VALUES (0);
+INSERT INTO t1 VALUES (0);
+SELECT * FROM t0, t1 LEFT JOIN (t2,t3) ON a1=5 WHERE a0=a1 AND a0=1;
+a0 a1 a2 a3
+1 1 NULL NULL
+EXPLAIN SELECT * FROM t0, t1 LEFT JOIN (t2,t3) ON a1=5 WHERE a0=a1 AND a0=1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t0 const PRIMARY PRIMARY 4 const 1 Using index
+1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1 Using index
+1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using where
+1 SIMPLE t3 ALL NULL NULL NULL NULL 2
+drop table t1,t2;
+create table t1 (a int, b int);
+insert into t1 values (1,1),(2,2),(3,3);
+create table t2 (a int, b int);
+insert into t2 values (1,1), (2,2);
+select * from t2 right join t1 on t2.a=t1.a;
+a b a b
+1 1 1 1
+2 2 2 2
+NULL NULL 3 3
+select straight_join * from t2 right join t1 on t2.a=t1.a;
+a b a b
+1 1 1 1
+2 2 2 2
+NULL NULL 3 3
+DROP TABLE t0,t1,t2,t3;
+CREATE TABLE t1 (a int PRIMARY KEY, b int);
+CREATE TABLE t2 (a int PRIMARY KEY, b int);
+INSERT INTO t1 VALUES (1,1), (2,1), (3,1), (4,2);
+INSERT INTO t2 VALUES (1,2), (2,2);
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a=t2.a;
+a b a b
+1 1 1 2
+2 1 2 2
+3 1 NULL NULL
+4 2 NULL NULL
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a=t2.a WHERE t1.b=1;
+a b a b
+1 1 1 2
+2 1 2 2
+3 1 NULL NULL
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a=t2.a
+WHERE t1.b=1 XOR (NOT ISNULL(t2.a) AND t2.b=1);
+a b a b
+1 1 1 2
+2 1 2 2
+3 1 NULL NULL
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a=t2.a WHERE not(0+(t1.a=30 and t2.b=1));
+a b a b
+1 1 1 2
+2 1 2 2
+3 1 NULL NULL
+4 2 NULL NULL
+DROP TABLE t1,t2;
+set group_concat_max_len=5;
+create table t1 (a int, b varchar(20));
+create table t2 (a int, c varchar(20));
+insert into t1 values (1,"aaaaaaaaaa"),(2,"bbbbbbbbbb");
+insert into t2 values (1,"cccccccccc"),(2,"dddddddddd");
+select group_concat(t1.b,t2.c) from t1 left join t2 using(a) group by t1.a;
+group_concat(t1.b,t2.c)
+aaaaa
+bbbbb
+Warnings:
+Warning 1260 2 line(s) were cut by GROUP_CONCAT()
+select group_concat(t1.b,t2.c) from t1 inner join t2 using(a) group by t1.a;
+group_concat(t1.b,t2.c)
+aaaaa
+bbbbb
+Warnings:
+Warning 1260 2 line(s) were cut by GROUP_CONCAT()
+select group_concat(t1.b,t2.c) from t1 left join t2 using(a) group by a;
+group_concat(t1.b,t2.c)
+aaaaa
+bbbbb
+Warnings:
+Warning 1260 2 line(s) were cut by GROUP_CONCAT()
+select group_concat(t1.b,t2.c) from t1 inner join t2 using(a) group by a;
+group_concat(t1.b,t2.c)
+aaaaa
+bbbbb
+Warnings:
+Warning 1260 2 line(s) were cut by GROUP_CONCAT()
+drop table t1, t2;
+set group_concat_max_len=default;
+create table t1 (gid smallint(5) unsigned not null, x int(11) not null, y int(11) not null, art int(11) not null, primary key (gid,x,y));
+insert t1 values (1, -5, -8, 2), (1, 2, 2, 1), (1, 1, 1, 1);
+create table t2 (gid smallint(5) unsigned not null, x int(11) not null, y int(11) not null, id int(11) not null, primary key (gid,id,x,y), key id (id));
+insert t2 values (1, -5, -8, 1), (1, 1, 1, 1), (1, 2, 2, 1);
+create table t3 ( set_id smallint(5) unsigned not null, id tinyint(4) unsigned not null, name char(12) not null, primary key (id,set_id));
+insert t3 values (0, 1, 'a'), (1, 1, 'b'), (0, 2, 'c'), (1, 2, 'd'), (1, 3, 'e'), (1, 4, 'f'), (1, 5, 'g'), (1, 6, 'h');
+explain select name from t1 left join t2 on t1.x = t2.x and t1.y = t2.y
+left join t3 on t1.art = t3.id where t2.id =1 and t2.x = -5 and t2.y =-8
+and t1.gid =1 and t2.gid =1 and t3.set_id =1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 const PRIMARY PRIMARY 10 const,const,const 1
+1 SIMPLE t2 const PRIMARY,id PRIMARY 14 const,const,const,const 1 Using index
+1 SIMPLE t3 const PRIMARY PRIMARY 3 const,const 1
+drop tables t1,t2,t3;
+CREATE TABLE t1 (EMPNUM INT, GRP INT);
+INSERT INTO t1 VALUES (0, 10);
+INSERT INTO t1 VALUES (2, 30);
+CREATE TABLE t2 (EMPNUM INT, NAME CHAR(5));
+INSERT INTO t2 VALUES (0, 'KERI');
+INSERT INTO t2 VALUES (9, 'BARRY');
+CREATE VIEW v1 AS
+SELECT COALESCE(t2.EMPNUM,t1.EMPNUM) AS EMPNUM, NAME, GRP
+FROM t2 LEFT OUTER JOIN t1 ON t2.EMPNUM=t1.EMPNUM;
+SELECT * FROM v1;
+EMPNUM NAME GRP
+0 KERI 10
+9 BARRY NULL
+SELECT * FROM v1 WHERE EMPNUM < 10;
+EMPNUM NAME GRP
+0 KERI 10
+9 BARRY NULL
+DROP VIEW v1;
+DROP TABLE t1,t2;
+CREATE TABLE t1 (c11 int);
+CREATE TABLE t2 (c21 int);
+INSERT INTO t1 VALUES (30), (40), (50);
+INSERT INTO t2 VALUES (300), (400), (500);
+SELECT * FROM t1 LEFT JOIN t2 ON (c11=c21 AND c21=30) WHERE c11=40;
+c11 c21
+40 NULL
+DROP TABLE t1, t2;
+CREATE TABLE t1 (a int PRIMARY KEY, b int);
+CREATE TABLE t2 (a int PRIMARY KEY, b int);
+INSERT INTO t1 VALUES (1,2), (2,1), (3,2), (4,3), (5,6), (6,5), (7,8), (8,7), (9,10);
+INSERT INTO t2 VALUES (3,0), (4,1), (6,4), (7,5);
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t2.b <= t1.a AND t1.a <= t1.b;
+a b a b
+7 8 7 5
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t1.a BETWEEN t2.b AND t1.b;
+a b a b
+7 8 7 5
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE NOT(t1.a NOT BETWEEN t2.b AND t1.b);
+a b a b
+7 8 7 5
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t2.b > t1.a OR t1.a > t1.b;
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+2 1 NULL NULL
+8 7 NULL NULL
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t1.a NOT BETWEEN t2.b AND t1.b;
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+2 1 NULL NULL
+8 7 NULL NULL
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE NOT(t1.a BETWEEN t2.b AND t1.b);
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+2 1 NULL NULL
+8 7 NULL NULL
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t1.a = t2.a OR t2.b > t1.a OR t1.a > t1.b;
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+7 8 7 5
+2 1 NULL NULL
+8 7 NULL NULL
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE NOT(t1.a != t2.a AND t1.a BETWEEN t2.b AND t1.b);
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+7 8 7 5
+2 1 NULL NULL
+8 7 NULL NULL
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t1.a = t2.a AND (t2.b > t1.a OR t1.a > t1.b);
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE NOT(t1.a != t2.a OR t1.a BETWEEN t2.b AND t1.b);
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t1.a = t2.a OR t1.a = t2.b;
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+7 8 7 5
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t1.a IN(t2.a, t2.b);
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+7 8 7 5
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE NOT(t1.a NOT IN(t2.a, t2.b));
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+7 8 7 5
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t1.a != t1.b AND t1.a != t2.b;
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+7 8 7 5
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t1.a NOT IN(t1.b, t2.b);
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+7 8 7 5
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE NOT(t1.a IN(t1.b, t2.b));
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+7 8 7 5
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t2.a != t2.b OR (t1.a != t2.a AND t1.a != t2.b);
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+7 8 7 5
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE NOT(t2.a = t2.b AND t1.a IN(t2.a, t2.b));
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+7 8 7 5
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t2.a != t2.b AND t1.a != t1.b AND t1.a != t2.b;
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+7 8 7 5
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE NOT(t2.a = t2.b OR t1.a IN(t1.b, t2.b));
+a b a b
+3 2 3 0
+4 3 4 1
+6 5 6 4
+7 8 7 5
+EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t1.a = t2.a OR t1.a = t2.b;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL PRIMARY NULL NULL NULL 4
+1 SIMPLE t1 eq_ref PRIMARY PRIMARY 4 test.t2.a 1 Using join buffer
+EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t1.a IN(t2.a, t2.b);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL PRIMARY NULL NULL NULL 4 Using where
+1 SIMPLE t1 eq_ref PRIMARY PRIMARY 4 test.t2.a 1 Using join buffer
+EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.a WHERE t1.a > IF(t1.a = t2.b-2, t2.b, t2.b-1);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL PRIMARY NULL NULL NULL 4 Using where
+1 SIMPLE t1 eq_ref PRIMARY PRIMARY 4 test.t2.a 1 Using join buffer
+DROP TABLE t1,t2;
+DROP VIEW IF EXISTS v1,v2;
+DROP TABLE IF EXISTS t1,t2;
+CREATE TABLE t1 (a int);
+CREATE table t2 (b int);
+INSERT INTO t1 VALUES (1), (2), (3), (4), (1), (1), (3);
+INSERT INTO t2 VALUES (2), (3);
+CREATE VIEW v1 AS SELECT a FROM t1 JOIN t2 ON t1.a=t2.b;
+CREATE VIEW v2 AS SELECT b FROM t2 JOIN t1 ON t2.b=t1.a;
+SELECT v1.a, v2. b
+FROM v1 LEFT OUTER JOIN v2 ON (v1.a=v2.b) AND (v1.a >= 3)
+GROUP BY v1.a;
+a b
+2 NULL
+3 3
+SELECT v1.a, v2. b
+FROM { OJ v1 LEFT OUTER JOIN v2 ON (v1.a=v2.b) AND (v1.a >= 3) }
+GROUP BY v1.a;
+a b
+2 NULL
+3 3
+DROP VIEW v1,v2;
+DROP TABLE t1,t2;
+CREATE TABLE t1 (a int);
+CREATE TABLE t2 (b int);
+INSERT INTO t1 VALUES (1), (2), (3), (4);
+INSERT INTO t2 VALUES (2), (3);
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.b WHERE (1=1);
+a b
+2 2
+3 3
+1 NULL
+4 NULL
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.b WHERE (1 OR 1);
+a b
+2 2
+3 3
+1 NULL
+4 NULL
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.b WHERE (0 OR 1);
+a b
+2 2
+3 3
+1 NULL
+4 NULL
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.b WHERE (1=1 OR 2=2);
+a b
+2 2
+3 3
+1 NULL
+4 NULL
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a = t2.b WHERE (1=1 OR 1=0);
+a b
+2 2
+3 3
+1 NULL
+4 NULL
+DROP TABLE t1,t2;
+CREATE TABLE t1 (
+f1 varchar(16) collate latin1_swedish_ci PRIMARY KEY,
+f2 varchar(16) collate latin1_swedish_ci
+);
+CREATE TABLE t2 (
+f1 varchar(16) collate latin1_swedish_ci PRIMARY KEY,
+f3 varchar(16) collate latin1_swedish_ci
+);
+INSERT INTO t1 VALUES ('bla','blah');
+INSERT INTO t2 VALUES ('bla','sheep');
+SELECT * FROM t1 JOIN t2 USING(f1) WHERE f1='Bla';
+f1 f2 f3
+bla blah sheep
+SELECT * FROM t1 LEFT JOIN t2 USING(f1) WHERE f1='bla';
+f1 f2 f3
+bla blah sheep
+SELECT * FROM t1 LEFT JOIN t2 USING(f1) WHERE f1='Bla';
+f1 f2 f3
+bla blah sheep
+DROP TABLE t1,t2;
+CREATE TABLE t1 (id int PRIMARY KEY, a varchar(8));
+CREATE TABLE t2 (id int NOT NULL, b int NOT NULL, INDEX idx(id));
+INSERT INTO t1 VALUES
+(1,'aaaaaaa'), (5,'eeeeeee'), (4,'ddddddd'), (2,'bbbbbbb'), (3,'ccccccc');
+INSERT INTO t2 VALUES
+(3,10), (2,20), (5,30), (3,20), (5,10), (3,40), (3,30), (2,10), (2,40);
+EXPLAIN
+SELECT t1.id, a FROM t1 LEFT JOIN t2 ON t1.id=t2.id WHERE t2.b IS NULL;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 5
+1 SIMPLE t2 ref idx idx 4 test.t1.id 2 Using where; Not exists; Using join buffer
+flush status;
+SELECT t1.id, a FROM t1 LEFT JOIN t2 ON t1.id=t2.id WHERE t2.b IS NULL;
+id a
+1 aaaaaaa
+4 ddddddd
+show status like 'Handler_read%';
+Variable_name Value
+Handler_read_first 0
+Handler_read_key 5
+Handler_read_next 9
+Handler_read_prev 0
+Handler_read_rnd 3
+Handler_read_rnd_next 6
+DROP TABLE t1,t2;
+CREATE TABLE t1 (c int PRIMARY KEY, e int NOT NULL);
+INSERT INTO t1 VALUES (1,0), (2,1);
+CREATE TABLE t2 (d int PRIMARY KEY);
+INSERT INTO t2 VALUES (1), (2), (3);
+EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON e<>0 WHERE c=1 AND d IS NULL;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1
+1 SIMPLE t2 index NULL PRIMARY 4 NULL 3 Using where; Using index; Not exists
+SELECT * FROM t1 LEFT JOIN t2 ON e<>0 WHERE c=1 AND d IS NULL;
+c e d
+1 0 NULL
+SELECT * FROM t1 LEFT JOIN t2 ON e<>0 WHERE c=1 AND d<=>NULL;
+c e d
+1 0 NULL
+DROP TABLE t1,t2;
+set join_cache_level=default;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 1
=== added file 'mysql-test/r/select_jcl6.result'
--- a/mysql-test/r/select_jcl6.result 1970-01-01 00:00:00 +0000
+++ b/mysql-test/r/select_jcl6.result 2009-12-21 02:26:15 +0000
@@ -0,0 +1,4546 @@
+set join_cache_level=6;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 6
+drop table if exists t1,t2,t3,t4,t11;
+drop table if exists t1_1,t1_2,t9_1,t9_2,t1aa,t2aa;
+drop view if exists v1;
+CREATE TABLE t1 (
+Period smallint(4) unsigned zerofill DEFAULT '0000' NOT NULL,
+Varor_period smallint(4) unsigned DEFAULT '0' NOT NULL
+);
+INSERT INTO t1 VALUES (9410,9412);
+select period from t1;
+period
+9410
+select * from t1;
+Period Varor_period
+9410 9412
+select t1.* from t1;
+Period Varor_period
+9410 9412
+CREATE TABLE t2 (
+auto int not null auto_increment,
+fld1 int(6) unsigned zerofill DEFAULT '000000' NOT NULL,
+companynr tinyint(2) unsigned zerofill DEFAULT '00' NOT NULL,
+fld3 char(30) DEFAULT '' NOT NULL,
+fld4 char(35) DEFAULT '' NOT NULL,
+fld5 char(35) DEFAULT '' NOT NULL,
+fld6 char(4) DEFAULT '' NOT NULL,
+UNIQUE fld1 (fld1),
+KEY fld3 (fld3),
+PRIMARY KEY (auto)
+);
+select t2.fld3 from t2 where companynr = 58 and fld3 like "%imaginable%";
+fld3
+imaginable
+select fld3 from t2 where fld3 like "%cultivation" ;
+fld3
+cultivation
+select t2.fld3,companynr from t2 where companynr = 57+1 order by fld3;
+fld3 companynr
+concoct 58
+druggists 58
+engrossing 58
+Eurydice 58
+exclaimers 58
+ferociousness 58
+hopelessness 58
+Huey 58
+imaginable 58
+judges 58
+merging 58
+ostrich 58
+peering 58
+Phelps 58
+presumes 58
+Ruth 58
+sentences 58
+Shylock 58
+straggled 58
+synergy 58
+thanking 58
+tying 58
+unlocks 58
+select fld3,companynr from t2 where companynr = 58 order by fld3;
+fld3 companynr
+concoct 58
+druggists 58
+engrossing 58
+Eurydice 58
+exclaimers 58
+ferociousness 58
+hopelessness 58
+Huey 58
+imaginable 58
+judges 58
+merging 58
+ostrich 58
+peering 58
+Phelps 58
+presumes 58
+Ruth 58
+sentences 58
+Shylock 58
+straggled 58
+synergy 58
+thanking 58
+tying 58
+unlocks 58
+select fld3 from t2 order by fld3 desc limit 10;
+fld3
+youthfulness
+yelped
+Wotan
+workers
+Witt
+witchcraft
+Winsett
+Willy
+willed
+wildcats
+select fld3 from t2 order by fld3 desc limit 5;
+fld3
+youthfulness
+yelped
+Wotan
+workers
+Witt
+select fld3 from t2 order by fld3 desc limit 5,5;
+fld3
+witchcraft
+Winsett
+Willy
+willed
+wildcats
+select t2.fld3 from t2 where fld3 = 'honeysuckle';
+fld3
+honeysuckle
+select t2.fld3 from t2 where fld3 LIKE 'honeysuckl_';
+fld3
+honeysuckle
+select t2.fld3 from t2 where fld3 LIKE 'hon_ysuckl_';
+fld3
+honeysuckle
+select t2.fld3 from t2 where fld3 LIKE 'honeysuckle%';
+fld3
+honeysuckle
+select t2.fld3 from t2 where fld3 LIKE 'h%le';
+fld3
+honeysuckle
+select t2.fld3 from t2 where fld3 LIKE 'honeysuckle_';
+fld3
+select t2.fld3 from t2 where fld3 LIKE 'don_t_find_me_please%';
+fld3
+explain select t2.fld3 from t2 where fld3 = 'honeysuckle';
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ref fld3 fld3 30 const 1 Using where; Using index
+explain select fld3 from t2 ignore index (fld3) where fld3 = 'honeysuckle';
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where
+explain select fld3 from t2 use index (fld1) where fld3 = 'honeysuckle';
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where
+explain select fld3 from t2 use index (fld3) where fld3 = 'honeysuckle';
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ref fld3 fld3 30 const 1 Using where; Using index
+explain select fld3 from t2 use index (fld1,fld3) where fld3 = 'honeysuckle';
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ref fld3 fld3 30 const 1 Using where; Using index
+explain select fld3 from t2 ignore index (fld3,not_used);
+ERROR 42000: Key 'not_used' doesn't exist in table 't2'
+explain select fld3 from t2 use index (not_used);
+ERROR 42000: Key 'not_used' doesn't exist in table 't2'
+select t2.fld3 from t2 where fld3 >= 'honeysuckle' and fld3 <= 'honoring' order by fld3;
+fld3
+honeysuckle
+honoring
+explain select t2.fld3 from t2 where fld3 >= 'honeysuckle' and fld3 <= 'honoring' order by fld3;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 range fld3 fld3 30 NULL 2 Using where; Using index
+select fld1,fld3 from t2 where fld3="Colombo" or fld3 = "nondecreasing" order by fld3;
+fld1 fld3
+148504 Colombo
+068305 Colombo
+000000 nondecreasing
+select fld1,fld3 from t2 where companynr = 37 and fld3 = 'appendixes';
+fld1 fld3
+232605 appendixes
+1232605 appendixes
+1232606 appendixes
+1232607 appendixes
+1232608 appendixes
+1232609 appendixes
+select fld1 from t2 where fld1=250501 or fld1="250502";
+fld1
+250501
+250502
+explain select fld1 from t2 where fld1=250501 or fld1="250502";
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 range fld1 fld1 4 NULL 2 Using where; Using index
+select fld1 from t2 where fld1=250501 or fld1=250502 or fld1 >= 250505 and fld1 <= 250601 or fld1 between 250501 and 250502;
+fld1
+250501
+250502
+250505
+250601
+explain select fld1 from t2 where fld1=250501 or fld1=250502 or fld1 >= 250505 and fld1 <= 250601 or fld1 between 250501 and 250502;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 range fld1 fld1 4 NULL 4 Using where; Using index
+select fld1,fld3 from t2 where companynr = 37 and fld3 like 'f%';
+fld1 fld3
+012001 flanking
+013602 foldout
+013606 fingerings
+018007 fanatic
+018017 featherweight
+018054 fetters
+018103 flint
+018104 flopping
+036002 funereal
+038017 fetched
+038205 firearm
+058004 Fenton
+088303 feminine
+186002 freakish
+188007 flurried
+188505 fitting
+198006 furthermore
+202301 Fitzpatrick
+208101 fiftieth
+208113 freest
+218008 finishers
+218022 feed
+218401 faithful
+226205 foothill
+226209 furnishings
+228306 forthcoming
+228311 fated
+231315 freezes
+232102 forgivably
+238007 filial
+238008 fixedly
+select fld3 from t2 where fld3 like "L%" and fld3 = "ok";
+fld3
+select fld3 from t2 where (fld3 like "C%" and fld3 = "Chantilly");
+fld3
+Chantilly
+select fld1,fld3 from t2 where fld1 like "25050%";
+fld1 fld3
+250501 poisoning
+250502 Iraqis
+250503 heaving
+250504 population
+250505 bomb
+select fld1,fld3 from t2 where fld1 like "25050_";
+fld1 fld3
+250501 poisoning
+250502 Iraqis
+250503 heaving
+250504 population
+250505 bomb
+select distinct companynr from t2;
+companynr
+00
+37
+36
+50
+58
+29
+40
+53
+65
+41
+34
+68
+select distinct companynr from t2 order by companynr;
+companynr
+00
+29
+34
+36
+37
+40
+41
+50
+53
+58
+65
+68
+select distinct companynr from t2 order by companynr desc;
+companynr
+68
+65
+58
+53
+50
+41
+40
+37
+36
+34
+29
+00
+select distinct t2.fld3,period from t2,t1 where companynr=37 and fld3 like "O%";
+fld3 period
+obliterates 9410
+offload 9410
+opaquely 9410
+organizer 9410
+overestimating 9410
+overlay 9410
+select distinct fld3 from t2 where companynr = 34 order by fld3;
+fld3
+absentee
+accessed
+ahead
+alphabetic
+Asiaticizations
+attitude
+aye
+bankruptcies
+belays
+Blythe
+bomb
+boulevard
+bulldozes
+cannot
+caressing
+charcoal
+checksumming
+chess
+clubroom
+colorful
+cosy
+creator
+crying
+Darius
+diffusing
+duality
+Eiffel
+Epiphany
+Ernestine
+explorers
+exterminated
+famine
+forked
+Gershwins
+heaving
+Hodges
+Iraqis
+Italianization
+Lagos
+landslide
+libretto
+Majorca
+mastering
+narrowed
+occurred
+offerers
+Palestine
+Peruvianizes
+pharmaceutic
+poisoning
+population
+Pygmalion
+rats
+realest
+recording
+regimented
+retransmitting
+reviver
+rouses
+scars
+sicker
+sleepwalk
+stopped
+sugars
+translatable
+uncles
+unexpected
+uprisings
+versatility
+vest
+select distinct fld3 from t2 limit 10;
+fld3
+abates
+abiding
+Abraham
+abrogating
+absentee
+abut
+accessed
+accruing
+accumulating
+accuracies
+select distinct fld3 from t2 having fld3 like "A%" limit 10;
+fld3
+abates
+abiding
+Abraham
+abrogating
+absentee
+abut
+accessed
+accruing
+accumulating
+accuracies
+select distinct substring(fld3,1,3) from t2 where fld3 like "A%";
+substring(fld3,1,3)
+aba
+abi
+Abr
+abs
+abu
+acc
+acq
+acu
+Ade
+adj
+Adl
+adm
+Ado
+ads
+adv
+aer
+aff
+afi
+afl
+afo
+agi
+ahe
+aim
+air
+Ald
+alg
+ali
+all
+alp
+alr
+ama
+ame
+amm
+ana
+and
+ane
+Ang
+ani
+Ann
+Ant
+api
+app
+aqu
+Ara
+arc
+Arm
+arr
+Art
+Asi
+ask
+asp
+ass
+ast
+att
+aud
+Aug
+aut
+ave
+avo
+awe
+aye
+Azt
+select distinct substring(fld3,1,3) as a from t2 having a like "A%" order by a limit 10;
+a
+aba
+abi
+Abr
+abs
+abu
+acc
+acq
+acu
+Ade
+adj
+select distinct substring(fld3,1,3) from t2 where fld3 like "A%" limit 10;
+substring(fld3,1,3)
+aba
+abi
+Abr
+abs
+abu
+acc
+acq
+acu
+Ade
+adj
+select distinct substring(fld3,1,3) as a from t2 having a like "A%" limit 10;
+a
+aba
+abi
+Abr
+abs
+abu
+acc
+acq
+acu
+Ade
+adj
+create table t3 (
+period int not null,
+name char(32) not null,
+companynr int not null,
+price double(11,0),
+price2 double(11,0),
+key (period),
+key (name)
+);
+create temporary table tmp engine = myisam select * from t3;
+insert into t3 select * from tmp;
+insert into tmp select * from t3;
+insert into t3 select * from tmp;
+insert into tmp select * from t3;
+insert into t3 select * from tmp;
+insert into tmp select * from t3;
+insert into t3 select * from tmp;
+insert into tmp select * from t3;
+insert into t3 select * from tmp;
+insert into tmp select * from t3;
+insert into t3 select * from tmp;
+insert into tmp select * from t3;
+insert into t3 select * from tmp;
+insert into tmp select * from t3;
+insert into t3 select * from tmp;
+insert into tmp select * from t3;
+insert into t3 select * from tmp;
+alter table t3 add t2nr int not null auto_increment primary key first;
+drop table tmp;
+SET SQL_BIG_TABLES=1;
+select distinct concat(fld3," ",fld3) as namn from t2,t3 where t2.fld1=t3.t2nr order by namn limit 10;
+namn
+Abraham Abraham
+abrogating abrogating
+admonishing admonishing
+Adolph Adolph
+afield afield
+aging aging
+ammonium ammonium
+analyzable analyzable
+animals animals
+animized animized
+SET SQL_BIG_TABLES=0;
+select distinct concat(fld3," ",fld3) from t2,t3 where t2.fld1=t3.t2nr order by fld3 limit 10;
+concat(fld3," ",fld3)
+Abraham Abraham
+abrogating abrogating
+admonishing admonishing
+Adolph Adolph
+afield afield
+aging aging
+ammonium ammonium
+analyzable analyzable
+animals animals
+animized animized
+select distinct fld5 from t2 limit 10;
+fld5
+neat
+Steinberg
+jarring
+tinily
+balled
+persist
+attainments
+fanatic
+measures
+rightfulness
+select distinct fld3,count(*) from t2 group by companynr,fld3 limit 10;
+fld3 count(*)
+affixed 1
+and 1
+annoyers 1
+Anthony 1
+assayed 1
+assurers 1
+attendants 1
+bedlam 1
+bedpost 1
+boasted 1
+SET SQL_BIG_TABLES=1;
+select distinct fld3,count(*) from t2 group by companynr,fld3 limit 10;
+fld3 count(*)
+affixed 1
+and 1
+annoyers 1
+Anthony 1
+assayed 1
+assurers 1
+attendants 1
+bedlam 1
+bedpost 1
+boasted 1
+SET SQL_BIG_TABLES=0;
+select distinct fld3,repeat("a",length(fld3)),count(*) from t2 group by companynr,fld3 limit 100,10;
+fld3 repeat("a",length(fld3)) count(*)
+circus aaaaaa 1
+cited aaaaa 1
+Colombo aaaaaaa 1
+congresswoman aaaaaaaaaaaaa 1
+contrition aaaaaaaaaa 1
+corny aaaaa 1
+cultivation aaaaaaaaaaa 1
+definiteness aaaaaaaaaaaa 1
+demultiplex aaaaaaaaaaa 1
+disappointing aaaaaaaaaaaaa 1
+select distinct companynr,rtrim(space(512+companynr)) from t3 order by 1,2;
+companynr rtrim(space(512+companynr))
+37
+78
+101
+154
+311
+447
+512
+select distinct fld3 from t2,t3 where t2.companynr = 34 and t2.fld1=t3.t2nr order by fld3;
+fld3
+explain select t3.t2nr,fld3 from t2,t3 where t2.companynr = 34 and t2.fld1=t3.t2nr order by t3.t2nr,fld3;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL fld1 NULL NULL NULL 1199 Using where; Using temporary; Using filesort
+1 SIMPLE t3 eq_ref PRIMARY PRIMARY 4 test.t2.fld1 1 Using where; Using index
+explain select * from t3 as t1,t3 where t1.period=t3.period order by t3.period;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL period NULL NULL NULL 41810 Using temporary; Using filesort
+1 SIMPLE t3 ref period period 4 test.t1.period 4181 Using join buffer
+explain select * from t3 as t1,t3 where t1.period=t3.period order by t3.period limit 10;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t3 ALL period NULL NULL NULL 41810 Using temporary; Using filesort
+1 SIMPLE t1 ref period period 4 test.t3.period 4181 Using join buffer
+explain select * from t3 as t1,t3 where t1.period=t3.period order by t1.period limit 10;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL period NULL NULL NULL 41810 Using temporary; Using filesort
+1 SIMPLE t3 ref period period 4 test.t1.period 4181 Using join buffer
+select period from t1;
+period
+9410
+select period from t1 where period=1900;
+period
+select fld3,period from t1,t2 where fld1 = 011401 order by period;
+fld3 period
+breaking 9410
+select fld3,period from t2,t3 where t2.fld1 = 011401 and t2.fld1=t3.t2nr and t3.period=1001;
+fld3 period
+breaking 1001
+explain select fld3,period from t2,t3 where t2.fld1 = 011401 and t3.t2nr=t2.fld1 and 1001 = t3.period;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 const fld1 fld1 4 const 1
+1 SIMPLE t3 const PRIMARY,period PRIMARY 4 const 1
+select fld3,period from t2,t1 where companynr*10 = 37*10;
+fld3 period
+breaking 9410
+Romans 9410
+intercepted 9410
+bewilderingly 9410
+astound 9410
+admonishing 9410
+sumac 9410
+flanking 9410
+combed 9410
+subjective 9410
+scatterbrain 9410
+Eulerian 9410
+Kane 9410
+overlay 9410
+perturb 9410
+goblins 9410
+annihilates 9410
+Wotan 9410
+snatching 9410
+concludes 9410
+laterally 9410
+yelped 9410
+grazing 9410
+Baird 9410
+celery 9410
+misunderstander 9410
+handgun 9410
+foldout 9410
+mystic 9410
+succumbed 9410
+Nabisco 9410
+fingerings 9410
+aging 9410
+afield 9410
+ammonium 9410
+boat 9410
+intelligibility 9410
+Augustine 9410
+teethe 9410
+dreaded 9410
+scholastics 9410
+audiology 9410
+wallet 9410
+parters 9410
+eschew 9410
+quitter 9410
+neat 9410
+Steinberg 9410
+jarring 9410
+tinily 9410
+balled 9410
+persist 9410
+attainments 9410
+fanatic 9410
+measures 9410
+rightfulness 9410
+capably 9410
+impulsive 9410
+starlet 9410
+terminators 9410
+untying 9410
+announces 9410
+featherweight 9410
+pessimist 9410
+daughter 9410
+decliner 9410
+lawgiver 9410
+stated 9410
+readable 9410
+attrition 9410
+cascade 9410
+motors 9410
+interrogate 9410
+pests 9410
+stairway 9410
+dopers 9410
+testicle 9410
+Parsifal 9410
+leavings 9410
+postulation 9410
+squeaking 9410
+contrasted 9410
+leftover 9410
+whiteners 9410
+erases 9410
+Punjab 9410
+Merritt 9410
+Quixotism 9410
+sweetish 9410
+dogging 9410
+scornfully 9410
+bellow 9410
+bills 9410
+cupboard 9410
+sureties 9410
+puddings 9410
+fetters 9410
+bivalves 9410
+incurring 9410
+Adolph 9410
+pithed 9410
+Miles 9410
+trimmings 9410
+tragedies 9410
+skulking 9410
+flint 9410
+flopping 9410
+relaxing 9410
+offload 9410
+suites 9410
+lists 9410
+animized 9410
+multilayer 9410
+standardizes 9410
+Judas 9410
+vacuuming 9410
+dentally 9410
+humanness 9410
+inch 9410
+Weissmuller 9410
+irresponsibly 9410
+luckily 9410
+culled 9410
+medical 9410
+bloodbath 9410
+subschema 9410
+animals 9410
+Micronesia 9410
+repetitions 9410
+Antares 9410
+ventilate 9410
+pityingly 9410
+interdependent 9410
+Graves 9410
+neonatal 9410
+chafe 9410
+honoring 9410
+realtor 9410
+elite 9410
+funereal 9410
+abrogating 9410
+sorters 9410
+Conley 9410
+lectured 9410
+Abraham 9410
+Hawaii 9410
+cage 9410
+hushes 9410
+Simla 9410
+reporters 9410
+Dutchman 9410
+descendants 9410
+groupings 9410
+dissociate 9410
+coexist 9410
+Beebe 9410
+Taoism 9410
+Connally 9410
+fetched 9410
+checkpoints 9410
+rusting 9410
+galling 9410
+obliterates 9410
+traitor 9410
+resumes 9410
+analyzable 9410
+terminator 9410
+gritty 9410
+firearm 9410
+minima 9410
+Selfridge 9410
+disable 9410
+witchcraft 9410
+betroth 9410
+Manhattanize 9410
+imprint 9410
+peeked 9410
+swelling 9410
+interrelationships 9410
+riser 9410
+Gandhian 9410
+peacock 9410
+bee 9410
+kanji 9410
+dental 9410
+scarf 9410
+chasm 9410
+insolence 9410
+syndicate 9410
+alike 9410
+imperial 9410
+convulsion 9410
+railway 9410
+validate 9410
+normalizes 9410
+comprehensive 9410
+chewing 9410
+denizen 9410
+schemer 9410
+chronicle 9410
+Kline 9410
+Anatole 9410
+partridges 9410
+brunch 9410
+recruited 9410
+dimensions 9410
+Chicana 9410
+announced 9410
+praised 9410
+employing 9410
+linear 9410
+quagmire 9410
+western 9410
+relishing 9410
+serving 9410
+scheduling 9410
+lore 9410
+eventful 9410
+arteriole 9410
+disentangle 9410
+cured 9410
+Fenton 9410
+avoidable 9410
+drains 9410
+detectably 9410
+husky 9410
+impelling 9410
+undoes 9410
+evened 9410
+squeezes 9410
+destroyer 9410
+rudeness 9410
+beaner 9410
+boorish 9410
+Everhart 9410
+encompass 9410
+mushrooms 9410
+Alison 9410
+externally 9410
+pellagra 9410
+cult 9410
+creek 9410
+Huffman 9410
+Majorca 9410
+governing 9410
+gadfly 9410
+reassigned 9410
+intentness 9410
+craziness 9410
+psychic 9410
+squabbled 9410
+burlesque 9410
+capped 9410
+extracted 9410
+DiMaggio 9410
+exclamation 9410
+subdirectory 9410
+Gothicism 9410
+feminine 9410
+metaphysically 9410
+sanding 9410
+Miltonism 9410
+freakish 9410
+index 9410
+straight 9410
+flurried 9410
+denotative 9410
+coming 9410
+commencements 9410
+gentleman 9410
+gifted 9410
+Shanghais 9410
+sportswriting 9410
+sloping 9410
+navies 9410
+leaflet 9410
+shooter 9410
+Joplin 9410
+babies 9410
+assails 9410
+admiring 9410
+swaying 9410
+Goldstine 9410
+fitting 9410
+Norwalk 9410
+analogy 9410
+deludes 9410
+cokes 9410
+Clayton 9410
+exhausts 9410
+causality 9410
+sating 9410
+icon 9410
+throttles 9410
+communicants 9410
+dehydrate 9410
+priceless 9410
+publicly 9410
+incidentals 9410
+commonplace 9410
+mumbles 9410
+furthermore 9410
+cautioned 9410
+parametrized 9410
+registration 9410
+sadly 9410
+positioning 9410
+babysitting 9410
+eternal 9410
+hoarder 9410
+congregates 9410
+rains 9410
+workers 9410
+sags 9410
+unplug 9410
+garage 9410
+boulder 9410
+specifics 9410
+Teresa 9410
+Winsett 9410
+convenient 9410
+buckboards 9410
+amenities 9410
+resplendent 9410
+sews 9410
+participated 9410
+Simon 9410
+certificates 9410
+Fitzpatrick 9410
+Evanston 9410
+misted 9410
+textures 9410
+save 9410
+count 9410
+rightful 9410
+chaperone 9410
+Lizzy 9410
+clenched 9410
+effortlessly 9410
+accessed 9410
+beaters 9410
+Hornblower 9410
+vests 9410
+indulgences 9410
+infallibly 9410
+unwilling 9410
+excrete 9410
+spools 9410
+crunches 9410
+overestimating 9410
+ineffective 9410
+humiliation 9410
+sophomore 9410
+star 9410
+rifles 9410
+dialysis 9410
+arriving 9410
+indulge 9410
+clockers 9410
+languages 9410
+Antarctica 9410
+percentage 9410
+ceiling 9410
+specification 9410
+regimented 9410
+ciphers 9410
+pictures 9410
+serpents 9410
+allot 9410
+realized 9410
+mayoral 9410
+opaquely 9410
+hostess 9410
+fiftieth 9410
+incorrectly 9410
+decomposition 9410
+stranglings 9410
+mixture 9410
+electroencephalography 9410
+similarities 9410
+charges 9410
+freest 9410
+Greenberg 9410
+tinting 9410
+expelled 9410
+warm 9410
+smoothed 9410
+deductions 9410
+Romano 9410
+bitterroot 9410
+corset 9410
+securing 9410
+environing 9410
+cute 9410
+Crays 9410
+heiress 9410
+inform 9410
+avenge 9410
+universals 9410
+Kinsey 9410
+ravines 9410
+bestseller 9410
+equilibrium 9410
+extents 9410
+relatively 9410
+pressure 9410
+critiques 9410
+befouled 9410
+rightfully 9410
+mechanizing 9410
+Latinizes 9410
+timesharing 9410
+Aden 9410
+embassies 9410
+males 9410
+shapelessly 9410
+mastering 9410
+Newtonian 9410
+finishers 9410
+abates 9410
+teem 9410
+kiting 9410
+stodgy 9410
+feed 9410
+guitars 9410
+airships 9410
+store 9410
+denounces 9410
+Pyle 9410
+Saxony 9410
+serializations 9410
+Peruvian 9410
+taxonomically 9410
+kingdom 9410
+stint 9410
+Sault 9410
+faithful 9410
+Ganymede 9410
+tidiness 9410
+gainful 9410
+contrary 9410
+Tipperary 9410
+tropics 9410
+theorizers 9410
+renew 9410
+already 9410
+terminal 9410
+Hegelian 9410
+hypothesizer 9410
+warningly 9410
+journalizing 9410
+nested 9410
+Lars 9410
+saplings 9410
+foothill 9410
+labeled 9410
+imperiously 9410
+reporters 9410
+furnishings 9410
+precipitable 9410
+discounts 9410
+excises 9410
+Stalin 9410
+despot 9410
+ripeness 9410
+Arabia 9410
+unruly 9410
+mournfulness 9410
+boom 9410
+slaughter 9410
+Sabine 9410
+handy 9410
+rural 9410
+organizer 9410
+shipyard 9410
+civics 9410
+inaccuracy 9410
+rules 9410
+juveniles 9410
+comprised 9410
+investigations 9410
+stabilizes 9410
+seminaries 9410
+Hunter 9410
+sporty 9410
+test 9410
+weasels 9410
+CERN 9410
+tempering 9410
+afore 9410
+Galatean 9410
+techniques 9410
+error 9410
+veranda 9410
+severely 9410
+Cassites 9410
+forthcoming 9410
+guides 9410
+vanish 9410
+lied 9410
+sawtooth 9410
+fated 9410
+gradually 9410
+widens 9410
+preclude 9410
+evenhandedly 9410
+percentage 9410
+disobedience 9410
+humility 9410
+gleaning 9410
+petted 9410
+bloater 9410
+minion 9410
+marginal 9410
+apiary 9410
+measures 9410
+precaution 9410
+repelled 9410
+primary 9410
+coverings 9410
+Artemia 9410
+navigate 9410
+spatial 9410
+Gurkha 9410
+meanwhile 9410
+Melinda 9410
+Butterfield 9410
+Aldrich 9410
+previewing 9410
+glut 9410
+unaffected 9410
+inmate 9410
+mineral 9410
+impending 9410
+meditation 9410
+ideas 9410
+miniaturizes 9410
+lewdly 9410
+title 9410
+youthfulness 9410
+creak 9410
+Chippewa 9410
+clamored 9410
+freezes 9410
+forgivably 9410
+reduce 9410
+McGovern 9410
+Nazis 9410
+epistle 9410
+socializes 9410
+conceptions 9410
+Kevin 9410
+uncovering 9410
+chews 9410
+appendixes 9410
+appendixes 9410
+appendixes 9410
+appendixes 9410
+appendixes 9410
+appendixes 9410
+raining 9410
+infest 9410
+compartment 9410
+minting 9410
+ducks 9410
+roped 9410
+waltz 9410
+Lillian 9410
+repressions 9410
+chillingly 9410
+noncritical 9410
+lithograph 9410
+spongers 9410
+parenthood 9410
+posed 9410
+instruments 9410
+filial 9410
+fixedly 9410
+relives 9410
+Pandora 9410
+watering 9410
+ungrateful 9410
+secures 9410
+poison 9410
+dusted 9410
+encompasses 9410
+presentation 9410
+Kantian 9410
+select fld3,period,price,price2 from t2,t3 where t2.fld1=t3.t2nr and period >= 1001 and period <= 1002 and t2.companynr = 37 order by fld3,period, price;
+fld3 period price price2
+admonishing 1002 28357832 8723648
+analyzable 1002 28357832 8723648
+annihilates 1001 5987435 234724
+Antares 1002 28357832 8723648
+astound 1001 5987435 234724
+audiology 1001 5987435 234724
+Augustine 1002 28357832 8723648
+Baird 1002 28357832 8723648
+bewilderingly 1001 5987435 234724
+breaking 1001 5987435 234724
+Conley 1001 5987435 234724
+dentally 1002 28357832 8723648
+dissociate 1002 28357832 8723648
+elite 1001 5987435 234724
+eschew 1001 5987435 234724
+Eulerian 1001 5987435 234724
+flanking 1001 5987435 234724
+foldout 1002 28357832 8723648
+funereal 1002 28357832 8723648
+galling 1002 28357832 8723648
+Graves 1001 5987435 234724
+grazing 1001 5987435 234724
+groupings 1001 5987435 234724
+handgun 1001 5987435 234724
+humility 1002 28357832 8723648
+impulsive 1002 28357832 8723648
+inch 1001 5987435 234724
+intelligibility 1001 5987435 234724
+jarring 1001 5987435 234724
+lawgiver 1001 5987435 234724
+lectured 1002 28357832 8723648
+Merritt 1002 28357832 8723648
+neonatal 1001 5987435 234724
+offload 1002 28357832 8723648
+parters 1002 28357832 8723648
+pityingly 1002 28357832 8723648
+puddings 1002 28357832 8723648
+Punjab 1001 5987435 234724
+quitter 1002 28357832 8723648
+realtor 1001 5987435 234724
+relaxing 1001 5987435 234724
+repetitions 1001 5987435 234724
+resumes 1001 5987435 234724
+Romans 1002 28357832 8723648
+rusting 1001 5987435 234724
+scholastics 1001 5987435 234724
+skulking 1002 28357832 8723648
+stated 1002 28357832 8723648
+suites 1002 28357832 8723648
+sureties 1001 5987435 234724
+testicle 1002 28357832 8723648
+tinily 1002 28357832 8723648
+tragedies 1001 5987435 234724
+trimmings 1001 5987435 234724
+vacuuming 1001 5987435 234724
+ventilate 1001 5987435 234724
+wallet 1001 5987435 234724
+Weissmuller 1002 28357832 8723648
+Wotan 1002 28357832 8723648
+select t2.fld1,fld3,period,price,price2 from t2,t3 where t2.fld1>= 18201 and t2.fld1 <= 18811 and t2.fld1=t3.t2nr and period = 1001 and t2.companynr = 37;
+fld1 fld3 period price price2
+018201 relaxing 1001 5987435 234724
+018601 vacuuming 1001 5987435 234724
+018801 inch 1001 5987435 234724
+018811 repetitions 1001 5987435 234724
+create table t4 (
+companynr tinyint(2) unsigned zerofill NOT NULL default '00',
+companyname char(30) NOT NULL default '',
+PRIMARY KEY (companynr),
+UNIQUE KEY companyname(companyname)
+) ENGINE=MyISAM MAX_ROWS=50 PACK_KEYS=1 COMMENT='companynames';
+select STRAIGHT_JOIN t2.companynr,companyname from t4,t2 where t2.companynr=t4.companynr group by t2.companynr;
+companynr companyname
+00 Unknown
+29 company 1
+34 company 2
+36 company 3
+37 company 4
+40 company 5
+41 company 6
+50 company 11
+53 company 7
+58 company 8
+65 company 9
+68 company 10
+select SQL_SMALL_RESULT t2.companynr,companyname from t4,t2 where t2.companynr=t4.companynr group by t2.companynr;
+companynr companyname
+00 Unknown
+29 company 1
+34 company 2
+36 company 3
+37 company 4
+40 company 5
+41 company 6
+50 company 11
+53 company 7
+58 company 8
+65 company 9
+68 company 10
+select * from t1,t1 t12;
+Period Varor_period Period Varor_period
+9410 9412 9410 9412
+select t2.fld1,t22.fld1 from t2,t2 t22 where t2.fld1 >= 250501 and t2.fld1 <= 250505 and t22.fld1 >= 250501 and t22.fld1 <= 250505;
+fld1 fld1
+250501 250501
+250502 250501
+250503 250501
+250504 250501
+250505 250501
+250501 250502
+250502 250502
+250503 250502
+250504 250502
+250505 250502
+250501 250503
+250502 250503
+250503 250503
+250504 250503
+250505 250503
+250501 250504
+250502 250504
+250503 250504
+250504 250504
+250505 250504
+250501 250505
+250502 250505
+250503 250505
+250504 250505
+250505 250505
+insert into t2 (fld1, companynr) values (999999,99);
+select t2.companynr,companyname from t2 left join t4 using (companynr) where t4.companynr is null;
+companynr companyname
+99 NULL
+select count(*) from t2 left join t4 using (companynr) where t4.companynr is not null;
+count(*)
+1199
+explain select t2.companynr,companyname from t2 left join t4 using (companynr) where t4.companynr is null;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1200
+1 SIMPLE t4 eq_ref PRIMARY PRIMARY 1 test.t2.companynr 1 Using where; Not exists; Using join buffer
+explain select t2.companynr,companyname from t4 left join t2 using (companynr) where t2.companynr is null;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t4 ALL NULL NULL NULL NULL 12
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1200 Using where; Not exists; Using join buffer
+select companynr,companyname from t2 left join t4 using (companynr) where companynr is null;
+companynr companyname
+select count(*) from t2 left join t4 using (companynr) where companynr is not null;
+count(*)
+1200
+explain select companynr,companyname from t2 left join t4 using (companynr) where companynr is null;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE NULL NULL NULL NULL NULL NULL NULL Impossible WHERE
+explain select companynr,companyname from t4 left join t2 using (companynr) where companynr is null;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE NULL NULL NULL NULL NULL NULL NULL Impossible WHERE
+delete from t2 where fld1=999999;
+explain select t2.companynr,companyname from t4 left join t2 using (companynr) where t2.companynr > 0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where
+1 SIMPLE t4 eq_ref PRIMARY PRIMARY 1 test.t2.companynr 1 Using join buffer
+explain select t2.companynr,companyname from t4 left join t2 using (companynr) where t2.companynr > 0 or t2.companynr < 0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where
+1 SIMPLE t4 eq_ref PRIMARY PRIMARY 1 test.t2.companynr 1 Using join buffer
+explain select t2.companynr,companyname from t4 left join t2 using (companynr) where t2.companynr > 0 and t4.companynr > 0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where
+1 SIMPLE t4 eq_ref PRIMARY PRIMARY 1 test.t2.companynr 1 Using join buffer
+explain select companynr,companyname from t4 left join t2 using (companynr) where companynr > 0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t4 ALL PRIMARY NULL NULL NULL 12 Using where
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where; Using join buffer
+explain select companynr,companyname from t4 left join t2 using (companynr) where companynr > 0 or companynr < 0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t4 ALL PRIMARY NULL NULL NULL 12 Using where
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where; Using join buffer
+explain select companynr,companyname from t4 left join t2 using (companynr) where companynr > 0 and companynr > 0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t4 ALL PRIMARY NULL NULL NULL 12 Using where
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where; Using join buffer
+explain select t2.companynr,companyname from t4 left join t2 using (companynr) where t2.companynr > 0 or t2.companynr is null;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t4 ALL NULL NULL NULL NULL 12
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where; Using join buffer
+explain select t2.companynr,companyname from t4 left join t2 using (companynr) where t2.companynr > 0 or t2.companynr < 0 or t4.companynr > 0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t4 ALL PRIMARY NULL NULL NULL 12
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where; Using join buffer
+explain select t2.companynr,companyname from t4 left join t2 using (companynr) where ifnull(t2.companynr,1)>0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t4 ALL NULL NULL NULL NULL 12
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where; Using join buffer
+explain select companynr,companyname from t4 left join t2 using (companynr) where companynr > 0 or companynr is null;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t4 ALL PRIMARY NULL NULL NULL 12 Using where
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where; Using join buffer
+explain select companynr,companyname from t4 left join t2 using (companynr) where companynr > 0 or companynr < 0 or companynr > 0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t4 ALL PRIMARY NULL NULL NULL 12 Using where
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where; Using join buffer
+explain select companynr,companyname from t4 left join t2 using (companynr) where ifnull(companynr,1)>0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t4 ALL NULL NULL NULL NULL 12 Using where
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where; Using join buffer
+select distinct t2.companynr,t4.companynr from t2,t4 where t2.companynr=t4.companynr+1;
+companynr companynr
+37 36
+41 40
+explain select distinct t2.companynr,t4.companynr from t2,t4 where t2.companynr=t4.companynr+1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t4 index NULL PRIMARY 1 NULL 12 Using index; Using temporary
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 Using where; Using join buffer
+select t2.fld1,t2.companynr,fld3,period from t3,t2 where t2.fld1 = 38208 and t2.fld1=t3.t2nr and period = 1008 or t2.fld1 = 38008 and t2.fld1 =t3.t2nr and period = 1008;
+fld1 companynr fld3 period
+038008 37 reporters 1008
+038208 37 Selfridge 1008
+select t2.fld1,t2.companynr,fld3,period from t3,t2 where (t2.fld1 = 38208 or t2.fld1 = 38008) and t2.fld1=t3.t2nr and period>=1008 and period<=1009;
+fld1 companynr fld3 period
+038008 37 reporters 1008
+038208 37 Selfridge 1008
+select t2.fld1,t2.companynr,fld3,period from t3,t2 where (t3.t2nr = 38208 or t3.t2nr = 38008) and t2.fld1=t3.t2nr and period>=1008 and period<=1009;
+fld1 companynr fld3 period
+038008 37 reporters 1008
+038208 37 Selfridge 1008
+select period from t1 where (((period > 0) or period < 10000 or (period = 1900)) and (period=1900 and period <= 1901) or (period=1903 and (period=1903)) and period>=1902) or ((period=1904 or period=1905) or (period=1906 or period>1907)) or (period=1908 and period = 1909);
+period
+9410
+select period from t1 where ((period > 0 and period < 1) or (((period > 0 and period < 100) and (period > 10)) or (period > 10)) or (period > 0 and (period > 5 or period > 6)));
+period
+9410
+select a.fld1 from t2 as a,t2 b where ((a.fld1 = 250501 and a.fld1=b.fld1) or a.fld1=250502 or a.fld1=250503 or (a.fld1=250505 and a.fld1<=b.fld1 and b.fld1>=a.fld1)) and a.fld1=b.fld1;
+fld1
+250501
+250502
+250503
+250505
+select fld1 from t2 where fld1 in (250502,98005,98006,250503,250605,250606) and fld1 >=250502 and fld1 not in (250605,250606);
+fld1
+250502
+250503
+select fld1 from t2 where fld1 between 250502 and 250504;
+fld1
+250502
+250503
+250504
+select fld3 from t2 where (((fld3 like "_%L%" ) or (fld3 like "%ok%")) and ( fld3 like "L%" or fld3 like "G%")) and fld3 like "L%" ;
+fld3
+label
+labeled
+labeled
+landslide
+laterally
+leaflet
+lewdly
+Lillian
+luckily
+select count(*) from t1;
+count(*)
+1
+select companynr,count(*),sum(fld1) from t2 group by companynr;
+companynr count(*) sum(fld1)
+00 82 10355753
+29 95 14473298
+34 70 17788966
+36 215 22786296
+37 588 83602098
+40 37 6618386
+41 52 12816335
+50 11 1595438
+53 4 793210
+58 23 2254293
+65 10 2284055
+68 12 3097288
+select companynr,count(*) from t2 group by companynr order by companynr desc limit 5;
+companynr count(*)
+68 12
+65 10
+58 23
+53 4
+50 11
+select count(*),min(fld4),max(fld4),sum(fld1),avg(fld1),std(fld1),variance(fld1) from t2 where companynr = 34 and fld4<>"";
+count(*) min(fld4) max(fld4) sum(fld1) avg(fld1) std(fld1) variance(fld1)
+70 absentee vest 17788966 254128.0857 3272.5940 10709871.3069
+explain extended select count(*),min(fld4),max(fld4),sum(fld1),avg(fld1),std(fld1),variance(fld1) from t2 where companynr = 34 and fld4<>"";
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199 100.00 Using where
+Warnings:
+Note 1003 select count(0) AS `count(*)`,min(`test`.`t2`.`fld4`) AS `min(fld4)`,max(`test`.`t2`.`fld4`) AS `max(fld4)`,sum(`test`.`t2`.`fld1`) AS `sum(fld1)`,avg(`test`.`t2`.`fld1`) AS `avg(fld1)`,std(`test`.`t2`.`fld1`) AS `std(fld1)`,variance(`test`.`t2`.`fld1`) AS `variance(fld1)` from `test`.`t2` where ((`test`.`t2`.`companynr` = 34) and (`test`.`t2`.`fld4` <> ''))
+select companynr,count(*),min(fld4),max(fld4),sum(fld1),avg(fld1),std(fld1),variance(fld1) from t2 group by companynr limit 3;
+companynr count(*) min(fld4) max(fld4) sum(fld1) avg(fld1) std(fld1) variance(fld1)
+00 82 Anthony windmills 10355753 126289.6707 115550.9757 13352027981.7087
+29 95 abut wetness 14473298 152350.5053 8368.5480 70032594.9026
+34 70 absentee vest 17788966 254128.0857 3272.5940 10709871.3069
+select companynr,t2nr,count(price),sum(price),min(price),max(price),avg(price) from t3 where companynr = 37 group by companynr,t2nr limit 10;
+companynr t2nr count(price) sum(price) min(price) max(price) avg(price)
+37 1 1 5987435 5987435 5987435 5987435.0000
+37 2 1 28357832 28357832 28357832 28357832.0000
+37 3 1 39654943 39654943 39654943 39654943.0000
+37 11 1 5987435 5987435 5987435 5987435.0000
+37 12 1 28357832 28357832 28357832 28357832.0000
+37 13 1 39654943 39654943 39654943 39654943.0000
+37 21 1 5987435 5987435 5987435 5987435.0000
+37 22 1 28357832 28357832 28357832 28357832.0000
+37 23 1 39654943 39654943 39654943 39654943.0000
+37 31 1 5987435 5987435 5987435 5987435.0000
+select /*! SQL_SMALL_RESULT */ companynr,t2nr,count(price),sum(price),min(price),max(price),avg(price) from t3 where companynr = 37 group by companynr,t2nr limit 10;
+companynr t2nr count(price) sum(price) min(price) max(price) avg(price)
+37 1 1 5987435 5987435 5987435 5987435.0000
+37 2 1 28357832 28357832 28357832 28357832.0000
+37 3 1 39654943 39654943 39654943 39654943.0000
+37 11 1 5987435 5987435 5987435 5987435.0000
+37 12 1 28357832 28357832 28357832 28357832.0000
+37 13 1 39654943 39654943 39654943 39654943.0000
+37 21 1 5987435 5987435 5987435 5987435.0000
+37 22 1 28357832 28357832 28357832 28357832.0000
+37 23 1 39654943 39654943 39654943 39654943.0000
+37 31 1 5987435 5987435 5987435 5987435.0000
+select companynr,count(price),sum(price),min(price),max(price),avg(price) from t3 group by companynr ;
+companynr count(price) sum(price) min(price) max(price) avg(price)
+37 12543 309394878010 5987435 39654943 24666736.6667
+78 8362 414611089292 726498 98439034 49582766.0000
+101 4181 3489454238 834598 834598 834598.0000
+154 4181 4112197254950 983543950 983543950 983543950.0000
+311 4181 979599938 234298 234298 234298.0000
+447 4181 9929180954 2374834 2374834 2374834.0000
+512 4181 3288532102 786542 786542 786542.0000
+select distinct mod(companynr,10) from t4 group by companynr;
+mod(companynr,10)
+0
+9
+4
+6
+7
+1
+3
+8
+5
+select distinct 1 from t4 group by companynr;
+1
+1
+select count(distinct fld1) from t2;
+count(distinct fld1)
+1199
+select companynr,count(distinct fld1) from t2 group by companynr;
+companynr count(distinct fld1)
+00 82
+29 95
+34 70
+36 215
+37 588
+40 37
+41 52
+50 11
+53 4
+58 23
+65 10
+68 12
+select companynr,count(*) from t2 group by companynr;
+companynr count(*)
+00 82
+29 95
+34 70
+36 215
+37 588
+40 37
+41 52
+50 11
+53 4
+58 23
+65 10
+68 12
+select companynr,count(distinct concat(fld1,repeat(65,1000))) from t2 group by companynr;
+companynr count(distinct concat(fld1,repeat(65,1000)))
+00 82
+29 95
+34 70
+36 215
+37 588
+40 37
+41 52
+50 11
+53 4
+58 23
+65 10
+68 12
+select companynr,count(distinct concat(fld1,repeat(65,200))) from t2 group by companynr;
+companynr count(distinct concat(fld1,repeat(65,200)))
+00 82
+29 95
+34 70
+36 215
+37 588
+40 37
+41 52
+50 11
+53 4
+58 23
+65 10
+68 12
+select companynr,count(distinct floor(fld1/100)) from t2 group by companynr;
+companynr count(distinct floor(fld1/100))
+00 47
+29 35
+34 14
+36 69
+37 108
+40 16
+41 11
+50 9
+53 1
+58 1
+65 1
+68 1
+select companynr,count(distinct concat(repeat(65,1000),floor(fld1/100))) from t2 group by companynr;
+companynr count(distinct concat(repeat(65,1000),floor(fld1/100)))
+00 47
+29 35
+34 14
+36 69
+37 108
+40 16
+41 11
+50 9
+53 1
+58 1
+65 1
+68 1
+select sum(fld1),fld3 from t2 where fld3="Romans" group by fld1 limit 10;
+sum(fld1) fld3
+11402 Romans
+select name,count(*) from t3 where name='cloakroom' group by name;
+name count(*)
+cloakroom 4181
+select name,count(*) from t3 where name='cloakroom' and price>10 group by name;
+name count(*)
+cloakroom 4181
+select count(*) from t3 where name='cloakroom' and price2=823742;
+count(*)
+4181
+select name,count(*) from t3 where name='cloakroom' and price2=823742 group by name;
+name count(*)
+cloakroom 4181
+select name,count(*) from t3 where name >= "extramarital" and price <= 39654943 group by name;
+name count(*)
+extramarital 4181
+gazer 4181
+gems 4181
+Iranizes 4181
+spates 4181
+tucked 4181
+violinist 4181
+select t2.fld3,count(*) from t2,t3 where t2.fld1=158402 and t3.name=t2.fld3 group by t3.name;
+fld3 count(*)
+spates 4181
+select companynr|0,companyname from t4 group by 1;
+companynr|0 companyname
+0 Unknown
+29 company 1
+34 company 2
+36 company 3
+37 company 4
+40 company 5
+41 company 6
+50 company 11
+53 company 7
+58 company 8
+65 company 9
+68 company 10
+select t2.companynr,companyname,count(*) from t2,t4 where t2.companynr=t4.companynr group by t2.companynr order by companyname;
+companynr companyname count(*)
+29 company 1 95
+68 company 10 12
+50 company 11 11
+34 company 2 70
+36 company 3 215
+37 company 4 588
+40 company 5 37
+41 company 6 52
+53 company 7 4
+58 company 8 23
+65 company 9 10
+00 Unknown 82
+select t2.fld1,count(*) from t2,t3 where t2.fld1=158402 and t3.name=t2.fld3 group by t3.name;
+fld1 count(*)
+158402 4181
+select sum(Period)/count(*) from t1;
+sum(Period)/count(*)
+9410.0000
+select companynr,count(price) as "count",sum(price) as "sum" ,abs(sum(price)/count(price)-avg(price)) as "diff",(0+count(price))*companynr as func from t3 group by companynr;
+companynr count sum diff func
+37 12543 309394878010 0.0000 464091
+78 8362 414611089292 0.0000 652236
+101 4181 3489454238 0.0000 422281
+154 4181 4112197254950 0.0000 643874
+311 4181 979599938 0.0000 1300291
+447 4181 9929180954 0.0000 1868907
+512 4181 3288532102 0.0000 2140672
+select companynr,sum(price)/count(price) as avg from t3 group by companynr having avg > 70000000 order by avg;
+companynr avg
+154 983543950.0000
+select companynr,count(*) from t2 group by companynr order by 2 desc;
+companynr count(*)
+37 588
+36 215
+29 95
+00 82
+34 70
+41 52
+40 37
+58 23
+68 12
+50 11
+65 10
+53 4
+select companynr,count(*) from t2 where companynr > 40 group by companynr order by 2 desc;
+companynr count(*)
+41 52
+58 23
+68 12
+50 11
+65 10
+53 4
+select t2.fld4,t2.fld1,count(price),sum(price),min(price),max(price),avg(price) from t3,t2 where t3.companynr = 37 and t2.fld1 = t3.t2nr group by fld1,t2.fld4;
+fld4 fld1 count(price) sum(price) min(price) max(price) avg(price)
+teethe 000001 1 5987435 5987435 5987435 5987435.0000
+dreaded 011401 1 5987435 5987435 5987435 5987435.0000
+scholastics 011402 1 28357832 28357832 28357832 28357832.0000
+audiology 011403 1 39654943 39654943 39654943 39654943.0000
+wallet 011501 1 5987435 5987435 5987435 5987435.0000
+parters 011701 1 5987435 5987435 5987435 5987435.0000
+eschew 011702 1 28357832 28357832 28357832 28357832.0000
+quitter 011703 1 39654943 39654943 39654943 39654943.0000
+neat 012001 1 5987435 5987435 5987435 5987435.0000
+Steinberg 012003 1 39654943 39654943 39654943 39654943.0000
+balled 012301 1 5987435 5987435 5987435 5987435.0000
+persist 012302 1 28357832 28357832 28357832 28357832.0000
+attainments 012303 1 39654943 39654943 39654943 39654943.0000
+capably 012501 1 5987435 5987435 5987435 5987435.0000
+impulsive 012602 1 28357832 28357832 28357832 28357832.0000
+starlet 012603 1 39654943 39654943 39654943 39654943.0000
+featherweight 012701 1 5987435 5987435 5987435 5987435.0000
+pessimist 012702 1 28357832 28357832 28357832 28357832.0000
+daughter 012703 1 39654943 39654943 39654943 39654943.0000
+lawgiver 013601 1 5987435 5987435 5987435 5987435.0000
+stated 013602 1 28357832 28357832 28357832 28357832.0000
+readable 013603 1 39654943 39654943 39654943 39654943.0000
+testicle 013801 1 5987435 5987435 5987435 5987435.0000
+Parsifal 013802 1 28357832 28357832 28357832 28357832.0000
+leavings 013803 1 39654943 39654943 39654943 39654943.0000
+squeaking 013901 1 5987435 5987435 5987435 5987435.0000
+contrasted 016001 1 5987435 5987435 5987435 5987435.0000
+leftover 016201 1 5987435 5987435 5987435 5987435.0000
+whiteners 016202 1 28357832 28357832 28357832 28357832.0000
+erases 016301 1 5987435 5987435 5987435 5987435.0000
+Punjab 016302 1 28357832 28357832 28357832 28357832.0000
+Merritt 016303 1 39654943 39654943 39654943 39654943.0000
+sweetish 018001 1 5987435 5987435 5987435 5987435.0000
+dogging 018002 1 28357832 28357832 28357832 28357832.0000
+scornfully 018003 1 39654943 39654943 39654943 39654943.0000
+fetters 018012 1 28357832 28357832 28357832 28357832.0000
+bivalves 018013 1 39654943 39654943 39654943 39654943.0000
+skulking 018021 1 5987435 5987435 5987435 5987435.0000
+flint 018022 1 28357832 28357832 28357832 28357832.0000
+flopping 018023 1 39654943 39654943 39654943 39654943.0000
+Judas 018032 1 28357832 28357832 28357832 28357832.0000
+vacuuming 018033 1 39654943 39654943 39654943 39654943.0000
+medical 018041 1 5987435 5987435 5987435 5987435.0000
+bloodbath 018042 1 28357832 28357832 28357832 28357832.0000
+subschema 018043 1 39654943 39654943 39654943 39654943.0000
+interdependent 018051 1 5987435 5987435 5987435 5987435.0000
+Graves 018052 1 28357832 28357832 28357832 28357832.0000
+neonatal 018053 1 39654943 39654943 39654943 39654943.0000
+sorters 018061 1 5987435 5987435 5987435 5987435.0000
+epistle 018062 1 28357832 28357832 28357832 28357832.0000
+Conley 018101 1 5987435 5987435 5987435 5987435.0000
+lectured 018102 1 28357832 28357832 28357832 28357832.0000
+Abraham 018103 1 39654943 39654943 39654943 39654943.0000
+cage 018201 1 5987435 5987435 5987435 5987435.0000
+hushes 018202 1 28357832 28357832 28357832 28357832.0000
+Simla 018402 1 28357832 28357832 28357832 28357832.0000
+reporters 018403 1 39654943 39654943 39654943 39654943.0000
+coexist 018601 1 5987435 5987435 5987435 5987435.0000
+Beebe 018602 1 28357832 28357832 28357832 28357832.0000
+Taoism 018603 1 39654943 39654943 39654943 39654943.0000
+Connally 018801 1 5987435 5987435 5987435 5987435.0000
+fetched 018802 1 28357832 28357832 28357832 28357832.0000
+checkpoints 018803 1 39654943 39654943 39654943 39654943.0000
+gritty 018811 1 5987435 5987435 5987435 5987435.0000
+firearm 018812 1 28357832 28357832 28357832 28357832.0000
+minima 019101 1 5987435 5987435 5987435 5987435.0000
+Selfridge 019102 1 28357832 28357832 28357832 28357832.0000
+disable 019103 1 39654943 39654943 39654943 39654943.0000
+witchcraft 019201 1 5987435 5987435 5987435 5987435.0000
+betroth 030501 1 5987435 5987435 5987435 5987435.0000
+Manhattanize 030502 1 28357832 28357832 28357832 28357832.0000
+imprint 030503 1 39654943 39654943 39654943 39654943.0000
+swelling 031901 1 5987435 5987435 5987435 5987435.0000
+interrelationships 036001 1 5987435 5987435 5987435 5987435.0000
+riser 036002 1 28357832 28357832 28357832 28357832.0000
+bee 038001 1 5987435 5987435 5987435 5987435.0000
+kanji 038002 1 28357832 28357832 28357832 28357832.0000
+dental 038003 1 39654943 39654943 39654943 39654943.0000
+railway 038011 1 5987435 5987435 5987435 5987435.0000
+validate 038012 1 28357832 28357832 28357832 28357832.0000
+normalizes 038013 1 39654943 39654943 39654943 39654943.0000
+Kline 038101 1 5987435 5987435 5987435 5987435.0000
+Anatole 038102 1 28357832 28357832 28357832 28357832.0000
+partridges 038103 1 39654943 39654943 39654943 39654943.0000
+recruited 038201 1 5987435 5987435 5987435 5987435.0000
+dimensions 038202 1 28357832 28357832 28357832 28357832.0000
+Chicana 038203 1 39654943 39654943 39654943 39654943.0000
+select t3.companynr,fld3,sum(price) from t3,t2 where t2.fld1 = t3.t2nr and t3.companynr = 512 group by companynr,fld3;
+companynr fld3 sum(price)
+512 boat 786542
+512 capably 786542
+512 cupboard 786542
+512 decliner 786542
+512 descendants 786542
+512 dopers 786542
+512 erases 786542
+512 Micronesia 786542
+512 Miles 786542
+512 skies 786542
+select t2.companynr,count(*),min(fld3),max(fld3),sum(price),avg(price) from t2,t3 where t3.companynr >= 30 and t3.companynr <= 58 and t3.t2nr = t2.fld1 and 1+1=2 group by t2.companynr;
+companynr count(*) min(fld3) max(fld3) sum(price) avg(price)
+00 1 Omaha Omaha 5987435 5987435.0000
+36 1 dubbed dubbed 28357832 28357832.0000
+37 83 Abraham Wotan 1908978016 22999735.1325
+50 2 scribbled tapestry 68012775 34006387.5000
+select t3.companynr+0,t3.t2nr,fld3,sum(price) from t3,t2 where t2.fld1 = t3.t2nr and t3.companynr = 37 group by 1,t3.t2nr,fld3,fld3,fld3,fld3,fld3 order by fld1;
+t3.companynr+0 t2nr fld3 sum(price)
+37 1 Omaha 5987435
+37 11401 breaking 5987435
+37 11402 Romans 28357832
+37 11403 intercepted 39654943
+37 11501 bewilderingly 5987435
+37 11701 astound 5987435
+37 11702 admonishing 28357832
+37 11703 sumac 39654943
+37 12001 flanking 5987435
+37 12003 combed 39654943
+37 12301 Eulerian 5987435
+37 12302 dubbed 28357832
+37 12303 Kane 39654943
+37 12501 annihilates 5987435
+37 12602 Wotan 28357832
+37 12603 snatching 39654943
+37 12701 grazing 5987435
+37 12702 Baird 28357832
+37 12703 celery 39654943
+37 13601 handgun 5987435
+37 13602 foldout 28357832
+37 13603 mystic 39654943
+37 13801 intelligibility 5987435
+37 13802 Augustine 28357832
+37 13803 teethe 39654943
+37 13901 scholastics 5987435
+37 16001 audiology 5987435
+37 16201 wallet 5987435
+37 16202 parters 28357832
+37 16301 eschew 5987435
+37 16302 quitter 28357832
+37 16303 neat 39654943
+37 18001 jarring 5987435
+37 18002 tinily 28357832
+37 18003 balled 39654943
+37 18012 impulsive 28357832
+37 18013 starlet 39654943
+37 18021 lawgiver 5987435
+37 18022 stated 28357832
+37 18023 readable 39654943
+37 18032 testicle 28357832
+37 18033 Parsifal 39654943
+37 18041 Punjab 5987435
+37 18042 Merritt 28357832
+37 18043 Quixotism 39654943
+37 18051 sureties 5987435
+37 18052 puddings 28357832
+37 18053 tapestry 39654943
+37 18061 trimmings 5987435
+37 18062 humility 28357832
+37 18101 tragedies 5987435
+37 18102 skulking 28357832
+37 18103 flint 39654943
+37 18201 relaxing 5987435
+37 18202 offload 28357832
+37 18402 suites 28357832
+37 18403 lists 39654943
+37 18601 vacuuming 5987435
+37 18602 dentally 28357832
+37 18603 humanness 39654943
+37 18801 inch 5987435
+37 18802 Weissmuller 28357832
+37 18803 irresponsibly 39654943
+37 18811 repetitions 5987435
+37 18812 Antares 28357832
+37 19101 ventilate 5987435
+37 19102 pityingly 28357832
+37 19103 interdependent 39654943
+37 19201 Graves 5987435
+37 30501 neonatal 5987435
+37 30502 scribbled 28357832
+37 30503 chafe 39654943
+37 31901 realtor 5987435
+37 36001 elite 5987435
+37 36002 funereal 28357832
+37 38001 Conley 5987435
+37 38002 lectured 28357832
+37 38003 Abraham 39654943
+37 38011 groupings 5987435
+37 38012 dissociate 28357832
+37 38013 coexist 39654943
+37 38101 rusting 5987435
+37 38102 galling 28357832
+37 38103 obliterates 39654943
+37 38201 resumes 5987435
+37 38202 analyzable 28357832
+37 38203 terminator 39654943
+select sum(price) from t3,t2 where t2.fld1 = t3.t2nr and t3.companynr = 512 and t3.t2nr = 38008 and t2.fld1 = 38008 or t2.fld1= t3.t2nr and t3.t2nr = 38008 and t2.fld1 = 38008;
+sum(price)
+234298
+select t2.fld1,sum(price) from t3,t2 where t2.fld1 = t3.t2nr and t3.companynr = 512 and t3.t2nr = 38008 and t2.fld1 = 38008 or t2.fld1 = t3.t2nr and t3.t2nr = 38008 and t2.fld1 = 38008 or t3.t2nr = t2.fld1 and t2.fld1 = 38008 group by t2.fld1;
+fld1 sum(price)
+038008 234298
+explain select fld3 from t2 where 1>2 or 2>3;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE NULL NULL NULL NULL NULL NULL NULL Impossible WHERE
+explain select fld3 from t2 where fld1=fld1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1199
+select companynr,fld1 from t2 HAVING fld1=250501 or fld1=250502;
+companynr fld1
+34 250501
+34 250502
+select companynr,fld1 from t2 WHERE fld1>=250501 HAVING fld1<=250502;
+companynr fld1
+34 250501
+34 250502
+select companynr,count(*) as count,sum(fld1) as sum from t2 group by companynr having count > 40 and sum/count >= 120000;
+companynr count sum
+00 82 10355753
+29 95 14473298
+34 70 17788966
+37 588 83602098
+41 52 12816335
+select companynr from t2 group by companynr having count(*) > 40 and sum(fld1)/count(*) >= 120000 ;
+companynr
+00
+29
+34
+37
+41
+select t2.companynr,companyname,count(*) from t2,t4 where t2.companynr=t4.companynr group by companyname having t2.companynr >= 40;
+companynr companyname count(*)
+68 company 10 12
+50 company 11 11
+40 company 5 37
+41 company 6 52
+53 company 7 4
+58 company 8 23
+65 company 9 10
+select count(*) from t2;
+count(*)
+1199
+select count(*) from t2 where fld1 < 098024;
+count(*)
+387
+select min(fld1) from t2 where fld1>= 098024;
+min(fld1)
+98024
+select max(fld1) from t2 where fld1>= 098024;
+max(fld1)
+1232609
+select count(*) from t3 where price2=76234234;
+count(*)
+4181
+select count(*) from t3 where companynr=512 and price2=76234234;
+count(*)
+4181
+explain select min(fld1),max(fld1),count(*) from t2;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE NULL NULL NULL NULL NULL NULL NULL Select tables optimized away
+select min(fld1),max(fld1),count(*) from t2;
+min(fld1) max(fld1) count(*)
+0 1232609 1199
+select min(t2nr),max(t2nr) from t3 where t2nr=2115 and price2=823742;
+min(t2nr) max(t2nr)
+2115 2115
+select count(*),min(t2nr),max(t2nr) from t3 where name='spates' and companynr=78;
+count(*) min(t2nr) max(t2nr)
+4181 4 41804
+select t2nr,count(*) from t3 where name='gems' group by t2nr limit 20;
+t2nr count(*)
+9 1
+19 1
+29 1
+39 1
+49 1
+59 1
+69 1
+79 1
+89 1
+99 1
+109 1
+119 1
+129 1
+139 1
+149 1
+159 1
+169 1
+179 1
+189 1
+199 1
+select max(t2nr) from t3 where price=983543950;
+max(t2nr)
+41807
+select t1.period from t3 = t1 limit 1;
+period
+1001
+select t1.period from t1 as t1 limit 1;
+period
+9410
+select t1.period as "Nuvarande period" from t1 as t1 limit 1;
+Nuvarande period
+9410
+select period as ok_period from t1 limit 1;
+ok_period
+9410
+select period as ok_period from t1 group by ok_period limit 1;
+ok_period
+9410
+select 1+1 as summa from t1 group by summa limit 1;
+summa
+2
+select period as "Nuvarande period" from t1 group by "Nuvarande period" limit 1;
+Nuvarande period
+9410
+show tables;
+Tables_in_test
+t1
+t2
+t3
+t4
+show tables from test like "s%";
+Tables_in_test (s%)
+show tables from test like "t?";
+Tables_in_test (t?)
+show full columns from t2;
+Field Type Collation Null Key Default Extra Privileges Comment
+auto int(11) NULL NO PRI NULL auto_increment #
+fld1 int(6) unsigned zerofill NULL NO UNI 000000 #
+companynr tinyint(2) unsigned zerofill NULL NO 00 #
+fld3 char(30) latin1_swedish_ci NO MUL #
+fld4 char(35) latin1_swedish_ci NO #
+fld5 char(35) latin1_swedish_ci NO #
+fld6 char(4) latin1_swedish_ci NO #
+show full columns from t2 from test like 'f%';
+Field Type Collation Null Key Default Extra Privileges Comment
+fld1 int(6) unsigned zerofill NULL NO UNI 000000 #
+fld3 char(30) latin1_swedish_ci NO MUL #
+fld4 char(35) latin1_swedish_ci NO #
+fld5 char(35) latin1_swedish_ci NO #
+fld6 char(4) latin1_swedish_ci NO #
+show full columns from t2 from test like 's%';
+Field Type Collation Null Key Default Extra Privileges Comment
+show keys from t2;
+Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment
+t2 0 PRIMARY 1 auto A 1199 NULL NULL BTREE
+t2 0 fld1 1 fld1 A 1199 NULL NULL BTREE
+t2 1 fld3 1 fld3 A NULL NULL NULL BTREE
+drop table t4, t3, t2, t1;
+DO 1;
+DO benchmark(100,1+1),1,1;
+do default;
+ERROR 42000: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1
+do foobar;
+ERROR 42S22: Unknown column 'foobar' in 'field list'
+CREATE TABLE t1 (
+id mediumint(8) unsigned NOT NULL auto_increment,
+pseudo varchar(35) NOT NULL default '',
+PRIMARY KEY (id),
+UNIQUE KEY pseudo (pseudo)
+);
+INSERT INTO t1 (pseudo) VALUES ('test');
+INSERT INTO t1 (pseudo) VALUES ('test1');
+SELECT 1 as rnd1 from t1 where rand() > 2;
+rnd1
+DROP TABLE t1;
+CREATE TABLE t1 (gvid int(10) unsigned default NULL, hmid int(10) unsigned default NULL, volid int(10) unsigned default NULL, mmid int(10) unsigned default NULL, hdid int(10) unsigned default NULL, fsid int(10) unsigned default NULL, ctid int(10) unsigned default NULL, dtid int(10) unsigned default NULL, cost int(10) unsigned default NULL, performance int(10) unsigned default NULL, serialnumber bigint(20) unsigned default NULL, monitored tinyint(3) unsigned default '1', removed tinyint(3) unsigned default '0', target tinyint(3) unsigned default '0', dt_modified timestamp NOT NULL, name varchar(255) binary default NULL, description varchar(255) default NULL, UNIQUE KEY hmid (hmid,volid)) ENGINE=MyISAM;
+INSERT INTO t1 VALUES (200001,2,1,1,100,1,1,1,0,0,0,1,0,1,20020425060057,'\\\\ARKIVIO-TESTPDC\\E$',''),(200002,2,2,1,101,1,1,1,0,0,0,1,0,1,20020425060057,'\\\\ARKIVIO-TESTPDC\\C$',''),(200003,1,3,2,NULL,NULL,NULL,NULL,NULL,NULL,NULL,1,0,1,20020425060427,'c:',NULL);
+CREATE TABLE t2 ( hmid int(10) unsigned default NULL, volid int(10) unsigned default NULL, sampletid smallint(5) unsigned default NULL, sampletime datetime default NULL, samplevalue bigint(20) unsigned default NULL, KEY idx1 (hmid,volid,sampletid,sampletime)) ENGINE=MyISAM;
+INSERT INTO t2 VALUES (1,3,10,'2002-06-01 08:00:00',35),(1,3,1010,'2002-06-01 12:00:01',35);
+SELECT a.gvid, (SUM(CASE b.sampletid WHEN 140 THEN b.samplevalue ELSE 0 END)) as the_success,(SUM(CASE b.sampletid WHEN 141 THEN b.samplevalue ELSE 0 END)) as the_fail,(SUM(CASE b.sampletid WHEN 142 THEN b.samplevalue ELSE 0 END)) as the_size,(SUM(CASE b.sampletid WHEN 143 THEN b.samplevalue ELSE 0 END)) as the_time FROM t1 a, t2 b WHERE a.hmid = b.hmid AND a.volid = b.volid AND b.sampletime >= 'wrong-date-value' AND b.sampletime < 'wrong-date-value' AND b.sampletid IN (140, 141, 142, 143) GROUP BY a.gvid;
+gvid the_success the_fail the_size the_time
+Warnings:
+Warning 1292 Incorrect datetime value: 'wrong-date-value' for column 'sampletime' at row 1
+Warning 1292 Incorrect datetime value: 'wrong-date-value' for column 'sampletime' at row 1
+SELECT a.gvid, (SUM(CASE b.sampletid WHEN 140 THEN b.samplevalue ELSE 0 END)) as the_success,(SUM(CASE b.sampletid WHEN 141 THEN b.samplevalue ELSE 0 END)) as the_fail,(SUM(CASE b.sampletid WHEN 142 THEN b.samplevalue ELSE 0 END)) as the_size,(SUM(CASE b.sampletid WHEN 143 THEN b.samplevalue ELSE 0 END)) as the_time FROM t1 a, t2 b WHERE a.hmid = b.hmid AND a.volid = b.volid AND b.sampletime >= NULL AND b.sampletime < NULL AND b.sampletid IN (140, 141, 142, 143) GROUP BY a.gvid;
+gvid the_success the_fail the_size the_time
+DROP TABLE t1,t2;
+create table t1 ( A_Id bigint(20) NOT NULL default '0', A_UpdateBy char(10) NOT NULL default '', A_UpdateDate bigint(20) NOT NULL default '0', A_UpdateSerial int(11) NOT NULL default '0', other_types bigint(20) NOT NULL default '0', wss_type bigint(20) NOT NULL default '0');
+INSERT INTO t1 VALUES (102935998719055004,'brade',1029359987,2,102935229116544068,102935229216544093);
+select wss_type from t1 where wss_type ='102935229216544106';
+wss_type
+select wss_type from t1 where wss_type ='102935229216544105';
+wss_type
+select wss_type from t1 where wss_type ='102935229216544104';
+wss_type
+select wss_type from t1 where wss_type ='102935229216544093';
+wss_type
+102935229216544093
+select wss_type from t1 where wss_type =102935229216544093;
+wss_type
+102935229216544093
+drop table t1;
+select 1+2,"aaaa",3.13*2.0 into @a,@b,@c;
+select @a;
+@a
+3
+select @b;
+@b
+aaaa
+select @c;
+@c
+6.260
+create table t1 (a int not null auto_increment primary key);
+insert into t1 values ();
+insert into t1 values ();
+insert into t1 values ();
+select * from (t1 as t2 left join t1 as t3 using (a)), t1;
+a a
+1 1
+2 1
+3 1
+1 2
+2 2
+3 2
+1 3
+2 3
+3 3
+select * from t1, (t1 as t2 left join t1 as t3 using (a));
+a a
+1 1
+2 1
+3 1
+1 2
+2 2
+3 2
+1 3
+2 3
+3 3
+select * from (t1 as t2 left join t1 as t3 using (a)) straight_join t1;
+a a
+1 1
+2 1
+3 1
+1 2
+2 2
+3 2
+1 3
+2 3
+3 3
+select * from t1 straight_join (t1 as t2 left join t1 as t3 using (a));
+a a
+1 1
+2 1
+3 1
+1 2
+2 2
+3 2
+1 3
+2 3
+3 3
+select * from (t1 as t2 left join t1 as t3 using (a)) inner join t1 on t1.a>1;
+a a
+1 2
+2 2
+3 2
+1 3
+2 3
+3 3
+select * from t1 inner join (t1 as t2 left join t1 as t3 using (a)) on t1.a>1;
+a a
+2 1
+3 1
+2 2
+3 2
+2 3
+3 3
+select * from (t1 as t2 left join t1 as t3 using (a)) inner join t1 using ( a );
+a
+1
+2
+3
+select * from t1 inner join (t1 as t2 left join t1 as t3 using (a)) using ( a );
+a
+1
+2
+3
+select * from (t1 as t2 left join t1 as t3 using (a)) left outer join t1 on t1.a>1;
+a a
+1 2
+2 2
+3 2
+1 3
+2 3
+3 3
+select * from t1 left outer join (t1 as t2 left join t1 as t3 using (a)) on t1.a>1;
+a a
+2 1
+3 1
+2 2
+3 2
+2 3
+3 3
+1 NULL
+select * from (t1 as t2 left join t1 as t3 using (a)) left join t1 using ( a );
+a
+1
+2
+3
+select * from t1 left join (t1 as t2 left join t1 as t3 using (a)) using ( a );
+a
+1
+2
+3
+select * from (t1 as t2 left join t1 as t3 using (a)) natural left join t1;
+a
+1
+2
+3
+select * from t1 natural left join (t1 as t2 left join t1 as t3 using (a));
+a
+1
+2
+3
+select * from (t1 as t2 left join t1 as t3 using (a)) right join t1 on t1.a>1;
+a a
+1 2
+1 3
+2 2
+2 3
+3 2
+3 3
+NULL 1
+select * from t1 right join (t1 as t2 left join t1 as t3 using (a)) on t1.a>1;
+a a
+2 1
+2 2
+2 3
+3 1
+3 2
+3 3
+select * from (t1 as t2 left join t1 as t3 using (a)) right outer join t1 using ( a );
+a
+1
+2
+3
+select * from t1 right outer join (t1 as t2 left join t1 as t3 using (a)) using ( a );
+a
+1
+2
+3
+select * from (t1 as t2 left join t1 as t3 using (a)) natural right join t1;
+a
+1
+2
+3
+select * from t1 natural right join (t1 as t2 left join t1 as t3 using (a));
+a
+1
+2
+3
+select * from t1 natural join (t1 as t2 left join t1 as t3 using (a));
+a
+1
+2
+3
+select * from (t1 as t2 left join t1 as t3 using (a)) natural join t1;
+a
+1
+2
+3
+drop table t1;
+CREATE TABLE t1 ( aa char(2), id int(11) NOT NULL auto_increment, t2_id int(11) NOT NULL default '0', PRIMARY KEY (id), KEY replace_id (t2_id)) ENGINE=MyISAM;
+INSERT INTO t1 VALUES ("1",8264,2506),("2",8299,2517),("3",8301,2518),("4",8302,2519),("5",8303,2520),("6",8304,2521),("7",8305,2522);
+CREATE TABLE t2 ( id int(11) NOT NULL auto_increment, PRIMARY KEY (id)) ENGINE=MyISAM;
+INSERT INTO t2 VALUES (2517), (2518), (2519), (2520), (2521), (2522);
+select * from t1, t2 WHERE t1.t2_id = t2.id and t1.t2_id > 0 order by t1.id LIMIT 0, 5;
+aa id t2_id id
+2 8299 2517 2517
+3 8301 2518 2518
+4 8302 2519 2519
+5 8303 2520 2520
+6 8304 2521 2521
+drop table t1,t2;
+create table t1 (id1 int NOT NULL);
+create table t2 (id2 int NOT NULL);
+create table t3 (id3 int NOT NULL);
+create table t4 (id4 int NOT NULL, id44 int NOT NULL, KEY (id4));
+insert into t1 values (1);
+insert into t1 values (2);
+insert into t2 values (1);
+insert into t4 values (1,1);
+explain select * from t1 left join t2 on id1 = id2 left join t3 on id1 = id3
+left join t4 on id3 = id4 where id2 = 1 or id4 = 1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t3 system NULL NULL NULL NULL 0 const row not found
+1 SIMPLE t4 const id4 NULL NULL NULL 1
+1 SIMPLE t1 ALL NULL NULL NULL NULL 2
+1 SIMPLE t2 ALL NULL NULL NULL NULL 1 Using where; Using join buffer
+select * from t1 left join t2 on id1 = id2 left join t3 on id1 = id3
+left join t4 on id3 = id4 where id2 = 1 or id4 = 1;
+id1 id2 id3 id4 id44
+1 1 NULL NULL NULL
+drop table t1,t2,t3,t4;
+create table t1(s varchar(10) not null);
+create table t2(s varchar(10) not null primary key);
+create table t3(s varchar(10) not null primary key);
+insert into t1 values ('one\t'), ('two\t');
+insert into t2 values ('one\r'), ('two\t');
+insert into t3 values ('one '), ('two\t');
+select * from t1 where s = 'one';
+s
+select * from t2 where s = 'one';
+s
+select * from t3 where s = 'one';
+s
+one
+select * from t1,t2 where t1.s = t2.s;
+s s
+two two
+select * from t2,t3 where t2.s = t3.s;
+s s
+two two
+drop table t1, t2, t3;
+create table t1 (a integer, b integer, index(a), index(b));
+create table t2 (c integer, d integer, index(c), index(d));
+insert into t1 values (1,2), (2,2), (3,2), (4,2);
+insert into t2 values (1,3), (2,3), (3,4), (4,4);
+explain select * from t1 left join t2 on a=c where d in (4);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ref c,d d 5 const 2
+1 SIMPLE t1 ALL a NULL NULL NULL 4 Using where; Using join buffer
+select * from t1 left join t2 on a=c where d in (4);
+a b c d
+3 2 3 4
+4 2 4 4
+explain select * from t1 left join t2 on a=c where d = 4;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ref c,d d 5 const 2
+1 SIMPLE t1 ALL a NULL NULL NULL 4 Using where; Using join buffer
+select * from t1 left join t2 on a=c where d = 4;
+a b c d
+3 2 3 4
+4 2 4 4
+drop table t1, t2;
+CREATE TABLE t1 (
+i int(11) NOT NULL default '0',
+c char(10) NOT NULL default '',
+PRIMARY KEY (i),
+UNIQUE KEY c (c)
+) ENGINE=MyISAM;
+INSERT INTO t1 VALUES (1,'a');
+INSERT INTO t1 VALUES (2,'b');
+INSERT INTO t1 VALUES (3,'c');
+EXPLAIN SELECT i FROM t1 WHERE i=1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1 Using index
+DROP TABLE t1;
+CREATE TABLE t1 ( a BLOB, INDEX (a(20)) );
+CREATE TABLE t2 ( a BLOB, INDEX (a(20)) );
+INSERT INTO t1 VALUES ('one'),('two'),('three'),('four'),('five');
+INSERT INTO t2 VALUES ('one'),('two'),('three'),('four'),('five');
+EXPLAIN SELECT * FROM t1 LEFT JOIN t2 USE INDEX (a) ON t1.a=t2.a;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 5
+1 SIMPLE t2 ref a a 23 test.t1.a 2 Using where
+EXPLAIN SELECT * FROM t1 LEFT JOIN t2 FORCE INDEX (a) ON t1.a=t2.a;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 5
+1 SIMPLE t2 ref a a 23 test.t1.a 2 Using where
+DROP TABLE t1, t2;
+CREATE TABLE t1 ( city char(30) );
+INSERT INTO t1 VALUES ('London');
+INSERT INTO t1 VALUES ('Paris');
+SELECT * FROM t1 WHERE city='London';
+city
+London
+SELECT * FROM t1 WHERE city='london';
+city
+London
+EXPLAIN SELECT * FROM t1 WHERE city='London' AND city='london';
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 2 Using where
+SELECT * FROM t1 WHERE city='London' AND city='london';
+city
+London
+EXPLAIN SELECT * FROM t1 WHERE city LIKE '%london%' AND city='London';
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 2 Using where
+SELECT * FROM t1 WHERE city LIKE '%london%' AND city='London';
+city
+London
+DROP TABLE t1;
+create table t1 (a int(11) unsigned, b int(11) unsigned);
+insert into t1 values (1,0), (1,1), (1,2);
+select a-b from t1 order by 1;
+a-b
+0
+1
+18446744073709551615
+select a-b , (a-b < 0) from t1 order by 1;
+a-b (a-b < 0)
+0 0
+1 0
+18446744073709551615 0
+select a-b as d, (a-b >= 0), b from t1 group by b having d >= 0;
+d (a-b >= 0) b
+1 1 0
+0 1 1
+18446744073709551615 1 2
+select cast((a - b) as unsigned) from t1 order by 1;
+cast((a - b) as unsigned)
+0
+1
+18446744073709551615
+drop table t1;
+create table t1 (a int(11));
+select all all * from t1;
+a
+select distinct distinct * from t1;
+a
+select all distinct * from t1;
+ERROR HY000: Incorrect usage of ALL and DISTINCT
+select distinct all * from t1;
+ERROR HY000: Incorrect usage of ALL and DISTINCT
+drop table t1;
+CREATE TABLE t1 (
+kunde_intern_id int(10) unsigned NOT NULL default '0',
+kunde_id int(10) unsigned NOT NULL default '0',
+FK_firma_id int(10) unsigned NOT NULL default '0',
+aktuell enum('Ja','Nein') NOT NULL default 'Ja',
+vorname varchar(128) NOT NULL default '',
+nachname varchar(128) NOT NULL default '',
+geloescht enum('Ja','Nein') NOT NULL default 'Nein',
+firma varchar(128) NOT NULL default ''
+);
+INSERT INTO t1 VALUES
+(3964,3051,1,'Ja','Vorname1','1Nachname','Nein','Print Schau XXXX'),
+(3965,3051111,1,'Ja','Vorname1111','1111Nachname','Nein','Print Schau XXXX');
+SELECT kunde_id ,FK_firma_id ,aktuell, vorname, nachname, geloescht FROM t1
+WHERE
+(
+(
+( '' != '' AND firma LIKE CONCAT('%', '', '%'))
+OR
+(vorname LIKE CONCAT('%', 'Vorname1', '%') AND
+nachname LIKE CONCAT('%', '1Nachname', '%') AND
+'Vorname1' != '' AND 'xxxx' != '')
+)
+AND
+(
+aktuell = 'Ja' AND geloescht = 'Nein' AND FK_firma_id = 2
+)
+)
+;
+kunde_id FK_firma_id aktuell vorname nachname geloescht
+SELECT kunde_id ,FK_firma_id ,aktuell, vorname, nachname,
+geloescht FROM t1
+WHERE
+(
+(
+aktuell = 'Ja' AND geloescht = 'Nein' AND FK_firma_id = 2
+)
+AND
+(
+( '' != '' AND firma LIKE CONCAT('%', '', '%') )
+OR
+( vorname LIKE CONCAT('%', 'Vorname1', '%') AND
+nachname LIKE CONCAT('%', '1Nachname', '%') AND 'Vorname1' != '' AND
+'xxxx' != '')
+)
+)
+;
+kunde_id FK_firma_id aktuell vorname nachname geloescht
+SELECT COUNT(*) FROM t1 WHERE
+( 0 OR (vorname LIKE '%Vorname1%' AND nachname LIKE '%1Nachname%' AND 1))
+AND FK_firma_id = 2;
+COUNT(*)
+0
+drop table t1;
+CREATE TABLE t1 (b BIGINT(20) UNSIGNED NOT NULL, PRIMARY KEY (b));
+INSERT INTO t1 VALUES (0x8000000000000000);
+SELECT b FROM t1 WHERE b=0x8000000000000000;
+b
+9223372036854775808
+DROP TABLE t1;
+CREATE TABLE `t1` ( `gid` int(11) default NULL, `uid` int(11) default NULL);
+CREATE TABLE `t2` ( `ident` int(11) default NULL, `level` char(16) default NULL);
+INSERT INTO `t2` VALUES (0,'READ');
+CREATE TABLE `t3` ( `id` int(11) default NULL, `name` char(16) default NULL);
+INSERT INTO `t3` VALUES (1,'fs');
+select * from t3 left join t1 on t3.id = t1.uid, t2 where t2.ident in (0, t1.gid, t3.id, 0);
+id name gid uid ident level
+1 fs NULL NULL 0 READ
+drop table t1,t2,t3;
+CREATE TABLE t1 (
+acct_id int(11) NOT NULL default '0',
+profile_id smallint(6) default NULL,
+UNIQUE KEY t1$acct_id (acct_id),
+KEY t1$profile_id (profile_id)
+);
+INSERT INTO t1 VALUES (132,17),(133,18);
+CREATE TABLE t2 (
+profile_id smallint(6) default NULL,
+queue_id int(11) default NULL,
+seq int(11) default NULL,
+KEY t2$queue_id (queue_id)
+);
+INSERT INTO t2 VALUES (17,31,4),(17,30,3),(17,36,2),(17,37,1);
+CREATE TABLE t3 (
+id int(11) NOT NULL default '0',
+qtype int(11) default NULL,
+seq int(11) default NULL,
+warn_lvl int(11) default NULL,
+crit_lvl int(11) default NULL,
+rr1 tinyint(4) NOT NULL default '0',
+rr2 int(11) default NULL,
+default_queue tinyint(4) NOT NULL default '0',
+KEY t3$qtype (qtype),
+KEY t3$id (id)
+);
+INSERT INTO t3 VALUES (30,1,29,NULL,NULL,0,NULL,0),(31,1,28,NULL,NULL,0,NULL,0),
+(36,1,34,NULL,NULL,0,NULL,0),(37,1,35,NULL,NULL,0,121,0);
+SELECT COUNT(*) FROM t1 a STRAIGHT_JOIN t2 pq STRAIGHT_JOIN t3 q
+WHERE
+(pq.profile_id = a.profile_id) AND (a.acct_id = 132) AND
+(pq.queue_id = q.id) AND (q.rr1 <> 1);
+COUNT(*)
+4
+drop table t1,t2,t3;
+create table t1 (f1 int);
+insert into t1 values (1),(NULL);
+create table t2 (f2 int, f3 int, f4 int);
+create index idx1 on t2 (f4);
+insert into t2 values (1,2,3),(2,4,6);
+select A.f2 from t1 left join t2 A on A.f2 = f1 where A.f3=(select min(f3)
+from t2 C where A.f4 = C.f4) or A.f3 IS NULL;
+f2
+1
+NULL
+drop table t1,t2;
+create table t2 (a tinyint unsigned);
+create index t2i on t2(a);
+insert into t2 values (0), (254), (255);
+explain select * from t2 where a > -1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 index t2i t2i 2 NULL 3 Using where; Using index
+select * from t2 where a > -1;
+a
+0
+254
+255
+drop table t2;
+CREATE TABLE t1 (a int, b int, c int);
+INSERT INTO t1
+SELECT 50, 3, 3 FROM DUAL
+WHERE NOT EXISTS
+(SELECT * FROM t1 WHERE a = 50 AND b = 3);
+SELECT * FROM t1;
+a b c
+50 3 3
+INSERT INTO t1
+SELECT 50, 3, 3 FROM DUAL
+WHERE NOT EXISTS
+(SELECT * FROM t1 WHERE a = 50 AND b = 3);
+select found_rows();
+found_rows()
+0
+SELECT * FROM t1;
+a b c
+50 3 3
+select count(*) from t1;
+count(*)
+1
+select found_rows();
+found_rows()
+1
+select count(*) from t1 limit 2,3;
+count(*)
+select found_rows();
+found_rows()
+0
+select SQL_CALC_FOUND_ROWS count(*) from t1 limit 2,3;
+count(*)
+select found_rows();
+found_rows()
+1
+DROP TABLE t1;
+CREATE TABLE t1 (a INT, b INT);
+(SELECT a, b AS c FROM t1) ORDER BY c+1;
+a c
+(SELECT a, b AS c FROM t1) ORDER BY b+1;
+a c
+SELECT a, b AS c FROM t1 ORDER BY c+1;
+a c
+SELECT a, b AS c FROM t1 ORDER BY b+1;
+a c
+drop table t1;
+create table t1(f1 int, f2 int);
+create table t2(f3 int);
+select f1 from t1,t2 where f1=f2 and (f1,f2) = ((1,1));
+f1
+select f1 from t1,t2 where f1=f2 and (f1,NULL) = ((1,1));
+f1
+select f1 from t1,t2 where f1=f2 and (f1,f2) = ((1,NULL));
+f1
+insert into t1 values(1,1),(2,null);
+insert into t2 values(2);
+select * from t1,t2 where f1=f3 and (f1,f2) = (2,null);
+f1 f2 f3
+select * from t1,t2 where f1=f3 and (f1,f2) <=> (2,null);
+f1 f2 f3
+2 NULL 2
+drop table t1,t2;
+create table t1 (f1 int not null auto_increment primary key, f2 varchar(10));
+create table t11 like t1;
+insert into t1 values(1,""),(2,"");
+show table status like 't1%';
+Name Engine Version Row_format Rows Avg_row_length Data_length Max_data_length Index_length Data_free Auto_increment Create_time Update_time Check_time Collation Checksum Create_options Comment
+t1 MyISAM 10 Dynamic 2 20 X X X X X X X X latin1_swedish_ci NULL
+t11 MyISAM 10 Dynamic 0 0 X X X X X X X X latin1_swedish_ci NULL
+select 123 as a from t1 where f1 is null;
+a
+drop table t1,t11;
+CREATE TABLE t1 ( a INT NOT NULL, b INT NOT NULL, UNIQUE idx (a,b) );
+INSERT INTO t1 VALUES (1,1),(1,2),(1,3),(1,4);
+CREATE TABLE t2 ( a INT NOT NULL, b INT NOT NULL, e INT );
+INSERT INTO t2 VALUES ( 1,10,1), (1,10,2), (1,11,1), (1,11,2), (1,2,1), (1,2,2),(1,2,3);
+SELECT t2.a, t2.b, IF(t1.b IS NULL,'',e) AS c, COUNT(*) AS d FROM t2 LEFT JOIN
+t1 ON t2.a = t1.a AND t2.b = t1.b GROUP BY a, b, c;
+a b c d
+1 2 1 1
+1 2 2 1
+1 2 3 1
+1 10 2
+1 11 2
+SELECT t2.a, t2.b, IF(t1.b IS NULL,'',e) AS c, COUNT(*) AS d FROM t2 LEFT JOIN
+t1 ON t2.a = t1.a AND t2.b = t1.b GROUP BY t1.a, t1.b, c;
+a b c d
+1 10 4
+1 2 1 1
+1 2 2 1
+1 2 3 1
+SELECT t2.a, t2.b, IF(t1.b IS NULL,'',e) AS c, COUNT(*) AS d FROM t2 LEFT JOIN
+t1 ON t2.a = t1.a AND t2.b = t1.b GROUP BY t2.a, t2.b, c;
+a b c d
+1 2 1 1
+1 2 2 1
+1 2 3 1
+1 10 2
+1 11 2
+SELECT t2.a, t2.b, IF(t1.b IS NULL,'',e) AS c, COUNT(*) AS d FROM t2,t1
+WHERE t2.a = t1.a AND t2.b = t1.b GROUP BY a, b, c;
+a b c d
+1 2 1 1
+1 2 2 1
+1 2 3 1
+DROP TABLE IF EXISTS t1, t2;
+create table t1 (f1 int primary key, f2 int);
+create table t2 (f3 int, f4 int, primary key(f3,f4));
+insert into t1 values (1,1);
+insert into t2 values (1,1),(1,2);
+select distinct count(f2) >0 from t1 left join t2 on f1=f3 group by f1;
+count(f2) >0
+1
+drop table t1,t2;
+create table t1 (f1 int,f2 int);
+insert into t1 values(1,1);
+create table t2 (f3 int, f4 int, primary key(f3,f4));
+insert into t2 values(1,1);
+select * from t1 where f1 in (select f3 from t2 where (f3,f4)= (select f3,f4 from t2));
+f1 f2
+1 1
+drop table t1,t2;
+CREATE TABLE t1(a int, b int, c int, KEY b(b), KEY c(c));
+insert into t1 values (1,0,0),(2,0,0);
+CREATE TABLE t2 (a int, b varchar(2), c varchar(2), PRIMARY KEY(a));
+insert into t2 values (1,'',''), (2,'','');
+CREATE TABLE t3 (a int, b int, PRIMARY KEY (a,b), KEY a (a), KEY b (b));
+insert into t3 values (1,1),(1,2);
+explain select straight_join DISTINCT t2.a,t2.b, t1.c from t1, t3, t2
+where (t1.c=t2.a or (t1.c=t3.a and t2.a=t3.b)) and t1.b=556476786 and
+t2.b like '%%' order by t2.b limit 0,1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ref b,c b 5 const 1 Using temporary; Using filesort
+1 SIMPLE t3 index PRIMARY,a,b PRIMARY 8 NULL 2 Using index; Using join buffer
+1 SIMPLE t2 ALL PRIMARY NULL NULL NULL 2 Range checked for each record (index map: 0x1)
+DROP TABLE t1,t2,t3;
+CREATE TABLE t1 (a int, INDEX idx(a));
+INSERT INTO t1 VALUES (2), (3), (1);
+EXPLAIN SELECT * FROM t1 IGNORE INDEX (idx);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3
+EXPLAIN SELECT * FROM t1 IGNORE INDEX (a);
+ERROR 42000: Key 'a' doesn't exist in table 't1'
+EXPLAIN SELECT * FROM t1 FORCE INDEX (a);
+ERROR 42000: Key 'a' doesn't exist in table 't1'
+DROP TABLE t1;
+CREATE TABLE t1 (a int, b int);
+INSERT INTO t1 VALUES (1,1), (2,1), (4,10);
+CREATE TABLE t2 (a int PRIMARY KEY, b int, KEY b (b));
+INSERT INTO t2 VALUES (1,NULL), (2,10);
+ALTER TABLE t1 ENABLE KEYS;
+EXPLAIN SELECT STRAIGHT_JOIN SQL_NO_CACHE COUNT(*) FROM t2, t1 WHERE t1.b = t2.b OR t2.b IS NULL;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 index b b 5 NULL 2 Using index
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3 Using where; Using join buffer
+SELECT STRAIGHT_JOIN SQL_NO_CACHE * FROM t2, t1 WHERE t1.b = t2.b OR t2.b IS NULL;
+a b a b
+1 NULL 1 1
+1 NULL 2 1
+1 NULL 4 10
+2 10 4 10
+EXPLAIN SELECT STRAIGHT_JOIN SQL_NO_CACHE COUNT(*) FROM t2, t1 WHERE t1.b = t2.b OR t2.b IS NULL;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 index b b 5 NULL 2 Using index
+1 SIMPLE t1 ALL NULL NULL NULL NULL 3 Using where; Using join buffer
+SELECT STRAIGHT_JOIN SQL_NO_CACHE * FROM t2, t1 WHERE t1.b = t2.b OR t2.b IS NULL;
+a b a b
+1 NULL 1 1
+1 NULL 2 1
+1 NULL 4 10
+2 10 4 10
+DROP TABLE IF EXISTS t1,t2;
+CREATE TABLE t1 (key1 float default NULL, UNIQUE KEY key1 (key1));
+CREATE TABLE t2 (key2 float default NULL, UNIQUE KEY key2 (key2));
+INSERT INTO t1 VALUES (0.3762),(0.3845),(0.6158),(0.7941);
+INSERT INTO t2 VALUES (1.3762),(1.3845),(1.6158),(1.7941);
+explain select max(key1) from t1 where key1 <= 0.6158;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE NULL NULL NULL NULL NULL NULL NULL Select tables optimized away
+explain select max(key2) from t2 where key2 <= 1.6158;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE NULL NULL NULL NULL NULL NULL NULL Select tables optimized away
+explain select min(key1) from t1 where key1 >= 0.3762;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE NULL NULL NULL NULL NULL NULL NULL Select tables optimized away
+explain select min(key2) from t2 where key2 >= 1.3762;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE NULL NULL NULL NULL NULL NULL NULL Select tables optimized away
+explain select max(key1), min(key2) from t1, t2
+where key1 <= 0.6158 and key2 >= 1.3762;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE NULL NULL NULL NULL NULL NULL NULL Select tables optimized away
+explain select max(key1) from t1 where key1 <= 0.6158 and rand() + 0.5 >= 0.5;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE NULL NULL NULL NULL NULL NULL NULL Select tables optimized away
+explain select min(key1) from t1 where key1 >= 0.3762 and rand() + 0.5 >= 0.5;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE NULL NULL NULL NULL NULL NULL NULL Select tables optimized away
+select max(key1) from t1 where key1 <= 0.6158;
+max(key1)
+0.615800023078918
+select max(key2) from t2 where key2 <= 1.6158;
+max(key2)
+1.61580002307892
+select min(key1) from t1 where key1 >= 0.3762;
+min(key1)
+0.376199990510941
+select min(key2) from t2 where key2 >= 1.3762;
+min(key2)
+1.37619996070862
+select max(key1), min(key2) from t1, t2
+where key1 <= 0.6158 and key2 >= 1.3762;
+max(key1) min(key2)
+0.615800023078918 1.37619996070862
+select max(key1) from t1 where key1 <= 0.6158 and rand() + 0.5 >= 0.5;
+max(key1)
+0.615800023078918
+select min(key1) from t1 where key1 >= 0.3762 and rand() + 0.5 >= 0.5;
+min(key1)
+0.376199990510941
+DROP TABLE t1,t2;
+CREATE TABLE t1 (i BIGINT UNSIGNED NOT NULL);
+INSERT INTO t1 VALUES (10);
+SELECT i='1e+01',i=1e+01, i in (1e+01,1e+01), i in ('1e+01','1e+01') FROM t1;
+i='1e+01' i=1e+01 i in (1e+01,1e+01) i in ('1e+01','1e+01')
+1 1 1 1
+DROP TABLE t1;
+create table t1(a bigint unsigned, b bigint);
+insert into t1 values (0xfffffffffffffffff, 0xfffffffffffffffff),
+(0x10000000000000000, 0x10000000000000000),
+(0x8fffffffffffffff, 0x8fffffffffffffff);
+Warnings:
+Warning 1264 Out of range value for column 'a' at row 1
+Warning 1264 Out of range value for column 'b' at row 1
+Warning 1264 Out of range value for column 'a' at row 2
+Warning 1264 Out of range value for column 'b' at row 2
+Warning 1264 Out of range value for column 'b' at row 3
+select hex(a), hex(b) from t1;
+hex(a) hex(b)
+FFFFFFFFFFFFFFFF 7FFFFFFFFFFFFFFF
+FFFFFFFFFFFFFFFF 7FFFFFFFFFFFFFFF
+8FFFFFFFFFFFFFFF 7FFFFFFFFFFFFFFF
+drop table t1;
+CREATE TABLE t1 (c0 int);
+CREATE TABLE t2 (c0 int);
+INSERT INTO t1 VALUES(@@connect_timeout);
+INSERT INTO t2 VALUES(@@connect_timeout);
+SELECT * FROM t1 JOIN t2 ON t1.c0 = t2.c0 WHERE (t1.c0 <=> @@connect_timeout);
+c0 c0
+X X
+DROP TABLE t1, t2;
+End of 4.1 tests
+CREATE TABLE t1 (
+K2C4 varchar(4) character set latin1 collate latin1_bin NOT NULL default '',
+K4N4 varchar(4) character set latin1 collate latin1_bin NOT NULL default '0000',
+F2I4 int(11) NOT NULL default '0'
+) ENGINE=MyISAM DEFAULT CHARSET=latin1;
+INSERT INTO t1 VALUES
+('W%RT', '0100', 1),
+('W-RT', '0100', 1),
+('WART', '0100', 1),
+('WART', '0200', 1),
+('WERT', '0100', 2),
+('WORT','0200', 2),
+('WT', '0100', 2),
+('W_RT', '0100', 2),
+('WaRT', '0100', 3),
+('WART', '0300', 3),
+('WRT' , '0400', 3),
+('WURM', '0500', 3),
+('W%T', '0600', 4),
+('WA%T', '0700', 4),
+('WA_T', '0800', 4);
+SELECT K2C4, K4N4, F2I4 FROM t1
+WHERE K2C4 = 'WART' AND
+(F2I4 = 2 AND K2C4 = 'WART' OR (F2I4 = 2 OR K4N4 = '0200'));
+K2C4 K4N4 F2I4
+WART 0200 1
+SELECT K2C4, K4N4, F2I4 FROM t1
+WHERE K2C4 = 'WART' AND (K2C4 = 'WART' OR K4N4 = '0200');
+K2C4 K4N4 F2I4
+WART 0100 1
+WART 0200 1
+WART 0300 3
+DROP TABLE t1;
+create table t1 (a int, b int);
+create table t2 like t1;
+select t1.a from (t1 inner join t2 on t1.a=t2.a) where t2.a=1;
+a
+select t1.a from ((t1 inner join t2 on t1.a=t2.a)) where t2.a=1;
+a
+select x.a, y.a, z.a from ( (t1 x inner join t2 y on x.a=y.a) inner join t2 z on y.a=z.a) WHERE x.a=1;
+a a a
+drop table t1,t2;
+create table t1 (s1 varchar(5));
+insert into t1 values ('Wall');
+select min(s1) from t1 group by s1 with rollup;
+min(s1)
+Wall
+Wall
+drop table t1;
+create table t1 (s1 int) engine=myisam;
+insert into t1 values (0);
+select avg(distinct s1) from t1 group by s1 with rollup;
+avg(distinct s1)
+0.0000
+0.0000
+drop table t1;
+create table t1 (s1 int);
+insert into t1 values (null),(1);
+select distinct avg(s1) as x from t1 group by s1 with rollup;
+x
+NULL
+1.0000
+drop table t1;
+CREATE TABLE t1 (a int);
+CREATE TABLE t2 (a int);
+INSERT INTO t1 VALUES (1), (2), (3), (4), (5);
+INSERT INTO t2 VALUES (2), (4), (6);
+SELECT t1.a FROM t1 STRAIGHT_JOIN t2 ON t1.a=t2.a;
+a
+2
+4
+EXPLAIN SELECT t1.a FROM t1 STRAIGHT_JOIN t2 ON t1.a=t2.a;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 5
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3 Using where; Using join buffer
+EXPLAIN SELECT t1.a FROM t1 INNER JOIN t2 ON t1.a=t2.a;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ALL NULL NULL NULL NULL 3
+1 SIMPLE t1 ALL NULL NULL NULL NULL 5 Using where; Using join buffer
+DROP TABLE t1,t2;
+select x'10' + 0, X'10' + 0, b'10' + 0, B'10' + 0;
+x'10' + 0 X'10' + 0 b'10' + 0 B'10' + 0
+16 16 2 2
+create table t1 (f1 varchar(6) default NULL, f2 int(6) primary key not null);
+create table t2 (f3 varchar(5) not null, f4 varchar(5) not null, UNIQUE KEY UKEY (f3,f4));
+insert into t1 values (" 2", 2);
+insert into t2 values (" 2", " one "),(" 2", " two ");
+select * from t1 left join t2 on f1 = f3;
+f1 f2 f3 f4
+ 2 2 2 one
+ 2 2 2 two
+drop table t1,t2;
+create table t1 (empnum smallint, grp int);
+create table t2 (empnum int, name char(5));
+insert into t1 values(1,1);
+insert into t2 values(1,'bob');
+create view v1 as select * from t2 inner join t1 using (empnum);
+select * from v1;
+empnum name grp
+1 bob 1
+drop table t1,t2;
+drop view v1;
+create table t1 (pk int primary key, b int);
+create table t2 (pk int primary key, c int);
+select pk from t1 inner join t2 using (pk);
+pk
+drop table t1,t2;
+create table t1 (s1 int, s2 char(5), s3 decimal(10));
+create view v1 as select s1, s2, 'x' as s3 from t1;
+select * from t1 natural join v1;
+s1 s2 s3
+insert into t1 values (1,'x',5);
+select * from t1 natural join v1;
+s1 s2 s3
+Warnings:
+Warning 1292 Truncated incorrect DOUBLE value: 'x'
+drop table t1;
+drop view v1;
+create table t1(a1 int);
+create table t2(a2 int);
+insert into t1 values(1),(2);
+insert into t2 values(1),(2);
+create view v2 (c) as select a1 from t1;
+select * from t1 natural left join t2;
+a1 a2
+1 1
+2 1
+1 2
+2 2
+select * from t1 natural right join t2;
+a2 a1
+1 1
+2 1
+1 2
+2 2
+select * from v2 natural left join t2;
+c a2
+1 1
+2 1
+1 2
+2 2
+select * from v2 natural right join t2;
+a2 c
+1 1
+2 1
+1 2
+2 2
+drop table t1, t2;
+drop view v2;
+create table t1 (a int(10), t1_val int(10));
+create table t2 (b int(10), t2_val int(10));
+create table t3 (a int(10), b int(10));
+insert into t1 values (1,1),(2,2);
+insert into t2 values (1,1),(2,2),(3,3);
+insert into t3 values (1,1),(2,1),(3,1),(4,1);
+select * from t1 natural join t2 natural join t3;
+a b t1_val t2_val
+1 1 1 1
+2 1 2 1
+select * from t1 natural join t3 natural join t2;
+b a t1_val t2_val
+1 1 1 1
+1 2 2 1
+drop table t1, t2, t3;
+DO IFNULL(NULL, NULL);
+SELECT CAST(IFNULL(NULL, NULL) AS DECIMAL);
+CAST(IFNULL(NULL, NULL) AS DECIMAL)
+NULL
+SELECT ABS(IFNULL(NULL, NULL));
+ABS(IFNULL(NULL, NULL))
+NULL
+SELECT IFNULL(NULL, NULL);
+IFNULL(NULL, NULL)
+NULL
+SET @OLD_SQL_MODE12595=@@SQL_MODE, @@SQL_MODE='';
+SHOW LOCAL VARIABLES LIKE 'SQL_MODE';
+Variable_name Value
+sql_mode
+CREATE TABLE BUG_12595(a varchar(100));
+INSERT INTO BUG_12595 VALUES ('hakan%'), ('hakank'), ("ha%an");
+SELECT * FROM BUG_12595 WHERE a LIKE 'hakan\%';
+a
+hakan%
+SELECT * FROM BUG_12595 WHERE a LIKE 'hakan*%' ESCAPE '*';
+a
+hakan%
+SELECT * FROM BUG_12595 WHERE a LIKE 'hakan**%' ESCAPE '**';
+ERROR HY000: Incorrect arguments to ESCAPE
+SELECT * FROM BUG_12595 WHERE a LIKE 'hakan%' ESCAPE '';
+a
+hakan%
+hakank
+SELECT * FROM BUG_12595 WHERE a LIKE 'hakan\%' ESCAPE '';
+a
+SELECT * FROM BUG_12595 WHERE a LIKE 'ha\%an' ESCAPE 0x5c;
+a
+ha%an
+SELECT * FROM BUG_12595 WHERE a LIKE 'ha%%an' ESCAPE '%';
+a
+ha%an
+SELECT * FROM BUG_12595 WHERE a LIKE 'ha\%an' ESCAPE '\\';
+a
+ha%an
+SELECT * FROM BUG_12595 WHERE a LIKE 'ha|%an' ESCAPE '|';
+a
+ha%an
+SET @@SQL_MODE='NO_BACKSLASH_ESCAPES';
+SHOW LOCAL VARIABLES LIKE 'SQL_MODE';
+Variable_name Value
+sql_mode NO_BACKSLASH_ESCAPES
+SELECT * FROM BUG_12595 WHERE a LIKE 'hakan\%';
+a
+SELECT * FROM BUG_12595 WHERE a LIKE 'hakan*%' ESCAPE '*';
+a
+hakan%
+SELECT * FROM BUG_12595 WHERE a LIKE 'hakan**%' ESCAPE '**';
+ERROR HY000: Incorrect arguments to ESCAPE
+SELECT * FROM BUG_12595 WHERE a LIKE 'hakan\%' ESCAPE '\\';
+ERROR HY000: Incorrect arguments to ESCAPE
+SELECT * FROM BUG_12595 WHERE a LIKE 'hakan%' ESCAPE '';
+ERROR HY000: Incorrect arguments to ESCAPE
+SELECT * FROM BUG_12595 WHERE a LIKE 'ha\%an' ESCAPE 0x5c;
+a
+ha%an
+SELECT * FROM BUG_12595 WHERE a LIKE 'ha|%an' ESCAPE '|';
+a
+ha%an
+SELECT * FROM BUG_12595 WHERE a LIKE 'hakan\n%' ESCAPE '\n';
+ERROR HY000: Incorrect arguments to ESCAPE
+SET @@SQL_MODE=@OLD_SQL_MODE12595;
+DROP TABLE BUG_12595;
+create table t1 (a char(1));
+create table t2 (a char(1));
+insert into t1 values ('a'),('b'),('c');
+insert into t2 values ('b'),('c'),('d');
+select a from t1 natural join t2;
+a
+b
+c
+select * from t1 natural join t2 where a = 'b';
+a
+b
+drop table t1, t2;
+CREATE TABLE t1 (`id` TINYINT);
+CREATE TABLE t2 (`id` TINYINT);
+CREATE TABLE t3 (`id` TINYINT);
+INSERT INTO t1 VALUES (1),(2),(3);
+INSERT INTO t2 VALUES (2);
+INSERT INTO t3 VALUES (3);
+SELECT t1.id,t3.id FROM t1 JOIN t2 ON (t2.id=t1.id) LEFT JOIN t3 USING (id);
+ERROR 23000: Column 'id' in from clause is ambiguous
+SELECT t1.id,t3.id FROM t1 JOIN t2 ON (t2.notacolumn=t1.id) LEFT JOIN t3 USING (id);
+ERROR 23000: Column 'id' in from clause is ambiguous
+SELECT id,t3.id FROM t1 JOIN t2 ON (t2.id=t1.id) LEFT JOIN t3 USING (id);
+ERROR 23000: Column 'id' in from clause is ambiguous
+SELECT id,t3.id FROM (t1 JOIN t2 ON (t2.id=t1.id)) LEFT JOIN t3 USING (id);
+ERROR 23000: Column 'id' in from clause is ambiguous
+drop table t1, t2, t3;
+create table t1 (a int(10),b int(10));
+create table t2 (a int(10),b int(10));
+insert into t1 values (1,10),(2,20),(3,30);
+insert into t2 values (1,10);
+select * from t1 inner join t2 using (A);
+a b b
+1 10 10
+select * from t1 inner join t2 using (a);
+a b b
+1 10 10
+drop table t1, t2;
+create table t1 (a int, c int);
+create table t2 (b int);
+create table t3 (b int, a int);
+create table t4 (c int);
+insert into t1 values (1,1);
+insert into t2 values (1);
+insert into t3 values (1,1);
+insert into t4 values (1);
+select * from t1 join t2 join t3 on (t2.b = t3.b and t1.a = t3.a);
+a c b b a
+1 1 1 1 1
+select * from t1, t2 join t3 on (t2.b = t3.b and t1.a = t3.a);
+ERROR 42S22: Unknown column 't1.a' in 'on clause'
+select * from t1 join t2 join t3 join t4 on (t1.a = t4.c and t2.b = t4.c);
+a c b b a c
+1 1 1 1 1 1
+select * from t1 join t2 join t4 using (c);
+c a b
+1 1 1
+drop table t1, t2, t3, t4;
+create table t1(x int, y int);
+create table t2(x int, y int);
+create table t3(x int, primary key(x));
+insert into t1 values (1, 1), (2, 1), (3, 1), (4, 3), (5, 6), (6, 6);
+insert into t2 values (1, 1), (2, 1), (3, 3), (4, 6), (5, 6);
+insert into t3 values (1), (2), (3), (4), (5);
+select t1.x, t3.x from t1, t2, t3 where t1.x = t2.x and t3.x >= t1.y and t3.x <= t2.y;
+x x
+1 1
+2 1
+3 1
+3 2
+3 3
+4 3
+4 4
+4 5
+drop table t1,t2,t3;
+create table t1 (id char(16) not null default '', primary key (id));
+insert into t1 values ('100'),('101'),('102');
+create table t2 (id char(16) default null);
+insert into t2 values (1);
+create view v1 as select t1.id from t1;
+create view v2 as select t2.id from t2;
+create view v3 as select (t1.id+2) as id from t1 natural left join t2;
+select t1.id from t1 left join v2 using (id);
+id
+100
+101
+102
+select t1.id from v2 right join t1 using (id);
+id
+100
+101
+102
+select t1.id from t1 left join v3 using (id);
+id
+102
+100
+101
+select * from t1 left join v2 using (id);
+id
+100
+101
+102
+select * from v2 right join t1 using (id);
+id
+100
+101
+102
+select * from t1 left join v3 using (id);
+id
+102
+100
+101
+select v1.id from v1 left join v2 using (id);
+id
+100
+101
+102
+select v1.id from v2 right join v1 using (id);
+id
+100
+101
+102
+select v1.id from v1 left join v3 using (id);
+id
+102
+100
+101
+select * from v1 left join v2 using (id);
+id
+100
+101
+102
+select * from v2 right join v1 using (id);
+id
+100
+101
+102
+select * from v1 left join v3 using (id);
+id
+102
+100
+101
+drop table t1, t2;
+drop view v1, v2, v3;
+create table t1 (id int(11) not null default '0');
+insert into t1 values (123),(191),(192);
+create table t2 (id char(16) character set utf8 not null);
+insert into t2 values ('58013'),('58014'),('58015'),('58016');
+create table t3 (a_id int(11) not null, b_id char(16) character set utf8);
+insert into t3 values (123,null),(123,null),(123,null),(123,null),(123,null),(123,'58013');
+select count(*)
+from t1 inner join (t3 left join t2 on t2.id = t3.b_id) on t1.id = t3.a_id;
+count(*)
+6
+select count(*)
+from t1 inner join (t2 right join t3 on t2.id = t3.b_id) on t1.id = t3.a_id;
+count(*)
+6
+drop table t1,t2,t3;
+create table t1 (a int);
+create table t2 (b int);
+create table t3 (c int);
+select * from t1 join t2 join t3 on (t1.a=t3.c);
+a b c
+select * from t1 join t2 left join t3 on (t1.a=t3.c);
+a b c
+select * from t1 join t2 right join t3 on (t1.a=t3.c);
+a b c
+select * from t1 join t2 straight_join t3 on (t1.a=t3.c);
+a b c
+drop table t1, t2 ,t3;
+create table t1(f1 int, f2 date);
+insert into t1 values(1,'2005-01-01'),(2,'2005-09-01'),(3,'2005-09-30'),
+(4,'2005-10-01'),(5,'2005-12-30');
+select * from t1 where f2 >= 0 order by f2;
+f1 f2
+1 2005-01-01
+2 2005-09-01
+3 2005-09-30
+4 2005-10-01
+5 2005-12-30
+select * from t1 where f2 >= '0000-00-00' order by f2;
+f1 f2
+1 2005-01-01
+2 2005-09-01
+3 2005-09-30
+4 2005-10-01
+5 2005-12-30
+select * from t1 where f2 >= '2005-09-31' order by f2;
+f1 f2
+4 2005-10-01
+5 2005-12-30
+select * from t1 where f2 >= '2005-09-3a' order by f2;
+f1 f2
+3 2005-09-30
+4 2005-10-01
+5 2005-12-30
+Warnings:
+Warning 1292 Incorrect date value: '2005-09-3a' for column 'f2' at row 1
+select * from t1 where f2 <= '2005-09-31' order by f2;
+f1 f2
+1 2005-01-01
+2 2005-09-01
+3 2005-09-30
+select * from t1 where f2 <= '2005-09-3a' order by f2;
+f1 f2
+1 2005-01-01
+2 2005-09-01
+Warnings:
+Warning 1292 Incorrect date value: '2005-09-3a' for column 'f2' at row 1
+drop table t1;
+create table t1 (f1 int, f2 int);
+insert into t1 values (1, 30), (2, 20), (3, 10);
+create algorithm=merge view v1 as select f1, f2 from t1;
+create algorithm=merge view v2 (f2, f1) as select f1, f2 from t1;
+create algorithm=merge view v3 as select t1.f1 as f2, t1.f2 as f1 from t1;
+select t1.f1 as x1, f1 from t1 order by t1.f1;
+x1 f1
+1 1
+2 2
+3 3
+select v1.f1 as x1, f1 from v1 order by v1.f1;
+x1 f1
+1 1
+2 2
+3 3
+select v2.f1 as x1, f1 from v2 order by v2.f1;
+x1 f1
+10 10
+20 20
+30 30
+select v3.f1 as x1, f1 from v3 order by v3.f1;
+x1 f1
+10 10
+20 20
+30 30
+select f1, f2, v1.f1 as x1 from v1 order by v1.f1;
+f1 f2 x1
+1 30 1
+2 20 2
+3 10 3
+select f1, f2, v2.f1 as x1 from v2 order by v2.f1;
+f1 f2 x1
+10 3 10
+20 2 20
+30 1 30
+select f1, f2, v3.f1 as x1 from v3 order by v3.f1;
+f1 f2 x1
+10 3 10
+20 2 20
+30 1 30
+drop table t1;
+drop view v1, v2, v3;
+CREATE TABLE t1(key_a int4 NOT NULL, optimus varchar(32), PRIMARY KEY(key_a));
+CREATE TABLE t2(key_a int4 NOT NULL, prime varchar(32), PRIMARY KEY(key_a));
+CREATE table t3(key_a int4 NOT NULL, key_b int4 NOT NULL, foo varchar(32),
+PRIMARY KEY(key_a,key_b));
+INSERT INTO t1 VALUES (0,'');
+INSERT INTO t1 VALUES (1,'i');
+INSERT INTO t1 VALUES (2,'j');
+INSERT INTO t1 VALUES (3,'k');
+INSERT INTO t2 VALUES (1,'r');
+INSERT INTO t2 VALUES (2,'s');
+INSERT INTO t2 VALUES (3,'t');
+INSERT INTO t3 VALUES (1,5,'x');
+INSERT INTO t3 VALUES (1,6,'y');
+INSERT INTO t3 VALUES (2,5,'xx');
+INSERT INTO t3 VALUES (2,6,'yy');
+INSERT INTO t3 VALUES (2,7,'zz');
+INSERT INTO t3 VALUES (3,5,'xxx');
+SELECT t2.key_a,foo
+FROM t1 INNER JOIN t2 ON t1.key_a = t2.key_a
+INNER JOIN t3 ON t1.key_a = t3.key_a
+WHERE t2.key_a=2 and key_b=5;
+key_a foo
+2 xx
+EXPLAIN SELECT t2.key_a,foo
+FROM t1 INNER JOIN t2 ON t1.key_a = t2.key_a
+INNER JOIN t3 ON t1.key_a = t3.key_a
+WHERE t2.key_a=2 and key_b=5;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1 Using index
+1 SIMPLE t2 const PRIMARY PRIMARY 4 const 1 Using index
+1 SIMPLE t3 const PRIMARY PRIMARY 8 const,const 1
+SELECT t2.key_a,foo
+FROM t1 INNER JOIN t2 ON t2.key_a = t1.key_a
+INNER JOIN t3 ON t1.key_a = t3.key_a
+WHERE t2.key_a=2 and key_b=5;
+key_a foo
+2 xx
+EXPLAIN SELECT t2.key_a,foo
+FROM t1 INNER JOIN t2 ON t2.key_a = t1.key_a
+INNER JOIN t3 ON t1.key_a = t3.key_a
+WHERE t2.key_a=2 and key_b=5;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1 Using index
+1 SIMPLE t2 const PRIMARY PRIMARY 4 const 1 Using index
+1 SIMPLE t3 const PRIMARY PRIMARY 8 const,const 1
+DROP TABLE t1,t2,t3;
+create table t1 (f1 int);
+insert into t1 values(1),(2);
+create table t2 (f2 int, f3 int, key(f2));
+insert into t2 values(1,1),(2,2);
+create table t3 (f4 int not null);
+insert into t3 values (2),(2),(2);
+select f1,(select count(*) from t2,t3 where f2=f1 and f3=f4) as count from t1;
+f1 count
+1 0
+2 3
+drop table t1,t2,t3;
+create table t1 (f1 int unique);
+create table t2 (f2 int unique);
+create table t3 (f3 int unique);
+insert into t1 values(1),(2);
+insert into t2 values(1),(2);
+insert into t3 values(1),(NULL);
+select * from t3 where f3 is null;
+f3
+NULL
+select t2.f2 from t1 left join t2 on f1=f2 join t3 on f1=f3 where f1=1;
+f2
+1
+drop table t1,t2,t3;
+create table t1(f1 char, f2 char not null);
+insert into t1 values(null,'a');
+create table t2 (f2 char not null);
+insert into t2 values('b');
+select * from t1 left join t2 on f1=t2.f2 where t1.f2='a';
+f1 f2 f2
+NULL a NULL
+drop table t1,t2;
+select * from (select * left join t on f1=f2) tt;
+ERROR 42000: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'on f1=f2) tt' at line 1
+CREATE TABLE t1 (sku int PRIMARY KEY, pr int);
+CREATE TABLE t2 (sku int PRIMARY KEY, sppr int, name varchar(255));
+INSERT INTO t1 VALUES
+(10, 10), (20, 10), (30, 20), (40, 30), (50, 10), (60, 10);
+INSERT INTO t2 VALUES
+(10, 10, 'aaa'), (20, 10, 'bbb'), (30, 10, 'ccc'), (40, 20, 'ddd'),
+(50, 10, 'eee'), (60, 20, 'fff'), (70, 20, 'ggg'), (80, 30, 'hhh');
+SELECT t2.sku, t2.sppr, t2.name, t1.sku, t1.pr
+FROM t2, t1 WHERE t2.sku=20 AND (t2.sku=t1.sku OR t2.sppr=t1.sku);
+sku sppr name sku pr
+20 10 bbb 10 10
+20 10 bbb 20 10
+EXPLAIN
+SELECT t2.sku, t2.sppr, t2.name, t1.sku, t1.pr
+FROM t2, t1 WHERE t2.sku=20 AND (t2.sku=t1.sku OR t2.sppr=t1.sku);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 const PRIMARY PRIMARY 4 const 1
+1 SIMPLE t1 range PRIMARY PRIMARY 4 NULL 2 Using index condition; Using MRR
+DROP TABLE t1,t2;
+CREATE TABLE t1 (i TINYINT UNSIGNED NOT NULL);
+INSERT t1 SET i = 0;
+UPDATE t1 SET i = -1;
+Warnings:
+Warning 1264 Out of range value for column 'i' at row 1
+SELECT * FROM t1;
+i
+0
+UPDATE t1 SET i = CAST(i - 1 AS SIGNED);
+Warnings:
+Warning 1264 Out of range value for column 'i' at row 1
+SELECT * FROM t1;
+i
+0
+UPDATE t1 SET i = i - 1;
+Warnings:
+Warning 1264 Out of range value for column 'i' at row 1
+SELECT * FROM t1;
+i
+255
+DROP TABLE t1;
+create table t1 (a int);
+insert into t1 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
+create table t2 (a int, b int, c int, e int, primary key(a,b,c));
+insert into t2 select A.a, B.a, C.a, C.a from t1 A, t1 B, t1 C;
+analyze table t2;
+Table Op Msg_type Msg_text
+test.t2 analyze status OK
+select 'In next EXPLAIN, B.rows must be exactly 10:' Z;
+Z
+In next EXPLAIN, B.rows must be exactly 10:
+explain select * from t2 A, t2 B where A.a=5 and A.b=5 and A.C<5
+and B.a=5 and B.b=A.e and (B.b =1 or B.b = 3 or B.b=5);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE A range PRIMARY PRIMARY 12 NULL 4 Using index condition; Using where; Using MRR
+1 SIMPLE B ref PRIMARY PRIMARY 8 const,test.A.e 10 Using join buffer
+drop table t1, t2;
+CREATE TABLE t1 (a int PRIMARY KEY, b int, INDEX(b));
+INSERT INTO t1 VALUES (1, 3), (9,4), (7,5), (4,5), (6,2),
+(3,1), (5,1), (8,9), (2,2), (0,9);
+CREATE TABLE t2 (c int, d int, f int, INDEX(c,f));
+INSERT INTO t2 VALUES
+(1,0,0), (1,0,1), (2,0,0), (2,0,1), (3,0,0), (4,0,1),
+(5,0,0), (5,0,1), (6,0,0), (0,0,1), (7,0,0), (7,0,1),
+(0,0,0), (0,0,1), (8,0,0), (8,0,1), (9,0,0), (9,0,1);
+EXPLAIN
+SELECT a, c, d, f FROM t1,t2 WHERE a=c AND b BETWEEN 4 AND 6;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 range PRIMARY,b b 5 NULL 3 Using index condition; Using MRR
+1 SIMPLE t2 ref c c 5 test.t1.a 2 Using join buffer
+EXPLAIN
+SELECT a, c, d, f FROM t1,t2 WHERE a=c AND b BETWEEN 4 AND 6 AND a > 0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 range PRIMARY,b b 5 NULL 3 Using index condition; Using where; Using MRR
+1 SIMPLE t2 ref c c 5 test.t1.a 2 Using join buffer
+DROP TABLE t1, t2;
+create table t1 (
+a int unsigned not null auto_increment primary key,
+b bit not null,
+c bit not null
+);
+create table t2 (
+a int unsigned not null auto_increment primary key,
+b bit not null,
+c int unsigned not null,
+d varchar(50)
+);
+insert into t1 (b,c) values (0,1), (0,1);
+insert into t2 (b,c) values (0,1);
+select t1.a, t1.b + 0, t1.c + 0, t2.a, t2.b + 0, t2.c, t2.d
+from t1 left outer join t2 on t1.a = t2.c and t2.b <> 1
+where t1.b <> 1 order by t1.a;
+a t1.b + 0 t1.c + 0 a t2.b + 0 c d
+1 0 1 1 0 1 NULL
+2 0 1 NULL NULL NULL NULL
+drop table t1,t2;
+SELECT 0.9888889889 * 1.011111411911;
+0.9888889889 * 1.011111411911
+0.9998769417899202067879
+prepare stmt from 'select 1 as " a "';
+Warnings:
+Warning 1466 Leading spaces are removed from name ' a '
+execute stmt;
+a
+1
+CREATE TABLE t1 (a int NOT NULL PRIMARY KEY, b int NOT NULL);
+INSERT INTO t1 VALUES (1,1), (2,2), (3,3), (4,4);
+CREATE TABLE t2 (c int NOT NULL, INDEX idx(c));
+INSERT INTO t2 VALUES
+(1), (1), (1), (1), (1), (1), (1), (1),
+(2), (2), (2), (2),
+(3), (3),
+(4);
+EXPLAIN SELECT b FROM t1, t2 WHERE b=c AND a=1;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1
+1 SIMPLE t2 ref idx idx 4 const 7 Using index
+EXPLAIN SELECT b FROM t1, t2 WHERE b=c AND a=4;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1
+1 SIMPLE t2 ref idx idx 4 const 1 Using index
+DROP TABLE t1, t2;
+CREATE TABLE t1 (id int NOT NULL PRIMARY KEY, a int);
+INSERT INTO t1 VALUES (1,2), (2,NULL), (3,2);
+CREATE TABLE t2 (b int, c INT, INDEX idx1(b));
+INSERT INTO t2 VALUES (2,1), (3,2);
+CREATE TABLE t3 (d int, e int, INDEX idx1(d));
+INSERT INTO t3 VALUES (2,10), (2,20), (1,30), (2,40), (2,50);
+EXPLAIN
+SELECT * FROM t1 LEFT JOIN t2 ON t2.b=t1.a INNER JOIN t3 ON t3.d=t1.id
+WHERE t1.id=2;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1
+1 SIMPLE t2 const idx1 NULL NULL NULL 1
+1 SIMPLE t3 ref idx1 idx1 5 const 3
+SELECT * FROM t1 LEFT JOIN t2 ON t2.b=t1.a INNER JOIN t3 ON t3.d=t1.id
+WHERE t1.id=2;
+id a b c d e
+2 NULL NULL NULL 2 10
+2 NULL NULL NULL 2 20
+2 NULL NULL NULL 2 40
+2 NULL NULL NULL 2 50
+DROP TABLE t1,t2,t3;
+create table t1 (c1 varchar(1), c2 int, c3 int, c4 int, c5 int, c6 int,
+c7 int, c8 int, c9 int, fulltext key (`c1`));
+select distinct match (`c1`) against ('z') , c2, c3, c4,c5, c6,c7, c8
+from t1 where c9=1 order by c2, c2;
+match (`c1`) against ('z') c2 c3 c4 c5 c6 c7 c8
+drop table t1;
+CREATE TABLE t1 (pk varchar(10) PRIMARY KEY, fk varchar(16));
+CREATE TABLE t2 (pk varchar(16) PRIMARY KEY, fk varchar(10));
+INSERT INTO t1 VALUES
+('d','dddd'), ('i','iii'), ('a','aa'), ('b','bb'), ('g','gg'),
+('e','eee'), ('c','cccc'), ('h','hhh'), ('j','jjj'), ('f','fff');
+INSERT INTO t2 VALUES
+('jjj', 'j'), ('cc','c'), ('ccc','c'), ('aaa', 'a'), ('jjjj','j'),
+('hhh','h'), ('gg','g'), ('fff','f'), ('ee','e'), ('ffff','f'),
+('bbb','b'), ('ff','f'), ('cccc','c'), ('dddd','d'), ('jj','j'),
+('aaaa','a'), ('bb','b'), ('eeee','e'), ('aa','a'), ('hh','h');
+EXPLAIN SELECT t2.*
+FROM t1 JOIN t2 ON t2.fk=t1.pk
+WHERE t2.fk < 'c' AND t2.pk=t1.fk;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 range PRIMARY PRIMARY 12 NULL 3 Using index condition; Using MRR
+1 SIMPLE t2 eq_ref PRIMARY PRIMARY 18 test.t1.fk 1 Using where; Using join buffer
+EXPLAIN SELECT t2.*
+FROM t1 JOIN t2 ON t2.fk=t1.pk
+WHERE t2.fk BETWEEN 'a' AND 'b' AND t2.pk=t1.fk;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 range PRIMARY PRIMARY 12 NULL 2 Using index condition; Using MRR
+1 SIMPLE t2 eq_ref PRIMARY PRIMARY 18 test.t1.fk 1 Using where; Using join buffer
+EXPLAIN SELECT t2.*
+FROM t1 JOIN t2 ON t2.fk=t1.pk
+WHERE t2.fk IN ('a','b') AND t2.pk=t1.fk;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 range PRIMARY PRIMARY 12 NULL 2 Using index condition; Using MRR
+1 SIMPLE t2 eq_ref PRIMARY PRIMARY 18 test.t1.fk 1 Using where; Using join buffer
+DROP TABLE t1,t2;
+CREATE TABLE t1 (a int, b varchar(20) NOT NULL, PRIMARY KEY(a));
+CREATE TABLE t2 (a int, b varchar(20) NOT NULL,
+PRIMARY KEY (a), UNIQUE KEY (b));
+INSERT INTO t1 VALUES (1,'a'),(2,'b'),(3,'c');
+INSERT INTO t2 VALUES (1,'a'),(2,'b'),(3,'c');
+EXPLAIN SELECT t1.a FROM t1 LEFT JOIN t2 ON t2.b=t1.b WHERE t1.a=3;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1
+DROP TABLE t1,t2;
+CREATE TABLE t1(id int PRIMARY KEY, b int, e int);
+CREATE TABLE t2(i int, a int, INDEX si(i), INDEX ai(a));
+CREATE TABLE t3(a int PRIMARY KEY, c char(4), INDEX ci(c));
+INSERT INTO t1 VALUES
+(1,10,19), (2,20,22), (4,41,42), (9,93,95), (7, 77,79),
+(6,63,67), (5,55,58), (3,38,39), (8,81,89);
+INSERT INTO t2 VALUES
+(21,210), (41,410), (82,820), (83,830), (84,840),
+(65,650), (51,510), (37,370), (94,940), (76,760),
+(22,220), (33,330), (40,400), (95,950), (38,380),
+(67,670), (88,880), (57,570), (96,960), (97,970);
+INSERT INTO t3 VALUES
+(210,'bb'), (950,'ii'), (400,'ab'), (500,'ee'), (220,'gg'),
+(440,'gg'), (310,'eg'), (380,'ee'), (840,'bb'), (830,'ff'),
+(230,'aa'), (960,'ii'), (410,'aa'), (510,'ee'), (290,'bb'),
+(450,'gg'), (320,'dd'), (390,'hh'), (850,'jj'), (860,'ff');
+EXPLAIN
+SELECT t3.a FROM t1,t2 FORCE INDEX (si),t3
+WHERE t1.id = 8 AND t2.i BETWEEN t1.b AND t1.e AND
+t3.a=t2.a AND t3.c IN ('bb','ee');
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1
+1 SIMPLE t2 range si si 5 NULL 4 Using index condition; Using MRR
+1 SIMPLE t3 eq_ref PRIMARY,ci PRIMARY 4 test.t2.a 1 Using where; Using join buffer
+EXPLAIN
+SELECT t3.a FROM t1,t2,t3
+WHERE t1.id = 8 AND t2.i BETWEEN t1.b AND t1.e AND
+t3.a=t2.a AND t3.c IN ('bb','ee') ;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1
+1 SIMPLE t2 range si,ai si 5 NULL 4 Using index condition; Using MRR
+1 SIMPLE t3 eq_ref PRIMARY,ci PRIMARY 4 test.t2.a 1 Using where; Using join buffer
+EXPLAIN
+SELECT t3.a FROM t1,t2 FORCE INDEX (si),t3
+WHERE t1.id = 8 AND (t2.i=t1.b OR t2.i=t1.e) AND t3.a=t2.a AND
+t3.c IN ('bb','ee');
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1
+1 SIMPLE t2 range si si 5 NULL 2 Using index condition; Using MRR
+1 SIMPLE t3 eq_ref PRIMARY,ci PRIMARY 4 test.t2.a 1 Using where; Using join buffer
+EXPLAIN
+SELECT t3.a FROM t1,t2,t3
+WHERE t1.id = 8 AND (t2.i=t1.b OR t2.i=t1.e) AND t3.a=t2.a AND
+t3.c IN ('bb','ee');
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 const PRIMARY PRIMARY 4 const 1
+1 SIMPLE t2 range si,ai si 5 NULL 2 Using index condition; Using MRR
+1 SIMPLE t3 eq_ref PRIMARY,ci PRIMARY 4 test.t2.a 1 Using where; Using join buffer
+DROP TABLE t1,t2,t3;
+CREATE TABLE t1 ( f1 int primary key, f2 int, f3 int, f4 int, f5 int, f6 int, checked_out int);
+CREATE TABLE t2 ( f11 int PRIMARY KEY );
+INSERT INTO t1 VALUES (1,1,1,0,0,0,0),(2,1,1,3,8,1,0),(3,1,1,4,12,1,0);
+INSERT INTO t2 VALUES (62);
+SELECT * FROM t1 LEFT JOIN t2 ON f11 = t1.checked_out GROUP BY f1 ORDER BY f2, f3, f4, f5 LIMIT 0, 1;
+f1 f2 f3 f4 f5 f6 checked_out f11
+1 1 1 0 0 0 0 NULL
+DROP TABLE t1, t2;
+DROP TABLE IF EXISTS t1;
+CREATE TABLE t1(a int);
+INSERT into t1 values (1), (2), (3);
+SELECT * FROM t1 LIMIT 2, -1;
+ERROR 42000: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '-1' at line 1
+DROP TABLE t1;
+CREATE TABLE t1 (
+ID_with_null int NULL,
+ID_better int NOT NULL,
+INDEX idx1 (ID_with_null),
+INDEX idx2 (ID_better)
+);
+INSERT INTO t1 VALUES (1,1), (2,1), (null,3), (null,3), (null,3), (null,3);
+INSERT INTO t1 SELECT * FROM t1 WHERE ID_with_null IS NULL;
+INSERT INTO t1 SELECT * FROM t1 WHERE ID_with_null IS NULL;
+INSERT INTO t1 SELECT * FROM t1 WHERE ID_with_null IS NULL;
+INSERT INTO t1 SELECT * FROM t1 WHERE ID_with_null IS NULL;
+INSERT INTO t1 SELECT * FROM t1 WHERE ID_with_null IS NULL;
+SELECT COUNT(*) FROM t1 WHERE ID_with_null IS NULL;
+COUNT(*)
+128
+SELECT COUNT(*) FROM t1 WHERE ID_better=1;
+COUNT(*)
+2
+EXPLAIN SELECT * FROM t1 WHERE ID_better=1 AND ID_with_null IS NULL;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ref idx1,idx2 idx2 4 const 1 Using where
+DROP INDEX idx1 ON t1;
+CREATE UNIQUE INDEX idx1 ON t1(ID_with_null);
+EXPLAIN SELECT * FROM t1 WHERE ID_better=1 AND ID_with_null IS NULL;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ref idx1,idx2 idx2 4 const 1 Using where
+DROP TABLE t1;
+CREATE TABLE t1 (
+ID1_with_null int NULL,
+ID2_with_null int NULL,
+ID_better int NOT NULL,
+INDEX idx1 (ID1_with_null, ID2_with_null),
+INDEX idx2 (ID_better)
+);
+INSERT INTO t1 VALUES (1,1,1), (2,2,1), (3,null,3), (null,3,3), (null,null,3),
+(3,null,3), (null,3,3), (null,null,3), (3,null,3), (null,3,3), (null,null,3);
+INSERT INTO t1 SELECT * FROM t1 WHERE ID1_with_null IS NULL;
+INSERT INTO t1 SELECT * FROM t1 WHERE ID2_with_null IS NULL;
+INSERT INTO t1 SELECT * FROM t1 WHERE ID1_with_null IS NULL;
+INSERT INTO t1 SELECT * FROM t1 WHERE ID2_with_null IS NULL;
+INSERT INTO t1 SELECT * FROM t1 WHERE ID1_with_null IS NULL;
+INSERT INTO t1 SELECT * FROM t1 WHERE ID2_with_null IS NULL;
+SELECT COUNT(*) FROM t1 WHERE ID1_with_null IS NULL AND ID2_with_null=3;
+COUNT(*)
+24
+SELECT COUNT(*) FROM t1 WHERE ID1_with_null=3 AND ID2_with_null IS NULL;
+COUNT(*)
+24
+SELECT COUNT(*) FROM t1 WHERE ID1_with_null IS NULL AND ID2_with_null IS NULL;
+COUNT(*)
+192
+SELECT COUNT(*) FROM t1 WHERE ID_better=1;
+COUNT(*)
+2
+EXPLAIN SELECT * FROM t1
+WHERE ID_better=1 AND ID1_with_null IS NULL AND ID2_with_null=3 ;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ref idx1,idx2 idx2 4 const 1 Using where
+EXPLAIN SELECT * FROM t1
+WHERE ID_better=1 AND ID1_with_null=3 AND ID2_with_null=3 IS NULL ;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ref idx1,idx2 idx2 4 const 1 Using where
+EXPLAIN SELECT * FROM t1
+WHERE ID_better=1 AND ID1_with_null IS NULL AND ID2_with_null IS NULL;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ref idx1,idx2 idx2 4 const 1 Using where
+DROP INDEX idx1 ON t1;
+CREATE UNIQUE INDEX idx1 ON t1(ID1_with_null,ID2_with_null);
+EXPLAIN SELECT * FROM t1
+WHERE ID_better=1 AND ID1_with_null IS NULL AND ID2_with_null=3 ;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ref idx1,idx2 idx2 4 const 1 Using where
+EXPLAIN SELECT * FROM t1
+WHERE ID_better=1 AND ID1_with_null=3 AND ID2_with_null IS NULL ;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ref idx1,idx2 idx2 4 const 1 Using where
+EXPLAIN SELECT * FROM t1
+WHERE ID_better=1 AND ID1_with_null IS NULL AND ID2_with_null IS NULL;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ref idx1,idx2 idx2 4 const 1 Using where
+EXPLAIN SELECT * FROM t1
+WHERE ID_better=1 AND ID1_with_null IS NULL AND
+(ID2_with_null=1 OR ID2_with_null=2);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ref idx1,idx2 idx2 4 const 1 Using where
+DROP TABLE t1;
+CREATE TABLE t1 (a INT, ts TIMESTAMP, KEY ts(ts));
+INSERT INTO t1 VALUES (30,"2006-01-03 23:00:00"), (31,"2006-01-03 23:00:00");
+ANALYZE TABLE t1;
+Table Op Msg_type Msg_text
+test.t1 analyze status OK
+CREATE TABLE t2 (a INT, dt1 DATETIME, dt2 DATETIME, PRIMARY KEY (a));
+INSERT INTO t2 VALUES (30, "2006-01-01 00:00:00", "2999-12-31 00:00:00");
+INSERT INTO t2 SELECT a+1,dt1,dt2 FROM t2;
+ANALYZE TABLE t2;
+Table Op Msg_type Msg_text
+test.t2 analyze status OK
+EXPLAIN
+SELECT * FROM t1 LEFT JOIN t2 ON (t1.a=t2.a) WHERE t1.a=30
+AND t1.ts BETWEEN t2.dt1 AND t2.dt2
+AND t1.ts BETWEEN "2006-01-01" AND "2006-12-31";
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 const PRIMARY PRIMARY 4 const 1
+1 SIMPLE t1 range ts ts 4 NULL 1 Using index condition; Using where; Using MRR
+Warnings:
+Warning 1292 Incorrect datetime value: '2999-12-31 00:00:00' for column 'ts' at row 1
+SELECT * FROM t1 LEFT JOIN t2 ON (t1.a=t2.a) WHERE t1.a=30
+AND t1.ts BETWEEN t2.dt1 AND t2.dt2
+AND t1.ts BETWEEN "2006-01-01" AND "2006-12-31";
+a ts a dt1 dt2
+30 2006-01-03 23:00:00 30 2006-01-01 00:00:00 2999-12-31 00:00:00
+Warnings:
+Warning 1292 Incorrect datetime value: '2999-12-31 00:00:00' for column 'ts' at row 1
+DROP TABLE t1,t2;
+create table t1 (a bigint unsigned);
+insert into t1 values
+(if(1, 9223372036854775808, 1)),
+(case when 1 then 9223372036854775808 else 1 end),
+(coalesce(9223372036854775808, 1));
+select * from t1;
+a
+9223372036854775808
+9223372036854775808
+9223372036854775808
+drop table t1;
+create table t1 select
+if(1, 9223372036854775808, 1) i,
+case when 1 then 9223372036854775808 else 1 end c,
+coalesce(9223372036854775808, 1) co;
+show create table t1;
+Table Create Table
+t1 CREATE TABLE `t1` (
+ `i` decimal(19,0) NOT NULL DEFAULT '0',
+ `c` decimal(19,0) NOT NULL DEFAULT '0',
+ `co` decimal(19,0) NOT NULL DEFAULT '0'
+) ENGINE=MyISAM DEFAULT CHARSET=latin1
+drop table t1;
+select
+if(1, cast(1111111111111111111 as unsigned), 1) i,
+case when 1 then cast(1111111111111111111 as unsigned) else 1 end c,
+coalesce(cast(1111111111111111111 as unsigned), 1) co;
+i c co
+1111111111111111111 1111111111111111111 1111111111111111111
+CREATE TABLE t1 (name varchar(255));
+CREATE TABLE t2 (name varchar(255), n int, KEY (name(3)));
+INSERT INTO t1 VALUES ('ccc'), ('bb'), ('cc '), ('aa '), ('aa');
+INSERT INTO t2 VALUES ('bb',1), ('aa',2), ('cc ',3);
+INSERT INTO t2 VALUES (concat('cc ', 0x06), 4);
+INSERT INTO t2 VALUES ('cc',5), ('bb ',6), ('cc ',7);
+SELECT * FROM t2;
+name n
+bb 1
+aa 2
+cc 3
+cc 4
+cc 5
+bb 6
+cc 7
+SELECT * FROM t2 ORDER BY name;
+name n
+aa 2
+bb 1
+bb 6
+cc 4
+cc 3
+cc 5
+cc 7
+SELECT name, LENGTH(name), n FROM t2 ORDER BY name;
+name LENGTH(name) n
+aa 2 2
+bb 2 1
+bb 3 6
+cc 4 4
+cc 5 3
+cc 2 5
+cc 3 7
+EXPLAIN SELECT name, LENGTH(name), n FROM t2 WHERE name='cc ';
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ref name name 6 const 3 Using where
+SELECT name, LENGTH(name), n FROM t2 WHERE name='cc ';
+name LENGTH(name) n
+cc 5 3
+cc 2 5
+cc 3 7
+EXPLAIN SELECT name , LENGTH(name), n FROM t2 WHERE name LIKE 'cc%';
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 range name name 6 NULL 3 Using where
+SELECT name , LENGTH(name), n FROM t2 WHERE name LIKE 'cc%';
+name LENGTH(name) n
+cc 5 3
+cc 4 4
+cc 2 5
+cc 3 7
+EXPLAIN SELECT name , LENGTH(name), n FROM t2 WHERE name LIKE 'cc%' ORDER BY name;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 range name name 6 NULL 3 Using where; Using filesort
+SELECT name , LENGTH(name), n FROM t2 WHERE name LIKE 'cc%' ORDER BY name;
+name LENGTH(name) n
+cc 4 4
+cc 5 3
+cc 2 5
+cc 3 7
+EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON t1.name=t2.name;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 5
+1 SIMPLE t2 ref name name 6 test.t1.name 2 Using where
+SELECT * FROM t1 LEFT JOIN t2 ON t1.name=t2.name;
+name name n
+ccc NULL NULL
+bb bb 1
+bb bb 6
+cc cc 3
+cc cc 5
+cc cc 7
+aa aa 2
+aa aa 2
+DROP TABLE t1,t2;
+CREATE TABLE t1 (name text);
+CREATE TABLE t2 (name text, n int, KEY (name(3)));
+INSERT INTO t1 VALUES ('ccc'), ('bb'), ('cc '), ('aa '), ('aa');
+INSERT INTO t2 VALUES ('bb',1), ('aa',2), ('cc ',3);
+INSERT INTO t2 VALUES (concat('cc ', 0x06), 4);
+INSERT INTO t2 VALUES ('cc',5), ('bb ',6), ('cc ',7);
+SELECT * FROM t2;
+name n
+bb 1
+aa 2
+cc 3
+cc 4
+cc 5
+bb 6
+cc 7
+SELECT * FROM t2 ORDER BY name;
+name n
+aa 2
+bb 1
+bb 6
+cc 4
+cc 3
+cc 5
+cc 7
+SELECT name, LENGTH(name), n FROM t2 ORDER BY name;
+name LENGTH(name) n
+aa 2 2
+bb 2 1
+bb 3 6
+cc 4 4
+cc 5 3
+cc 2 5
+cc 3 7
+EXPLAIN SELECT name, LENGTH(name), n FROM t2 WHERE name='cc ';
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 ref name name 6 const 3 Using where
+SELECT name, LENGTH(name), n FROM t2 WHERE name='cc ';
+name LENGTH(name) n
+cc 5 3
+cc 2 5
+cc 3 7
+EXPLAIN SELECT name , LENGTH(name), n FROM t2 WHERE name LIKE 'cc%';
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 range name name 6 NULL 3 Using where
+SELECT name , LENGTH(name), n FROM t2 WHERE name LIKE 'cc%';
+name LENGTH(name) n
+cc 5 3
+cc 4 4
+cc 2 5
+cc 3 7
+EXPLAIN SELECT name , LENGTH(name), n FROM t2 WHERE name LIKE 'cc%' ORDER BY name;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t2 range name name 6 NULL 3 Using where; Using filesort
+SELECT name , LENGTH(name), n FROM t2 WHERE name LIKE 'cc%' ORDER BY name;
+name LENGTH(name) n
+cc 4 4
+cc 5 3
+cc 2 5
+cc 3 7
+EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON t1.name=t2.name;
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE t1 ALL NULL NULL NULL NULL 5
+1 SIMPLE t2 ref name name 6 test.t1.name 2 Using where
+SELECT * FROM t1 LEFT JOIN t2 ON t1.name=t2.name;
+name name n
+ccc NULL NULL
+bb bb 1
+bb bb 6
+cc cc 3
+cc cc 5
+cc cc 7
+aa aa 2
+aa aa 2
+DROP TABLE t1,t2;
+CREATE TABLE t1 (
+access_id int NOT NULL default '0',
+name varchar(20) default NULL,
+rank int NOT NULL default '0',
+KEY idx (access_id)
+);
+CREATE TABLE t2 (
+faq_group_id int NOT NULL default '0',
+faq_id int NOT NULL default '0',
+access_id int default NULL,
+UNIQUE KEY idx1 (faq_id),
+KEY idx2 (faq_group_id,faq_id)
+);
+INSERT INTO t1 VALUES
+(1,'Everyone',2),(2,'Help',3),(3,'Technical Support',1),(4,'Chat User',4);
+INSERT INTO t2 VALUES
+(261,265,1),(490,494,1);
+SELECT t2.faq_id
+FROM t1 INNER JOIN t2 IGNORE INDEX (idx1)
+ON (t1.access_id = t2.access_id)
+LEFT JOIN t2 t
+ON (t.faq_group_id = t2.faq_group_id AND
+find_in_set(t.access_id, '1,4') < find_in_set(t2.access_id, '1,4'))
+WHERE
+t2.access_id IN (1,4) AND t.access_id IS NULL AND t2.faq_id in (265);
+faq_id
+265
+SELECT t2.faq_id
+FROM t1 INNER JOIN t2
+ON (t1.access_id = t2.access_id)
+LEFT JOIN t2 t
+ON (t.faq_group_id = t2.faq_group_id AND
+find_in_set(t.access_id, '1,4') < find_in_set(t2.access_id, '1,4'))
+WHERE
+t2.access_id IN (1,4) AND t.access_id IS NULL AND t2.faq_id in (265);
+faq_id
+265
+DROP TABLE t1,t2;
+CREATE TABLE t1 (a INT, b INT, KEY inx (b,a));
+INSERT INTO t1 VALUES (1,1), (1,2), (1,3), (1,4), (1,5), (1, 6), (1,7);
+EXPLAIN SELECT COUNT(*) FROM t1 f1 INNER JOIN t1 f2
+ON ( f1.b=f2.b AND f1.a<f2.a )
+WHERE 1 AND f1.b NOT IN (100,2232,3343,51111);
+id select_type table type possible_keys key key_len ref rows Extra
+1 SIMPLE f1 index inx inx 10 NULL 7 Using where; Using index
+1 SIMPLE f2 ref inx inx 5 test.f1.b 1 Using where; Using index
+DROP TABLE t1;
+CREATE TABLE t1 (c1 INT, c2 INT);
+INSERT INTO t1 VALUES (1,11), (2,22), (2,22);
+EXPLAIN SELECT c1 FROM t1 WHERE (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT COUNT(c2)))))))))))))))))))))))))))))))) > 0;
+id select_type table type possible_keys key key_len ref rows Extra
+1 PRIMARY t1 ALL NULL NULL NULL NULL 3 Using where
+31 DEPENDENT SUBQUERY NULL NULL NULL NULL NULL NULL NULL No tables used
+32 DEPENDENT SUBQUERY NULL NULL NULL NULL NULL NULL NULL No tables used
+EXPLAIN SELECT c1 FROM t1 WHERE (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT (SELECT COUNT(c2))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) > 0;
+ERROR HY000: Too high level of nesting for select
+DROP TABLE t1;
+CREATE TABLE t1 (
+c1 int(11) NOT NULL AUTO_INCREMENT,
+c2 varchar(1000) DEFAULT NULL,
+c3 bigint(20) DEFAULT NULL,
+c4 bigint(20) DEFAULT NULL,
+PRIMARY KEY (c1)
+);
+EXPLAIN EXTENDED
+SELECT join_2.c1
+FROM
+t1 AS join_0,
+t1 AS join_1,
+t1 AS join_2,
+t1 AS join_3,
+t1 AS join_4,
+t1 AS join_5,
+t1 AS join_6,
+t1 AS join_7
+WHERE
+join_0.c1=join_1.c1 AND
+join_1.c1=join_2.c1 AND
+join_2.c1=join_3.c1 AND
+join_3.c1=join_4.c1 AND
+join_4.c1=join_5.c1 AND
+join_5.c1=join_6.c1 AND
+join_6.c1=join_7.c1
+OR
+join_0.c2 < '?' AND
+join_1.c2 < '?' AND
+join_2.c2 > '?' AND
+join_2.c2 < '!' AND
+join_3.c2 > '?' AND
+join_4.c2 = '?' AND
+join_5.c2 <> '?' AND
+join_6.c2 <> '?' AND
+join_7.c2 >= '?' AND
+join_0.c1=join_1.c1 AND
+join_1.c1=join_2.c1 AND
+join_2.c1=join_3.c1 AND
+join_3.c1=join_4.c1 AND
+join_4.c1=join_5.c1 AND
+join_5.c1=join_6.c1 AND
+join_6.c1=join_7.c1
+GROUP BY
+join_3.c1,
+join_2.c1,
+join_7.c1,
+join_1.c1,
+join_0.c1;
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE NULL NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables
+Warnings:
+Note 1003 select '0' AS `c1` from `test`.`t1` `join_0` join `test`.`t1` `join_1` join `test`.`t1` `join_2` join `test`.`t1` `join_3` join `test`.`t1` `join_4` join `test`.`t1` `join_5` join `test`.`t1` `join_6` join `test`.`t1` `join_7` where 0 group by '0','0','0','0','0'
+SHOW WARNINGS;
+Level Code Message
+Note 1003 select '0' AS `c1` from `test`.`t1` `join_0` join `test`.`t1` `join_1` join `test`.`t1` `join_2` join `test`.`t1` `join_3` join `test`.`t1` `join_4` join `test`.`t1` `join_5` join `test`.`t1` `join_6` join `test`.`t1` `join_7` where 0 group by '0','0','0','0','0'
+DROP TABLE t1;
+SELECT 1 AS ` `;
+
+1
+Warnings:
+Warning 1474 Name ' ' has become ''
+SELECT 1 AS ` `;
+
+1
+Warnings:
+Warning 1474 Name ' ' has become ''
+SELECT 1 AS ` x`;
+x
+1
+Warnings:
+Warning 1466 Leading spaces are removed from name ' x'
+CREATE VIEW v1 AS SELECT 1 AS ``;
+ERROR 42000: Incorrect column name ''
+CREATE VIEW v1 AS SELECT 1 AS ` `;
+ERROR 42000: Incorrect column name ' '
+CREATE VIEW v1 AS SELECT 1 AS ` `;
+ERROR 42000: Incorrect column name ' '
+CREATE VIEW v1 AS SELECT (SELECT 1 AS ` `);
+ERROR 42000: Incorrect column name ' '
+CREATE VIEW v1 AS SELECT 1 AS ` x`;
+Warnings:
+Warning 1466 Leading spaces are removed from name ' x'
+SELECT `x` FROM v1;
+x
+1
+ALTER VIEW v1 AS SELECT 1 AS ` `;
+ERROR 42000: Incorrect column name ' '
+DROP VIEW v1;
+select str_to_date('2007-10-09','%Y-%m-%d') between '2007/10/01 00:00:00 GMT'
+ and '2007/10/20 00:00:00 GMT';
+str_to_date('2007-10-09','%Y-%m-%d') between '2007/10/01 00:00:00 GMT'
+ and '2007/10/20 00:00:00 GMT'
+1
+Warnings:
+Warning 1292 Truncated incorrect datetime value: '2007/10/01 00:00:00 GMT'
+Warning 1292 Truncated incorrect datetime value: '2007/10/20 00:00:00 GMT'
+select str_to_date('2007-10-09','%Y-%m-%d') > '2007/10/01 00:00:00 GMT-6';
+str_to_date('2007-10-09','%Y-%m-%d') > '2007/10/01 00:00:00 GMT-6'
+1
+Warnings:
+Warning 1292 Truncated incorrect date value: '2007/10/01 00:00:00 GMT-6'
+select str_to_date('2007-10-09','%Y-%m-%d') <= '2007/10/2000:00:00 GMT-6';
+str_to_date('2007-10-09','%Y-%m-%d') <= '2007/10/2000:00:00 GMT-6'
+1
+Warnings:
+Warning 1292 Truncated incorrect date value: '2007/10/2000:00:00 GMT-6'
+select str_to_date('2007-10-01','%Y-%m-%d') = '2007-10-1 00:00:00 GMT-6';
+str_to_date('2007-10-01','%Y-%m-%d') = '2007-10-1 00:00:00 GMT-6'
+1
+Warnings:
+Warning 1292 Truncated incorrect date value: '2007-10-1 00:00:00 GMT-6'
+select str_to_date('2007-10-01','%Y-%m-%d') = '2007-10-01 x00:00:00 GMT-6';
+str_to_date('2007-10-01','%Y-%m-%d') = '2007-10-01 x00:00:00 GMT-6'
+1
+Warnings:
+Warning 1292 Truncated incorrect date value: '2007-10-01 x00:00:00 GMT-6'
+select str_to_date('2007-10-01','%Y-%m-%d %H:%i:%s') = '2007-10-01 00:00:00 GMT-6';
+str_to_date('2007-10-01','%Y-%m-%d %H:%i:%s') = '2007-10-01 00:00:00 GMT-6'
+1
+Warnings:
+Warning 1292 Truncated incorrect datetime value: '2007-10-01 00:00:00 GMT-6'
+select str_to_date('2007-10-01','%Y-%m-%d %H:%i:%s') = '2007-10-01 00:x00:00 GMT-6';
+str_to_date('2007-10-01','%Y-%m-%d %H:%i:%s') = '2007-10-01 00:x00:00 GMT-6'
+1
+Warnings:
+Warning 1292 Truncated incorrect datetime value: '2007-10-01 00:x00:00 GMT-6'
+select str_to_date('2007-10-01','%Y-%m-%d %H:%i:%s') = '2007-10-01 x12:34:56 GMT-6';
+str_to_date('2007-10-01','%Y-%m-%d %H:%i:%s') = '2007-10-01 x12:34:56 GMT-6'
+1
+Warnings:
+Warning 1292 Truncated incorrect datetime value: '2007-10-01 x12:34:56 GMT-6'
+select str_to_date('2007-10-01 12:34:00','%Y-%m-%d %H:%i:%s') = '2007-10-01 12:34x:56 GMT-6';
+str_to_date('2007-10-01 12:34:00','%Y-%m-%d %H:%i:%s') = '2007-10-01 12:34x:56 GMT-6'
+1
+Warnings:
+Warning 1292 Truncated incorrect datetime value: '2007-10-01 12:34x:56 GMT-6'
+select str_to_date('2007-10-01 12:34:56','%Y-%m-%d %H:%i:%s') = '2007-10-01 12:34x:56 GMT-6';
+str_to_date('2007-10-01 12:34:56','%Y-%m-%d %H:%i:%s') = '2007-10-01 12:34x:56 GMT-6'
+0
+Warnings:
+Warning 1292 Truncated incorrect datetime value: '2007-10-01 12:34x:56 GMT-6'
+select str_to_date('2007-10-01 12:34:56','%Y-%m-%d %H:%i:%s') = '2007-10-01 12:34:56';
+str_to_date('2007-10-01 12:34:56','%Y-%m-%d %H:%i:%s') = '2007-10-01 12:34:56'
+1
+select str_to_date('2007-10-01','%Y-%m-%d') = '2007-10-01 12:00:00';
+str_to_date('2007-10-01','%Y-%m-%d') = '2007-10-01 12:00:00'
+0
+select str_to_date('2007-10-01 12','%Y-%m-%d %H') = '2007-10-01 12:00:00';
+str_to_date('2007-10-01 12','%Y-%m-%d %H') = '2007-10-01 12:00:00'
+1
+select str_to_date('2007-10-01 12:34','%Y-%m-%d %H') = '2007-10-01 12:00:00';
+str_to_date('2007-10-01 12:34','%Y-%m-%d %H') = '2007-10-01 12:00:00'
+1
+Warnings:
+Warning 1292 Truncated incorrect datetime value: '2007-10-01 12:34'
+select str_to_date('2007-02-30 12:34','%Y-%m-%d %H:%i') = '2007-02-30 12:34';
+str_to_date('2007-02-30 12:34','%Y-%m-%d %H:%i') = '2007-02-30 12:34'
+1
+select str_to_date('2007-10-00 12:34','%Y-%m-%d %H:%i') = '2007-10-00 12:34';
+str_to_date('2007-10-00 12:34','%Y-%m-%d %H:%i') = '2007-10-00 12:34'
+1
+select str_to_date('2007-10-00','%Y-%m-%d') between '2007/09/01 00:00:00'
+ and '2007/10/20 00:00:00';
+str_to_date('2007-10-00','%Y-%m-%d') between '2007/09/01 00:00:00'
+ and '2007/10/20 00:00:00'
+1
+set SQL_MODE=TRADITIONAL;
+select str_to_date('2007-10-00 12:34','%Y-%m-%d %H:%i') = '2007-10-00 12:34';
+str_to_date('2007-10-00 12:34','%Y-%m-%d %H:%i') = '2007-10-00 12:34'
+0
+Warnings:
+Warning 1292 Truncated incorrect datetime value: '2007-10-00 12:34'
+select str_to_date('2007-10-01 12:34','%Y-%m-%d %H:%i') = '2007-10-00 12:34';
+str_to_date('2007-10-01 12:34','%Y-%m-%d %H:%i') = '2007-10-00 12:34'
+0
+Warnings:
+Warning 1292 Truncated incorrect datetime value: '2007-10-00 12:34'
+select str_to_date('2007-10-00 12:34','%Y-%m-%d %H:%i') = '2007-10-01 12:34';
+str_to_date('2007-10-00 12:34','%Y-%m-%d %H:%i') = '2007-10-01 12:34'
+0
+Warnings:
+Warning 1292 Truncated incorrect datetime value: '2007-10-00 12:34:00'
+select str_to_date('2007-10-00','%Y-%m-%d') between '2007/09/01'
+ and '2007/10/20';
+str_to_date('2007-10-00','%Y-%m-%d') between '2007/09/01'
+ and '2007/10/20'
+0
+Warnings:
+Warning 1292 Incorrect datetime value: '2007-10-00' for column '2007/09/01' at row 1
+Warning 1292 Incorrect datetime value: '2007-10-00' for column '2007/10/20' at row 1
+set SQL_MODE=DEFAULT;
+select str_to_date('2007-10-00','%Y-%m-%d') between '' and '2007/10/20';
+str_to_date('2007-10-00','%Y-%m-%d') between '' and '2007/10/20'
+1
+Warnings:
+Warning 1292 Truncated incorrect datetime value: ''
+select str_to_date('','%Y-%m-%d') between '2007/10/01' and '2007/10/20';
+str_to_date('','%Y-%m-%d') between '2007/10/01' and '2007/10/20'
+0
+select str_to_date('','%Y-%m-%d %H:%i') = '2007-10-01 12:34';
+str_to_date('','%Y-%m-%d %H:%i') = '2007-10-01 12:34'
+0
+select str_to_date(NULL,'%Y-%m-%d %H:%i') = '2007-10-01 12:34';
+str_to_date(NULL,'%Y-%m-%d %H:%i') = '2007-10-01 12:34'
+NULL
+select str_to_date('2007-10-00 12:34','%Y-%m-%d %H:%i') = '';
+str_to_date('2007-10-00 12:34','%Y-%m-%d %H:%i') = ''
+0
+Warnings:
+Warning 1292 Truncated incorrect datetime value: ''
+select str_to_date('1','%Y-%m-%d') = '1';
+str_to_date('1','%Y-%m-%d') = '1'
+0
+Warnings:
+Warning 1292 Truncated incorrect date value: '1'
+select str_to_date('1','%Y-%m-%d') = '1';
+str_to_date('1','%Y-%m-%d') = '1'
+0
+Warnings:
+Warning 1292 Truncated incorrect date value: '1'
+select str_to_date('','%Y-%m-%d') = '';
+str_to_date('','%Y-%m-%d') = ''
+0
+Warnings:
+Warning 1292 Truncated incorrect date value: ''
+select str_to_date('1000-01-01','%Y-%m-%d') between '0000-00-00' and NULL;
+str_to_date('1000-01-01','%Y-%m-%d') between '0000-00-00' and NULL
+0
+select str_to_date('1000-01-01','%Y-%m-%d') between NULL and '2000-00-00';
+str_to_date('1000-01-01','%Y-%m-%d') between NULL and '2000-00-00'
+0
+select str_to_date('1000-01-01','%Y-%m-%d') between NULL and NULL;
+str_to_date('1000-01-01','%Y-%m-%d') between NULL and NULL
+0
+CREATE TABLE t1 (c11 INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY);
+CREATE TABLE t2 (c21 INT UNSIGNED NOT NULL,
+c22 INT DEFAULT NULL,
+KEY(c21, c22));
+CREATE TABLE t3 (c31 INT UNSIGNED NOT NULL DEFAULT 0,
+c32 INT DEFAULT NULL,
+c33 INT NOT NULL,
+c34 INT UNSIGNED DEFAULT 0,
+KEY (c33, c34, c32));
+INSERT INTO t1 values (),(),(),(),();
+INSERT INTO t2 SELECT a.c11, b.c11 FROM t1 a, t1 b;
+INSERT INTO t3 VALUES (1, 1, 1, 0),
+(2, 2, 0, 0),
+(3, 3, 1, 0),
+(4, 4, 0, 0),
+(5, 5, 1, 0);
+SELECT c32 FROM t1, t2, t3 WHERE t1.c11 IN (1, 3, 5) AND
+t3.c31 = t1.c11 AND t2.c21 = t1.c11 AND
+t3.c33 = 1 AND t2.c22 in (1, 3)
+ORDER BY c32;
+c32
+1
+1
+3
+3
+5
+5
+SELECT c32 FROM t1, t2, t3 WHERE t1.c11 IN (1, 3, 5) AND
+t3.c31 = t1.c11 AND t2.c21 = t1.c11 AND
+t3.c33 = 1 AND t2.c22 in (1, 3)
+ORDER BY c32 DESC;
+c32
+5
+5
+3
+3
+1
+1
+DROP TABLE t1, t2, t3;
+
+#
+# Bug#30736: Row Size Too Large Error Creating a Table and
+# Inserting Data.
+#
+DROP TABLE IF EXISTS t1;
+DROP TABLE IF EXISTS t2;
+
+CREATE TABLE t1(
+c1 DECIMAL(10, 2),
+c2 FLOAT);
+
+INSERT INTO t1 VALUES (0, 1), (2, 3), (4, 5);
+
+CREATE TABLE t2(
+c3 DECIMAL(10, 2))
+SELECT
+c1 * c2 AS c3
+FROM t1;
+
+SELECT * FROM t1;
+c1 c2
+0.00 1
+2.00 3
+4.00 5
+
+SELECT * FROM t2;
+c3
+0.00
+6.00
+20.00
+
+DROP TABLE t1;
+DROP TABLE t2;
+
+CREATE TABLE t1 (c1 BIGINT NOT NULL);
+INSERT INTO t1 (c1) VALUES (1);
+SELECT * FROM t1 WHERE c1 > NULL + 1;
+c1
+DROP TABLE t1;
+
+CREATE TABLE t1 (a VARCHAR(10) NOT NULL PRIMARY KEY);
+INSERT INTO t1 (a) VALUES ('foo0'), ('bar0'), ('baz0');
+SELECT * FROM t1 WHERE a IN (CONCAT('foo', 0), 'bar');
+a
+foo0
+DROP TABLE t1;
+CREATE TABLE t1 (a INT, b INT);
+CREATE TABLE t2 (a INT, c INT, KEY(a));
+INSERT INTO t1 VALUES (1, 1), (2, 2);
+INSERT INTO t2 VALUES (1, 1), (1, 2), (1, 3), (1, 4), (1, 5),
+(2, 1), (2, 2), (2, 3), (2, 4), (2, 5),
+(3, 1), (3, 2), (3, 3), (3, 4), (3, 5),
+(4, 1), (4, 2), (4, 3), (4, 4), (4, 5);
+FLUSH STATUS;
+SELECT DISTINCT b FROM t1 LEFT JOIN t2 USING(a) WHERE c <= 3;
+b
+1
+2
+SHOW STATUS LIKE 'Handler_read%';
+Variable_name Value
+Handler_read_first 0
+Handler_read_key 2
+Handler_read_next 10
+Handler_read_prev 0
+Handler_read_rnd 10
+Handler_read_rnd_next 7
+DROP TABLE t1, t2;
+CREATE TABLE t1 (f1 bigint(20) NOT NULL default '0',
+f2 int(11) NOT NULL default '0',
+f3 bigint(20) NOT NULL default '0',
+f4 varchar(255) NOT NULL default '',
+PRIMARY KEY (f1),
+KEY key1 (f4),
+KEY key2 (f2));
+CREATE TABLE t2 (f1 int(11) NOT NULL default '0',
+f2 enum('A1','A2','A3') NOT NULL default 'A1',
+f3 int(11) NOT NULL default '0',
+PRIMARY KEY (f1),
+KEY key1 (f3));
+CREATE TABLE t3 (f1 bigint(20) NOT NULL default '0',
+f2 datetime NOT NULL default '1980-01-01 00:00:00',
+PRIMARY KEY (f1));
+insert into t1 values (1, 1, 1, 'abc');
+insert into t1 values (2, 1, 2, 'def');
+insert into t1 values (3, 1, 2, 'def');
+insert into t2 values (1, 'A1', 1);
+insert into t3 values (1, '1980-01-01');
+SELECT a.f3, cr.f4, count(*) count
+FROM t2 a
+STRAIGHT_JOIN t1 cr ON cr.f2 = a.f1
+LEFT JOIN
+(t1 cr2
+JOIN t3 ae2 ON cr2.f3 = ae2.f1
+) ON a.f1 = cr2.f2 AND ae2.f2 < now() - INTERVAL 7 DAY AND
+cr.f4 = cr2.f4
+GROUP BY a.f3, cr.f4;
+f3 f4 count
+1 abc 1
+1 def 2
+drop table t1, t2, t3;
+CREATE TABLE t1 (a INT KEY, b INT);
+INSERT INTO t1 VALUES (1,1), (2,2), (3,3), (4,4);
+EXPLAIN EXTENDED SELECT a, b FROM t1 WHERE a > 1 AND a = b LIMIT 2;
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t1 range PRIMARY PRIMARY 4 NULL 3 100.00 Using index condition; Using where; Using MRR
+Warnings:
+Note 1003 select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b` from `test`.`t1` where ((`test`.`t1`.`b` = `test`.`t1`.`a`) and (`test`.`t1`.`a` > 1)) limit 2
+EXPLAIN EXTENDED SELECT a, b FROM t1 WHERE a > 1 AND b = a LIMIT 2;
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t1 range PRIMARY PRIMARY 4 NULL 3 100.00 Using index condition; Using where; Using MRR
+Warnings:
+Note 1003 select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b` from `test`.`t1` where ((`test`.`t1`.`a` = `test`.`t1`.`b`) and (`test`.`t1`.`a` > 1)) limit 2
+DROP TABLE t1;
+End of 5.0 tests
+create table t1(a INT, KEY (a));
+INSERT INTO t1 VALUES (1),(2),(3),(4),(5);
+SELECT a FROM t1 ORDER BY a LIMIT 2;
+a
+1
+2
+SELECT a FROM t1 ORDER BY a LIMIT 2,4294967296;
+a
+3
+4
+5
+SELECT a FROM t1 ORDER BY a LIMIT 2,4294967297;
+a
+3
+4
+5
+DROP TABLE t1;
+CREATE TABLE A (date_key date);
+CREATE TABLE C (
+pk int,
+int_nokey int,
+int_key int,
+date_key date NOT NULL,
+date_nokey date,
+varchar_key varchar(1)
+);
+INSERT INTO C VALUES
+(1,1,1,'0000-00-00',NULL,NULL),
+(1,1,1,'0000-00-00',NULL,NULL);
+SELECT 1 FROM C WHERE pk > ANY (SELECT 1 FROM C);
+1
+SELECT COUNT(DISTINCT 1) FROM C
+WHERE date_key = (SELECT 1 FROM A WHERE C.date_key IS NULL) GROUP BY pk;
+COUNT(DISTINCT 1)
+SELECT date_nokey FROM C
+WHERE int_key IN (SELECT 1 FROM A)
+HAVING date_nokey = '10:41:7'
+ORDER BY date_key;
+date_nokey
+Warnings:
+Warning 1292 Incorrect date value: '10:41:7' for column 'date_nokey' at row 1
+DROP TABLE A,C;
+CREATE TABLE t1 (a INT NOT NULL, b INT);
+INSERT INTO t1 VALUES (1, 1);
+EXPLAIN EXTENDED SELECT * FROM t1 WHERE (a=a AND a=a) OR b > 2;
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t1 system NULL NULL NULL NULL 1 100.00
+Warnings:
+Note 1003 select '1' AS `a`,'1' AS `b` from `test`.`t1` where 1
+SELECT * FROM t1 WHERE (a=a AND a=a) OR b > 2;
+a b
+1 1
+DROP TABLE t1;
+CREATE TABLE t1 (a INT NOT NULL, b INT NOT NULL, c INT NOT NULL);
+EXPLAIN EXTENDED SELECT * FROM t1 WHERE (a=a AND b=b AND c=c) OR b > 20;
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t1 system NULL NULL NULL NULL 0 0.00 const row not found
+Warnings:
+Note 1003 select '0' AS `a`,'0' AS `b`,'0' AS `c` from `test`.`t1` where 1
+EXPLAIN EXTENDED SELECT * FROM t1 WHERE (a=a AND a=a AND b=b) OR b > 20;
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t1 system NULL NULL NULL NULL 0 0.00 const row not found
+Warnings:
+Note 1003 select '0' AS `a`,'0' AS `b`,'0' AS `c` from `test`.`t1` where 1
+EXPLAIN EXTENDED SELECT * FROM t1 WHERE (a=a AND b=b AND a=a) OR b > 20;
+id select_type table type possible_keys key key_len ref rows filtered Extra
+1 SIMPLE t1 system NULL NULL NULL NULL 0 0.00 const row not found
+Warnings:
+Note 1003 select '0' AS `a`,'0' AS `b`,'0' AS `c` from `test`.`t1` where 1
+DROP TABLE t1;
+#
+# Bug#45266: Uninitialized variable lead to an empty result.
+#
+drop table if exists A,AA,B,BB;
+CREATE TABLE `A` (
+`pk` int(11) NOT NULL AUTO_INCREMENT,
+`date_key` date NOT NULL,
+`date_nokey` date NOT NULL,
+`datetime_key` datetime NOT NULL,
+`int_nokey` int(11) NOT NULL,
+`time_key` time NOT NULL,
+`time_nokey` time NOT NULL,
+PRIMARY KEY (`pk`),
+KEY `date_key` (`date_key`),
+KEY `time_key` (`time_key`),
+KEY `datetime_key` (`datetime_key`)
+);
+CREATE TABLE `AA` (
+`pk` int(11) NOT NULL AUTO_INCREMENT,
+`int_nokey` int(11) NOT NULL,
+`time_key` time NOT NULL,
+KEY `time_key` (`time_key`),
+PRIMARY KEY (`pk`)
+);
+CREATE TABLE `B` (
+`date_nokey` date NOT NULL,
+`date_key` date NOT NULL,
+`time_key` time NOT NULL,
+`datetime_nokey` datetime NOT NULL,
+`varchar_key` varchar(1) NOT NULL,
+KEY `date_key` (`date_key`),
+KEY `time_key` (`time_key`),
+KEY `varchar_key` (`varchar_key`)
+);
+INSERT INTO `B` VALUES ('2003-07-28','2003-07-28','15:13:38','0000-00-00 00:00:00','f'),('0000-00-00','0000-00-00','00:05:48','2004-07-02 14:34:13','x');
+CREATE TABLE `BB` (
+`pk` int(11) NOT NULL AUTO_INCREMENT,
+`int_nokey` int(11) NOT NULL,
+`date_key` date NOT NULL,
+`varchar_nokey` varchar(1) NOT NULL,
+`date_nokey` date NOT NULL,
+PRIMARY KEY (`pk`),
+KEY `date_key` (`date_key`)
+);
+INSERT INTO `BB` VALUES (10,8,'0000-00-00','i','0000-00-00'),(11,0,'2005-08-18','','2005-08-18');
+SELECT table1 . `pk` AS field1
+FROM
+(BB AS table1 INNER JOIN
+(AA AS table2 STRAIGHT_JOIN A AS table3
+ON ( table3 . `date_key` = table2 . `pk` ))
+ON ( table3 . `datetime_key` = table2 . `int_nokey` ))
+WHERE ( table3 . `date_key` <= 4 AND table2 . `pk` = table1 . `varchar_nokey`)
+GROUP BY field1 ;
+field1
+SELECT table3 .`date_key` field1
+FROM
+B table1 LEFT JOIN B table3 JOIN
+(BB table6 JOIN A table7 ON table6 .`varchar_nokey`)
+ON table6 .`int_nokey` ON table6 .`date_key`
+ WHERE NOT ( table1 .`varchar_key` AND table7 .`pk`) GROUP BY field1;
+field1
+NULL
+SELECT table4 . `time_nokey` AS field1 FROM
+(AA AS table1 CROSS JOIN
+(AA AS table2 STRAIGHT_JOIN
+(B AS table3 STRAIGHT_JOIN A AS table4
+ON ( table4 . `date_key` = table3 . `time_key` ))
+ON ( table4 . `pk` = table3 . `date_nokey` ))
+ON ( table4 . `time_key` = table3 . `datetime_nokey` ))
+WHERE ( table4 . `time_key` < table1 . `time_key` AND
+table1 . `int_nokey` != 'f')
+GROUP BY field1 ORDER BY field1 , field1;
+field1
+SELECT table1 .`time_key` field2 FROM B table1 LEFT JOIN BB JOIN A table5 ON table5 .`date_nokey` ON table5 .`int_nokey` GROUP BY field2;
+field2
+00:05:48
+15:13:38
+drop table A,AA,B,BB;
+#end of test for bug#45266
+End of 5.1 tests
+set join_cache_level=default;
+show variables like 'join_cache_level';
+Variable_name Value
+join_cache_level 1
=== modified file 'mysql-test/r/table_elim.result'
--- a/mysql-test/r/table_elim.result 2009-12-15 07:16:46 +0000
+++ b/mysql-test/r/table_elim.result 2009-12-21 02:26:15 +0000
@@ -347,7 +347,7 @@ id select_type table type possible_keys
explain select t1.a from t1 left join t2 on TRUE;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t1 ALL NULL NULL NULL NULL 4
-1 SIMPLE t2 index NULL PRIMARY 4 NULL 2 Using index
+1 SIMPLE t2 index NULL PRIMARY 4 NULL 2 Using where; Using index
explain select t1.a from t1 left join t3 on t3.pk1=t1.a and t3.pk2 IS NULL;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t1 ALL NULL NULL NULL NULL 4
=== added file 'mysql-test/t/join_cache.test'
--- a/mysql-test/t/join_cache.test 1970-01-01 00:00:00 +0000
+++ b/mysql-test/t/join_cache.test 2009-12-21 02:26:15 +0000
@@ -0,0 +1,1825 @@
+--disable_warnings
+DROP TABLE IF EXISTS t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11;
+DROP DATABASE IF EXISTS world;
+--enable_warnings
+
+set names utf8;
+
+CREATE DATABASE world;
+
+use world;
+
+--source include/world_schema1.inc
+
+--disable_query_log
+--disable_result_log
+--disable_warnings
+--source include/world.inc
+--enable_warnings
+--enable_result_log
+--enable_query_log
+
+SELECT COUNT(*) FROM Country;
+SELECT COUNT(*) FROM City;
+SELECT COUNT(*) FROM CountryLanguage;
+
+show variables like 'join_buffer_size';
+
+show variables like 'join_cache_level';
+
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+set join_cache_level=2;
+show variables like 'join_cache_level';
+
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+set join_cache_level=default;
+
+set join_buffer_size=256;
+show variables like 'join_buffer_size';
+
+show variables like 'join_cache_level';
+
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+set join_cache_level=2;
+show variables like 'join_cache_level';
+
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+set join_cache_level=default;
+set join_buffer_size=default;
+
+show variables like 'join_buffer_size';
+show variables like 'join_cache_level';
+
+DROP DATABASE world;
+
+
+CREATE DATABASE world;
+
+use world;
+
+--source include/world_schema.inc
+
+--disable_query_log
+--disable_result_log
+--disable_warnings
+--source include/world.inc
+--enable_warnings
+--enable_result_log
+--enable_query_log
+
+show variables like 'join_buffer_size';
+set join_cache_level=5;
+show variables like 'join_cache_level';
+
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+--echo # !!!NB igor: after backporting the SJ code the following should return
+--echo # EXPLAIN
+--echo # SELECT Name FROM City
+--echo # WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+--echo # City.Population > 100000;
+--echo # id select_type table type possible_keys key key_len ref rows Extra
+--echo # 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+--echo # 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+
+EXPLAIN
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+EXPLAIN
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+ FROM Country LEFT JOIN CountryLanguage ON
+ (CountryLanguage.Country=Country.Code AND Language='English')
+ WHERE
+ Country.Population > 10000000;
+
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+ FROM Country LEFT JOIN CountryLanguage ON
+ (CountryLanguage.Country=Country.Code AND Language='English')
+ WHERE
+ Country.Population > 10000000;
+
+set join_cache_level=6;
+show variables like 'join_cache_level';
+
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+--echo # !!!NB igor: after backporting the SJ code the following should return
+--echo # EXPLAIN
+--echo # SELECT Name FROM City
+--echo # WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+--echo # City.Population > 100000;
+--echo # id select_type table type possible_keys key key_len ref rows Extra
+--echo # 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+--echo # 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+
+EXPLAIN
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+EXPLAIN
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+ FROM Country LEFT JOIN CountryLanguage ON
+ (CountryLanguage.Country=Country.Code AND Language='English')
+ WHERE
+ Country.Population > 10000000;
+
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+ FROM Country LEFT JOIN CountryLanguage ON
+ (CountryLanguage.Country=Country.Code AND Language='English')
+ WHERE
+ Country.Population > 10000000;
+
+set join_cache_level=7;
+show variables like 'join_cache_level';
+
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+--echo # !!!NB igor: after backporting the SJ code the following should return
+--echo # EXPLAIN
+--echo # SELECT Name FROM City
+--echo # WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+--echo # City.Population > 100000;
+--echo # id select_type table type possible_keys key key_len ref rows Extra
+--echo # 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+--echo # 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+
+EXPLAIN
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+EXPLAIN
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+ FROM Country LEFT JOIN CountryLanguage ON
+ (CountryLanguage.Country=Country.Code AND Language='English')
+ WHERE
+ Country.Population > 10000000;
+
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+ FROM Country LEFT JOIN CountryLanguage ON
+ (CountryLanguage.Country=Country.Code AND Language='English')
+ WHERE
+ Country.Population > 10000000;
+
+set join_cache_level=8;
+show variables like 'join_cache_level';
+
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+--echo # !!!NB igor: after backporting the SJ code the following should return
+--echo # EXPLAIN
+--echo # SELECT Name FROM City
+--echo # WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+--echo # City.Population > 100000;
+--echo # id select_type table type possible_keys key key_len ref rows Extra
+--echo # 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+--echo # 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+
+EXPLAIN
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+EXPLAIN
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+ FROM Country LEFT JOIN CountryLanguage ON
+ (CountryLanguage.Country=Country.Code AND Language='English')
+ WHERE
+ Country.Population > 10000000;
+
+SELECT Country.Name, IF(ISNULL(CountryLanguage.Country), NULL, CountryLanguage.Percentage)
+ FROM Country LEFT JOIN CountryLanguage ON
+ (CountryLanguage.Country=Country.Code AND Language='English')
+ WHERE
+ Country.Population > 10000000;
+
+set join_buffer_size=256;
+show variables like 'join_buffer_size';
+
+set join_cache_level=5;
+show variables like 'join_cache_level';
+
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+--echo # !!!NB igor: after backporting the SJ code the following should return
+--echo # EXPLAIN
+--echo # SELECT Name FROM City
+--echo # WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+--echo # City.Population > 100000;
+--echo # id select_type table type possible_keys key key_len ref rows Extra
+--echo # 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+--echo # 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+
+EXPLAIN
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+set join_cache_level=6;
+show variables like 'join_cache_level';
+
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+--echo # !!!NB igor: after backporting the SJ code the following should return
+--echo # EXPLAIN
+--echo # SELECT Name FROM City
+--echo # WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+--echo # City.Population > 100000;
+--echo # id select_type table type possible_keys key key_len ref rows Extra
+--echo # 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+--echo # 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+
+EXPLAIN
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+set join_cache_level=7;
+show variables like 'join_cache_level';
+
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+--echo # !!!NB igor: after backporting the SJ code the following should return
+--echo # EXPLAIN
+--echo # SELECT Name FROM City
+--echo # WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+--echo # City.Population > 100000;
+--echo # id select_type table type possible_keys key key_len ref rows Extra
+--echo # 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+--echo # 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+
+EXPLAIN
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+set join_cache_level=8;
+show variables like 'join_cache_level';
+
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+EXPLAIN
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+SELECT City.Name, Country.Name, CountryLanguage.Language
+ FROM City,Country,CountryLanguage
+ WHERE City.Country=Country.Code AND
+ CountryLanguage.Country=Country.Code AND
+ City.Name LIKE 'L%' AND Country.Population > 3000000 AND
+ CountryLanguage.Percentage > 50;
+
+--echo # !!!NB igor: after backporting the SJ code the following should return
+--echo # EXPLAIN
+--echo # SELECT Name FROM City
+--echo # WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+--echo # City.Population > 100000;
+--echo # id select_type table type possible_keys key key_len ref rows Extra
+--echo # 1 PRIMARY Country range PRIMARY,Name Name 52 NULL 10 Using index condition; Using MRR
+--echo # 1 PRIMARY City ref Population,Country Country 3 world.Country.Code 18 Using where; Using join buffer
+
+EXPLAIN
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+SELECT Name FROM City
+ WHERE City.Country IN (SELECT Code FROM Country WHERE Country.Name LIKE 'L%') AND
+ City.Population > 100000;
+
+set join_cache_level=default;
+set join_buffer_size=default;
+
+show variables like 'join_buffer_size';
+show variables like 'join_cache_level';
+
+set join_cache_level=1;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND City.Population > 3000000;
+
+set join_cache_level=8;
+set join_buffer_size=256;
+
+--replace_column 9 #
+EXPLAIN
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND City.Population > 3000000;
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND City.Population > 3000000;
+
+set join_buffer_size=default;
+
+set join_cache_level=6;
+
+ALTER TABLE Country MODIFY Name varchar(52) NOT NULL default '';
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+ALTER TABLE Country MODIFY Name varchar(300) NOT NULL default '';
+
+SELECT City.Name, Country.Name FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+ALTER TABLE Country ADD COLUMN PopulationBar text;
+UPDATE Country
+ SET PopulationBar=REPEAT('x', CAST(Population/100000 AS unsigned int));
+
+SELECT City.Name, Country.Name, Country.PopulationBar FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+set join_buffer_size=256;
+
+SELECT City.Name, Country.Name, Country.PopulationBar FROM City,Country
+ WHERE City.Country=Country.Code AND
+ Country.Name LIKE 'L%' AND City.Population > 100000;
+
+set join_cache_level=default;
+set join_buffer_size=default;
+
+DROP DATABASE world;
+
+use test;
+
+#
+# Bug #35685: assertion abort when initializing a BKA cache
+#
+
+CREATE TABLE t1(
+ affiliatetometaid int NOT NULL default '0',
+ uniquekey int NOT NULL default '0',
+ metaid int NOT NULL default '0',
+ affiliateid int NOT NULL default '0',
+ xml text,
+ isactive char(1) NOT NULL default 'Y',
+ PRIMARY KEY (affiliatetometaid)
+);
+CREATE UNIQUE INDEX t1_uniquekey ON t1(uniquekey);
+CREATE INDEX t1_affiliateid ON t1(affiliateid);
+CREATE INDEX t1_metaid on t1 (metaid);
+INSERT INTO t1 VALUES
+ (1616, 1571693233, 1391, 2, NULL, 'Y'), (1943, 1993216749, 1726, 2, NULL, 'Y');
+
+CREATE TABLE t2(
+ metaid int NOT NULL default '0',
+ name varchar(80) NOT NULL default '',
+ dateadded timestamp NOT NULL ,
+ xml text,
+ status int default NULL,
+ origin int default NULL,
+ gid int NOT NULL default '1',
+ formattypeid int default NULL,
+ PRIMARY KEY (metaid)
+);
+CREATE INDEX t2_status ON t2(status);
+CREATE INDEX t2_gid ON t2(gid);
+CREATE INDEX t2_formattypeid ON t2(formattypeid);
+INSERT INTO t2 VALUES
+ (1391, "I Just Died", "2003-10-02 10:07:37", "", 1, NULL, 3, NULL),
+ (1726, "Me, Myself & I", "2003-12-05 11:24:36", " ", 1, NULL, 3, NULL);
+
+CREATE TABLE t3(
+ mediaid int NOT NULL ,
+ metaid int NOT NULL default '0',
+ formatid int NOT NULL default '0',
+ status int default NULL,
+ path varchar(100) NOT NULL default '',
+ datemodified timestamp NOT NULL ,
+ resourcetype int NOT NULL default '1',
+ parameters text,
+ signature int default NULL,
+ quality int NOT NULL default '255',
+ PRIMARY KEY (mediaid)
+);
+CREATE INDEX t3_metaid ON t3(metaid);
+CREATE INDEX t3_formatid ON t3(formatid);
+CREATE INDEX t3_status ON t3(status);
+CREATE INDEX t3_metaidformatid ON t3(metaid,formatid);
+CREATE INDEX t3_signature ON t3(signature);
+CREATE INDEX t3_quality ON t3(quality);
+INSERT INTO t3 VALUES
+ (6, 4, 8, 0, "010101_anastacia_spmidi.mid", "2004-03-16 13:40:00", 1, NULL, NULL, 255),
+ (3343, 3, 8, 1, "010102_4VN4bsPwnxRQUJW5Zp1RhG2IL9vvl_8.mid", "2004-03-16 13:40:00", 1, NULL, NULL, 255);
+
+CREATE TABLE t4(
+ formatid int NOT NULL ,
+ name varchar(60) NOT NULL default '',
+ formatclassid int NOT NULL default '0',
+ mime varchar(60) default NULL,
+ extension varchar(10) default NULL,
+ priority int NOT NULL default '0',
+ canaddtocapability char(1) NOT NULL default 'Y',
+ PRIMARY KEY (formatid)
+);
+CREATE INDEX t4_formatclassid ON t4(formatclassid);
+CREATE INDEX t4_formats_idx ON t4(canaddtocapability);
+INSERT INTO t4 VALUES
+ (19, "XHTML", 11, "text/html", "xhtml", 10, 'Y'),
+ (54, "AMR (wide band)", 13, "audio/amr-wb", "awb", 0, 'Y');
+
+CREATE TABLE t5(
+ formatclassid int NOT NULL ,
+ name varchar(60) NOT NULL default '',
+ priority int NOT NULL default '0',
+ formattypeid int NOT NULL default '0',
+ PRIMARY KEY (formatclassid)
+);
+CREATE INDEX t5_formattypeid on t5(formattypeid);
+INSERT INTO t5 VALUES
+ (11, "Info", 0, 4), (13, "Digital Audio", 0, 2);
+
+CREATE TABLE t6(
+ formattypeid int NOT NULL ,
+ name varchar(60) NOT NULL default '',
+ priority int default NULL,
+ PRIMARY KEY (formattypeid)
+);
+INSERT INTO t6 VALUES
+ (2, "Ringtones", 0);
+
+CREATE TABLE t7(
+ metaid int NOT NULL default '0',
+ artistid int NOT NULL default '0',
+ PRIMARY KEY (metaid,artistid)
+);
+INSERT INTO t7 VALUES
+ (4, 5), (3, 4);
+
+CREATE TABLE t8(
+ artistid int NOT NULL ,
+ name varchar(80) NOT NULL default '',
+ PRIMARY KEY (artistid)
+);
+INSERT INTO t8 VALUES
+ (5, "Anastacia"), (4, "John Mayer");
+
+CREATE TABLE t9(
+ subgenreid int NOT NULL default '0',
+ metaid int NOT NULL default '0',
+ PRIMARY KEY (subgenreid,metaid)
+) ;
+CREATE INDEX t9_subgenreid ON t9(subgenreid);
+CREATE INDEX t9_metaid ON t9(metaid);
+INSERT INTO t9 VALUES
+ (138, 4), (31, 3);
+
+CREATE TABLE t10(
+ subgenreid int NOT NULL ,
+ genreid int NOT NULL default '0',
+ name varchar(80) NOT NULL default '',
+ PRIMARY KEY (subgenreid)
+) ;
+CREATE INDEX t10_genreid ON t10(genreid);
+INSERT INTO t10 VALUES
+ (138, 19, ''), (31, 3, '');
+
+CREATE TABLE t11(
+ genreid int NOT NULL default '0',
+ name char(80) NOT NULL default '',
+ priority int NOT NULL default '0',
+ masterclip char(1) default NULL,
+ PRIMARY KEY (genreid)
+) ;
+CREATE INDEX t11_masterclip ON t11( masterclip);
+INSERT INTO t11 VALUES
+ (19, "Pop & Dance", 95, 'Y'), (3, "Rock & Alternative", 100, 'Y');
+
+set join_cache_level=6;
+
+EXPLAIN
+SELECT t1.uniquekey, t1.xml AS affiliateXml,
+ t8.name AS artistName, t8.artistid,
+ t11.name AS genreName, t11.genreid, t11.priority AS genrePriority,
+ t10.subgenreid, t10.name AS subgenreName,
+ t2.name AS metaName, t2.metaid, t2.xml AS metaXml,
+ t4.priority + t5.priority + t6.priority AS overallPriority,
+ t3.path AS path, t3.mediaid,
+ t4.formatid, t4.name AS formatName,
+ t5.formatclassid, t5.name AS formatclassName,
+ t6.formattypeid, t6.name AS formattypeName
+FROM t1, t2, t3, t4, t5, t6, t7, t8, t9, t10, t11
+WHERE t7.metaid = t2.metaid AND t7.artistid = t8.artistid AND
+ t9.metaid = t2.metaid AND t9.subgenreid = t10.subgenreid AND
+ t10.genreid = t11.genreid AND t3.metaid = t2.metaid AND
+ t3.formatid = t4.formatid AND t4.formatclassid = t5.formatclassid AND
+ t4.canaddtocapability = 'Y' AND t5.formattypeid = t6.formattypeid AND
+ t6.formattypeid IN (2) AND (t3.formatid IN (31, 8, 76)) AND
+ t1.metaid = t2.metaid AND t1.affiliateid = '2';
+
+SELECT t1.uniquekey, t1.xml AS affiliateXml,
+ t8.name AS artistName, t8.artistid,
+ t11.name AS genreName, t11.genreid, t11.priority AS genrePriority,
+ t10.subgenreid, t10.name AS subgenreName,
+ t2.name AS metaName, t2.metaid, t2.xml AS metaXml,
+ t4.priority + t5.priority + t6.priority AS overallPriority,
+ t3.path AS path, t3.mediaid,
+ t4.formatid, t4.name AS formatName,
+ t5.formatclassid, t5.name AS formatclassName,
+ t6.formattypeid, t6.name AS formattypeName
+FROM t1, t2, t3, t4, t5, t6, t7, t8, t9, t10, t11
+WHERE t7.metaid = t2.metaid AND t7.artistid = t8.artistid AND
+ t9.metaid = t2.metaid AND t9.subgenreid = t10.subgenreid AND
+ t10.genreid = t11.genreid AND t3.metaid = t2.metaid AND
+ t3.formatid = t4.formatid AND t4.formatclassid = t5.formatclassid AND
+ t4.canaddtocapability = 'Y' AND t5.formattypeid = t6.formattypeid AND
+ t6.formattypeid IN (2) AND (t3.formatid IN (31, 8, 76)) AND
+ t1.metaid = t2.metaid AND t1.affiliateid = '2';
+
+DROP TABLE t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11;
+
+#
+# Bug #37131: 3-way join query with BKA used with a small buffer and
+# only for the third table
+#
+
+CREATE TABLE t1 (a1 int, filler1 char(64) default ' ' );
+CREATE TABLE t2 (
+ a2 int, b2 int, filler2 char(64) default ' ',
+ PRIMARY KEY idx(a2,b2,filler2)
+) ;
+CREATE TABLE t3 (b3 int, c3 int, INDEX idx(b3));
+
+INSERT INTO t1(a1) VALUES
+ (4), (7), (1), (9), (8), (5), (3), (6), (2);
+INSERT INTO t2(a2,b2) VALUES
+ (1,30), (3,40), (2,61), (6,73), (8,92), (9,27), (4,18), (5,84), (7,56),
+ (4,14), (6,76), (8,98), (7,55), (1,39), (2,68), (3,45), (9,21), (5,81),
+ (5,88), (2,65), (6,74), (9,23), (1,37), (3,44), (4,17), (8,99), (7,51),
+ (9,28), (7,52), (1,33), (4,13), (5,87), (3,43), (8,91), (2,62), (6,79),
+ (3,49), (8,93), (7,34), (5,82), (6,78), (2,63), (1,32), (9,22), (4,11);
+INSERT INTO t3 VALUES
+ (30,302), (92,923), (18,187), (45,459), (30,309),
+ (39,393), (68,685), (45,458), (21,210), (81,817),
+ (40,405), (61,618), (73,738), (92,929), (27,275),
+ (18,188), (84,846), (56,564), (14,144), (76,763),
+ (98,982), (55,551), (17,174), (99,998), (51,513),
+ (28,282), (52,527), (33,336), (13,138), (87,878),
+ (43,431), (91,916), (62,624), (79,797), (49,494),
+ (93,933), (34,347), (82,829), (78,780), (63,634),
+ (32,329), (22,228), (11,114), (74,749), (23,236);
+
+set join_cache_level=1;
+
+EXPLAIN
+SELECT a1<>a2, a1, a2, b2, b3, c3,
+ SUBSTR(filler1,1,1) AS s1, SUBSTR(filler2,1,1) AS s2
+FROM t1,t2,t3 WHERE a1=a2 AND b2=b3 AND MOD(c3,10)>7;
+
+SELECT a1<>a2, a1, a2, b2, b3, c3,
+ SUBSTR(filler1,1,1) AS s1, SUBSTR(filler2,1,1) AS s2
+FROM t1,t2,t3 WHERE a1=a2 AND b2=b3 AND MOD(c3,10)>7;
+
+set join_cache_level=5;
+set join_buffer_size=512;
+
+EXPLAIN
+SELECT a1<>a2, a1, a2, b2, b3, c3,
+ SUBSTR(filler1,1,1) AS s1, SUBSTR(filler2,1,1) AS s2
+FROM t1,t2,t3 WHERE a1=a2 AND b2=b3 AND MOD(c3,10)>7;
+
+SELECT a1<>a2, a1, a2, b2, b3, c3,
+ SUBSTR(filler1,1,1) AS s1, SUBSTR(filler2,1,1) AS s2
+FROM t1,t2,t3 WHERE a1=a2 AND b2=b3 AND MOD(c3,10)>7;
+
+DROP TABLE t1,t2,t3;
+
+#
+# Bug #37690: crash with a tiny buffer when using BKA_JOIN_CACHE_UNIQUE
+#
+
+CREATE TABLE t1 (a int, b int, INDEX idx(b));
+CREATE TABLE t2 (a int, b int, INDEX idx(a));
+INSERT INTO t1 VALUES (5,30), (3,20), (7,40), (2,10), (8,30), (1,10), (4,20);
+INSERT INTO t2 VALUES (7,10), (1,20), (2,20), (8,20), (8,10), (1,20);
+INSERT INTO t2 VALUES (1,10), (4,20), (3,20), (7,20), (7,10), (1,20);
+
+set join_buffer_size=32;
+set join_cache_level=8;
+
+EXPLAIN SELECT * FROM t1,t2 WHERE t1.a=t2.a AND t1.b >= 30;
+--sorted_result
+SELECT * FROM t1,t2 WHERE t1.a=t2.a AND t1.b >= 30;
+
+DROP TABLE t1,t2;
+
+--echo #
+--echo # Bug #40134: outer join with not exists optimization and join buffer
+--echo #
+
+set join_cache_level=default;
+set join_buffer_size=default;
+
+CREATE TABLE t1 (a int NOT NULL);
+INSERT INTO t1 VALUES (2), (4), (3), (5), (1);
+CREATE TABLE t2 (a int NOT NULL, b int NOT NULL, INDEX i_a(a));
+INSERT INTO t2 VALUES (4,10), (2,10), (2,30), (2,20), (4,20);
+
+EXPLAIN
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a=t2.a WHERE t2.b IS NULL;
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a=t2.a WHERE t2.b IS NULL;
+
+SET join_cache_level=6;
+EXPLAIN
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a=t2.a WHERE t2.b IS NULL;
+SELECT * FROM t1 LEFT JOIN t2 ON t1.a=t2.a WHERE t2.b IS NULL;
+
+DROP TABLE t1, t2;
+
+set join_cache_level=default;
+set join_buffer_size=default;
+
+--echo #
+--echo # BUG#40136: Group by is ignored when join buffer is used for an outer join
+--echo #
+create table t1(a int PRIMARY KEY, b int);
+insert into t1 values
+ (5, 10), (2, 70), (7, 80), (6, 20), (1, 50), (9, 40), (8, 30), (3, 60);
+create table t2 (p int, a int, INDEX i_a(a));
+insert into t2 values
+ (103, 7), (109, 3), (102, 3), (108, 1), (106, 3),
+ (107, 7), (105, 1), (101, 3), (100, 7), (110, 1);
+set @save_join_cache_level=@@join_cache_level;
+set join_cache_level=6;
+--echo The following must not show "using join cache":
+explain
+select t1.a, count(t2.p) as count
+ from t1 left join t2 on t1.a=t2.a and t2.p % 2 = 1 group by t1.a;
+select t1.a, count(t2.p) as count
+ from t1 left join t2 on t1.a=t2.a and t2.p % 2 = 1 group by t1.a;
+set join_cache_level=@save_join_cache_level;
+drop table t1, t2;
+
+--echo #
+--echo # BUG#40268: Nested outer join with not null-rejecting where condition
+--echo # over an inner table which is not the last in the nest
+--echo #
+
+CREATE TABLE t2 (a int, b int, c int);
+CREATE TABLE t3 (a int, b int, c int);
+CREATE TABLE t4 (a int, b int, c int);
+
+INSERT INTO t2 VALUES (3,3,0), (4,2,0), (5,3,0);
+INSERT INTO t3 VALUES (1,2,0), (2,2,0);
+INSERT INTO t4 VALUES (3,2,0), (4,2,0);
+
+set join_cache_level=6;
+
+SELECT t2.a,t2.b,t3.a,t3.b,t4.a,t4.b
+ FROM t2 LEFT JOIN (t3, t4) ON t2.b=t4.b
+ WHERE t3.a+2<t2.a OR t3.c IS NULL;
+
+set join_cache_level=default;
+DROP TABLE t2, t3, t4;
+
+--echo #
+--echo # Bug #40192: outer join with where clause when using BNL
+--echo #
+
+create table t1 (a int, b int);
+insert into t1 values (2, 20), (3, 30), (1, 10);
+create table t2 (a int, c int);
+insert into t2 values (1, 101), (3, 102), (1, 100);
+
+set join_cache_level=6;
+
+select * from t1 left join t2 on t1.a=t2.a;
+explain select * from t1 left join t2 on t1.a=t2.a where t2.c=102 or t2.c is null;
+select * from t1 left join t2 on t1.a=t2.a where t2.c=102 or t2.c is null;
+
+set join_cache_level=default;
+drop table t1, t2;
+
+--echo #
+--echo # Bug #40317: outer join with with constant on expression equal to FALSE
+--echo #
+
+create table t1 (a int);
+insert into t1 values (30), (40), (20);
+create table t2 (b int);
+insert into t2 values (200), (100);
+
+set join_cache_level=6;
+
+select * from t1 left join t2 on (1=0);
+explain select * from t1 left join t2 on (1=0) where a=40;
+select * from t1 left join t2 on (1=0) where a=40;
+
+set join_cache_level=1;
+explain select * from t1 left join t2 on (1=0);
+
+set join_cache_level=default;
+drop table t1, t2;
+
+--echo #
+--echo # Bug #41204: small buffer with big rec_per_key for ref access
+--echo #
+
+CREATE TABLE t1 (a int);
+
+INSERT INTO t1 VALUES (0);
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1(a) SELECT a FROM t1;
+INSERT INTO t1 VALUES (20000), (10000);
+
+CREATE TABLE t2 (pk int AUTO_INCREMENT PRIMARY KEY, b int, c int, INDEX idx(b));
+INSERT INTO t2(b,c) VALUES (10000, 3), (20000, 7), (20000, 1), (10000, 9), (20000, 5);
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+INSERT INTO t2(b,c) SELECT b,c FROM t2;
+
+--disable_result_log
+ANALYZE TABLE t1,t2;
+--enable_result_log
+
+set join_cache_level=6;
+set join_buffer_size=1024;
+
+EXPLAIN SELECT AVG(c) FROM t1,t2 WHERE t1.a=t2.b;
+SELECT AVG(c) FROM t1,t2 WHERE t1.a=t2.b;
+
+set join_buffer_size=default;
+set join_cache_level=default;
+
+DROP TABLE t1, t2;
+
+--echo #
+--echo # Bug #41894: big join buffer of level 7 used to join records
+--echo # with null values in place of varchar strings
+--echo #
+
+CREATE TABLE t1 (a int NOT NULL AUTO_INCREMENT PRIMARY KEY,
+ b varchar(127) DEFAULT NULL);
+
+INSERT INTO t1(a) VALUES (1);
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+INSERT INTO t1(b) SELECT b FROM t1;
+
+CREATE TABLE t2 (a int NOT NULL PRIMARY KEY, b varchar(127) DEFAULT NULL);
+INSERT INTO t2 SELECT * FROM t1;
+
+CREATE TABLE t3 (a int NOT NULL PRIMARY KEY, b varchar(127) DEFAULT NULL);
+INSERT INTO t3 SELECT * FROM t1;
+
+set join_cache_level=7;
+set join_buffer_size=1024*1024;
+
+EXPLAIN
+SELECT COUNT(*) FROM t1,t2,t3
+ WHERE t1.a=t2.a AND t2.a=t3.a AND
+ t1.b IS NULL AND t2.b IS NULL AND t3.b IS NULL;
+
+SELECT COUNT(*) FROM t1,t2,t3
+ WHERE t1.a=t2.a AND t2.a=t3.a AND
+ t1.b IS NULL AND t2.b IS NULL AND t3.b IS NULL;
+
+set join_buffer_size=default;
+set join_cache_level=default;
+
+DROP TABLE t1,t2,t3;
+
+--echo #
+--echo # Bug #42020: join buffer is used for outer join with fields of
+--echo # several outer tables in join buffer
+--echo #
+
+CREATE TABLE t1 (
+ a bigint NOT NULL,
+ PRIMARY KEY (a)
+);
+INSERT INTO t1 VALUES
+ (2), (1);
+
+CREATE TABLE t2 (
+ a bigint NOT NULL,
+ b bigint NOT NULL,
+ PRIMARY KEY (a,b)
+);
+INSERT INTO t2 VALUES
+ (2,30), (2,40), (2,50), (2,60), (2,70), (2,80),
+ (1,10), (1, 20), (1,30), (1,40), (1,50);
+
+CREATE TABLE t3 (
+ pk bigint NOT NULL AUTO_INCREMENT,
+ a bigint NOT NULL,
+ b bigint NOT NULL,
+ val bigint DEFAULT '0',
+ PRIMARY KEY (pk),
+ KEY idx (a,b)
+);
+INSERT INTO t3(a,b) VALUES
+ (2,30), (2,40), (2,50), (2,60), (2,70), (2,80),
+ (4,30), (4,40), (4,50), (4,60), (4,70), (4,80),
+ (5,30), (5,40), (5,50), (5,60), (5,70), (5,80),
+ (7,30), (7,40), (7,50), (7,60), (7,70), (7,80);
+
+SELECT t1.a, t2.a, t3.a, t2.b, t3.b, t3.val
+ FROM (t1,t2) LEFT JOIN t3 ON (t1.a=t3.a AND t2.b=t3.b)
+ WHERE t1.a=t2.a;
+
+set join_cache_level=6;
+set join_buffer_size=256;
+
+EXPLAIN
+SELECT t1.a, t2.a, t3.a, t2.b, t3.b, t3.val
+ FROM (t1,t2) LEFT JOIN t3 ON (t1.a=t3.a AND t2.b=t3.b)
+ WHERE t1.a=t2.a;
+--sorted_result
+SELECT t1.a, t2.a, t3.a, t2.b, t3.b, t3.val
+ FROM (t1,t2) LEFT JOIN t3 ON (t1.a=t3.a AND t2.b=t3.b)
+ WHERE t1.a=t2.a;
+
+DROP INDEX idx ON t3;
+set join_cache_level=4;
+
+EXPLAIN
+SELECT t1.a, t2.a, t3.a, t2.b, t3.b, t3.val
+ FROM (t1,t2) LEFT JOIN t3 ON (t1.a=t3.a AND t2.b=t3.b)
+ WHERE t1.a=t2.a;
+
+--sorted_result
+SELECT t1.a, t2.a, t3.a, t2.b, t3.b, t3.val
+ FROM (t1,t2) LEFT JOIN t3 ON (t1.a=t3.a AND t2.b=t3.b)
+ WHERE t1.a=t2.a;
+
+set join_buffer_size=default;
+set join_cache_level=default;
+
+DROP TABLE t1,t2,t3;
+
+#
+# WL#4424 Full index condition pushdown with batched key access join
+#
+create table t1(f1 int, f2 int);
+insert into t1 values (1,1),(2,2),(3,3);
+create table t2(f1 int not null, f2 int not null, f3 char(200), key(f1,f2));
+insert into t2 values (1,1, 'qwerty'),(1,2, 'qwerty'),(1,3, 'qwerty');
+insert into t2 values (2,1, 'qwerty'),(2,2, 'qwerty'),(2,3, 'qwerty'),
+ (2,4, 'qwerty'),(2,5, 'qwerty');
+insert into t2 values (3,1, 'qwerty'),(3,4, 'qwerty');
+insert into t2 values (4,1, 'qwerty'),(4,2, 'qwerty'),(4,3, 'qwerty'),
+ (4,4, 'qwerty');
+insert into t2 values (1,1, 'qwerty'),(1,2, 'qwerty'),(1,3, 'qwerty');
+insert into t2 values (2,1, 'qwerty'),(2,2, 'qwerty'),(2,3, 'qwerty'),
+ (2,4, 'qwerty'),(2,5, 'qwerty');
+insert into t2 values (3,1, 'qwerty'),(3,4, 'qwerty');
+insert into t2 values (4,1, 'qwerty'),(4,2, 'qwerty'),(4,3, 'qwerty'),
+ (4,4, 'qwerty');
+
+set join_cache_level=5;
+select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t1.f2 and t2.f2 + 1 >= t1.f1 + 1;
+
+explain select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t2.f2;
+
+set join_cache_level=6;
+select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t1.f2 and t2.f2 + 1 >= t1.f1 + 1;
+
+explain select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t2.f2;
+
+set join_cache_level=7;
+select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t1.f2 and t2.f2 + 1 >= t1.f1 + 1;
+
+explain select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t2.f2;
+
+set join_cache_level=8;
+select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t1.f2 and t2.f2 + 1 >= t1.f1 + 1;
+
+explain select t2.f1, t2.f2, t2.f3 from t1,t2
+where t1.f1=t2.f1 and t2.f2 between t1.f1 and t2.f2;
+
+drop table t1,t2;
+set join_cache_level=default;
+
+--echo #
+--echo # Bug #42955: join with GROUP BY/ORDER BY and when BKA is enabled
+--echo #
+
+create table t1 (d int, id1 int, index idx1 (d, id1));
+insert into t1 values
+ (3, 20), (2, 40), (3, 10), (1, 10), (3, 20), (1, 40), (2, 30), (3, 30);
+
+create table t2 (id1 int, id2 int, index idx2 (id1));
+insert into t2 values
+ (20, 100), (30, 400), (20, 400), (30, 200), (10, 300), (10, 200), (40, 100),
+ (40, 200), (30, 300), (10, 400), (20, 200), (20, 300);
+
+set join_cache_level=6;
+
+explain
+select t1.id1, sum(t2.id2) from t1 join t2 on t1.id1=t2.id1
+ where t1.d=3 group by t1.id1;
+
+select t1.id1, sum(t2.id2) from t1 join t2 on t1.id1=t2.id1
+ where t1.d=3 group by t1.id1;
+
+explain
+select t1.id1 from t1 join t2 on t1.id1=t2.id1
+ where t1.d=3 and t2.id2 > 200 order by t1.id1;
+
+select t1.id1 from t1 join t2 on t1.id1=t2.id1
+ where t1.d=3 and t2.id2 > 200 order by t1.id1;
+
+set join_cache_level=default;
+
+drop table t1,t2;
+
+--echo #
+--echo # Bug #44019: star-like multi-join query executed join_cache_level=6
+--echo #
+
+create table t1 (a int, b int, c int, d int);
+create table t2 (b int, e varchar(16), index idx(b));
+create table t3 (d int, f varchar(16), index idx(d));
+create table t4 (c int, g varchar(16), index idx(c));
+
+insert into t1 values
+ (5, 50, 500, 5000), (3, 30, 300, 3000), (9, 90, 900, 9000),
+ (2, 20, 200, 2000), (4, 40, 400, 4000), (8, 80, 800, 800),
+ (7, 70, 700, 7000);
+insert into t2 values
+ (30, 'bbb'), (10, 'b'), (70, 'bbbbbbb'), (60, 'bbbbbb'),
+ (31, 'bbb'), (11, 'b'), (71, 'bbbbbbb'), (61, 'bbbbbb'),
+ (32, 'bbb'), (12, 'b'), (72, 'bbbbbbb'), (62, 'bbbbbb');
+insert into t3 values
+ (4000, 'dddd'), (3000, 'ddd'), (1000, 'd'), (8000, 'dddddddd'),
+ (4001, 'dddd'), (3001, 'ddd'), (1001, 'd'), (8001, 'dddddddd'),
+ (4002, 'dddd'), (3002, 'ddd'), (1002, 'd'), (8002, 'dddddddd');
+insert into t4 values
+ (200, 'cc'), (600, 'cccccc'), (300, 'ccc'), (500, 'ccccc'),
+ (201, 'cc'), (601, 'cccccc'), (301, 'ccc'), (501, 'ccccc'),
+ (202, 'cc'), (602, 'cccccc'), (302, 'ccc'), (502, 'ccccc');
+
+--disable_result_log
+--disable_warnings
+analyze table t2,t3,t4;
+--enable_warnings
+--enable_result_log
+
+set join_cache_level=1;
+explain
+select t1.a, t1.b, t1.c, t1.d, t2.e, t3.f, t4.g from t1,t2,t3,t4
+ where t2.b=t1.b and t3.d=t1.d and t4.c=t1.c;
+
+select t1.a, t1.b, t1.c, t1.d, t2.e, t3.f, t4.g from t1,t2,t3,t4
+ where t2.b=t1.b and t3.d=t1.d and t4.c=t1.c;
+
+set join_cache_level=6;
+explain
+select t1.a, t1.b, t1.c, t1.d, t2.e, t3.f, t4.g from t1,t2,t3,t4
+ where t2.b=t1.b and t3.d=t1.d and t4.c=t1.c;
+
+select t1.a, t1.b, t1.c, t1.d, t2.e, t3.f, t4.g from t1,t2,t3,t4
+ where t2.b=t1.b and t3.d=t1.d and t4.c=t1.c;
+
+set join_cache_level=default;
+
+drop table t1,t2,t3,t4;
+
+--echo #
+--echo # Bug #44250: Corruption of linked join buffers when using BKA
+--echo #
+
+CREATE TABLE t1 (
+ id1 bigint(20) DEFAULT NULL,
+ id2 bigint(20) DEFAULT NULL,
+ id3 bigint(20) DEFAULT NULL,
+ num1 bigint(20) DEFAULT NULL,
+ num2 int(11) DEFAULT NULL,
+ num3 bigint(20) DEFAULT NULL
+);
+
+CREATE TABLE t2 (
+ id3 bigint(20) NOT NULL DEFAULT '0',
+ id4 bigint(20) DEFAULT NULL,
+ enum1 enum('Enabled','Disabled','Paused') DEFAULT NULL,
+ PRIMARY KEY (id3)
+);
+
+CREATE TABLE t3 (
+ id4 bigint(20) NOT NULL DEFAULT '0',
+ text1 text,
+ PRIMARY KEY (id4)
+);
+
+CREATE TABLE t4 (
+ id2 bigint(20) NOT NULL DEFAULT '0',
+ dummy int(11) DEFAULT '0',
+ PRIMARY KEY (id2)
+);
+
+CREATE TABLE t5 (
+ id1 bigint(20) NOT NULL DEFAULT '0',
+ id2 bigint(20) NOT NULL DEFAULT '0',
+ enum2 enum('Active','Deleted','Paused') DEFAULT NULL,
+ PRIMARY KEY (id1,id2)
+);
+
+--disable_query_log
+--disable_result_log
+--disable_warnings
+
+INSERT INTO t1 VALUES
+(228172702,72485641,2667134182,10,1,14),(228172702,94266195,2667134182,134,0,134),
+(228172702,94266195,2667134182,15,0,15),(228172702,94266195,2667134182,2,0,3),
+(228172702,818095880,2667134182,1,1,1),(228172702,1004959639,2667134182,3,0,3),
+(228172702,1297484422,2667134182,1,2,1),(228172702,1730911800,2667134182,11,0,28),
+(228172702,1730911800,2667134182,4,0,4),(228172702,2182755982,2667134182,5,0,15),
+(228172702,2182755982,2667134182,1,0,1),(228172702,2968841184,2667134182,1,0,1),
+(228172702,4765525626,2667134182,2,0,3),(228172702,4765525626,2667134182,29,0,38),
+(228172702,4765525626,2667134182,7,0,7),(228172702,4765525626,2667134182,7,0,8),
+(228172702,5330573302,2667134182,1,0,1),(228512602,191149872,935692942,3,0,17),
+(228512602,259118753,935692942,13,7,13),(228512602,259118753,935692942,83,33,83),
+(228512602,585705465,935692942,1,0,1),(228512602,585716775,935692942,1,0,1),
+(228512602,585716775,935692942,6,6,6),(228512602,585716775,935692942,1,1,1),
+(228512602,1105371172,935692942,2,0,3),(228512602,1105371172,935692942,7,2,7),
+(228512602,1314223462,935692942,1,0,1),(228512602,1314223642,935692942,1,1,1),
+(228512602,1411060522,935692942,1,0,1),(228512602,1467398182,935692942,1,0,1),
+(228512602,1467398182,935692942,3,0,4),(228512602,1467398242,935692942,10,0,41),
+(228512602,1467398242,935692942,28,0,40),(228512602,1467398242,935692942,0,0,0),
+(228512602,1467398242,935692942,29,2,33),(228512602,1734178942,935692942,1,0,1),
+(228512602,1734179122,935692942,1,0,4),(228512602,1734179122,935692942,3,0,6),
+(228512602,1953612870,935692942,1,0,1),(228512602,2271510562,935692942,1,1,1),
+(228512602,2271525022,935692942,0,0,0),(228512602,3058831402,935692942,1,1,1),
+(228512602,3723638842,935692942,1,1,1),(228512602,3723638842,935692942,4,3,4),
+(228512602,3723836602,935692942,1,1,1),(228512602,3723836842,935692942,1,1,1),
+(228512602,3723836962,935692942,1,1,1),(228512602,3723988102,935692942,11,4,11),
+(228512602,3723989182,935692942,8,3,8),(228512602,5920283002,935692942,1,0,1),
+(228512602,5920314232,935692942,1,0,1),(228512602,191149872,1241589892,0,0,0),
+(228512602,191149872,1241589892,2,0,4),(228512602,191149872,1241589892,0,0,0),
+(228512602,259118753,1241589892,8,4,8),(228512602,259118753,1241589892,70,33,70),
+(228512602,259118753,1241589892,1,1,1),(228512602,585716775,1241589892,8,7,8),
+(228512602,1105371172,1241589892,1,0,1),(228512602,1105371172,1241589892,9,0,9),
+(228512602,1314223462,1241589892,1,0,1),(228512602,1411060522,1241589892,1,1,1),
+(228512602,1467398182,1241589892,1,0,1),(228512602,1467398182,1241589892,4,1,4),
+(228512602,1467398182,1241589892,1,0,1),(228512602,1467398242,1241589892,10,0,28),
+(228512602,1467398242,1241589892,37,1,78),(228512602,1467398242,1241589892,28,9,30),
+(228512602,1467398242,1241589892,5,0,6),(228512602,1734179122,1241589892,3,1,18),
+(228512602,1734179122,1241589892,1,1,1),(228512602,1734179122,1241589892,2,0,3),
+(228512602,1953611430,1241589892,1,1,1),(228512602,1953611430,1241589892,1,1,1),
+(228512602,1953612870,1241589892,1,0,1),(228512602,2026844250,1241589892,1,0,1),
+(228512602,2271510562,1241589892,1,1,1),(228512602,2271525022,1241589892,1,0,1),
+(228512602,2941612417,1241589892,1,0,1),(228512602,3723988102,1241589892,1,0,1);
+INSERT INTO t1 VALUES
+(228512602,3723988102,1241589892,11,4,11),(228512602,3723989002,1241589892,1,0,1),
+(228512602,3752960902,1241589892,2,2,4),(228808822,17304242,935693782,6,0,17),
+(228808822,17304242,935693782,28,1,50),(228808822,17304242,935693782,29,3,61),
+(228808822,17304242,935693782,6,0,13),(228808822,30931012,935693782,21,0,60),
+(228808822,30931012,935693782,5,0,13),(228808822,37254452,935693782,3,0,3),
+(228808822,42726891,935693782,1,0,4),(228808822,42726891,935693782,3,0,6),
+(228808822,76261151,935693782,8,0,18),(228808822,88240139,935693782,1,0,1),
+(228808822,88240139,935693782,3,0,3),(228808822,94730895,935693782,2,0,4),
+(228808822,179737402,935693782,10,0,13),(228808822,179737402,935693782,7,0,8),
+(228808822,179737402,935693782,3,0,4),(228808822,271288782,935693782,1,0,6),
+(228808822,304690943,935693782,5,2,10),(228808822,304691183,935693782,4,0,16),
+(228808822,568994960,935693782,1,0,1),(228808822,631705925,935693782,1,0,1),
+(228808822,631745165,935693782,1,0,1),(228808822,631749605,935693782,1,0,4),
+(228808822,1057787002,935693782,1,0,1),(228808822,1057787002,935693782,2,1,4),
+(228808822,1057787002,935693782,12,1,20),(228808822,1057788022,935693782,2,0,40),
+(228808822,1057788022,935693782,2,1,3),(228808822,1057788022,935693782,9,2,16),
+(228808822,1335646822,935693782,3,1,6),(228808822,1335646882,935693782,1,0,3),
+(228808822,1335646882,935693782,1,0,3),(228808822,1335646942,935693782,7,2,15),
+(228808822,5510586183,935693782,1,1,1),(228808822,17304242,2482416112,11,0,28),
+(228808822,17304242,2482416112,34,0,62),(228808822,17304242,2482416112,43,2,89),
+(228808822,17304242,2482416112,9,0,19),(228808822,30931012,2482416112,32,2,84),
+(228808822,30931012,2482416112,6,0,14),(228808822,30931012,2482416112,2,0,9),
+(228808822,37254452,2482416112,1,1,1),(228808822,42726891,2482416112,2,0,10),
+(228808822,76261151,2482416112,11,0,26),(228808822,88240139,2482416112,3,0,3),
+(228808822,88240139,2482416112,1,0,1),(228808822,88240139,2482416112,3,0,4),
+(228808822,94730895,2482416112,1,0,3),(228808822,125469602,2482416112,0,0,0),
+(228808822,179737402,2482416112,4,0,10),(228808822,179737402,2482416112,8,1,9),
+(228808822,179737402,2482416112,7,1,9),(228808822,179737402,2482416112,1,0,1),
+(228808822,271288782,2482416112,2,0,14),(228808822,304690943,2482416112,3,0,6),
+(228808822,304691183,2482416112,1,0,4),(228808822,555689643,2482416112,2,1,8),
+(228808822,555689643,2482416112,1,0,4),(228808822,631705925,2482416112,1,0,1),
+(228808822,631712555,2482416112,1,0,1),(228808822,631745165,2482416112,1,0,1),
+(228808822,710348755,2482416112,1,0,1),(228808822,753718113,2482416112,1,0,1),
+(228808822,1057787002,2482416112,1,0,4),(228808822,1057787002,2482416112,1,0,1),
+(228808822,1057787002,2482416112,4,1,7),(228808822,1057788022,2482416112,7,0,12),
+(228808822,1057788022,2482416112,3,0,37),(228808822,1057788022,2482416112,0,0,0),
+(228808822,1057788022,2482416112,12,0,15),(228808822,1335646822,2482416112,14,1,28),
+(228808822,1335646882,2482416112,1,1,3),(228808822,1335646942,2482416112,5,1,9),
+(228808822,1335646942,2482416112,1,0,1),(230941762,16069490,2691187582,0,0,0),
+(230941762,16705991,2691187582,16,0,30),(230941762,16705991,2691187582,12,3,12);
+INSERT INTO t1 VALUES
+(230941762,16705991,2691187582,1,0,1),(230941762,27714032,2691187582,6,0,16),
+(230941762,27714032,2691187582,1,0,1),(230941762,27714032,2691187582,9,0,14),
+(230941762,28676710,2691187582,3,1,4),(230941762,370319272,2691187582,7,0,7),
+(230941762,1409814802,2691187582,1,0,3),(230941762,1409814982,2691187582,1,0,1),
+(230941762,1409814982,2691187582,1,1,1),(230941762,2069703256,2691187582,1,0,3),
+(230941762,16705991,2691187672,8,1,20),(230941762,16705991,2691187672,11,6,11),
+(230941762,16705991,2691187672,1,0,1),(230941762,27714032,2691187672,5,0,20),
+(230941762,27714032,2691187672,1,0,10),(230941762,27714032,2691187672,12,2,17),
+(230941762,28676710,2691187672,1,0,1),(230941762,142889951,2691187672,2,0,10),
+(230941762,172526592,2691187672,1,1,1),(230941762,293109282,2691187672,1,0,1),
+(230941762,370319272,2691187672,10,0,10),(230941762,1409814802,2691187672,1,0,3),
+(230941762,1409814922,2691187672,1,0,1),(230941762,1409814982,2691187672,1,0,1),
+(230941762,16069490,2694472582,1,1,1),(230941762,16069490,2694472582,1,1,1),
+(230941762,16705991,2694472582,15,0,45),(230941762,16705991,2694472582,13,2,15),
+(230941762,27714032,2694472582,9,0,34),(230941762,27714032,2694472582,2,0,4),
+(230941762,27714032,2694472582,10,2,14),(230941762,28676710,2694472582,4,0,12),
+(230941762,28676710,2694472582,1,0,1),(230941762,172526592,2694472582,1,0,4),
+(230941762,293109282,2694472582,1,0,1),(230941762,370319272,2694472582,6,0,6),
+(230941762,1409814802,2694472582,1,0,3),(230941762,1409814862,2694472582,1,0,4),
+(230941762,1409814982,2694472582,1,0,1),(230941762,2680867980,2694472582,1,0,3),
+(230942122,25451690,935695702,1,0,9),(230942122,31549341,935695702,2,0,18),
+(230942122,31549341,935695702,2,0,4),(230942122,38900150,935695702,4,0,29),
+(230942122,38900150,935695702,4,1,13),(230942122,906919252,935695702,39,0,271),
+(230942122,906919252,935695702,20,0,83),(230942122,906919252,935695702,2,1,9),
+(230942122,1409816782,935695702,3,0,18),(230942122,1409816842,935695702,1,0,7),
+(230942122,1409816842,935695702,1,0,3),(230942122,1409816902,935695702,1,0,6),
+(230942122,2145075862,935695702,4,1,4),(230942122,25451690,935695822,2,0,16),
+(230942122,38900150,935695822,3,0,26),(230942122,38900150,935695822,1,0,3),
+(230942122,906919252,935695822,24,0,176),(230942122,906919252,935695822,20,0,74),
+(230942122,906919252,935695822,1,0,3),(230942122,1409816782,935695822,2,0,21),
+(230942122,1409816782,935695822,2,0,21),(230942122,1409816842,935695822,1,0,3),
+(230942122,1409816902,935695822,1,0,7),(231112162,1413675742,935696902,1,0,1),
+(231112162,1413675742,935696962,0,0,0),(231112162,1413675742,935696962,4,2,4),
+(231112162,1413675922,935696962,1,0,1),(231112162,1413675922,935696962,1,0,1),
+(231112162,1413675742,1248588922,1,0,1),(231112162,1413675922,1248588922,3,0,3),
+(233937022,12641121,935697562,2,0,13),(233937022,12653871,935697562,1,0,1),
+(233937022,12693551,935697562,1,0,1),(233937022,12910461,935697562,2,0,6),
+(233937022,12910461,935697562,26,0,65),(233937022,12910461,935697562,44,8,45),
+(233937022,12910481,935697562,12,0,19),(233937022,12910481,935697562,7,2,9),
+(233937022,12910481,935697562,1,0,1),(233937022,12910511,935697562,8,0,8);
+INSERT INTO t1 VALUES
+(233937022,12910511,935697562,20,6,22),(233937022,30879781,935697562,34,0,34),
+(233937022,30879781,935697562,3,0,4),(233937022,30879781,935697562,1,0,1),
+(233937022,45631730,935697562,8,0,39),(233937022,54079090,935697562,12,0,12),
+(233937022,54079090,935697562,7,0,11),(233937022,54079090,935697562,14,0,16),
+(233937022,94431735,935697562,6,0,31),(233937022,96876131,935697562,3,0,4),
+(233937022,105436492,935697562,4,0,4),(233937022,128981555,935697562,3,0,3),
+(233937022,145211004,935697562,1,0,1),(233937022,146382622,935697562,1,0,1),
+(233937022,175678702,935697562,1,0,4),(233937022,298998998,935697562,1,0,1),
+(233937022,335995773,935697562,3,0,3),(233937022,335995773,935697562,2,0,3),
+(233937022,347447636,935697562,0,0,0),(233937022,459295955,935697562,3,0,3),
+(233937022,459376625,935697562,1,0,1),(233937022,495877773,935697562,1,0,1),
+(233937022,497008702,935697562,1,0,3),(233937022,561944105,935697562,1,0,1),
+(233937022,561944105,935697562,1,0,1),(233937022,586535965,935697562,3,0,3),
+(233937022,631549775,935697562,1,0,7),(233937022,647138479,935697562,1,0,1),
+(233937022,655870453,935697562,4,0,7),(233937022,694832725,935697562,1,0,1),
+(233937022,864475057,935697562,1,0,1),(233937022,1010757503,935697562,1,0,4),
+(233937022,1010847736,935697562,2,0,9),(233937022,1287437116,935697562,2,0,4),
+(233937022,1337693056,935697562,1,0,1),(233937022,1569279742,935697562,1,1,1),
+(233937022,1569280102,935697562,2,0,7),(233937022,1569280882,935697562,2,1,3),
+(233937022,1569281062,935697562,1,0,1),(233937022,1569281962,935697562,1,0,3),
+(233937022,2823580588,935697562,2,0,8),(233937022,2823580588,935697562,3,1,10),
+(233937022,2842066134,935697562,1,0,1),(233937022,2904542181,935697562,1,0,1),
+(233937022,3058483627,935697562,1,0,1),(233937022,4507287318,935697562,1,0,1),
+(233937022,5283489892,935697562,1,0,1),(233937022,11890554322,935697562,16,0,16),
+(233937022,11890756102,935697562,3,1,3),(233937022,12641121,953996482,1,0,7),
+(233937022,12641851,953996482,1,0,1),(233937022,12641851,953996482,1,0,1),
+(233937022,12910461,953996482,4,0,14),(233937022,12910461,953996482,20,2,23),
+(233937022,12910461,953996482,43,5,43),(233937022,12910461,953996482,1,0,1),
+(233937022,12910481,953996482,17,2,30),(233937022,12910511,953996482,7,1,8),
+(233937022,12910511,953996482,23,5,23),(233937022,14913951,953996482,2,0,3),
+(233937022,21835210,953996482,1,1,1),(233937022,26481052,953996482,1,1,1),
+(233937022,26481052,953996482,1,0,1),(233937022,30879781,953996482,2,0,3),
+(233937022,30879781,953996482,22,0,22),(233937022,35617681,953996482,1,0,1),
+(233937022,45631730,953996482,3,0,11),(233937022,54079090,953996482,13,0,13),
+(233937022,54079090,953996482,11,0,16),(233937022,54079090,953996482,29,0,34),
+(233937022,94431735,953996482,3,0,9),(233937022,96876131,953996482,3,0,4),
+(233937022,105436492,953996482,1,0,1),(233937022,105437952,953996482,3,1,3),
+(233937022,123639716,953996482,1,0,6),(233937022,145211004,953996482,2,0,3),
+(233937022,145211004,953996482,2,1,3),(233937022,146382622,953996482,1,0,1),
+(233937022,146382622,953996482,1,0,1),(233937022,155454324,953996482,1,0,1);
+INSERT INTO t1 VALUES
+(233937022,298998998,953996482,1,1,1),(233937022,335995773,953996482,1,0,1),
+(233937022,335995773,953996482,7,2,9),(233937022,459295955,953996482,2,0,4),
+(233937022,561944105,953996482,1,0,1),(233937022,655870453,953996482,5,0,9),
+(233937022,694832725,953996482,1,0,1),(233937022,694832725,953996482,1,0,1),
+(233937022,864475057,953996482,4,1,4),(233937022,897886118,953996482,1,0,1),
+(233937022,897886118,953996482,1,0,3),(233937022,1005147016,953996482,1,0,1),
+(233937022,1010757503,953996482,1,0,1),(233937022,1082217873,953996482,1,0,1),
+(233937022,1286925326,953996482,1,0,1),(233937022,1337693056,953996482,4,0,4),
+(233937022,1407236408,953996482,2,0,3),(233937022,1569280102,953996482,1,0,6),
+(233937022,1569280222,953996482,1,0,1),(233937022,1569281062,953996482,1,0,1),
+(233937022,1569284362,953996482,1,0,3),(233937022,2823580588,953996482,1,0,3),
+(233937022,2904542181,953996482,3,0,7),(233937022,4371581485,953996482,1,0,1),
+(233937022,5283491332,953996482,1,0,1),(233937022,7300486013,953996482,1,1,1),
+(233937022,11890554322,953996482,16,0,16),(233937022,11890754392,953996482,1,0,1),
+(233937022,11890754392,953996482,0,0,0);
+
+INSERT INTO t2 VALUES
+(2667134182,2567095402,'Enabled'),(935692942,826927822,'Enabled'),
+(1241589892,1130891152,'Enabled'),(935693782,826928662,'Enabled'),
+(2482416112,2381969632,'Enabled'),(2691187582,2591198842,'Enabled'),
+(2691187672,2591198932,'Enabled'),(2694472582,2594492212,'Paused'),
+(935695702,826930582,'Enabled'),(935695822,826930702,'Enabled'),
+(935696902,826931782,'Enabled'),(935696962,826931842,'Enabled'),
+(1248588922,1137805582,'Enabled'),(935697562,826932442,'Paused'),
+(953996482,845181202,'Enabled'),(2702549092,2602579882,'Enabled'),
+(2702549182,2602579972,'Enabled'),(2702550712,2602581502,'Enabled'),
+(1125312412,1015179502,'Enabled'),(2708245462,2608290202,'Enabled'),
+(2708247262,2608292002,'Enabled'),(935699242,826934122,'Enabled'),
+(1125312502,1015179592,'Enabled'),(1125312592,1015179682,'Enabled'),
+(2711450452,2611502302,'Enabled'),(2711452252,2611504102,'Enabled'),
+(935699902,826934782,'Enabled'),(935700262,826935142,'Enabled'),
+(1215381442,1104677032,'Enabled'),(2503848082,2403457762,'Enabled'),
+(935701762,826936642,'Enabled'),(935701822,826936702,'Enabled'),
+(1468810282,1355227402,'Enabled'),(935702842,826937722,'Enabled'),
+(1125312682,1015179772,'Enabled'),(2713816102,2613869392,'Enabled'),
+(2688452032,2588455012,'Enabled'),(2688452212,2588455192,'Enabled'),
+(2701527412,2601556942,'Enabled'),(1623918712,1510242412,'Enabled'),
+(2701521922,2601551452,'Enabled'),(2701527772,2601557302,'Enabled');
+
+INSERT INTO `t3` VALUES
+(2567095402,'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'),
+(826927822,'BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB'),(1130891152,'BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB'),
+(826928662,'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC'),
+(2381969632,'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC'),
+(2591198842,'DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD'),
+(2591198932,'EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE'),
+(2594492212,'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF'),
+(826930582,'GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG'),
+(826930702,'GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG'),
+(826931782,'BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB'),
+(826931842,'BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB'),
+(1137805582,'BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB');
+
+INSERT INTO t4 VALUES
+(12618121,0),(12641121,0),(12641851,0),(12653871,0),(12665801,0),(12666811,0),
+(12693551,0),(12910461,0),(12910481,0),(12910511,0),(14787251,0),(14913941,0),
+(14913951,0),(16069490,0),(16705901,0),(16705991,0),(17291062,0),(17304242,0),
+(20737411,0),(21524370,0),(21835210,0),(25300361,0),(25451690,0),(25728842,0),
+(26481052,0),(27714032,0),(28676710,0),(30879781,0),(30931012,0),(31549341,0),
+(35617681,0),(37254452,0),(38619430,0),(38895490,0),(38900150,0),(39798990,0),
+(42726891,0),(42867050,0),(43439030,0),(45631730,0),(47171711,0),(49539832,0),
+(54079090,0),(60442241,0),(65320501,0),(72485641,0),(76261151,0),(87949714,0),
+(88240139,0),(94266195,0),(94431735,0),(94730895,0),(96876131,0);
+
+INSERT INTO t5 VALUES
+(228172702,72485641,'Active'),(228172702,94266195,'Active'),
+(228172702,818095880,'Active'),(228172702,1004959639,'Active'),
+(228172702,1297484242,'Active'),(228172702,1297484422,'Active'),
+(228172702,1730911800,'Active'),(228172702,1808277389,'Active'),
+(228172702,2182755982,'Active'),(228172702,2968841184,'Active'),
+(228172702,3015116542,'Active'),(228172702,3752383170,'Active'),
+(228172702,4765525626,'Active'),(228172702,5330573302,'Active'),
+(228512602,191149872,'Active'),(228512602,259118753,'Active'),
+(228512602,585705465,'Active'),(228512602,585716775,'Active'),
+(228512602,1105371172,'Active'),(228512602,1314223462,'Active'),
+(228512602,1314223642,'Active'),(228512602,1411060522,'Active'),
+(228512602,1467398182,'Active'),(228512602,1467398242,'Active'),
+(228512602,1734178942,'Active'),(228512602,1734179122,'Active'),
+(228512602,1953612870,'Active'),(228512602,2271510562,'Active'),
+(228512602,2271525022,'Active'),(228512602,2941612417,'Active'),
+(228512602,3058831402,'Active'),(228512602,3723638842,'Active'),
+(228512602,3723836602,'Active'),(228512602,3723836842,'Active'),
+(228512602,3723836962,'Active'),(228512602,3723988102,'Active'),
+(228512602,3723989182,'Active'),(228512602,5920283002,'Active'),
+(228512602,5920314232,'Active'),(228512602,585717615,'Active'),
+(228512602,1953611430,'Active'),(228512602,2026844250,'Active'),
+(228512602,3058831462,'Active'),(228512602,3723836902,'Active'),
+(228512602,3723989002,'Active'),(228512602,3752960902,'Active'),
+(228808822,17304242,'Active'),(228808822,30931012,'Active'),
+(228808822,37254452,'Active'),(228808822,42726891,'Active'),
+(228808822,76261151,'Active'),(228808822,88240139,'Active'),
+(228808822,94730895,'Active'),(228808822,125469622,'Active'),
+(228808822,179737402,'Active'),(228808822,271288782,'Active'),
+(228808822,304690943,'Active'),(228808822,304691183,'Active'),
+(228808822,496123368,'Active'),(228808822,555689643,'Active'),
+(228808822,568994960,'Active'),(228808822,631705925,'Active'),
+(228808822,631745165,'Active'),(228808822,631749605,'Active'),
+(228808822,1057787002,'Active'),(228808822,1057788022,'Active'),
+(228808822,1335646822,'Active'),(228808822,1335646882,'Active'),
+(228808822,1335646942,'Active'),(228808822,1612792238,'Active'),
+(228808822,5510586183,'Active'),(228808822,47171711,'Active'),
+(228808822,125469602,'Active'),(228808822,631712555,'Active'),
+(228808822,710348755,'Active'),(228808822,753718113,'Active'),
+(230941762,16069490,'Active'),(230941762,16705991,'Active'),
+(230941762,27714032,'Active'),(230941762,28676710,'Active');
+INSERT INTO t5 VALUES
+(230941762,370319272,'Active'),(230941762,1409814802,'Active'),
+(230941762,1409814982,'Active'),(230941762,2069703256,'Active'),
+(230941762,142889951,'Active'),(230941762,172526592,'Active'),
+(230941762,293109282,'Active'),(230941762,1409814922,'Active'),
+(230941762,1409814862,'Active'),(230941762,2680867980,'Active'),
+(230942122,25451690,'Active'),(230942122,31549341,'Active'),
+(230942122,38900150,'Active'),(230942122,464554745,'Active'),
+(230942122,906919252,'Active'),(230942122,1409816782,'Active'),
+(230942122,1409816842,'Active'),(230942122,1409816902,'Active'),
+(230942122,2145075862,'Active'),(231112162,1413675742,'Active'),
+(231112162,1413675922,'Active'),(231112162,1413675562,'Active'),
+(231112162,1413675802,'Active'),(233937022,12641121,'Active'),
+(233937022,12653871,'Active'),(233937022,12693551,'Active'),
+(233937022,12910461,'Active'),(233937022,12910481,'Active'),
+(233937022,12910511,'Active'),(233937022,14913941,'Active'),
+(233937022,30879781,'Active'),(233937022,45631730,'Active'),
+(233937022,54079090,'Active'),(233937022,65320501,'Active'),
+(233937022,94431735,'Active'),(233937022,96876131,'Active'),
+(233937022,105436492,'Active'),(233937022,105437952,'Active'),
+(233937022,128981555,'Active'),(233937022,145211004,'Active'),
+(233937022,146382622,'Active'),(233937022,148832422,'Active'),
+(233937022,175678702,'Active'),(233937022,260507673,'Active'),
+(233937022,298998998,'Active'),(233937022,335995773,'Active'),
+(233937022,347447636,'Active'),(233937022,459295955,'Active'),
+(233937022,459376625,'Active'),(233937022,495877773,'Active'),
+(233937022,497008702,'Active'),(233937022,561944105,'Active'),
+(233937022,586535965,'Active'),(233937022,631549775,'Active'),
+(233937022,647138479,'Active'),(233937022,655870453,'Active'),
+(233937022,694832725,'Active'),(233937022,835712045,'Active'),
+(233937022,864475057,'Active'),(233937022,864484777,'Active'),
+(233937022,1010757503,'Active'),(233937022,1010847736,'Active'),
+(233937022,1091554836,'Active'),(233937022,1287437116,'Active'),
+(233937022,1337693056,'Active'),(233937022,1569279742,'Active'),
+(233937022,1569280102,'Active'),(233937022,1569280222,'Active'),
+(233937022,1569280582,'Active'),(233937022,1569280882,'Active'),
+(233937022,1569281062,'Active'),(233937022,1569281962,'Active'),
+(233937022,1569284362,'Active'),(233937022,1743317015,'Active'),
+(233937022,2698799002,'Active'),(233937022,2698800742,'Active'),
+(233937022,2823580588,'Active'),(233937022,2842066134,'Active'),
+(233937022,2904542181,'Active'),(233937022,3058483627,'Active');
+INSERT INTO t5 VALUES
+(233937022,4507287318,'Active'),(233937022,5283489892,'Active'),
+(233937022,11890554322,'Active'),(233937022,11890756102,'Active'),
+(233937022,12641851,'Active'),(233937022,14913951,'Active'),
+(233937022,21835210,'Active'),(233937022,26481052,'Active'),
+(233937022,35617681,'Active'),(233937022,123639716,'Active'),
+(233937022,155454324,'Active'),(233937022,299001668,'Active'),
+(233937022,897886118,'Active'),(233937022,1005147016,'Active'),
+(233937022,1082217873,'Active'),(233937022,1286925326,'Active'),
+(233937022,1407236408,'Active'),(233937022,4371581485,'Active'),
+(233937022,5283491332,'Active'),(233937022,7300486013,'Active'),
+(233937022,11890754392,'Active');
+
+--enable_warnings
+--enable_result_log
+--enable_query_log
+
+set join_cache_level=8;
+set join_buffer_size=2048;
+
+EXPLAIN
+SELECT STRAIGHT_JOIN t1.id1, t1.num3, t3.text1, t3.id4, t2.id3, t4.dummy
+ FROM t1 JOIN t2 JOIN t3 JOIN t4 JOIN t5
+ WHERE t1.id1=t5.id1 AND t1.id2=t5.id2 and t4.id2=t1.id2 AND
+ t5.enum2='Active' AND t3.id4=t2.id4 AND t2.id3=t1.id3 AND t3.text1<'D';
+
+SELECT STRAIGHT_JOIN t1.id1, t1.num3, t3.text1, t3.id4, t2.id3, t4.dummy
+ FROM t1 JOIN t2 JOIN t3 JOIN t4 JOIN t5
+ WHERE t1.id1=t5.id1 AND t1.id2=t5.id2 and t4.id2=t1.id2 AND
+ t5.enum2='Active' AND t3.id4=t2.id4 AND t2.id3=t1.id3 AND t3.text1<'D';
+
+set join_buffer_size=default;
+set join_cache_level=default;
+
+DROP TABLE t1,t2,t3,t4,t5;
+
+--echo #
+--echo # Bug#45267: Incomplete check caused wrong result.
+--echo #
+CREATE TABLE t1 (
+ `pk` int(11) NOT NULL AUTO_INCREMENT PRIMARY KEY
+);
+CREATE TABLE t3 (
+ `pk` int(11) NOT NULL AUTO_INCREMENT PRIMARY KEY
+);
+INSERT INTO t3 VALUES
+(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),
+(16),(17),(18),(19),(20);
+CREATE TABLE t2 (
+ `pk` int(11) NOT NULL AUTO_INCREMENT,
+ `int_nokey` int(11) NOT NULL,
+ `time_key` time NOT NULL,
+ PRIMARY KEY (`pk`),
+ KEY `time_key` (`time_key`)
+);
+INSERT INTO t2 VALUES (10,9,'22:36:46'),(11,0,'08:46:46');
+
+SELECT DISTINCT t1.`pk`
+FROM t1 RIGHT JOIN t2 STRAIGHT_JOIN t3 ON t2.`int_nokey` ON t2.`time_key`
+GROUP BY 1;
+
+DROP TABLE IF EXISTS t1, t2, t3;
+
+--echo #
+--echo # Bug #46328: Use of aggregate function without GROUP BY clause
+--echo # returns many rows (vs. one )
+--echo #
+
+CREATE TABLE t1 (
+ int_key int(11) NOT NULL,
+ KEY int_key (int_key)
+);
+
+INSERT INTO t1 VALUES
+(0),(2),(2),(2),(3),(4),(5),(5),(6),(6),(8),(8),(9),(9);
+
+CREATE TABLE t2 (
+ int_key int(11) NOT NULL,
+ KEY int_key (int_key)
+);
+
+INSERT INTO t2 VALUES (2),(3);
+
+--echo
+
+--echo # The query shall return 1 record with a max value 9 and one of the
+--echo # int_key values inserted above (undefined which one). A changed
+--echo # execution plan may change the value in the second column
+SELECT MAX(t1.int_key), t1.int_key
+FROM t1 STRAIGHT_JOIN t2
+ORDER BY t1.int_key;
+
+--echo
+
+explain
+SELECT MAX(t1.int_key), t1.int_key
+FROM t1 STRAIGHT_JOIN t2
+ORDER BY t1.int_key;
+
+--echo
+
+DROP TABLE t1,t2;
+
+SET join_cache_level=default;
+
+--echo #
+--echo # Regression test for
+--echo # Bug#46733 - NULL value not returned for aggregate on empty result
+--echo # set w/ semijoin on
+--echo #
+
+CREATE TABLE t1 (
+ i int(11) NOT NULL,
+ v varchar(1) DEFAULT NULL,
+ PRIMARY KEY (i)
+);
+
+INSERT INTO t1 VALUES (10,'a'),(11,'b'),(12,'c'),(13,'d');
+
+CREATE TABLE t2 (
+ i int(11) NOT NULL,
+ v varchar(1) DEFAULT NULL,
+ PRIMARY KEY (i)
+);
+
+INSERT INTO t2 VALUES (1,'x'),(2,'y');
+
+--echo
+
+SELECT MAX(t1.i)
+FROM t1 JOIN t2 ON t2.v
+ORDER BY t2.v;
+
+--echo
+
+EXPLAIN
+SELECT MAX(t1.i)
+FROM t1 JOIN t2 ON t2.v
+ORDER BY t2.v;
+
+--echo
+
+DROP TABLE t1,t2;
+
+--echo #
+--echo # Bug #45092: join buffer contains two blob columns one of which is
+--echo # used in the key employed to access the joined table
+--echo #
+
+CREATE TABLE t1 (c1 int, c2 int, key (c2));
+INSERT INTO t1 VALUES (1,1);
+INSERT INTO t1 VALUES (2,2);
+
+CREATE TABLE t2 (c1 text, c2 text);
+INSERT INTO t2 VALUES('tt', 'uu');
+INSERT INTO t2 VALUES('zzzz', 'xxxxxxxxx');
+
+--disable_result_log
+ANALYZE TABLE t1,t2;
+--enable_result_log
+
+set join_cache_level=6;
+
+SELECT t1.*, t2.*, LENGTH(t2.c1), LENGTH(t2.c2) FROM t1,t2
+ WHERE t1.c2=LENGTH(t2.c2) and t1.c1=LENGTH(t2.c1);
+
+set join_cache_level=default;
+
+DROP TABLE t1,t2;
=== added file 'mysql-test/t/join_nested_jcl6.test'
--- a/mysql-test/t/join_nested_jcl6.test 1970-01-01 00:00:00 +0000
+++ b/mysql-test/t/join_nested_jcl6.test 2009-12-21 02:26:15 +0000
@@ -0,0 +1,95 @@
+#
+# Run join_nested.test with BKA enabled
+#
+
+set join_cache_level=6;
+show variables like 'join_cache_level';
+
+--source t/join_nested.test
+
+#
+# BUG#35835: queries with nested outer joins with BKA enabled
+#
+
+CREATE TABLE t5 (a int, b int, c int, PRIMARY KEY(a), KEY b_i (b));
+CREATE TABLE t6 (a int, b int, c int, PRIMARY KEY(a), KEY b_i (b));
+CREATE TABLE t7 (a int, b int, c int, PRIMARY KEY(a), KEY b_i (b));
+CREATE TABLE t8 (a int, b int, c int, PRIMARY KEY(a), KEY b_i (b));
+
+INSERT INTO t5 VALUES (1,1,0), (2,2,0), (3,3,0);
+INSERT INTO t6 VALUES (1,2,0), (3,2,0), (6,1,0);
+INSERT INTO t7 VALUES (1,1,0), (2,2,0);
+INSERT INTO t8 VALUES (0,2,0), (1,2,0);
+
+EXPLAIN
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+ FROM t5
+ LEFT JOIN
+ (
+ (t6, t7)
+ LEFT JOIN
+ t8
+ ON t7.b=t8.b AND t6.b < 10
+ )
+ ON t6.b >= 2 AND t5.b=t7.b AND
+ (t8.a > 0 OR t8.c IS NULL);
+
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+ FROM t5
+ LEFT JOIN
+ (
+ (t6, t7)
+ LEFT JOIN
+ t8
+ ON t7.b=t8.b AND t6.b < 10
+ )
+ ON t6.b >= 2 AND t5.b=t7.b AND
+ (t8.a > 0 OR t8.c IS NULL);
+
+DELETE FROM t5;
+DELETE FROM t6;
+DELETE FROM t7;
+DELETE FROM t8;
+
+INSERT INTO t5 VALUES (1,3,0), (3,2,0);
+INSERT INTO t6 VALUES (3,3,0);
+INSERT INTO t7 VALUES (1,2,0);
+INSERT INTO t8 VALUES (1,1,0);
+
+EXPLAIN
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+ FROM t5 LEFT JOIN
+ (t6 LEFT JOIN t7 ON t7.a=1, t8)
+ ON (t5.b=t8.b);
+
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+ FROM t5 LEFT JOIN
+ (t6 LEFT JOIN t7 ON t7.a=1, t8)
+ ON (t5.b=t8.b);
+
+EXPLAIN
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+ FROM t5 LEFT JOIN
+ (t6 LEFT JOIN t7 ON t7.b=2, t8)
+ ON (t5.b=t8.b);
+
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+ FROM t5 LEFT JOIN
+ (t6 LEFT JOIN t7 ON t7.b=2, t8)
+ ON (t5.b=t8.b);
+
+EXPLAIN
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+ FROM t5 LEFT JOIN
+ (t8, t6 LEFT JOIN t7 ON t7.a=1)
+ ON (t5.b=t8.b);
+
+SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
+ FROM t5 LEFT JOIN
+ (t8, t6 LEFT JOIN t7 ON t7.a=1)
+ ON (t5.b=t8.b);
+
+DROP TABLE t5,t6,t7,t8;
+
+set join_cache_level=default;
+show variables like 'join_cache_level';
=== added file 'mysql-test/t/join_outer_jcl6.test'
--- a/mysql-test/t/join_outer_jcl6.test 1970-01-01 00:00:00 +0000
+++ b/mysql-test/t/join_outer_jcl6.test 2009-12-21 02:26:15 +0000
@@ -0,0 +1,11 @@
+#
+# Run join_outer.test with BKA enabled
+#
+
+set join_cache_level=6;
+show variables like 'join_cache_level';
+
+--source t/join_outer.test
+
+set join_cache_level=default;
+show variables like 'join_cache_level';
=== added file 'mysql-test/t/select_jcl6.test'
--- a/mysql-test/t/select_jcl6.test 1970-01-01 00:00:00 +0000
+++ b/mysql-test/t/select_jcl6.test 2009-12-21 02:26:15 +0000
@@ -0,0 +1,11 @@
+#
+# Run select.test with BKA enabled
+#
+
+set join_cache_level=6;
+show variables like 'join_cache_level';
+
+--source t/select.test
+
+set join_cache_level=default;
+show variables like 'join_cache_level';
=== modified file 'sql/CMakeLists.txt'
--- a/sql/CMakeLists.txt 2009-12-15 22:37:39 +0000
+++ b/sql/CMakeLists.txt 2009-12-21 02:26:15 +0000
@@ -62,9 +62,10 @@ SET (SQL_SOURCE
sp_rcontext.cc spatial.cc sql_acl.cc sql_analyse.cc sql_base.cc
sql_cache.cc sql_class.cc sql_client.cc sql_crypt.cc sql_crypt.h
sql_cursor.cc sql_db.cc sql_delete.cc sql_derived.cc sql_do.cc
- sql_error.cc sql_handler.cc sql_help.cc sql_insert.cc sql_lex.cc
- sql_list.cc sql_load.cc sql_manager.cc sql_map.cc sql_parse.cc
- sql_partition.cc sql_plugin.cc sql_prepare.cc sql_rename.cc
+ sql_error.cc sql_handler.cc sql_help.cc sql_insert.cc
+ sql_join_cache.cc sql_lex.cc sql_list.cc sql_load.cc sql_manager.cc
+ sql_map.cc sql_parse.cc sql_partition.cc sql_plugin.cc
+ sql_prepare.cc sql_rename.cc
sql_repl.cc sql_select.cc sql_show.cc sql_state.c sql_string.cc
sql_table.cc sql_test.cc sql_trigger.cc sql_udf.cc sql_union.cc
sql_update.cc sql_view.cc strfunc.cc table.cc thr_malloc.cc
=== modified file 'sql/Makefile.am'
--- a/sql/Makefile.am 2009-12-15 07:16:46 +0000
+++ b/sql/Makefile.am 2009-12-21 02:26:15 +0000
@@ -91,6 +91,7 @@ mysqld_SOURCES = ds_mrr.cc sql_lex.cc sq
mysqld.cc password.c hash_filo.cc hostname.cc \
sql_connect.cc scheduler.cc sql_parse.cc \
set_var.cc sql_yacc.yy \
+ sql_join_cache.cc \
sql_base.cc table.cc sql_select.cc sql_insert.cc \
sql_profile.cc \
sql_prepare.cc sql_error.cc sql_locale.cc \
=== modified file 'sql/ds_mrr.cc'
--- a/sql/ds_mrr.cc 2009-12-15 17:23:55 +0000
+++ b/sql/ds_mrr.cc 2009-12-21 02:26:15 +0000
@@ -990,9 +990,6 @@ void push_index_cond(JOIN_TAB *tab, uint
{
Item *idx_remainder_cond= 0;
tab->pre_idx_push_select_cond= tab->select_cond;
-#if 0
- /*
- psergey: enable the below when we backport BKA: */
/*
For BKA cache we store condition to special BKA cache field
because evaluation of the condition requires additional operations
@@ -1011,7 +1008,6 @@ void push_index_cond(JOIN_TAB *tab, uint
~(tab->table->map | tab->join->const_table_map)))
tab->cache_idx_cond= idx_cond;
else
-#endif
idx_remainder_cond= tab->table->file->idx_cond_push(keyno, idx_cond);
/*
=== modified file 'sql/field.cc'
--- a/sql/field.cc 2009-11-10 02:32:39 +0000
+++ b/sql/field.cc 2009-12-21 02:26:15 +0000
@@ -1705,13 +1705,12 @@ my_decimal *Field_str::val_decimal(my_de
uint Field::fill_cache_field(CACHE_FIELD *copy)
{
uint store_length;
- copy->str=ptr;
- copy->length=pack_length();
- copy->blob_field=0;
+ copy->str= ptr;
+ copy->length= pack_length();
+ copy->field= this;
if (flags & BLOB_FLAG)
{
- copy->blob_field=(Field_blob*) this;
- copy->strip=0;
+ copy->type= CACHE_BLOB;
copy->length-= table->s->blob_ptr_size;
return copy->length;
}
@@ -1719,15 +1718,21 @@ uint Field::fill_cache_field(CACHE_FIELD
(type() == MYSQL_TYPE_STRING && copy->length >= 4 &&
copy->length < 256))
{
- copy->strip=1; /* Remove end space */
+ copy->type= CACHE_STRIPPED; /* Remove end space */
store_length= 2;
}
+ else if (type() == MYSQL_TYPE_VARCHAR)
+ {
+ copy->type= pack_length()-row_pack_length() == 1 ? CACHE_VARSTR1:
+ CACHE_VARSTR2;
+ store_length= 0;
+ }
else
{
- copy->strip=0;
+ copy->type= 0;
store_length= 0;
}
- return copy->length+ store_length;
+ return copy->length+store_length;
}
=== modified file 'sql/item.cc'
--- a/sql/item.cc 2009-11-10 02:32:39 +0000
+++ b/sql/item.cc 2009-12-21 02:26:15 +0000
@@ -646,6 +646,16 @@ bool Item_field::collect_item_field_proc
}
+bool Item_field::add_field_to_set_processor(uchar *arg)
+{
+ DBUG_ENTER("Item_field::add_field_to_set_processor");
+ DBUG_PRINT("info", ("%s", field->field_name ? field->field_name : "noname"));
+ TABLE *table= (TABLE *) arg;
+ if (field->table == table)
+ bitmap_set_bit(&table->tmp_set, field->field_index);
+ DBUG_RETURN(FALSE);
+}
+
/**
Check if an Item_field references some field from a list of fields.
=== modified file 'sql/item.h'
--- a/sql/item.h 2009-12-15 07:16:46 +0000
+++ b/sql/item.h 2009-12-21 02:26:15 +0000
@@ -913,6 +913,7 @@ public:
virtual bool remove_fixed(uchar * arg) { fixed= 0; return 0; }
virtual bool cleanup_processor(uchar *arg);
virtual bool collect_item_field_processor(uchar * arg) { return 0; }
+ virtual bool add_field_to_set_processor(uchar * arg) { return 0; }
virtual bool find_item_in_field_list_processor(uchar *arg) { return 0; }
virtual bool change_context_processor(uchar *context) { return 0; }
virtual bool reset_query_id_processor(uchar *query_id_arg) { return 0; }
@@ -1571,6 +1572,7 @@ public:
void update_null_value();
Item *get_tmp_table_item(THD *thd);
bool collect_item_field_processor(uchar * arg);
+ bool add_field_to_set_processor(uchar * arg);
bool find_item_in_field_list_processor(uchar *arg);
bool register_field_in_read_map(uchar *arg);
bool register_field_in_bitmap(uchar *arg);
=== modified file 'sql/mysqld.cc'
--- a/sql/mysqld.cc 2009-12-15 07:16:46 +0000
+++ b/sql/mysqld.cc 2009-12-21 02:26:15 +0000
@@ -5754,7 +5754,7 @@ enum options_mysqld
OPT_DELAYED_INSERT_LIMIT, OPT_DELAYED_QUEUE_SIZE,
OPT_FLUSH_TIME, OPT_FT_MIN_WORD_LEN, OPT_FT_BOOLEAN_SYNTAX,
OPT_FT_MAX_WORD_LEN, OPT_FT_QUERY_EXPANSION_LIMIT, OPT_FT_STOPWORD_FILE,
- OPT_INTERACTIVE_TIMEOUT, OPT_JOIN_BUFF_SIZE,
+ OPT_INTERACTIVE_TIMEOUT, OPT_JOIN_BUFF_SIZE, OPT_JOIN_CACHE_LEVEL,
OPT_KEY_BUFFER_SIZE, OPT_KEY_CACHE_BLOCK_SIZE,
OPT_KEY_CACHE_DIVISION_LIMIT, OPT_KEY_CACHE_AGE_THRESHOLD,
OPT_LONG_QUERY_TIME,
@@ -6804,8 +6804,13 @@ log and this option does nothing anymore
"The size of the buffer that is used for full joins.",
(uchar**) &global_system_variables.join_buff_size,
(uchar**) &max_system_variables.join_buff_size, 0, GET_ULONG,
- REQUIRED_ARG, 128*1024L, IO_SIZE*2+MALLOC_OVERHEAD, (longlong) ULONG_MAX,
- MALLOC_OVERHEAD, IO_SIZE, 0},
+ REQUIRED_ARG, 128*1024L, 128+MALLOC_OVERHEAD, (longlong) ULONG_MAX,
+ MALLOC_OVERHEAD, 128, 0},
+ {"join_cache_level", OPT_JOIN_CACHE_LEVEL,
+ "Controls what join operations can be executed with join buffers. Odd numbers are used for plain join buffers while even numbers are used for linked buffers",
+ (uchar**) &global_system_variables.join_cache_level,
+ (uchar**) &max_system_variables.join_cache_level,
+ 0, GET_ULONG, REQUIRED_ARG, 1, 0, 8, 0, 1, 0},
{"keep_files_on_create", OPT_KEEP_FILES_ON_CREATE,
"Don't overwrite stale .MYD and .MYI even if no directory is specified.",
(uchar**) &global_system_variables.keep_files_on_create,
=== modified file 'sql/set_var.cc'
--- a/sql/set_var.cc 2009-12-15 07:16:46 +0000
+++ b/sql/set_var.cc 2009-12-21 02:26:15 +0000
@@ -313,6 +313,8 @@ static sys_var_thd_ulong sys_interactive
&SV::net_interactive_timeout);
static sys_var_thd_ulong sys_join_buffer_size(&vars, "join_buffer_size",
&SV::join_buff_size);
+static sys_var_thd_ulong sys_join_cache_level(&vars, "join_cache_level",
+ &SV::join_cache_level);
static sys_var_key_buffer_size sys_key_buffer_size(&vars, "key_buffer_size");
static sys_var_key_cache_long sys_key_cache_block_size(&vars, "key_cache_block_size",
offsetof(KEY_CACHE,
=== modified file 'sql/sql_class.h'
--- a/sql/sql_class.h 2009-12-15 07:16:46 +0000
+++ b/sql/sql_class.h 2009-12-21 02:26:15 +0000
@@ -306,6 +306,7 @@ struct system_variables
ulong auto_increment_increment, auto_increment_offset;
ulong bulk_insert_buff_size;
ulong join_buff_size;
+ ulong join_cache_level;
ulong max_allowed_packet;
ulong max_error_count;
ulong max_length_for_sort_data;
=== added file 'sql/sql_join_cache.cc'
--- a/sql/sql_join_cache.cc 1970-01-01 00:00:00 +0000
+++ b/sql/sql_join_cache.cc 2009-12-21 02:26:15 +0000
@@ -0,0 +1,3279 @@
+/* Copyright (C) 2000-2006 MySQL AB
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; version 2 of the License.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
+
+/**
+ @file
+
+ @brief
+ join cache optimizations
+
+ @defgroup Query_Optimizer Query Optimizer
+ @{
+*/
+
+#ifdef USE_PRAGMA_IMPLEMENTATION
+#pragma implementation // gcc: Class implementation
+#endif
+
+#include "mysql_priv.h"
+#include "sql_select.h"
+
+
+/*****************************************************************************
+ * Join cache module
+******************************************************************************/
+
+/*
+ Fill in the descriptor of a flag field associated with a join cache
+
+ SYNOPSIS
+ add_field_flag_to_join_cache()
+ str position in a record buffer to copy the field from/to
+ length length of the field
+ field IN/OUT pointer to the field descriptor to fill in
+
+ DESCRIPTION
+ The function fill in the descriptor of a cache flag field to which
+ the parameter 'field' points to. The function uses the first two
+ parameters to set the position in the record buffer from/to which
+ the field value is to be copied and the length of the copied fragment.
+ Before returning the result the function increments the value of
+ *field by 1.
+ The function ignores the fields 'blob_length' and 'ofset' of the
+ descriptor.
+
+ RETURN
+ the length of the field
+*/
+
+static
+uint add_flag_field_to_join_cache(uchar *str, uint length, CACHE_FIELD **field)
+{
+ CACHE_FIELD *copy= *field;
+ copy->str= str;
+ copy->length= length;
+ copy->type= 0;
+ copy->field= 0;
+ copy->referenced_field_no= 0;
+ (*field)++;
+ return length;
+}
+
+
+/*
+ Fill in the descriptors of table data fields associated with a join cache
+
+ SYNOPSIS
+ add_table_data_fields_to_join_cache()
+ tab descriptors of fields from this table are to be filled
+ field_set descriptors for only these fields are to be created
+ field_cnt IN/OUT counter of data fields
+ descr IN/OUT pointer to the first descriptor to be filled
+ field_ptr_cnt IN/OUT counter of pointers to the data fields
+ descr_ptr IN/OUT pointer to the first pointer to blob descriptors
+
+ DESCRIPTION
+ The function fills in the descriptors of cache data fields from the table
+ 'tab'. The descriptors are filled only for the fields marked in the
+ bitmap 'field_set'.
+ The function fills the descriptors starting from the position pointed
+ by 'descr'. If an added field is of a BLOB type then a pointer to the
+ its descriptor is added to the array descr_ptr.
+ At the return 'descr' points to the position after the last added
+ descriptor while 'descr_ptr' points to the position right after the
+ last added pointer.
+
+ RETURN
+ the total length of the added fields
+*/
+
+static
+uint add_table_data_fields_to_join_cache(JOIN_TAB *tab,
+ MY_BITMAP *field_set,
+ uint *field_cnt,
+ CACHE_FIELD **descr,
+ uint *field_ptr_cnt,
+ CACHE_FIELD ***descr_ptr)
+{
+ Field **fld_ptr;
+ uint len= 0;
+ CACHE_FIELD *copy= *descr;
+ CACHE_FIELD **copy_ptr= *descr_ptr;
+ uint used_fields= bitmap_bits_set(field_set);
+ for (fld_ptr= tab->table->field; used_fields; fld_ptr++)
+ {
+ if (bitmap_is_set(field_set, (*fld_ptr)->field_index))
+ {
+ len+= (*fld_ptr)->fill_cache_field(copy);
+ if (copy->type == CACHE_BLOB)
+ {
+ *copy_ptr= copy;
+ copy_ptr++;
+ (*field_ptr_cnt)++;
+ }
+ copy->field= *fld_ptr;
+ copy->referenced_field_no= 0;
+ copy++;
+ (*field_cnt)++;
+ used_fields--;
+ }
+ }
+ *descr= copy;
+ *descr_ptr= copy_ptr;
+ return len;
+}
+
+
+/*
+ Determine different counters of fields associated with a record in the cache
+
+ SYNOPSIS
+ calc_record_fields()
+
+ DESCRIPTION
+ The function counts the number of total fields stored in a record
+ of the cache and saves this number in the 'fields' member. It also
+ determines the number of flag fields and the number of blobs.
+ The function sets 'with_match_flag' on if 'join_tab' needs a match flag
+ i.e. if it is the first inner table of an outer join or a semi-join.
+
+ RETURN
+ none
+*/
+
+void JOIN_CACHE::calc_record_fields()
+{
+ JOIN_TAB *tab = prev_cache ? prev_cache->join_tab :
+ join->join_tab+join->const_tables;
+ tables= join_tab-tab;
+
+ fields= 0;
+ blobs= 0;
+ flag_fields= 0;
+ data_field_count= 0;
+ data_field_ptr_count= 0;
+ referenced_fields= 0;
+
+ for ( ; tab < join_tab ; tab++)
+ {
+ calc_used_field_length(join->thd, tab);
+ flag_fields+= test(tab->used_null_fields || tab->used_uneven_bit_fields);
+ flag_fields+= test(tab->table->maybe_null);
+ fields+= tab->used_fields;
+ blobs+= tab->used_blobs;
+
+ fields+= tab->check_rowid_field();
+ }
+ if ((with_match_flag= join_tab->use_match_flag()))
+ flag_fields++;
+ fields+= flag_fields;
+}
+
+/*
+ Allocate memory for descriptors and pointers to them associated with the cache
+
+ SYNOPSIS
+ alloc_fields()
+
+ DESCRIPTION
+ The function allocates memory for the array of fields descriptors
+ and the array of pointers to the field descriptors used to copy
+ join record data from record buffers into the join buffer and
+ backward. Some pointers refer to the field descriptor associated
+ with previous caches. They are placed at the beginning of the
+ array of pointers and its total number is specified by the parameter
+ 'external fields'.
+ The pointer of the first array is assigned to field_descr and the
+ number of elements is precalculated by the function calc_record_fields.
+ The allocated arrays are adjacent.
+
+ NOTES
+ The memory is allocated in join->thd->memroot
+
+ RETURN
+ pointer to the first array
+*/
+
+int JOIN_CACHE::alloc_fields(uint external_fields)
+{
+ uint ptr_cnt= external_fields+blobs+1;
+ uint fields_size= sizeof(CACHE_FIELD)*fields;
+ field_descr= (CACHE_FIELD*) sql_alloc(fields_size +
+ sizeof(CACHE_FIELD*)*ptr_cnt);
+ blob_ptr= (CACHE_FIELD **) ((uchar *) field_descr + fields_size);
+ return (field_descr == NULL);
+}
+
+/*
+ Create descriptors of the record flag fields stored in the join buffer
+
+ SYNOPSIS
+ create_flag_fields()
+
+ DESCRIPTION
+ The function creates descriptors of the record flag fields stored
+ in the join buffer. These are descriptors for:
+ - an optional match flag field,
+ - table null bitmap fields,
+ - table null row fields.
+ The match flag field is created when 'join_tab' is the first inner
+ table of an outer join our a semi-join. A null bitmap field is
+ created for any table whose fields are to be stored in the join
+ buffer if at least one of these fields is nullable or is a BIT field
+ whose bits are partially stored with null bits. A null row flag
+ is created for any table assigned to the cache if it is an inner
+ table of an outer join.
+ The descriptor for flag fields are placed one after another at the
+ beginning of the array of field descriptors 'field_descr' that
+ contains 'fields' elements. If there is a match flag field the
+ descriptor for it is always first in the sequence of flag fields.
+ The descriptors for other flag fields can follow in an arbitrary
+ order.
+ The flag field values follow in a record stored in the join buffer
+ in the same order as field descriptors, with the match flag always
+ following first.
+ The function sets the value of 'flag_fields' to the total number
+ of the descriptors created for the flag fields.
+ The function sets the value of 'length' to the total length of the
+ flag fields.
+
+ RETURN
+ none
+*/
+
+void JOIN_CACHE::create_flag_fields()
+{
+ CACHE_FIELD *copy;
+ JOIN_TAB *tab;
+
+ copy= field_descr;
+
+ length=0;
+
+ /* If there is a match flag the first field is always used for this flag */
+ if (with_match_flag)
+ length+= add_flag_field_to_join_cache((uchar*) &join_tab->found,
+ sizeof(join_tab->found),
+ ©);
+
+ /* Create fields for all null bitmaps and null row flags that are needed */
+ for (tab= join_tab-tables; tab < join_tab; tab++)
+ {
+ TABLE *table= tab->table;
+
+ /* Create a field for the null bitmap from table if needed */
+ if (tab->used_null_fields || tab->used_uneven_bit_fields)
+ length+= add_flag_field_to_join_cache(table->null_flags,
+ table->s->null_bytes,
+ ©);
+
+ /* Create table for the null row flag if needed */
+ if (table->maybe_null)
+ length+= add_flag_field_to_join_cache((uchar*) &table->null_row,
+ sizeof(table->null_row),
+ ©);
+ }
+
+ /* Theoretically the new value of flag_fields can be less than the old one */
+ flag_fields= copy-field_descr;
+}
+
+
+/*
+ Create descriptors of all remaining data fields stored in the join buffer
+
+ SYNOPSIS
+ create_remaining_fields()
+ all_read_fields indicates that descriptors for all read data fields
+ are to be created
+
+ DESCRIPTION
+ The function creates descriptors for all remaining data fields of a
+ record from the join buffer. If the parameter 'all_read_fields' is
+ true the function creates fields for all read record fields that
+ comprise the partial join record joined with join_tab. Otherwise,
+ for each table tab, the set of the read fields for which the descriptors
+ have to be added is determined as the difference between all read fields
+ and and those for which the descriptors have been already created.
+ The latter are supposed to be marked in the bitmap tab->table->tmp_set.
+ The function increases the value of 'length' to the the total length of
+ the added fields.
+
+ NOTES
+ If 'all_read_fields' is false the function modifies the value of
+ tab->table->tmp_set for a each table whose fields are stored in the cache.
+ The function calls the method Field::fill_cache_field to figure out
+ the type of the cache field and the maximal length of its representation
+ in the join buffer. If this is a blob field then additionally a pointer
+ to this field is added as an element of the array blob_ptr. For a blob
+ field only the size of the length of the blob data is taken into account.
+ It is assumed that 'data_field_count' contains the number of descriptors
+ for data fields that have been already created and 'data_field_ptr_count'
+ contains the number of the pointers to such descriptors having been
+ stored up to the moment.
+
+ RETURN
+ none
+*/
+
+void JOIN_CACHE:: create_remaining_fields(bool all_read_fields)
+{
+ JOIN_TAB *tab;
+ CACHE_FIELD *copy= field_descr+flag_fields+data_field_count;
+ CACHE_FIELD **copy_ptr= blob_ptr+data_field_ptr_count;
+
+ for (tab= join_tab-tables; tab < join_tab; tab++)
+ {
+ MY_BITMAP *rem_field_set;
+ TABLE *table= tab->table;
+
+ if (all_read_fields)
+ rem_field_set= table->read_set;
+ else
+ {
+ bitmap_invert(&table->tmp_set);
+ bitmap_intersect(&table->tmp_set, table->read_set);
+ rem_field_set= &table->tmp_set;
+ }
+
+ length+= add_table_data_fields_to_join_cache(tab, rem_field_set,
+ &data_field_count, ©,
+ &data_field_ptr_count,
+ ©_ptr);
+
+ /* SemiJoinDuplicateElimination: allocate space for rowid if needed */
+/* !!! NB igor: Enable the code in the comment after backporting the SJ code
+ if (tab->keep_current_rowid)
+ {
+ copy->str= table->file->ref;
+ copy->length= table->file->ref_length;
+ copy->type= 0;
+ copy->field= 0;
+ copy->referenced_field_no= 0;
+ length+= copy->length;
+ data_field_count++;
+ copy++;
+ }
+*/
+ }
+}
+
+
+/*
+ Calculate and set all cache constants
+
+ SYNOPSIS
+ set_constants()
+
+ DESCRIPTION
+ The function calculates and set all precomputed constants that are used
+ when writing records into the join buffer and reading them from it.
+ It calculates the size of offsets of a record within the join buffer
+ and of a field within a record. It also calculates the number of bytes
+ used to store record lengths.
+ The function also calculates the maximal length of the representation
+ of record in the cache excluding blob_data. This value is used when
+ making a dicision whether more records should be added into the join
+ buffer or not.
+
+ RETURN
+ none
+*/
+
+void JOIN_CACHE::set_constants()
+{
+ /*
+ Any record from a BKA cache is prepended with the record length.
+ We use the record length when reading the buffer and building key values
+ for each record. The length allows us not to read the fields that are
+ not needed for keys.
+ If a record has match flag it also may be skipped when the match flag
+ is on. It happens if the cache is used for a semi-join operation or
+ for outer join when the 'not exist' optimization can be applied.
+ If some of the fields are referenced from other caches then
+ the record length allows us to easily reach the saved offsets for
+ these fields since the offsets are stored at the very end of the record.
+ However at this moment we don't know whether we have referenced fields for
+ the cache or not. Later when a referenced field is registered for the cache
+ we adjust the value of the flag 'with_length'.
+ */
+ with_length= is_key_access() || with_match_flag;
+ /*
+ At this moment we don't know yet the value of 'referenced_fields',
+ but in any case it can't be greater than the value of 'fields'.
+ */
+ uint len= length + fields*sizeof(uint)+blobs*sizeof(uchar *) +
+ (prev_cache ? prev_cache->get_size_of_rec_offset() : 0) +
+ sizeof(ulong);
+ buff_size= max(join->thd->variables.join_buff_size, 2*len);
+ size_of_rec_ofs= offset_size(buff_size);
+ size_of_rec_len= blobs ? size_of_rec_ofs : offset_size(len);
+ size_of_fld_ofs= size_of_rec_len;
+ /*
+ The size of the offsets for referenced fields will be added later.
+ The values of 'pack_length' and 'pack_length_with_blob_ptrs' are adjusted
+ every time when the first reference to the referenced field is registered.
+ */
+ pack_length= (with_length ? size_of_rec_len : 0) +
+ (prev_cache ? prev_cache->get_size_of_rec_offset() : 0) +
+ length;
+ pack_length_with_blob_ptrs= pack_length + blobs*sizeof(uchar *);
+}
+
+
+/*
+ Allocate memory for a join buffer
+
+ SYNOPSIS
+ alloc_buffer()
+
+ DESCRIPTION
+ The function allocates a lump of memory for the cache join buffer. The
+ size of the allocated memory is 'buff_size' bytes.
+
+ RETURN
+ 0 - if the memory has been successfully allocated
+ 1 - otherwise
+*/
+
+int JOIN_CACHE::alloc_buffer()
+{
+ buff= (uchar*) my_malloc(buff_size, MYF(0));
+ return buff == NULL;
+}
+
+
+/*
+ Initialize a BNL cache
+
+ SYNOPSIS
+ init()
+
+ DESCRIPTION
+ The function initializes the cache structure. It supposed to be called
+ right after a constructor for the JOIN_CACHE_BNL.
+ The function allocates memory for the join buffer and for descriptors of
+ the record fields stored in the buffer.
+
+ NOTES
+ The code of this function should have been included into the constructor
+ code itself. However the new operator for the class JOIN_CACHE_BNL would
+ never fail while memory allocation for the join buffer is not absolutely
+ unlikely to fail. That's why this memory allocation has to be placed in a
+ separate function that is called in a couple with a cache constructor.
+ It is quite natural to put almost all other constructor actions into
+ this function.
+
+ RETURN
+ 0 initialization with buffer allocations has been succeeded
+ 1 otherwise
+*/
+
+int JOIN_CACHE_BNL::init()
+{
+ DBUG_ENTER("JOIN_CACHE::init");
+
+ calc_record_fields();
+
+ if (alloc_fields(0))
+ DBUG_RETURN(1);
+
+ create_flag_fields();
+
+ create_remaining_fields(TRUE);
+
+ set_constants();
+
+ if (alloc_buffer())
+ DBUG_RETURN(1);
+
+ reset(TRUE);
+
+ DBUG_RETURN(0);
+}
+
+
+/*
+ Initialize a BKA cache
+
+ SYNOPSIS
+ init()
+
+ DESCRIPTION
+ The function initializes the cache structure. It supposed to be called
+ right after a constructor for the JOIN_CACHE_BKA.
+ The function allocates memory for the join buffer and for descriptors of
+ the record fields stored in the buffer.
+
+ NOTES
+ The code of this function should have been included into the constructor
+ code itself. However the new operator for the class JOIN_CACHE_BKA would
+ never fail while memory allocation for the join buffer is not absolutely
+ unlikely to fail. That's why this memory allocation has to be placed in a
+ separate function that is called in a couple with a cache constructor.
+ It is quite natural to put almost all other constructor actions into
+ this function.
+
+ RETURN
+ 0 initialization with buffer allocations has been succeeded
+ 1 otherwise
+*/
+
+int JOIN_CACHE_BKA::init()
+{
+ JOIN_TAB *tab;
+ JOIN_CACHE *cache;
+ local_key_arg_fields= 0;
+ external_key_arg_fields= 0;
+ DBUG_ENTER("JOIN_CACHE_BKA::init");
+
+ calc_record_fields();
+
+ /* Mark all fields that can be used as arguments for this key access */
+ TABLE_REF *ref= &join_tab->ref;
+ cache= this;
+ do
+ {
+ /*
+ Traverse the ref expressions and find the occurrences of fields in them for
+ each table 'tab' whose fields are to be stored in the 'cache' join buffer.
+ Mark these fields in the bitmap tab->table->tmp_set.
+ For these fields count the number of them stored in this cache and the
+ total number of them stored in the previous caches. Save the result
+ of the counting 'in local_key_arg_fields' and 'external_key_arg_fields'
+ respectively.
+ */
+ for (tab= cache->join_tab-cache->tables; tab < cache->join_tab ; tab++)
+ {
+ uint key_args;
+ bitmap_clear_all(&tab->table->tmp_set);
+ for (uint i= 0; i < ref->key_parts; i++)
+ {
+ Item *ref_item= ref->items[i];
+ if (!(tab->table->map & ref_item->used_tables()))
+ continue;
+ ref_item->walk(&Item::add_field_to_set_processor, 1,
+ (uchar *) tab->table);
+ }
+ if ((key_args= bitmap_bits_set(&tab->table->tmp_set)))
+ {
+ if (cache == this)
+ local_key_arg_fields+= key_args;
+ else
+ external_key_arg_fields+= key_args;
+ }
+ }
+ cache= cache->prev_cache;
+ }
+ while (cache);
+
+ if (alloc_fields(external_key_arg_fields))
+ DBUG_RETURN(1);
+
+ create_flag_fields();
+
+ /*
+ Save pointers to the cache fields in previous caches
+ that are used to build keys for this key access.
+ */
+ cache= this;
+ uint ext_key_arg_cnt= external_key_arg_fields;
+ CACHE_FIELD *copy;
+ CACHE_FIELD **copy_ptr= blob_ptr;
+ while (ext_key_arg_cnt)
+ {
+ cache= cache->prev_cache;
+ for (tab= cache->join_tab-cache->tables; tab < cache->join_tab ; tab++)
+ {
+ CACHE_FIELD *copy_end;
+ MY_BITMAP *key_read_set= &tab->table->tmp_set;
+ /* key_read_set contains the bitmap of tab's fields referenced by ref */
+ if (bitmap_is_clear_all(key_read_set))
+ continue;
+ copy_end= cache->field_descr+cache->fields;
+ for (copy= cache->field_descr+cache->flag_fields; copy < copy_end; copy++)
+ {
+ if (copy->field->table == tab->table &&
+ bitmap_is_set(key_read_set, copy->field->field_index))
+ {
+ *copy_ptr++= copy;
+ ext_key_arg_cnt--;
+ if (!copy->referenced_field_no)
+ {
+ /*
+ Register the referenced field 'copy':
+ - set the offset number in copy->referenced_field_no,
+ - adjust the value of the flag 'with_length',
+ - adjust the values of 'pack_length' and
+ of 'pack_length_with_blob_ptrs'.
+ */
+ copy->referenced_field_no= ++cache->referenced_fields;
+ cache->with_length= TRUE;
+ cache->pack_length+= cache->get_size_of_fld_offset();
+ cache->pack_length_with_blob_ptrs+= cache->get_size_of_fld_offset();
+ }
+ }
+ }
+ }
+ }
+ /* After this 'blob_ptr' shall not be be changed */
+ blob_ptr= copy_ptr;
+
+ /* Now create local fields that are used to build ref for this key access */
+ copy= field_descr+flag_fields;
+ for (tab= join_tab-tables; tab < join_tab ; tab++)
+ {
+ length+= add_table_data_fields_to_join_cache(tab, &tab->table->tmp_set,
+ &data_field_count, ©,
+ &data_field_ptr_count,
+ ©_ptr);
+ }
+
+ use_emb_key= check_emb_key_usage();
+
+ create_remaining_fields(FALSE);
+
+ set_constants();
+
+ if (alloc_buffer())
+ DBUG_RETURN(1);
+
+ reset(TRUE);
+
+ DBUG_RETURN(0);
+}
+
+
+/*
+ Check the possibility to read the access keys directly from the join buffer
+
+ SYNOPSIS
+ check_emb_key_usage()
+
+ DESCRIPTION
+ The function checks some conditions at which the key values can be read
+ directly from the join buffer. This is possible when the key values can be
+ composed by concatenation of the record fields stored in the join buffer.
+ Sometimes when the access key is multi-component the function has to re-order
+ the fields written into the join buffer to make keys embedded. If key
+ values for the key access are detected as embedded then 'use_emb_key'
+ is set to TRUE.
+
+ EXAMPLE
+ Let table t2 has an index defined on the columns a,b . Let's assume also
+ that the columns t2.a, t2.b as well as the columns t1.a, t1.b are all
+ of the integer type. Then if the query
+ SELECT COUNT(*) FROM t1, t2 WHERE t1.a=t2.a and t1.b=t2.b
+ is executed with a join cache in such a way that t1 is the driving
+ table then the key values to access table t2 can be read directly
+ from the join buffer.
+
+ NOTES
+ In some cases key values could be read directly from the join buffer but
+ we still do not consider them embedded. In the future we'll expand the
+ the class of keys which we identify as embedded.
+
+ RETURN
+ TRUE - key values will be considered as embedded,
+ FALSE - otherwise.
+*/
+
+bool JOIN_CACHE_BKA::check_emb_key_usage()
+{
+ uint i;
+ Item *item;
+ KEY_PART_INFO *key_part;
+ CACHE_FIELD *copy;
+ CACHE_FIELD *copy_end;
+ uint len= 0;
+ TABLE *table= join_tab->table;
+ TABLE_REF *ref= &join_tab->ref;
+ KEY *keyinfo= table->key_info+ref->key;
+
+ /*
+ If some of the key arguments are not from the local cache the key
+ is not considered as embedded.
+ TODO:
+ Expand it to the case when ref->key_parts=1 and local_key_arg_fields=0.
+ */
+ if (external_key_arg_fields != 0)
+ return FALSE;
+ /*
+ If the number of the local key arguments is not equal to the number
+ of key parts the key value cannot be read directly from the join buffer.
+ */
+ if (local_key_arg_fields != ref->key_parts)
+ return FALSE;
+
+ /*
+ A key is not considered embedded if one of the following is true:
+ - one of its key parts is not equal to a field
+ - it is a partial key
+ - definition of the argument field does not coincide with the
+ definition of the corresponding key component
+ - some of the key components are nullable
+ */
+ for (i=0; i < ref->key_parts; i++)
+ {
+ item= ref->items[i]->real_item();
+ if (item->type() != Item::FIELD_ITEM)
+ return FALSE;
+ key_part= keyinfo->key_part+i;
+ if (key_part->key_part_flag & HA_PART_KEY_SEG)
+ return FALSE;
+ if (!key_part->field->eq_def(((Item_field *) item)->field))
+ return FALSE;
+ if (key_part->field->maybe_null())
+ return FALSE;
+ }
+
+ copy= field_descr+flag_fields;
+ copy_end= copy+local_key_arg_fields;
+ for ( ; copy < copy_end; copy++)
+ {
+ /*
+ If some of the key arguments are of variable length the key
+ is not considered as embedded.
+ */
+ if (copy->type != 0)
+ return FALSE;
+ /*
+ If some of the key arguments are bit fields whose bits are partially
+ stored with null bits the key is not considered as embedded.
+ */
+ if (copy->field->type() == MYSQL_TYPE_BIT &&
+ ((Field_bit*) (copy->field))->bit_len)
+ return FALSE;
+ len+= copy->length;
+ }
+
+ emb_key_length= len;
+
+ /*
+ Make sure that key fields follow the order of the corresponding
+ key components these fields are equal to. For this the descriptors
+ of the fields that comprise the key might be re-ordered.
+ */
+ for (i= 0; i < ref->key_parts; i++)
+ {
+ uint j;
+ Item *item= ref->items[i]->real_item();
+ Field *fld= ((Item_field *) item)->field;
+ CACHE_FIELD *init_copy= field_descr+flag_fields+i;
+ for (j= i, copy= init_copy; i < local_key_arg_fields; i++, copy++)
+ {
+ if (fld->eq(copy->field))
+ {
+ if (j != i)
+ {
+ CACHE_FIELD key_part_copy= *copy;
+ *copy= *init_copy;
+ *init_copy= key_part_copy;
+ }
+ break;
+ }
+ }
+ }
+
+ return TRUE;
+}
+
+
+/*
+ Calculate the increment of the MRR buffer for a record write
+
+ SYNOPSIS
+ aux_buffer_incr()
+
+ DESCRIPTION
+ This implementation of the virtual function aux_buffer_incr determines
+ for how much the size of the MRR buffer should be increased when another
+ record is added to the cache.
+
+ RETURN
+ the increment of the size of the MRR buffer for the next record
+*/
+
+uint JOIN_CACHE_BKA::aux_buffer_incr()
+{
+ uint incr= 0;
+ TABLE_REF *ref= &join_tab->ref;
+ TABLE *tab= join_tab->table;
+ uint rec_per_key= tab->key_info[ref->key].rec_per_key[ref->key_parts-1];
+ set_if_bigger(rec_per_key, 1);
+ if (records == 1)
+ incr= ref->key_length + tab->file->ref_length;
+ incr+= tab->file->stats.mrr_length_per_rec * rec_per_key;
+ return incr;
+}
+
+
+/*
+ Check if the record combination matches the index condition
+
+ SYNOPSIS
+ JOIN_CACHE_BKA::skip_index_tuple()
+ rseq Value returned by bka_range_seq_init()
+ range_info MRR range association data
+
+ DESCRIPTION
+ This function is invoked from MRR implementation to check if an index
+ tuple matches the index condition. It is used in the case where the index
+ condition actually depends on both columns of the used index and columns
+ from previous tables.
+
+ Accessing columns of the previous tables requires special handling with
+ BKA. The idea of BKA is to collect record combinations in a buffer and
+ then do a batch of ref access lookups, i.e. by the time we're doing a
+ lookup its previous-records-combination is not in prev_table->record[0]
+ but somewhere in the join buffer.
+
+ We need to get it from there back into prev_table(s)->record[0] before we
+ can evaluate the index condition, and that's why we need this function
+ instead of regular IndexConditionPushdown.
+
+ NOTE
+ Possible optimization:
+ Before we unpack the record from a previous table
+ check if this table is used in the condition.
+ If so then unpack the record otherwise skip the unpacking.
+ This should be done by a special virtual method
+ get_partial_record_by_pos().
+
+ RETURN
+ 0 The record combination satisfies the index condition
+ 1 Otherwise
+*/
+
+bool JOIN_CACHE_BKA::skip_index_tuple(range_seq_t rseq, char *range_info)
+{
+ DBUG_ENTER("JOIN_CACHE_BKA::skip_index_tuple");
+ JOIN_CACHE_BKA *cache= (JOIN_CACHE_BKA *) rseq;
+ cache->get_record_by_pos((uchar*)range_info);
+ DBUG_RETURN(!join_tab->cache_idx_cond->val_int());
+}
+
+
+/*
+ Check if the record combination matches the index condition
+
+ SYNOPSIS
+ bka_skip_index_tuple()
+ rseq Value returned by bka_range_seq_init()
+ range_info MRR range association data
+
+ DESCRIPTION
+ This is wrapper for JOIN_CACHE_BKA::skip_index_tuple method,
+ see comments there.
+
+ NOTE
+ This function is used as a RANGE_SEQ_IF::skip_index_tuple callback.
+
+ RETURN
+ 0 The record combination satisfies the index condition
+ 1 Otherwise
+*/
+
+static
+bool bka_skip_index_tuple(range_seq_t rseq, char *range_info)
+{
+ DBUG_ENTER("bka_skip_index_tuple");
+ JOIN_CACHE_BKA *cache= (JOIN_CACHE_BKA *) rseq;
+ DBUG_RETURN(cache->skip_index_tuple(rseq, range_info));
+}
+
+
+/*
+ Write record fields and their required offsets into the join cache buffer
+
+ SYNOPSIS
+ write_record_data()
+ link a reference to the associated info in the previous cache
+ is_full OUT true if it has been decided that no more records will be
+ added to the join buffer
+
+ DESCRIPTION
+ This function put into the cache buffer the following info that it reads
+ from the join record buffers or computes somehow:
+ (1) the length of all fields written for the record (optional)
+ (2) an offset to the associated info in the previous cache (if there is any)
+ determined by the link parameter
+ (3) all flag fields of the tables whose data field are put into the cache:
+ - match flag (optional),
+ - null bitmaps for all tables,
+ - null row flags for all tables
+ (4) values of all data fields including
+ - full images of those fixed legth data fields that cannot have
+ trailing spaces
+ - significant part of fixed length fields that can have trailing spaces
+ with the prepanded length
+ - data of non-blob variable length fields with the prepanded data length
+ - blob data from blob fields with the prepanded data length
+ (5) record offset values for the data fields that are referred to from
+ other caches
+
+ The record is written at the current position stored in the field 'pos'.
+ At the end of the function 'pos' points at the position right after the
+ written record data.
+ The function increments the number of records in the cache that is stored
+ in the 'records' field by 1. The function also modifies the values of
+ 'curr_rec_pos' and 'last_rec_pos' to point to the written record.
+ The 'end_pos' cursor is modified accordingly.
+ The 'last_rec_blob_data_is_in_rec_buff' is set on if the blob data
+ remains in the record buffers and not copied to the join buffer. It may
+ happen only to the blob data from the last record added into the cache.
+
+
+ RETURN
+ length of the written record data
+*/
+
+uint JOIN_CACHE::write_record_data(uchar * link, bool *is_full)
+{
+ uint len;
+ bool last_record;
+ CACHE_FIELD *copy;
+ CACHE_FIELD *copy_end;
+ uchar *cp= pos;
+ uchar *init_pos= cp;
+ uchar *rec_len_ptr= 0;
+
+ records++; /* Increment the counter of records in the cache */
+
+ len= pack_length;
+
+ /* Make an adjustment for the size of the auxiliary buffer if there is any */
+ uint incr= aux_buffer_incr();
+ ulong rem= rem_space();
+ aux_buff_size+= len+incr < rem ? incr : rem;
+
+ /*
+ For each blob to be put into cache save its length and a pointer
+ to the value in the corresponding element of the blob_ptr array.
+ Blobs with null values are skipped.
+ Increment 'len' by the total length of all these blobs.
+ */
+ if (blobs)
+ {
+ CACHE_FIELD **copy_ptr= blob_ptr;
+ CACHE_FIELD **copy_ptr_end= copy_ptr+blobs;
+ for ( ; copy_ptr < copy_ptr_end; copy_ptr++)
+ {
+ Field_blob *blob_field= (Field_blob *) (*copy_ptr)->field;
+ if (!blob_field->is_null())
+ {
+ uint blob_len= blob_field->get_length();
+ (*copy_ptr)->blob_length= blob_len;
+ len+= blob_len;
+ blob_field->get_ptr(&(*copy_ptr)->str);
+ }
+ }
+ }
+
+ /*
+ Check whether we won't be able to add any new record into the cache after
+ this one because the cache will be full. Set last_record to TRUE if it's so.
+ The assume that the cache will be full after the record has been written
+ into it if either the remaining space of the cache is not big enough for the
+ record's blob values or if there is a chance that not all non-blob fields
+ of the next record can be placed there.
+ This function is called only in the case when there is enough space left in
+ the cache to store at least non-blob parts of the current record.
+ */
+ last_record= (len+pack_length_with_blob_ptrs) > rem_space();
+
+ /*
+ Save the position for the length of the record in the cache if it's needed.
+ The length of the record will be inserted here when all fields of the record
+ are put into the cache.
+ */
+ if (with_length)
+ {
+ rec_len_ptr= cp;
+ cp+= size_of_rec_len;
+ }
+
+ /*
+ Put a reference to the fields of the record that are stored in the previous
+ cache if there is any. This reference is passed by the 'link' parameter.
+ */
+ if (prev_cache)
+ {
+ cp+= prev_cache->get_size_of_rec_offset();
+ prev_cache->store_rec_ref(cp, link);
+ }
+
+ curr_rec_pos= cp;
+
+ /* If the there is a match flag set its value to 0 */
+ copy= field_descr;
+ if (with_match_flag)
+ *copy[0].str= 0;
+
+ /* First put into the cache the values of all flag fields */
+ copy_end= field_descr+flag_fields;
+ for ( ; copy < copy_end; copy++)
+ {
+ memcpy(cp, copy->str, copy->length);
+ cp+= copy->length;
+ }
+
+ /* Now put the values of the remaining fields as soon as they are not nulls */
+ copy_end= field_descr+fields;
+ for ( ; copy < copy_end; copy++)
+ {
+ Field *field= copy->field;
+ if (field && field->maybe_null() && field->is_null())
+ {
+ /* Do not copy a field if its value is null */
+ if (copy->referenced_field_no)
+ copy->offset= 0;
+ continue;
+ }
+ /* Save the offset of the field to put it later at the end of the record */
+ if (copy->referenced_field_no)
+ copy->offset= cp-curr_rec_pos;
+
+ if (copy->type == CACHE_BLOB)
+ {
+ Field_blob *blob_field= (Field_blob *) copy->field;
+ if (last_record)
+ {
+ last_rec_blob_data_is_in_rec_buff= 1;
+ /* Put down the length of the blob and the pointer to the data */
+ blob_field->get_image(cp, copy->length+sizeof(char*),
+ blob_field->charset());
+ cp+= copy->length+sizeof(char*);
+ }
+ else
+ {
+ /* First put down the length of the blob and then copy the data */
+ blob_field->get_image(cp, copy->length,
+ blob_field->charset());
+ memcpy(cp+copy->length, copy->str, copy->blob_length);
+ cp+= copy->length+copy->blob_length;
+ }
+ }
+ else
+ {
+ switch (copy->type) {
+ case CACHE_VARSTR1:
+ /* Copy the significant part of the short varstring field */
+ len= (uint) copy->str[0] + 1;
+ memcpy(cp, copy->str, len);
+ cp+= len;
+ break;
+ case CACHE_VARSTR2:
+ /* Copy the significant part of the long varstring field */
+ len= uint2korr(copy->str) + 2;
+ memcpy(cp, copy->str, len);
+ cp+= len;
+ break;
+ case CACHE_STRIPPED:
+ {
+ /*
+ Put down the field value stripping all trailing spaces off.
+ After this insert the length of the written sequence of bytes.
+ */
+ uchar *str, *end;
+ for (str= copy->str, end= str+copy->length;
+ end > str && end[-1] == ' ';
+ end--) ;
+ len=(uint) (end-str);
+ int2store(cp, len);
+ memcpy(cp+2, str, len);
+ cp+= len+2;
+ break;
+ }
+ default:
+ /* Copy the entire image of the field from the record buffer */
+ memcpy(cp, copy->str, copy->length);
+ cp+= copy->length;
+ }
+ }
+ }
+
+ /* Add the offsets of the fields that are referenced from other caches */
+ if (referenced_fields)
+ {
+ uint cnt= 0;
+ for (copy= field_descr+flag_fields; copy < copy_end ; copy++)
+ {
+ if (copy->referenced_field_no)
+ {
+ store_fld_offset(cp+size_of_fld_ofs*(copy->referenced_field_no-1),
+ copy->offset);
+ cnt++;
+ }
+ }
+ cp+= size_of_fld_ofs*cnt;
+ }
+
+ if (rec_len_ptr)
+ store_rec_length(rec_len_ptr, (ulong) (cp-rec_len_ptr-size_of_rec_len));
+ last_rec_pos= curr_rec_pos;
+ end_pos= pos= cp;
+ *is_full= last_record;
+ return (uint) (cp-init_pos);
+}
+
+
+/*
+ Reset the join buffer for reading/writing: default implementation
+
+ SYNOPSIS
+ reset()
+ for_writing if it's TRUE the function reset the buffer for writing
+
+ DESCRIPTION
+ This default implementation of the virtual function reset() resets
+ the join buffer for reading or writing.
+ If the buffer is reset for reading only the 'pos' value is reset
+ to point to the very beginning of the join buffer. If the buffer is
+ reset for writing additionally:
+ - the counter of the records in the buffer is set to 0,
+ - the the value of 'last_rec_pos' gets pointing at the position just
+ before the buffer,
+ - 'end_pos' is set to point to the beginning of the join buffer,
+ - the size of the auxiliary buffer is reset to 0,
+ - the flag 'last_rec_blob_data_is_in_rec_buff' is set to 0.
+
+ RETURN
+ none
+*/
+
+void JOIN_CACHE::reset(bool for_writing)
+{
+ pos= buff;
+ curr_rec_link= 0;
+ if (for_writing)
+ {
+ records= 0;
+ last_rec_pos= buff;
+ aux_buff_size= 0;
+ end_pos= pos;
+ last_rec_blob_data_is_in_rec_buff= 0;
+ }
+}
+
+/*
+ Add a record into the join buffer: the default implementation
+
+ SYNOPSIS
+ put_record()
+
+ DESCRIPTION
+ This default implementation of the virtual function put_record writes
+ the next matching record into the join buffer.
+ It also links the record having been written into the join buffer with
+ the matched record in the previous cache if there is any.
+ The implementation assumes that the function get_curr_link()
+ will return exactly the pointer to this matched record.
+
+ RETURN
+ TRUE if it has been decided that it should be the last record
+ in the join buffer,
+ FALSE otherwise
+*/
+
+bool JOIN_CACHE::put_record()
+{
+ bool is_full;
+ uchar *link= 0;
+ if (prev_cache)
+ link= prev_cache->get_curr_rec_link();
+ write_record_data(link, &is_full);
+ return is_full;
+}
+
+
+/*
+ Read the next record from the join buffer: the default implementation
+
+ SYNOPSIS
+ get_record()
+
+ DESCRIPTION
+ This default implementation of the virtual function get_record
+ reads fields of the next record from the join buffer of this cache.
+ The function also reads all other fields associated with this record
+ from the the join buffers of the previous caches. The fields are read
+ into the corresponding record buffers.
+ It is supposed that 'pos' points to the position in the buffer
+ right after the previous record when the function is called.
+ When the function returns the 'pos' values is updated to point
+ to the position after the read record.
+ The value of 'curr_rec_pos' is also updated by the function to
+ point to the beginning of the first field of the record in the
+ join buffer.
+
+ RETURN
+ TRUE - there are no more records to read from the join buffer
+ FALSE - otherwise
+*/
+
+bool JOIN_CACHE::get_record()
+{
+ bool res;
+ uchar *prev_rec_ptr= 0;
+ if (with_length)
+ pos+= size_of_rec_len;
+ if (prev_cache)
+ {
+ pos+= prev_cache->get_size_of_rec_offset();
+ prev_rec_ptr= prev_cache->get_rec_ref(pos);
+ }
+ curr_rec_pos= pos;
+ if (!(res= read_all_record_fields() == 0))
+ {
+ pos+= referenced_fields*size_of_fld_ofs;
+ if (prev_cache)
+ prev_cache->get_record_by_pos(prev_rec_ptr);
+ }
+ return res;
+}
+
+
+/*
+ Read a positioned record from the join buffer: the default implementation
+
+ SYNOPSIS
+ get_record_by_pos()
+ rec_ptr position of the first field of the record in the join buffer
+
+ DESCRIPTION
+ This default implementation of the virtual function get_record_pos
+ reads the fields of the record positioned at 'rec_ptr' from the join buffer.
+ The function also reads all other fields associated with this record
+ from the the join buffers of the previous caches. The fields are read
+ into the corresponding record buffers.
+
+ RETURN
+ none
+*/
+
+void JOIN_CACHE::get_record_by_pos(uchar *rec_ptr)
+{
+ uchar *save_pos= pos;
+ pos= rec_ptr;
+ read_all_record_fields();
+ pos= save_pos;
+ if (prev_cache)
+ {
+ uchar *prev_rec_ptr= prev_cache->get_rec_ref(rec_ptr);
+ prev_cache->get_record_by_pos(prev_rec_ptr);
+ }
+}
+
+
+/*
+ Test the match flag from the referenced record: the default implementation
+
+ SYNOPSIS
+ get_match_flag_by_pos()
+ rec_ptr position of the first field of the record in the join buffer
+
+ DESCRIPTION
+ This default implementation of the virtual function get_match_flag_by_pos
+ test the match flag for the record pointed by the reference at the position
+ rec_ptr. If the match flag in placed one of the previous buffers the function
+ first reaches the linked record fields in this buffer.
+
+ RETURN
+ TRUE if the match flag is set on
+ FALSE otherwise
+*/
+
+bool JOIN_CACHE::get_match_flag_by_pos(uchar *rec_ptr)
+{
+ if (with_match_flag)
+ return test(*rec_ptr);
+ if (prev_cache)
+ {
+ uchar *prev_rec_ptr= prev_cache->get_rec_ref(rec_ptr);
+ return prev_cache->get_match_flag_by_pos(prev_rec_ptr);
+ }
+ DBUG_ASSERT(1);
+ return FALSE;
+}
+
+
+/*
+ Read all flag and data fields of a record from the join buffer
+
+ SYNOPSIS
+ read_all_record_fields()
+
+ DESCRIPTION
+ The function reads all flag and data fields of a record from the join
+ buffer into the corresponding record buffers.
+ The fields are read starting from the position 'pos' which is
+ supposed to point to the beginning og the first record field.
+ The function increments the value of 'pos' by the length of the
+ read data.
+
+ RETURN
+ length of the data read from the join buffer
+*/
+
+uint JOIN_CACHE::read_all_record_fields()
+{
+ uchar *init_pos= pos;
+
+ if (pos > last_rec_pos || !records)
+ return 0;
+
+ /* First match flag, read null bitmaps and null_row flag for each table */
+ read_flag_fields();
+
+ /* Now read the remaining table fields if needed */
+ CACHE_FIELD *copy= field_descr+flag_fields;
+ CACHE_FIELD *copy_end= field_descr+fields;
+ bool blob_in_rec_buff= blob_data_is_in_rec_buff(init_pos);
+ for ( ; copy < copy_end; copy++)
+ read_record_field(copy, blob_in_rec_buff);
+
+ return (uint) (pos-init_pos);
+}
+
+
+/*
+ Read all flag fields of a record from the join buffer
+
+ SYNOPSIS
+ read_flag_fields()
+
+ DESCRIPTION
+ The function reads all flag fields of a record from the join
+ buffer into the corresponding record buffers.
+ The fields are read starting from the position 'pos'.
+ The function increments the value of 'pos' by the length of the
+ read data.
+
+ RETURN
+ length of the data read from the join buffer
+*/
+
+uint JOIN_CACHE::read_flag_fields()
+{
+ uchar *init_pos= pos;
+ CACHE_FIELD *copy= field_descr;
+ CACHE_FIELD *copy_end= copy+flag_fields;
+ for ( ; copy < copy_end; copy++)
+ {
+ memcpy(copy->str, pos, copy->length);
+ pos+= copy->length;
+ }
+ return (pos-init_pos);
+}
+
+
+/*
+ Read a data record field from the join buffer
+
+ SYNOPSIS
+ read_record_field()
+ copy the descriptor of the data field to be read
+ blob_in_rec_buff indicates whether this is the field from the record
+ whose blob data are in record buffers
+
+ DESCRIPTION
+ The function reads the data field specified by the parameter copy
+ from the join buffer into the corresponding record buffer.
+ The field is read starting from the position 'pos'.
+ The data of blob values is not copied from the join buffer.
+ The function increments the value of 'pos' by the length of the
+ read data.
+
+ RETURN
+ length of the data read from the join buffer
+*/
+
+uint JOIN_CACHE::read_record_field(CACHE_FIELD *copy, bool blob_in_rec_buff)
+{
+ uint len;
+ /* Do not copy the field if its value is null */
+ if (copy->field && copy->field->maybe_null() && copy->field->is_null())
+ return 0;
+ if (copy->type == CACHE_BLOB)
+ {
+ Field_blob *blob_field= (Field_blob *) copy->field;
+ /*
+ Copy the length and the pointer to data but not the blob data
+ itself to the record buffer
+ */
+ if (blob_in_rec_buff)
+ {
+ blob_field->set_image(pos, copy->length+sizeof(char*),
+ blob_field->charset());
+ len= copy->length+sizeof(char*);
+ }
+ else
+ {
+ blob_field->set_ptr(pos, pos+copy->length);
+ len= copy->length+blob_field->get_length();
+ }
+ }
+ else
+ {
+ switch (copy->type) {
+ case CACHE_VARSTR1:
+ /* Copy the significant part of the short varstring field */
+ len= (uint) pos[0] + 1;
+ memcpy(copy->str, pos, len);
+ break;
+ case CACHE_VARSTR2:
+ /* Copy the significant part of the long varstring field */
+ len= uint2korr(pos) + 2;
+ memcpy(copy->str, pos, len);
+ break;
+ case CACHE_STRIPPED:
+ /* Pad the value by spaces that has been stripped off */
+ len= uint2korr(pos);
+ memcpy(copy->str, pos+2, len);
+ memset(copy->str+len, ' ', copy->length-len);
+ len+= 2;
+ break;
+ default:
+ /* Copy the entire image of the field from the record buffer */
+ len= copy->length;
+ memcpy(copy->str, pos, len);
+ }
+ }
+ pos+= len;
+ return len;
+}
+
+
+/*
+ Read a referenced field from the join buffer
+
+ SYNOPSIS
+ read_referenced_field()
+ copy pointer to the descriptor of the referenced field
+ rec_ptr pointer to the record that may contain this field
+ len IN/OUT total length of the record fields
+
+ DESCRIPTION
+ The function checks whether copy points to a data field descriptor
+ for this cache object. If it does not then the function returns
+ FALSE. Otherwise the function reads the field of the record in
+ the join buffer pointed by 'rec_ptr' into the corresponding record
+ buffer and returns TRUE.
+ If the value of *len is 0 then the function sets it to the total
+ length of the record fields including possible trailing offset
+ values. Otherwise *len is supposed to provide this value that
+ has been obtained earlier.
+
+ RETURN
+ TRUE 'copy' points to a data descriptor of this join cache
+ FALSE otherwise
+*/
+
+bool JOIN_CACHE::read_referenced_field(CACHE_FIELD *copy,
+ uchar *rec_ptr,
+ uint *len)
+{
+ uchar *ptr;
+ uint offset;
+ if (copy < field_descr || copy >= field_descr+fields)
+ return FALSE;
+ if (!*len)
+ {
+ /* Get the total length of the record fields */
+ uchar *len_ptr= rec_ptr;
+ if (prev_cache)
+ len_ptr-= prev_cache->get_size_of_rec_offset();
+ *len= get_rec_length(len_ptr-size_of_rec_len);
+ }
+
+ ptr= rec_ptr-(prev_cache ? prev_cache->get_size_of_rec_offset() : 0);
+ offset= get_fld_offset(ptr+ *len -
+ size_of_fld_ofs*
+ (referenced_fields+1-copy->referenced_field_no));
+ bool is_null= FALSE;
+ if (offset == 0 && flag_fields)
+ is_null= TRUE;
+ if (is_null)
+ copy->field->set_null();
+ else
+ {
+ uchar *save_pos= pos;
+ copy->field->set_notnull();
+ pos= rec_ptr+offset;
+ read_record_field(copy, blob_data_is_in_rec_buff(rec_ptr));
+ pos= save_pos;
+ }
+ return TRUE;
+}
+
+
+/*
+ Skip record from join buffer if its match flag is on: default implementation
+
+ SYNOPSIS
+ skip_record_if_match()
+
+ DESCRIPTION
+ This default implementation of the virtual function skip_record_if_match
+ skips the next record from the join buffer if its match flag is set on.
+ If the record is skipped the value of 'pos' is set to points to the position
+ right after the record.
+
+ RETURN
+ TRUE - the match flag is on and the record has been skipped
+ FALSE - the match flag is off
+*/
+
+bool JOIN_CACHE::skip_record_if_match()
+{
+ DBUG_ASSERT(with_match_flag && with_length);
+ uint offset= size_of_rec_len;
+ if (prev_cache)
+ offset+= prev_cache->get_size_of_rec_offset();
+ /* Check whether the match flag is on */
+ if (test(*(pos+offset)))
+ {
+ pos+= size_of_rec_len + get_rec_length(pos);
+ return TRUE;
+ }
+ return FALSE;
+}
+
+
+/*
+ Restore the fields of the last record from the join buffer
+
+ SYNOPSIS
+ restore_last_record()
+
+ DESCRIPTION
+ This function restore the values of the fields of the last record put
+ into join buffer in record buffers. The values most probably have been
+ overwritten by the field values from other records when they were read
+ from the join buffer into the record buffer in order to check pushdown
+ predicates.
+
+ RETURN
+ none
+*/
+
+void JOIN_CACHE::restore_last_record()
+{
+ if (records)
+ get_record_by_pos(last_rec_pos);
+}
+
+
+/*
+ Join records from the join buffer with records from the next join table
+
+ SYNOPSIS
+ join_records()
+ skip_last do not find matches for the last record from the buffer
+
+ DESCRIPTION
+ The functions extends all records from the join buffer by the matched
+ records from join_tab. In the case of outer join operation it also
+ adds null complementing extensions for the records from the join buffer
+ that have no match.
+ No extensions are generated for the last record from the buffer if
+ skip_last is true.
+
+ NOTES
+ The function must make sure that if linked join buffers are used then
+ a join buffer cannot be refilled again until all extensions in the
+ buffers chained to this one are generated.
+ Currently an outer join operation with several inner tables always uses
+ at least two linked buffers with the match join flags placed in the
+ first buffer. Any record composed of rows of the inner tables that
+ matches a record in this buffer must refer to the position of the
+ corresponding match flag.
+
+ IMPLEMENTATION
+ When generating extensions for outer tables of an outer join operation
+ first we generate all extensions for those records from the join buffer
+ that have matches, after which null complementing extension for all
+ unmatched records from the join buffer are generated.
+
+ RETURN
+ return one of enum_nested_loop_state, except NESTED_LOOP_NO_MORE_ROWS.
+*/
+
+enum_nested_loop_state JOIN_CACHE::join_records(bool skip_last)
+{
+ JOIN_TAB *tab;
+ enum_nested_loop_state rc= NESTED_LOOP_OK;
+ bool outer_join_first_inner= join_tab->is_first_inner_for_outer_join();
+
+ if (outer_join_first_inner && !join_tab->first_unmatched)
+ join_tab->not_null_compl= TRUE;
+
+ if (!join_tab->first_unmatched)
+ {
+ /* Find all records from join_tab that match records from join buffer */
+ rc= join_matching_records(skip_last);
+ if (rc != NESTED_LOOP_OK && rc != NESTED_LOOP_NO_MORE_ROWS)
+ goto finish;
+ if (outer_join_first_inner)
+ {
+ if (next_cache)
+ {
+ /*
+ Ensure that all matches for outer records from join buffer are to be
+ found. Now we ensure that all full records are found for records from
+ join buffer. Generally this is an overkill.
+ TODO: Ensure that only matches of the inner table records have to be
+ found for the records from join buffer.
+ */
+ rc= next_cache->join_records(skip_last);
+ if (rc != NESTED_LOOP_OK && rc != NESTED_LOOP_NO_MORE_ROWS)
+ goto finish;
+ }
+ join_tab->not_null_compl= FALSE;
+ /* Prepare for generation of null complementing extensions */
+ for (tab= join_tab->first_inner; tab <= join_tab->last_inner; tab++)
+ tab->first_unmatched= join_tab->first_inner;
+ }
+ }
+ if (join_tab->first_unmatched)
+ {
+ if (is_key_access())
+ restore_last_record();
+
+ /*
+ Generate all null complementing extensions for the records from
+ join buffer that don't have any matching rows from the inner tables.
+ */
+ reset(FALSE);
+ rc= join_null_complements(skip_last);
+ if (rc != NESTED_LOOP_OK && rc != NESTED_LOOP_NO_MORE_ROWS)
+ goto finish;
+ }
+ if(next_cache)
+ {
+ /*
+ When using linked caches we must ensure the records in the next caches
+ that refer to the records in the join buffer are fully extended.
+ Otherwise we could have references to the records that have been
+ already erased from the join buffer and replaced for new records.
+ */
+ rc= next_cache->join_records(skip_last);
+ if (rc != NESTED_LOOP_OK && rc != NESTED_LOOP_NO_MORE_ROWS)
+ goto finish;
+ }
+ if (outer_join_first_inner)
+ {
+ /*
+ All null complemented rows have been already generated for all
+ outer records from join buffer. Restore the state of the
+ first_unmatched values to 0 to avoid another null complementing.
+ */
+ for (tab= join_tab->first_inner; tab <= join_tab->last_inner; tab++)
+ tab->first_unmatched= 0;
+ }
+
+ if (skip_last)
+ {
+ DBUG_ASSERT(!is_key_access());
+ /*
+ Restore the last record from the join buffer to generate
+ all extentions for it.
+ */
+ get_record();
+ }
+
+finish:
+ restore_last_record();
+ reset(TRUE);
+ return rc;
+}
+
+
+/*
+ Using BNL find matches from the next table for records from the join buffer
+
+ SYNOPSIS
+ join_matching_records()
+ skip_last do not look for matches for the last partial join record
+
+ DESCRIPTION
+ The function retrieves all rows of the join_tab table and check whether
+ they match partial join records from the join buffer. If a match is found
+ the function will call the sub_select function trying to look for matches
+ for the remaining join operations.
+ This function currently is called only from the function join_records.
+ If the value of skip_last is true the function writes the partial join
+ record from the record buffer into the join buffer to save its value for
+ the future processing in the caller function.
+
+ NOTES
+ The function produces all matching extensions for the records in the
+ join buffer following the path of the Blocked Nested Loops algorithm.
+ When an outer join operation is performed all unmatched records from
+ the join buffer must be extended by null values. The function
+ 'join_null_complements' serves this purpose.
+
+ RETURN
+ return one of enum_nested_loop_state.
+*/
+
+enum_nested_loop_state JOIN_CACHE_BNL::join_matching_records(bool skip_last)
+{
+ uint cnt;
+ int error;
+ JOIN_TAB *tab;
+ READ_RECORD *info;
+ enum_nested_loop_state rc= NESTED_LOOP_OK;
+ bool check_only_first_match= join_tab->check_only_first_match();
+ SQL_SELECT *select= join_tab->cache_select;
+
+ join_tab->table->null_row= 0;
+
+ /* Return at once if there are no records in the join buffer */
+ if (!records)
+ return NESTED_LOOP_OK;
+
+ /*
+ When joining we read records from the join buffer back into record buffers.
+ If matches for the last partial join record are found through a call to
+ the sub_select function then this partial join record must be saved in the
+ join buffer in order to be restored just before the sub_select call.
+ */
+ if (skip_last)
+ put_record();
+
+ if (join_tab->use_quick == 2 && join_tab->select->quick)
+ {
+ /* A dynamic range access was used last. Clean up after it */
+ delete join_tab->select->quick;
+ join_tab->select->quick= 0;
+ }
+
+ for (tab= join->join_tab; tab != join_tab ; tab++)
+ {
+ tab->status= tab->table->status;
+ tab->table->status= 0;
+ }
+
+ /* Start retrieving all records of the joined table */
+ if ((error= join_init_read_record(join_tab)))
+ {
+ rc= error < 0 ? NESTED_LOOP_NO_MORE_ROWS: NESTED_LOOP_ERROR;
+ goto finish;
+ }
+
+ info= &join_tab->read_record;
+ do
+ {
+/* !!! NB igor: Enable the code in the comment after backporting the SJ code
+ if (join_tab->keep_current_rowid)
+ join_tab->table->file->position(join_tab->table->record[0]);
+*/
+
+ if (join->thd->killed)
+ {
+ /* The user has aborted the execution of the query */
+ join->thd->send_kill_message();
+ rc= NESTED_LOOP_KILLED;
+ goto finish;
+ }
+
+ /*
+ Do not look for matches if the last read record of the joined table
+ does not meet the conditions that have been pushed to this table
+ */
+ if (rc == NESTED_LOOP_OK && (!select || !select->skip_record()))
+ {
+ /* Prepare to read records from the join buffer */
+ reset(FALSE);
+
+ /* Read each record from the join buffer and look for matches */
+ for (cnt= records - test(skip_last) ; cnt; cnt--)
+ {
+ /*
+ If only the first match is needed and it has been already found for
+ the next record read from the join buffer then the record is skipped.
+ */
+ if (!check_only_first_match || !skip_record_if_match())
+ {
+ get_record();
+ rc= generate_full_extensions(get_curr_rec());
+ if (rc != NESTED_LOOP_OK && rc != NESTED_LOOP_NO_MORE_ROWS)
+ goto finish;
+ }
+ }
+ }
+ } while (!(error= info->read_record(info)));
+
+ if (error > 0) // Fatal error
+ rc= NESTED_LOOP_ERROR;
+finish:
+ for (tab= join->join_tab; tab != join_tab ; tab++)
+ tab->table->status= tab->status;
+ return rc;
+}
+
+
+/*
+ Set match flag for a record in join buffer if it has not been set yet
+
+ SYNOPSIS
+ set_match_flag_if_none()
+ first_inner the join table to which this flag is attached to
+ rec_ptr pointer to the record in the join buffer
+
+ DESCRIPTION
+ If the records of the table are accumulated in a join buffer the function
+ sets the match flag for the record in the buffer that is referred to by
+ the record from this cache positioned at 'rec_ptr'.
+ The function also sets the match flag 'found' of the table first inner
+ if it has not been set before.
+
+ NOTES
+ The function assumes that the match flag for any record in any cache
+ is placed in the first byte occupied by the record fields.
+
+ RETURN
+ TRUE the match flag is set by this call for the first time
+ FALSE the match flag has been set before this call
+*/
+
+bool JOIN_CACHE::set_match_flag_if_none(JOIN_TAB *first_inner,
+ uchar *rec_ptr)
+{
+ if (!first_inner->cache)
+ {
+ /*
+ Records of the first inner table to which the flag is attached to
+ are not accumulated in a join buffer.
+ */
+ if (first_inner->found)
+ return FALSE;
+ else
+ {
+ first_inner->found= 1;
+ return TRUE;
+ }
+ }
+ JOIN_CACHE *cache= this;
+ while (cache->join_tab != first_inner)
+ {
+ cache= cache->prev_cache;
+ DBUG_ASSERT(cache);
+ rec_ptr= cache->get_rec_ref(rec_ptr);
+ }
+ if (rec_ptr[0] == 0)
+ {
+ rec_ptr[0]= 1;
+ first_inner->found= 1;
+ return TRUE;
+ }
+ return FALSE;
+}
+
+
+/*
+ Generate all full extensions for a partial join record in the buffer
+
+ SYNOPSIS
+ generate_full_extensions()
+ rec_ptr pointer to the record from join buffer to generate extensions
+
+ DESCRIPTION
+ The function first checks whether the current record of 'join_tab' matches
+ the partial join record from join buffer located at 'rec_ptr'. If it is the
+ case the function calls the join_tab->next_select method to generate
+ all full extension for this partial join match.
+
+ RETURN
+ return one of enum_nested_loop_state.
+*/
+
+enum_nested_loop_state JOIN_CACHE::generate_full_extensions(uchar *rec_ptr)
+{
+ enum_nested_loop_state rc= NESTED_LOOP_OK;
+
+ /*
+ Check whether the extended partial join record meets
+ the pushdown conditions.
+ */
+ if (check_match(rec_ptr))
+ {
+ int res= 0;
+/* !!! NB igor: Enable the code in the comment after backporting the SJ code
+ if (!join_tab->check_weed_out_table ||
+ !(res= do_sj_dups_weedout(join->thd, join_tab->check_weed_out_table)))
+*/
+ {
+ set_curr_rec_link(rec_ptr);
+ rc= (join_tab->next_select)(join, join_tab+1, 0);
+ if (rc != NESTED_LOOP_OK && rc != NESTED_LOOP_NO_MORE_ROWS)
+ {
+ reset(TRUE);
+ return rc;
+ }
+ }
+ if (res == -1)
+ {
+ rc= NESTED_LOOP_ERROR;
+ return rc;
+ }
+ }
+ return rc;
+}
+
+
+/*
+ Check matching to a partial join record from the join buffer
+
+ SYNOPSIS
+ check_match()
+ rec_ptr pointer to the record from join buffer to check matching to
+
+ DESCRIPTION
+ The function checks whether the current record of 'join_tab' matches
+ the partial join record from join buffer located at 'rec_ptr'. If this is
+ the case and 'join_tab' is the last inner table of a semi-join or an outer
+ join the function turns on the match flag for the 'rec_ptr' record unless
+ it has been already set.
+
+ NOTES
+ Setting the match flag on can trigger re-evaluation of pushdown conditions
+ for the record when join_tab is the last inner table of an outer join.
+
+ RETURN
+ TRUE there is a match
+ FALSE there is no match
+*/
+
+inline bool JOIN_CACHE::check_match(uchar *rec_ptr)
+{
+ /* Check whether pushdown conditions are satisfied */
+ if (join_tab->select && join_tab->select->skip_record())
+ return FALSE;
+
+ if (!join_tab->is_last_inner_table())
+ return TRUE;
+
+ /*
+ This is the last inner table of an outer join,
+ and maybe of other embedding outer joins, or
+ this is the last inner table of a semi-join.
+ */
+ JOIN_TAB *first_inner= join_tab->get_first_inner_table();
+ do
+ {
+ set_match_flag_if_none(first_inner, rec_ptr);
+ if (first_inner->check_only_first_match() &&
+ !join_tab->first_inner)
+ return TRUE;
+ /*
+ This is the first match for the outer table row.
+ The function set_match_flag_if_none has turned the flag
+ first_inner->found on. The pushdown predicates for
+ inner tables must be re-evaluated with this flag on.
+ Note that, if first_inner is the first inner table
+ of a semi-join, but is not an inner table of an outer join
+ such that 'not exists' optimization can be applied to it,
+ the re-evaluation of the pushdown predicates is not needed.
+ */
+ for (JOIN_TAB *tab= first_inner; tab <= join_tab; tab++)
+ {
+ if (tab->select && tab->select->skip_record())
+ return FALSE;
+ }
+ }
+ while ((first_inner= first_inner->first_upper) &&
+ first_inner->last_inner == join_tab);
+
+ return TRUE;
+}
+
+
+/*
+ Add null complements for unmatched outer records from join buffer
+
+ SYNOPSIS
+ join_null_complements()
+ skip_last do not add null complements for the last record
+
+ DESCRIPTION
+ This function is called only for inner tables of outer joins.
+ The function retrieves all rows from the join buffer and adds null
+ complements for those of them that do not have matches for outer
+ table records.
+ If the 'join_tab' is the last inner table of the embedding outer
+ join and the null complemented record satisfies the outer join
+ condition then the the corresponding match flag is turned on
+ unless it has been set earlier. This setting may trigger
+ re-evaluation of pushdown conditions for the record.
+
+ NOTES
+ The same implementation of the virtual method join_null_complements
+ is used for JOIN_CACHE_BNL and JOIN_CACHE_BKA.
+
+ RETURN
+ return one of enum_nested_loop_state.
+*/
+
+enum_nested_loop_state JOIN_CACHE::join_null_complements(bool skip_last)
+{
+ uint cnt;
+ enum_nested_loop_state rc= NESTED_LOOP_OK;
+ bool is_first_inner= join_tab == join_tab->first_unmatched;
+ bool is_last_inner= join_tab == join_tab->first_unmatched->last_inner;
+
+ /* Return at once if there are no records in the join buffer */
+ if (!records)
+ return NESTED_LOOP_OK;
+
+ cnt= records - (is_key_access() ? 0 : test(skip_last));
+
+ /* This function may be called only for inner tables of outer joins */
+ DBUG_ASSERT(join_tab->first_inner);
+
+ for ( ; cnt; cnt--)
+ {
+ if (join->thd->killed)
+ {
+ /* The user has aborted the execution of the query */
+ join->thd->send_kill_message();
+ rc= NESTED_LOOP_KILLED;
+ goto finish;
+ }
+ /* Just skip the whole record if a match for it has been already found */
+ if (!is_first_inner || !skip_record_if_match())
+ {
+ get_record();
+ /* The outer row is complemented by nulls for each inner table */
+ restore_record(join_tab->table, s->default_values);
+ mark_as_null_row(join_tab->table);
+ /* Check all pushdown conditions attached to the inner table */
+ join_tab->first_unmatched->found= 1;
+ if (join_tab->select && join_tab->select->skip_record())
+ continue;
+ if (is_last_inner)
+ {
+ JOIN_TAB *first_upper= join_tab->first_unmatched->first_upper;
+ while (first_upper && first_upper->last_inner == join_tab)
+ {
+ set_match_flag_if_none(first_upper, get_curr_rec());
+ for (JOIN_TAB* tab= first_upper; tab <= join_tab; tab++)
+ {
+ if (tab->select && tab->select->skip_record())
+ goto next;
+ }
+ first_upper= first_upper->first_upper;
+ }
+ }
+ /* Find all matches for the remaining join tables */
+ rc= (*join_tab->next_select)(join, join_tab+1, 0);
+ if (rc != NESTED_LOOP_OK && rc != NESTED_LOOP_NO_MORE_ROWS)
+ {
+ reset(TRUE);
+ goto finish;
+ }
+ }
+ next:
+ ;
+ }
+
+finish:
+ return rc;
+}
+
+
+/*
+ Initialize retrieval of range sequence for BKA algorithm
+
+ SYNOPSIS
+ bka_range_seq_init()
+ init_params pointer to the BKA join cache object
+ n_ranges the number of ranges obtained
+ flags combination of HA_MRR_SINGLE_POINT, HA_MRR_FIXED_KEY
+
+ DESCRIPTION
+ The function interprets init_param as a pointer to a JOIN_CACHE_BKA
+ object. The function prepares for an iteration over the join keys
+ built for all records from the cache join buffer.
+
+ NOTE
+ This function are used only as a callback function.
+
+ RETURN
+ init_param value that is to be used as a parameter of bka_range_seq_next()
+*/
+
+static
+range_seq_t bka_range_seq_init(void *init_param, uint n_ranges, uint flags)
+{
+ DBUG_ENTER("bka_range_seq_init");
+ JOIN_CACHE_BKA *cache= (JOIN_CACHE_BKA *) init_param;
+ cache->reset(0);
+ DBUG_RETURN((range_seq_t) init_param);
+}
+
+
+/*
+ Get the key over the next record from the join buffer used by BKA
+
+ SYNOPSIS
+ bka_range_seq_next()
+ seq the value returned by bka_range_seq_init
+ range OUT reference to the next range
+
+ DESCRIPTION
+ The function interprets seq as a pointer to a JOIN_CACHE_BKA
+ object. The function returns a pointer to the range descriptor
+ for the key built over the next record from the join buffer.
+
+ NOTE
+ This function are used only as a callback function.
+
+ RETURN
+ 0 ok, the range structure filled with info about the next key
+ 1 no more ranges
+*/
+
+static
+uint bka_range_seq_next(range_seq_t rseq, KEY_MULTI_RANGE *range)
+{
+ DBUG_ENTER("bka_range_seq_next");
+ JOIN_CACHE_BKA *cache= (JOIN_CACHE_BKA *) rseq;
+ TABLE_REF *ref= &cache->join_tab->ref;
+ key_range *start_key= &range->start_key;
+ if ((start_key->length= cache->get_next_key((uchar **) &start_key->key)))
+ {
+ start_key->keypart_map= (1 << ref->key_parts) - 1;
+ start_key->flag= HA_READ_KEY_EXACT;
+ range->end_key= *start_key;
+ range->end_key.flag= HA_READ_AFTER_KEY;
+ range->ptr= (char *) cache->get_curr_rec();
+ range->range_flag= EQ_RANGE;
+ DBUG_RETURN(0);
+ }
+ DBUG_RETURN(1);
+}
+
+
+/*
+ Check whether range_info orders to skip the next record from BKA buffer
+
+ SYNOPSIS
+ bka_range_seq_skip_record()
+ seq value returned by bka_range_seq_init()
+ range_info information about the next range
+ rowid [NOT USED] rowid of the record to be checked
+
+
+ DESCRIPTION
+ The function interprets seq as a pointer to a JOIN_CACHE_BKA object.
+ The function interprets seq as a pointer to the JOIN_CACHE_BKA_UNIQUE
+ object. The function returns TRUE if the record with this range_info
+ is to be filtered out from the stream of records returned by
+ multi_range_read_next().
+
+ NOTE
+ This function are used only as a callback function.
+
+ RETURN
+ 1 record with this range_info is to be filtered out from the stream
+ of records returned by multi_range_read_next()
+ 0 the record is to be left in the stream
+*/
+
+static
+bool bka_range_seq_skip_record(range_seq_t rseq, char *range_info, uchar *rowid)
+{
+ DBUG_ENTER("bka_range_seq_skip_record");
+ JOIN_CACHE_BKA *cache= (JOIN_CACHE_BKA *) rseq;
+ bool res= cache->get_match_flag_by_pos((uchar *) range_info);
+ DBUG_RETURN(res);
+}
+
+/*
+ Using BKA find matches from the next table for records from the join buffer
+
+ SYNOPSIS
+ join_matching_records()
+ skip_last do not look for matches for the last partial join record
+
+ DESCRIPTION
+ This function can be used only when the table join_tab can be accessed
+ by keys built over the fields of previous join tables.
+ The function retrieves all partial join records from the join buffer and
+ for each of them builds the key value to access join_tab, performs index
+ look-up with this key and selects matching records yielded by this look-up
+ If a match is found the function will call the sub_select function trying
+ to look for matches for the remaining join operations.
+ This function currently is called only from the function join_records.
+ It's assumed that this function is always called with the skip_last
+ parameter equal to false.
+
+ NOTES
+ The function produces all matching extensions for the records in the
+ join buffer following the path of the Batched Key Access algorithm.
+ When an outer join operation is performed all unmatched records from
+ the join buffer must be extended by null values. The function
+ join_null_complements serves this purpose.
+ The Batched Key Access algorithm assumes that key accesses are batched.
+ In other words it assumes that, first, either keys themselves or the
+ corresponding rowids (primary keys) are accumulated in a buffer, then
+ data rows from join_tab are fetched for all of them. When a row is
+ fetched it is always returned with a reference to the key by which it
+ has been accessed.
+ When key values are batched we can save on the number of the server
+ requests for index lookups. For the remote engines, like NDB cluster, it
+ essentially reduces the number of round trips between the server and
+ the engine when performing a join operation.
+ When the rowids for the keys are batched we can optimize the order
+ in what we fetch the data for this rowids. The performance benefits of
+ this optimization can be significant for such engines as MyISAM, InnoDB.
+ What is exactly batched are hidden behind implementations of
+ MRR handler interface that is supposed to be appropriately chosen
+ for each engine. If for a engine no specific implementation of the MRR
+ interface is supllied then the default implementation is used. This
+ implementation actually follows the path of Nested Loops Join algorithm.
+ In this case BKA join surely will demonstrate a worse performance than
+ NL join.
+
+ RETURN
+ return one of enum_nested_loop_state
+*/
+
+enum_nested_loop_state JOIN_CACHE_BKA::join_matching_records(bool skip_last)
+{
+ int error;
+ handler *file= join_tab->table->file;
+ enum_nested_loop_state rc= NESTED_LOOP_OK;
+ uchar *rec_ptr= 0;
+ bool check_only_first_match= join_tab->check_only_first_match();
+
+ /* Set functions to iterate over keys in the join buffer */
+
+ RANGE_SEQ_IF seq_funcs= { bka_range_seq_init,
+ bka_range_seq_next,
+ check_only_first_match ?
+ bka_range_seq_skip_record : 0,
+ join_tab->cache_idx_cond ?
+ bka_skip_index_tuple : 0 };
+
+ /* The value of skip_last must be always FALSE when this function is called */
+ DBUG_ASSERT(!skip_last);
+
+ /* Return at once if there are no records in the join buffer */
+ if (!records)
+ return NESTED_LOOP_OK;
+
+ rc= init_join_matching_records(&seq_funcs, records);
+ if (rc != NESTED_LOOP_OK)
+ goto finish;
+
+ while (!(error= file->multi_range_read_next((char **) &rec_ptr)))
+ {
+ if (join->thd->killed)
+ {
+ /* The user has aborted the execution of the query */
+ join->thd->send_kill_message();
+ rc= NESTED_LOOP_KILLED;
+ goto finish;
+ }
+/* !!!NB igor: Enable the statement in the comment after backporting the SJ code
+ if (join_tab->keep_current_rowid)
+ join_tab->table->file->position(join_tab->table->record[0]);
+*/
+ /*
+ If only the first match is needed and it has been already found
+ for the associated partial join record then the returned candidate
+ is discarded.
+ */
+ if (rc == NESTED_LOOP_OK &&
+ (!check_only_first_match || !get_match_flag_by_pos(rec_ptr)))
+ {
+ get_record_by_pos(rec_ptr);
+ rc= generate_full_extensions(rec_ptr);
+ if (rc != NESTED_LOOP_OK && rc != NESTED_LOOP_NO_MORE_ROWS)
+ goto finish;
+ }
+ }
+
+ if (error > 0 && error != HA_ERR_END_OF_FILE)
+ return NESTED_LOOP_ERROR;
+finish:
+ return end_join_matching_records(rc);
+}
+
+
+
+/*
+ Prepare to search for records that match records from the join buffer
+
+ SYNOPSIS
+ init_join_matching_records()
+ seq_funcs structure of range sequence interface
+ ranges number of keys/ranges in the sequence
+
+ DESCRIPTION
+ This function calls the multi_range_read_init function to set up
+ the BKA process of generating the keys from the records in the join
+ buffer and looking for matching records from the table to be joined.
+ The function passes as a parameter a structure of functions that
+ implement the range sequence interface. This interface is used to
+ enumerate all generated keys and optionally to filter the matching
+ records returned by the multi_range_read_next calls from the
+ intended invocation of the join_matching_records method. The
+ multi_range_read_init function also receives the parameters for
+ MRR buffer to be used and flags specifying the mode in which
+ this buffer will be functioning.
+ The number of keys in the sequence expected by multi_range_read_init
+ is passed through the parameter ranges.
+
+ RETURN
+ return one of enum_nested_loop_state
+*/
+
+enum_nested_loop_state
+JOIN_CACHE_BKA::init_join_matching_records(RANGE_SEQ_IF *seq_funcs, uint ranges)
+{
+ int error;
+ handler *file= join_tab->table->file;
+ enum_nested_loop_state rc= NESTED_LOOP_OK;
+
+ join_tab->table->null_row= 0;
+
+
+ /* Dynamic range access is never used with BKA */
+ DBUG_ASSERT(join_tab->use_quick != 2);
+
+ for (JOIN_TAB *tab =join->join_tab; tab != join_tab ; tab++)
+ {
+ tab->status= tab->table->status;
+ tab->table->status= 0;
+ }
+
+ init_mrr_buff();
+
+ /*
+ Prepare to iterate over keys from the join buffer and to get
+ matching candidates obtained with MMR handler functions.
+ */
+ if (!file->inited)
+ file->ha_index_init(join_tab->ref.key, 1);
+ if ((error= file->multi_range_read_init(seq_funcs, (void*) this, ranges,
+ mrr_mode, &mrr_buff)))
+ rc= error < 0 ? NESTED_LOOP_NO_MORE_ROWS: NESTED_LOOP_ERROR;
+
+ return rc;
+}
+
+
+/*
+ Finish searching for records that match records from the join buffer
+
+ SYNOPSIS
+ end_join_matching_records()
+ rc return code passed by the join_matching_records function
+
+ DESCRIPTION
+ This function perform final actions on searching for all matches for
+ the records from the join buffer and building all full join extensions
+ of the records with these matches.
+
+ RETURN
+ return code rc passed to the function as a parameter
+*/
+
+enum_nested_loop_state
+JOIN_CACHE_BKA::end_join_matching_records(enum_nested_loop_state rc)
+{
+ for (JOIN_TAB *tab=join->join_tab; tab != join_tab ; tab++)
+ tab->table->status= tab->status;
+ return rc;
+}
+
+
+/*
+ Get the key built over the next record from BKA join buffer
+
+ SYNOPSIS
+ get_next_key()
+ key pointer to the buffer where the key value is to be placed
+
+ DESCRIPTION
+ The function reads key fields from the current record in the join buffer.
+ and builds the key value out of these fields that will be used to access
+ the 'join_tab' table. Some of key fields may belong to previous caches.
+ They are accessed via record references to the record parts stored in the
+ previous join buffers. The other key fields always are placed right after
+ the flag fields of the record.
+ If the key is embedded, which means that its value can be read directly
+ from the join buffer, then *key is set to the beginning of the key in
+ this buffer. Otherwise the key is built in the join_tab->ref->key_buff.
+ The function returns the length of the key if it succeeds ro read it.
+ If is assumed that the functions starts reading at the position of
+ the record length which is provided for each records in a BKA cache.
+ After the key is built the 'pos' value points to the first position after
+ the current record.
+ The function returns 0 if the initial position is after the beginning
+ of the record fields for last record from the join buffer.
+
+ RETURN
+ length of the key value - if the starting value of 'pos' points to
+ the position before the fields for the last record,
+ 0 - otherwise.
+*/
+
+uint JOIN_CACHE_BKA::get_next_key(uchar ** key)
+{
+ uint len;
+ uint32 rec_len;
+ uchar *init_pos;
+ JOIN_CACHE *cache;
+
+ if (pos > last_rec_pos || !records)
+ return 0;
+
+ /* Any record in a BKA cache is prepended with its length */
+ DBUG_ASSERT(with_length);
+
+ /* Read the length of the record */
+ rec_len= get_rec_length(pos);
+ pos+= size_of_rec_len;
+ init_pos= pos;
+
+ /* Read a reference to the previous cache if any */
+ if (prev_cache)
+ pos+= prev_cache->get_size_of_rec_offset();
+
+ curr_rec_pos= pos;
+
+ /* Read all flag fields of the record */
+ read_flag_fields();
+
+ if (use_emb_key)
+ {
+ /* An embedded key is taken directly from the join buffer */
+ *key= pos;
+ len= emb_key_length;
+ }
+ else
+ {
+ /* Read key arguments from previous caches if there are any such fields */
+ if (external_key_arg_fields)
+ {
+ uchar *rec_ptr= curr_rec_pos;
+ uint key_arg_count= external_key_arg_fields;
+ CACHE_FIELD **copy_ptr= blob_ptr-key_arg_count;
+ for (cache= prev_cache; key_arg_count; cache= cache->prev_cache)
+ {
+ uint len= 0;
+ DBUG_ASSERT(cache);
+ rec_ptr= cache->get_rec_ref(rec_ptr);
+ while (!cache->referenced_fields)
+ {
+ cache= cache->prev_cache;
+ DBUG_ASSERT(cache);
+ rec_ptr= cache->get_rec_ref(rec_ptr);
+ }
+ while (key_arg_count &&
+ cache->read_referenced_field(*copy_ptr, rec_ptr, &len))
+ {
+ copy_ptr++;
+ --key_arg_count;
+ }
+ }
+ }
+
+ /*
+ Read the other key arguments from the current record. The fields for
+ these arguments are always first in the sequence of the record's fields.
+ */
+ CACHE_FIELD *copy= field_descr+flag_fields;
+ CACHE_FIELD *copy_end= copy+local_key_arg_fields;
+ bool blob_in_rec_buff= blob_data_is_in_rec_buff(curr_rec_pos);
+ for ( ; copy < copy_end; copy++)
+ read_record_field(copy, blob_in_rec_buff);
+
+ /* Build the key over the fields read into the record buffers */
+ TABLE_REF *ref= &join_tab->ref;
+ cp_buffer_from_ref(join->thd, join_tab->table, ref);
+ *key= ref->key_buff;
+ len= ref->key_length;
+ }
+
+ pos= init_pos+rec_len;
+
+ return len;
+}
+
+
+/*
+ Initialize a BKA_UNIQUE cache
+
+ SYNOPSIS
+ init()
+
+ DESCRIPTION
+ The function initializes the cache structure. It supposed to be called
+ right after a constructor for the JOIN_CACHE_BKA_UNIQUE.
+ The function allocates memory for the join buffer and for descriptors of
+ the record fields stored in the buffer.
+ The function also estimates the number of hash table entries in the hash
+ table to be used and initializes this hash table.
+
+ NOTES
+ The code of this function should have been included into the constructor
+ code itself. However the new operator for the class JOIN_CACHE_BKA_UNIQUE
+ would never fail while memory allocation for the join buffer is not
+ absolutely unlikely to fail. That's why this memory allocation has to be
+ placed in a separate function that is called in a couple with a cache
+ constructor.
+ It is quite natural to put almost all other constructor actions into
+ this function.
+
+ RETURN
+ 0 initialization with buffer allocations has been succeeded
+ 1 otherwise
+*/
+
+int JOIN_CACHE_BKA_UNIQUE::init()
+{
+ int rc= 0;
+ TABLE_REF *ref= &join_tab->ref;
+
+ DBUG_ENTER("JOIN_CACHE_BKA_UNIQUE::init");
+
+ hash_table= 0;
+ key_entries= 0;
+
+ if ((rc= JOIN_CACHE_BKA::init()))
+ DBUG_RETURN (rc);
+
+ key_length= ref->key_length;
+
+ /* Take into account a reference to the next record in the key chain */
+ pack_length+= get_size_of_rec_offset();
+
+ /* Calculate the minimal possible value of size_of_key_ofs greater than 1 */
+ uint max_size_of_key_ofs= max(2, get_size_of_rec_offset());
+ for (size_of_key_ofs= 2;
+ size_of_key_ofs <= max_size_of_key_ofs;
+ size_of_key_ofs+= 2)
+ {
+ key_entry_length= get_size_of_rec_offset() + // key chain header
+ size_of_key_ofs + // reference to the next key
+ (use_emb_key ? get_size_of_rec_offset() : key_length);
+
+ uint n= buff_size / (pack_length+key_entry_length+size_of_key_ofs);
+
+ /*
+ TODO: Make a better estimate for this upper bound of
+ the number of records in in the join buffer.
+ */
+ uint max_n= buff_size / (pack_length-length+
+ key_entry_length+size_of_key_ofs);
+
+ hash_entries= (uint) (n / 0.7);
+
+ if (offset_size(max_n*key_entry_length) <=
+ size_of_key_ofs)
+ break;
+ }
+
+ /* Initialize the hash table */
+ hash_table= buff + (buff_size-hash_entries*size_of_key_ofs);
+ cleanup_hash_table();
+ curr_key_entry= hash_table;
+
+ pack_length+= key_entry_length;
+ pack_length_with_blob_ptrs+= get_size_of_rec_offset() + key_entry_length;
+
+ rec_fields_offset= get_size_of_rec_offset()+get_size_of_rec_length()+
+ (prev_cache ? prev_cache->get_size_of_rec_offset() : 0);
+
+ data_fields_offset= 0;
+ if (use_emb_key)
+ {
+ CACHE_FIELD *copy= field_descr;
+ CACHE_FIELD *copy_end= copy+flag_fields;
+ for ( ; copy < copy_end; copy++)
+ data_fields_offset+= copy->length;
+ }
+
+ DBUG_RETURN(rc);
+}
+
+
+/*
+ Reset the JOIN_CACHE_BKA_UNIQUE buffer for reading/writing
+
+ SYNOPSIS
+ reset()
+ for_writing if it's TRUE the function reset the buffer for writing
+
+ DESCRIPTION
+ This implementation of the virtual function reset() resets the join buffer
+ of the JOIN_CACHE_BKA_UNIQUE class for reading or writing.
+ Additionally to what the default implementation does this function
+ cleans up the hash table allocated within the buffer.
+
+ RETURN
+ none
+*/
+
+void JOIN_CACHE_BKA_UNIQUE::reset(bool for_writing)
+{
+ this->JOIN_CACHE::reset(for_writing);
+ if (for_writing && hash_table)
+ cleanup_hash_table();
+ curr_key_entry= hash_table;
+}
+
+/*
+ Add a record into the JOIN_CACHE_BKA_UNIQUE buffer
+
+ SYNOPSIS
+ put_record()
+
+ DESCRIPTION
+ This implementation of the virtual function put_record writes the next
+ matching record into the join buffer of the JOIN_CACHE_BKA_UNIQUE class.
+ Additionally to what the default implementation does this function
+ performs the following.
+ It extracts from the record the key value used in lookups for matching
+ records and searches for this key in the hash tables from the join cache.
+ If it finds the key in the hash table it joins the record to the chain
+ of records with this key. If the key is not found in the hash table the
+ key is placed into it and a chain containing only the newly added record
+ is attached to the key entry. The key value is either placed in the hash
+ element added for the key or, if the use_emb_key flag is set, remains in
+ the record from the partial join.
+
+ RETURN
+ TRUE if it has been decided that it should be the last record
+ in the join buffer,
+ FALSE otherwise
+*/
+
+bool JOIN_CACHE_BKA_UNIQUE::put_record()
+{
+ bool is_full;
+ uchar *key;
+ uint key_len= key_length;
+ uchar *key_ref_ptr;
+ uchar *link= 0;
+ TABLE_REF *ref= &join_tab->ref;
+ uchar *next_ref_ptr= pos;
+
+ pos+= get_size_of_rec_offset();
+ /* Write the record into the join buffer */
+ if (prev_cache)
+ link= prev_cache->get_curr_rec_link();
+ write_record_data(link, &is_full);
+
+ if (use_emb_key)
+ key= get_curr_emb_key();
+ else
+ {
+ /* Build the key over the fields read into the record buffers */
+ cp_buffer_from_ref(join->thd, join_tab->table, ref);
+ key= ref->key_buff;
+ }
+
+ /* Look for the key in the hash table */
+ if (key_search(key, key_len, &key_ref_ptr))
+ {
+ uchar *last_next_ref_ptr;
+ /*
+ The key is found in the hash table.
+ Add the record to the circular list of the records attached to this key.
+ Below 'rec' is the record to be added into the record chain for the found
+ key, 'key_ref' points to a flatten representation of the st_key_entry
+ structure that contains the key and the head of the record chain.
+ */
+ last_next_ref_ptr= get_next_rec_ref(key_ref_ptr+get_size_of_key_offset());
+ /* rec->next_rec= key_entry->last_rec->next_rec */
+ memcpy(next_ref_ptr, last_next_ref_ptr, get_size_of_rec_offset());
+ /* key_entry->last_rec->next_rec= rec */
+ store_next_rec_ref(last_next_ref_ptr, next_ref_ptr);
+ /* key_entry->last_rec= rec */
+ store_next_rec_ref(key_ref_ptr+get_size_of_key_offset(), next_ref_ptr);
+ }
+ else
+ {
+ /*
+ The key is not found in the hash table.
+ Put the key into the join buffer linking it with the keys for the
+ corresponding hash entry. Create a circular list with one element
+ referencing the record and attach the list to the key in the buffer.
+ */
+ uchar *cp= last_key_entry;
+ cp-= get_size_of_rec_offset()+get_size_of_key_offset();
+ store_next_key_ref(key_ref_ptr, cp);
+ store_null_key_ref(cp);
+ store_next_rec_ref(next_ref_ptr, next_ref_ptr);
+ store_next_rec_ref(cp+get_size_of_key_offset(), next_ref_ptr);
+ if (use_emb_key)
+ {
+ cp-= get_size_of_rec_offset();
+ store_emb_key_ref(cp, key);
+ }
+ else
+ {
+ cp-= key_len;
+ memcpy(cp, key, key_len);
+ }
+ last_key_entry= cp;
+ /* Increment the counter of key_entries in the hash table */
+ key_entries++;
+ }
+ return is_full;
+}
+
+
+/*
+ Read the next record from the JOIN_CACHE_BKA_UNIQUE buffer
+
+ SYNOPSIS
+ get_record()
+
+ DESCRIPTION
+ Additionally to what the default implementation of the virtual
+ function get_record does this implementation skips the link element
+ used to connect the records with the same key into a chain.
+
+ RETURN
+ TRUE - there are no more records to read from the join buffer
+ FALSE - otherwise
+*/
+
+bool JOIN_CACHE_BKA_UNIQUE::get_record()
+{
+ pos+= get_size_of_rec_offset();
+ return this->JOIN_CACHE::get_record();
+}
+
+
+/*
+ Skip record from the JOIN_CACHE_BKA_UNIQUE join buffer if its match flag is on
+
+ SYNOPSIS
+ skip_record_if_match()
+
+ DESCRIPTION
+ This implementation of the virtual function skip_record_if_match does
+ the same as the default implementation does, but it takes into account
+ the link element used to connect the records with the same key into a chain.
+
+ RETURN
+ TRUE - the match flag is on and the record has been skipped
+ FALSE - the match flag is off
+*/
+
+bool JOIN_CACHE_BKA_UNIQUE::skip_record_if_match()
+{
+ uchar *save_pos= pos;
+ pos+= get_size_of_rec_offset();
+ if (!this->JOIN_CACHE::skip_record_if_match())
+ {
+ pos= save_pos;
+ return FALSE;
+ }
+ return TRUE;
+}
+
+
+/*
+ Search for a key in the hash table of the join buffer
+
+ SYNOPSIS
+ key_search()
+ key pointer to the key value
+ key_len key value length
+ key_ref_ptr OUT position of the reference to the next key from
+ the hash element for the found key , or
+ a position where the reference to the the hash
+ element for the key is to be added in the
+ case when the key has not been found
+
+ DESCRIPTION
+ The function looks for a key in the hash table of the join buffer.
+ If the key is found the functionreturns the position of the reference
+ to the next key from to the hash element for the given key.
+ Otherwise the function returns the position where the reference to the
+ newly created hash element for the given key is to be added.
+
+ RETURN
+ TRUE - the key is found in the hash table
+ FALSE - otherwise
+*/
+
+bool JOIN_CACHE_BKA_UNIQUE::key_search(uchar *key, uint key_len,
+ uchar **key_ref_ptr)
+{
+ bool is_found= FALSE;
+ uint idx= get_hash_idx(key, key_length);
+ uchar *ref_ptr= hash_table+size_of_key_ofs*idx;
+ while (!is_null_key_ref(ref_ptr))
+ {
+ uchar *next_key;
+ ref_ptr= get_next_key_ref(ref_ptr);
+ next_key= use_emb_key ? get_emb_key(ref_ptr-get_size_of_rec_offset()) :
+ ref_ptr-key_length;
+
+ if (memcmp(next_key, key, key_len) == 0)
+ {
+ is_found= TRUE;
+ break;
+ }
+ }
+ *key_ref_ptr= ref_ptr;
+ return is_found;
+}
+
+
+/*
+ Calclulate hash value for a key in the hash table of the join buffer
+
+ SYNOPSIS
+ get_hash_idx()
+ key pointer to the key value
+ key_len key value length
+
+ DESCRIPTION
+ The function calculates an index of the hash entry in the hash table
+ of the join buffer for the given key
+
+ RETURN
+ the calculated index of the hash entry for the given key.
+*/
+
+uint JOIN_CACHE_BKA_UNIQUE::get_hash_idx(uchar* key, uint key_len)
+{
+ ulong nr= 1;
+ ulong nr2= 4;
+ uchar *pos= key;
+ uchar *end= key+key_len;
+ for (; pos < end ; pos++)
+ {
+ nr^= (ulong) ((((uint) nr & 63)+nr2)*((uint) *pos))+ (nr << 8);
+ nr2+= 3;
+ }
+ return nr % hash_entries;
+}
+
+
+/*
+ Clean up the hash table of the join buffer
+
+ SYNOPSIS
+ cleanup_hash_table()
+ key pointer to the key value
+ key_len key value length
+
+ DESCRIPTION
+ The function cleans up the hash table in the join buffer removing all
+ hash elements from the table.
+
+ RETURN
+ none
+*/
+
+void JOIN_CACHE_BKA_UNIQUE:: cleanup_hash_table()
+{
+ last_key_entry= hash_table;
+ bzero(hash_table, (buff+buff_size)-hash_table);
+ key_entries= 0;
+}
+
+
+/*
+ Initialize retrieval of range sequence for BKA_UNIQUE algorithm
+
+ SYNOPSIS
+ bka_range_seq_init()
+ init_params pointer to the BKA_INIQUE join cache object
+ n_ranges the number of ranges obtained
+ flags combination of HA_MRR_SINGLE_POINT, HA_MRR_FIXED_KEY
+
+ DESCRIPTION
+ The function interprets init_param as a pointer to a JOIN_CACHE_BKA_UNIQUE
+ object. The function prepares for an iteration over the unique join keys
+ built over the records from the cache join buffer.
+
+ NOTE
+ This function are used only as a callback function.
+
+ RETURN
+ init_param value that is to be used as a parameter of
+ bka_unique_range_seq_next()
+*/
+
+static
+range_seq_t bka_unique_range_seq_init(void *init_param, uint n_ranges,
+ uint flags)
+{
+ DBUG_ENTER("bka_unique_range_seq_init");
+ JOIN_CACHE_BKA_UNIQUE *cache= (JOIN_CACHE_BKA_UNIQUE *) init_param;
+ cache->reset(0);
+ DBUG_RETURN((range_seq_t) init_param);
+}
+
+
+/*
+ Get the key over the next record from the join buffer used by BKA_UNIQUE
+
+ SYNOPSIS
+ bka_unique_range_seq_next()
+ seq value returned by bka_unique_range_seq_init()
+ range OUT reference to the next range
+
+ DESCRIPTION
+ The function interprets seq as a pointer to the JOIN_CACHE_BKA_UNIQUE
+ object. The function returns a pointer to the range descriptor
+ for the next unique key built over records from the join buffer.
+
+ NOTE
+ This function are used only as a callback function.
+
+ RETURN
+ 0 ok, the range structure filled with info about the next key
+ 1 no more ranges
+*/
+
+static
+uint bka_unique_range_seq_next(range_seq_t rseq, KEY_MULTI_RANGE *range)
+{
+ DBUG_ENTER("bka_unique_range_seq_next");
+ JOIN_CACHE_BKA_UNIQUE *cache= (JOIN_CACHE_BKA_UNIQUE *) rseq;
+ TABLE_REF *ref= &cache->join_tab->ref;
+ key_range *start_key= &range->start_key;
+ if ((start_key->length= cache->get_next_key((uchar **) &start_key->key)))
+ {
+ start_key->keypart_map= (1 << ref->key_parts) - 1;
+ start_key->flag= HA_READ_KEY_EXACT;
+ range->end_key= *start_key;
+ range->end_key.flag= HA_READ_AFTER_KEY;
+ range->ptr= (char *) cache->get_curr_key_chain();
+ range->range_flag= EQ_RANGE;
+ DBUG_RETURN(0);
+ }
+ DBUG_RETURN(1);
+}
+
+
+/*
+ Check whether range_info orders to skip the next record from BKA_UNIQUE buffer
+
+ SYNOPSIS
+ bka_unique_range_seq_skip_record()
+ seq value returned by bka_unique_range_seq_init()
+ range_info information about the next range
+ rowid [NOT USED] rowid of the record to be checked (not used)
+
+ DESCRIPTION
+ The function interprets seq as a pointer to the JOIN_CACHE_BKA_UNIQUE
+ object. The function returns TRUE if the record with this range_info
+ is to be filtered out from the stream of records returned by
+ multi_range_read_next().
+
+ NOTE
+ This function are used only as a callback function.
+
+ RETURN
+ 1 record with this range_info is to be filtered out from the stream
+ of records returned by multi_range_read_next()
+ 0 the record is to be left in the stream
+*/
+
+static
+bool bka_unique_range_seq_skip_record(range_seq_t rseq, char *range_info,
+ uchar *rowid)
+{
+ DBUG_ENTER("bka_unique_range_seq_skip_record");
+ JOIN_CACHE_BKA_UNIQUE *cache= (JOIN_CACHE_BKA_UNIQUE *) rseq;
+ bool res= cache->check_all_match_flags_for_key((uchar *) range_info);
+ DBUG_RETURN(res);
+}
+
+
+/*
+ Check if the record combination matches the index condition
+
+ SYNOPSIS
+ JOIN_CACHE_BKA_UNIQUE::skip_index_tuple()
+ rseq Value returned by bka_range_seq_init()
+ range_info MRR range association data
+
+ DESCRIPTION
+ See JOIN_CACHE_BKA::skip_index_tuple().
+ This function is the variant for use with
+ JOIN_CACHE_BKA_UNIQUE. The difference from JOIN_CACHE_BKA case is that
+ there may be multiple previous table record combinations that share the
+ same key, i.e. they map to the same MRR range.
+ As a consequence, we need to loop through all previous table record
+ combinations that match the given MRR range key range_info until we find
+ one that satisfies the index condition.
+
+ NOTE
+ Possible optimization:
+ Before we unpack the record from a previous table
+ check if this table is used in the condition.
+ If so then unpack the record otherwise skip the unpacking.
+ This should be done by a special virtual method
+ get_partial_record_by_pos().
+
+ RETURN
+ 0 The record combination satisfies the index condition
+ 1 Otherwise
+
+
+*/
+
+bool JOIN_CACHE_BKA_UNIQUE::skip_index_tuple(range_seq_t rseq, char *range_info)
+{
+ DBUG_ENTER("JOIN_CACHE_BKA_UNIQUE::skip_index_tuple");
+ JOIN_CACHE_BKA_UNIQUE *cache= (JOIN_CACHE_BKA_UNIQUE *) rseq;
+ uchar *last_rec_ref_ptr= cache->get_next_rec_ref((uchar*) range_info);
+ uchar *next_rec_ref_ptr= last_rec_ref_ptr;
+ do
+ {
+ next_rec_ref_ptr= cache->get_next_rec_ref(next_rec_ref_ptr);
+ uchar *rec_ptr= next_rec_ref_ptr + cache->rec_fields_offset;
+ cache->get_record_by_pos(rec_ptr);
+ if (join_tab->cache_idx_cond->val_int())
+ DBUG_RETURN(FALSE);
+ } while(next_rec_ref_ptr != last_rec_ref_ptr);
+ DBUG_RETURN(TRUE);
+}
+
+
+/*
+ Check if the record combination matches the index condition
+
+ SYNOPSIS
+ bka_unique_skip_index_tuple()
+ rseq Value returned by bka_range_seq_init()
+ range_info MRR range association data
+
+ DESCRIPTION
+ This is wrapper for JOIN_CACHE_BKA_UNIQUE::skip_index_tuple method,
+ see comments there.
+
+ NOTE
+ This function is used as a RANGE_SEQ_IF::skip_index_tuple callback.
+
+ RETURN
+ 0 The record combination satisfies the index condition
+ 1 Otherwise
+*/
+
+static
+bool bka_unique_skip_index_tuple(range_seq_t rseq, char *range_info)
+{
+ DBUG_ENTER("bka_unique_skip_index_tuple");
+ JOIN_CACHE_BKA_UNIQUE *cache= (JOIN_CACHE_BKA_UNIQUE *) rseq;
+ DBUG_RETURN(cache->skip_index_tuple(rseq, range_info));
+}
+
+
+/*
+ Using BKA_UNIQUE find matches from the next table for records from join buffer
+
+ SYNOPSIS
+ join_matching_records()
+ skip_last do not look for matches for the last partial join record
+
+ DESCRIPTION
+ This function can be used only when the table join_tab can be accessed
+ by keys built over the fields of previous join tables.
+ The function retrieves all keys from the hash table of the join buffer
+ built for partial join records from the buffer. For each of these keys
+ the function performs an index lookup and tries to match records yielded
+ by this lookup with records from the join buffer attached to the key.
+ If a match is found the function will call the sub_select function trying
+ to look for matches for the remaining join operations.
+ This function does not assume that matching records are necessarily
+ returned with references to the keys by which they were found. If the call
+ of the function multi_range_read_init returns flags with
+ HA_MRR_NO_ASSOCIATION then a search for the key built from the returned
+ record is carried on. The search is performed by probing in in the hash
+ table of the join buffer.
+ This function currently is called only from the function join_records.
+ It's assumed that this function is always called with the skip_last
+ parameter equal to false.
+
+ RETURN
+ return one of enum_nested_loop_state
+*/
+
+enum_nested_loop_state
+JOIN_CACHE_BKA_UNIQUE::join_matching_records(bool skip_last)
+{
+ int error;
+ uchar *key_chain_ptr;
+ handler *file= join_tab->table->file;
+ enum_nested_loop_state rc= NESTED_LOOP_OK;
+ bool check_only_first_match= join_tab->check_only_first_match();
+ bool no_association= test(mrr_mode & HA_MRR_NO_ASSOCIATION);
+
+ /* Set functions to iterate over keys in the join buffer */
+ RANGE_SEQ_IF seq_funcs= { bka_unique_range_seq_init,
+ bka_unique_range_seq_next,
+ check_only_first_match && !no_association ?
+ bka_unique_range_seq_skip_record : 0,
+ join_tab->cache_idx_cond ?
+ bka_unique_skip_index_tuple : 0 };
+
+ /* The value of skip_last must be always FALSE when this function is called */
+ DBUG_ASSERT(!skip_last);
+
+ /* Return at once if there are no records in the join buffer */
+ if (!records)
+ return NESTED_LOOP_OK;
+
+ rc= init_join_matching_records(&seq_funcs, key_entries);
+ if (rc != NESTED_LOOP_OK)
+ goto finish;
+
+ while (!(error= file->multi_range_read_next((char **) &key_chain_ptr)))
+ {
+ if (no_association)
+ {
+ uchar *key_ref_ptr;
+ TABLE *table= join_tab->table;
+ TABLE_REF *ref= &join_tab->ref;
+ KEY *keyinfo= table->key_info+ref->key;
+ /*
+ Build the key value out of the record returned by the call of
+ multi_range_read_next in the record buffer
+ */
+ key_copy(ref->key_buff, table->record[0], keyinfo, ref->key_length);
+ /* Look for this key in the join buffer */
+ if (!key_search(ref->key_buff, ref->key_length, &key_ref_ptr))
+ continue;
+ key_chain_ptr= key_ref_ptr+get_size_of_key_offset();
+ }
+
+ uchar *last_rec_ref_ptr= get_next_rec_ref(key_chain_ptr);
+ uchar *next_rec_ref_ptr= last_rec_ref_ptr;
+ do
+ {
+ next_rec_ref_ptr= get_next_rec_ref(next_rec_ref_ptr);
+ uchar *rec_ptr= next_rec_ref_ptr+rec_fields_offset;
+
+ if (join->thd->killed)
+ {
+ /* The user has aborted the execution of the query */
+ join->thd->send_kill_message();
+ rc= NESTED_LOOP_KILLED;
+ goto finish;
+ }
+ /*
+ If only the first match is needed and it has been already found
+ for the associated partial join record then the returned candidate
+ is discarded.
+ */
+ if (rc == NESTED_LOOP_OK &&
+ (!check_only_first_match || !get_match_flag_by_pos(rec_ptr)))
+ {
+ get_record_by_pos(rec_ptr);
+ rc= generate_full_extensions(rec_ptr);
+ if (rc != NESTED_LOOP_OK && rc != NESTED_LOOP_NO_MORE_ROWS)
+ goto finish;
+ }
+ }
+ while (next_rec_ref_ptr != last_rec_ref_ptr);
+ }
+
+ if (error > 0 && error != HA_ERR_END_OF_FILE)
+ return NESTED_LOOP_ERROR;
+finish:
+ return end_join_matching_records(rc);
+}
+
+
+/*
+ Check whether all records in a key chain have their match flags set on
+
+ SYNOPSIS
+ check_all_match_flags_for_key()
+ key_chain_ptr
+
+ DESCRIPTION
+ This function retrieves records in the given circular chain and checks
+ whether their match flags are set on. The parameter key_chain_ptr shall
+ point to the position in the join buffer storing the reference to the
+ last element of this chain.
+
+ RETURN
+ TRUE if each retrieved record has its match flag set on
+ FALSE otherwise
+*/
+
+bool JOIN_CACHE_BKA_UNIQUE::check_all_match_flags_for_key(uchar *key_chain_ptr)
+{
+ uchar *last_rec_ref_ptr= get_next_rec_ref(key_chain_ptr);
+ uchar *next_rec_ref_ptr= last_rec_ref_ptr;
+ do
+ {
+ next_rec_ref_ptr= get_next_rec_ref(next_rec_ref_ptr);
+ uchar *rec_ptr= next_rec_ref_ptr+rec_fields_offset;
+ if (!get_match_flag_by_pos(rec_ptr))
+ return FALSE;
+ }
+ while (next_rec_ref_ptr != last_rec_ref_ptr);
+ return TRUE;
+}
+
+
+/*
+ Get the next key built for the records from BKA_UNIQUE join buffer
+
+ SYNOPSIS
+ get_next_key()
+ key pointer to the buffer where the key value is to be placed
+
+ DESCRIPTION
+ The function reads the next key value stored in the hash table of the
+ join buffer. Depending on the value of the use_emb_key flag of the
+ join cache the value is read either from the table itself or from
+ the record field where it occurs.
+
+ RETURN
+ length of the key value - if the starting value of 'cur_key_entry' refers
+ to the position after that referred by the the value of 'last_key_entry'
+ 0 - otherwise.
+*/
+
+uint JOIN_CACHE_BKA_UNIQUE::get_next_key(uchar ** key)
+{
+ if (curr_key_entry == last_key_entry)
+ return 0;
+
+ curr_key_entry-= key_entry_length;
+
+ *key = use_emb_key ? get_emb_key(curr_key_entry) : curr_key_entry;
+
+ DBUG_ASSERT(*key >= buff && *key < hash_table);
+
+ return key_length;
+}
+
+
+/****************************************************************************
+ * Join cache module end
+ ****************************************************************************/
=== modified file 'sql/sql_select.cc'
--- a/sql/sql_select.cc 2009-12-15 17:23:55 +0000
+++ b/sql/sql_select.cc 2009-12-21 02:26:15 +0000
@@ -94,7 +94,7 @@ static store_key *get_store_key(THD *thd
uint maybe_null);
static void make_outerjoin_info(JOIN *join);
static bool make_join_select(JOIN *join,SQL_SELECT *select,COND *item);
-static void make_join_readinfo(JOIN *join, ulonglong options);
+static bool make_join_readinfo(JOIN *join, ulonglong options, uint no_jbuf_after);
static bool only_eq_ref_tables(JOIN *join, ORDER *order, table_map tables);
static void update_depend_map(JOIN *join);
static void update_depend_map(JOIN *join, ORDER *order);
@@ -165,7 +165,7 @@ static int join_no_more_records(READ_REC
static int join_read_next(READ_RECORD *info);
static int join_init_quick_read_record(JOIN_TAB *tab);
static int test_if_quick_select(JOIN_TAB *tab);
-static int join_init_read_record(JOIN_TAB *tab);
+static bool test_if_use_dynamic_range_scan(JOIN_TAB *join_tab);
static int join_read_first(JOIN_TAB *tab);
static int join_read_next(READ_RECORD *info);
static int join_read_next_same(READ_RECORD *info);
@@ -199,14 +199,7 @@ static int remove_dup_with_compare(THD *
ulong offset,Item *having);
static int remove_dup_with_hash_index(THD *thd,TABLE *table,
uint field_count, Field **first_field,
-
ulong key_length,Item *having);
-static int join_init_cache(THD *thd,JOIN_TAB *tables,uint table_count);
-static ulong used_blob_length(CACHE_FIELD **ptr);
-static bool store_record_in_cache(JOIN_CACHE *cache);
-static void reset_cache_read(JOIN_CACHE *cache);
-static void reset_cache_write(JOIN_CACHE *cache);
-static void read_cached_record(JOIN_TAB *tab);
static bool cmp_buffer_with_ref(JOIN_TAB *tab);
static bool setup_new_fields(THD *thd, List<Item> &fields,
List<Item> &all_fields, ORDER *new_order);
@@ -242,6 +235,7 @@ static Item *remove_additional_cond(Item
static void add_group_and_distinct_keys(JOIN *join, JOIN_TAB *join_tab);
static bool test_if_ref(COND *root_cond,
Item_field *left_item,Item *right_item);
+static uint make_join_orderinfo(JOIN *join);
/**
This handles SELECT with and without UNION.
@@ -776,6 +770,9 @@ static void save_index_subquery_explain_
int
JOIN::optimize()
{
+ ulonglong select_opts_for_readinfo;
+ uint no_jbuf_after;
+
DBUG_ENTER("JOIN::optimize");
// to prevent double initialization on EXPLAIN
if (optimized)
@@ -1268,12 +1265,23 @@ JOIN::optimize()
(group_list && order) ||
test(select_options & OPTION_BUFFER_RESULT)));
- // No cache for MATCH
- make_join_readinfo(this,
- (select_options & (SELECT_DESCRIBE |
- SELECT_NO_JOIN_CACHE)) |
- (select_lex->ftfunc_list->elements ?
- SELECT_NO_JOIN_CACHE : 0));
+ /*
+ If the hint FORCE INDEX FOR ORDER BY/GROUP BY is used for the table
+ whose columns are required to be returned in a sorted order, then
+ the proper value for no_jbuf_after should be yielded by a call to
+ the make_join_orderinfo function.
+ Yet the current implementation of FORCE INDEX hints does not
+ allow us to do it in a clean manner.
+ */
+ no_jbuf_after= 1 ? tables : make_join_orderinfo(this);
+
+ select_opts_for_readinfo=
+ (select_options & (SELECT_DESCRIBE | SELECT_NO_JOIN_CACHE)) |
+ (select_lex->ftfunc_list->elements ? SELECT_NO_JOIN_CACHE : 0);
+
+ // No cache for MATCH == 'Don't use join buffering when we use MATCH'.
+ if (make_join_readinfo(this, select_opts_for_readinfo, no_jbuf_after))
+ DBUG_RETURN(1);
/* Perform FULLTEXT search before all regular searches */
if (!(select_options & SELECT_DESCRIBE))
@@ -5418,13 +5426,14 @@ find_best(JOIN *join,table_map rest_tabl
Find how much space the prevous read not const tables takes in cache.
*/
-static void calc_used_field_length(THD *thd, JOIN_TAB *join_tab)
+void calc_used_field_length(THD *thd, JOIN_TAB *join_tab)
{
uint null_fields,blobs,fields,rec_length;
Field **f_ptr,*field;
- MY_BITMAP *read_set= join_tab->table->read_set;;
+ uint uneven_bit_fields;
+ MY_BITMAP *read_set= join_tab->table->read_set;
- null_fields= blobs= fields= rec_length=0;
+ uneven_bit_fields= null_fields= blobs= fields= rec_length=0;
for (f_ptr=join_tab->table->field ; (field= *f_ptr) ; f_ptr++)
{
if (bitmap_is_set(read_set, field->field_index))
@@ -5436,21 +5445,26 @@ static void calc_used_field_length(THD *
blobs++;
if (!(flags & NOT_NULL_FLAG))
null_fields++;
+ if (field->type() == MYSQL_TYPE_BIT &&
+ ((Field_bit*)field)->bit_len)
+ uneven_bit_fields++;
}
}
- if (null_fields)
+ if (null_fields || uneven_bit_fields)
rec_length+=(join_tab->table->s->null_fields+7)/8;
if (join_tab->table->maybe_null)
rec_length+=sizeof(my_bool);
if (blobs)
{
uint blob_length=(uint) (join_tab->table->file->stats.mean_rec_length-
- (join_tab->table->s->reclength- rec_length));
+ (join_tab->table->s->reclength-rec_length));
rec_length+=(uint) max(4,blob_length);
}
join_tab->used_fields=fields;
join_tab->used_fieldlength=rec_length;
join_tab->used_blobs=blobs;
+ join_tab->used_null_fields= null_fields;
+ join_tab->used_uneven_bit_fields= uneven_bit_fields;
}
@@ -5873,7 +5887,9 @@ JOIN::make_simple_join(JOIN *parent, TAB
row_limit= unit->select_limit_cnt;
do_send_rows= row_limit ? 1 : 0;
- join_tab->cache.buff=0; /* No caching */
+ join_tab->use_join_cache= FALSE;
+ join_tab->cache=0; /* No caching */
+ join_tab->cache_select= 0;
join_tab->table=tmp_table;
join_tab->select=0;
join_tab->set_select_cond(NULL, __LINE__);
@@ -6417,7 +6433,8 @@ make_join_select(JOIN *join,SQL_SELECT *
2 : 1;
sel->read_tables= used_tables & ~current_map;
}
- if (i != join->const_tables && tab->use_quick != 2)
+ if (i != join->const_tables && tab->use_quick != 2 &&
+ !tab->first_inner)
{ /* Read with cache */
if (cond &&
(tmp=make_cond_for_table(cond,
@@ -6426,10 +6443,10 @@ make_join_select(JOIN *join,SQL_SELECT *
current_map)))
{
DBUG_EXECUTE("where",print_where(tmp,"cache", QT_ORDINARY););
- tab->cache.select=(SQL_SELECT*)
+ tab->cache_select=(SQL_SELECT*)
thd->memdup((uchar*) sel, sizeof(SQL_SELECT));
- tab->cache.select->cond=tmp;
- tab->cache.select->read_tables=join->const_table_map;
+ tab->cache_select->cond=tmp;
+ tab->cache_select->read_tables=join->const_table_map;
}
}
}
@@ -6465,6 +6482,8 @@ make_join_select(JOIN *join,SQL_SELECT *
if (!cond_tab->select_cond)
DBUG_RETURN(1);
cond_tab->select_cond->quick_fix_field();
+ if (cond_tab->select)
+ cond_tab->select->cond= cond_tab->select_cond;
}
}
@@ -6530,8 +6549,338 @@ make_join_select(JOIN *join,SQL_SELECT *
DBUG_RETURN(0);
}
-static void
-make_join_readinfo(JOIN *join, ulonglong options)
+/*
+ Determine {after which table we'll produce ordered set}
+
+ SYNOPSIS
+ make_join_orderinfo()
+ join
+
+
+ DESCRIPTION
+ Determine if the set is already ordered for ORDER BY, so it can
+ disable join cache because it will change the ordering of the results.
+ Code handles sort table that is at any location (not only first after
+ the const tables) despite the fact that it's currently prohibited.
+ We must disable join cache if the first non-const table alone is
+ ordered. If there is a temp table the ordering is done as a last
+ operation and doesn't prevent join cache usage.
+
+ RETURN
+ Number of table after which the set will be ordered
+ join->tables if we don't need an ordered set
+*/
+
+static uint make_join_orderinfo(JOIN *join)
+{
+ JOIN_TAB *tab;
+ if (join->need_tmp)
+ return join->tables;
+ tab= join->get_sort_by_join_tab();
+ return tab ? tab-join->join_tab : join->tables;
+}
+
+/*
+ Deny usage of join buffer for the specified table
+
+ SYNOPSIS
+ set_join_cache_denial()
+ tab join table for which join buffer usage is to be denied
+
+ DESCRIPTION
+ The function denies usage of join buffer when joining the table 'tab'.
+ The table is marked as not employing any join buffer. If a join cache
+ object has been already allocated for the table this object is destroyed.
+
+ RETURN
+ none
+*/
+
+static
+void set_join_cache_denial(JOIN_TAB *join_tab)
+{
+ if (join_tab->cache)
+ {
+ join_tab->cache->free();
+ join_tab->cache= 0;
+ }
+ if (join_tab->use_join_cache)
+ {
+ join_tab->use_join_cache= FALSE;
+ /*
+ It could be only sub_select(). It could not be sub_seject_sjm because we
+ don't do join buffering for the first table in sjm nest.
+ */
+ join_tab[-1].next_select= sub_select;
+ }
+}
+
+
+/*
+ Revise usage of join buffer for the specified table and the whole nest
+
+ SYNOPSIS
+ revise_cache_usage()
+ tab join table for which join buffer usage is to be revised
+
+ DESCRIPTION
+ The function revise the decision to use a join buffer for the table 'tab'.
+ If this table happened to be among the inner tables of a nested outer join/
+ semi-join the functions denies usage of join buffers for all of them
+
+ RETURN
+ none
+*/
+
+static
+void revise_cache_usage(JOIN_TAB *join_tab)
+{
+ JOIN_TAB *tab;
+ JOIN_TAB *first_inner;
+
+ if (join_tab->first_inner)
+ {
+ JOIN_TAB *end_tab= join_tab;
+ for (first_inner= join_tab->first_inner;
+ first_inner;
+ first_inner= first_inner->first_upper)
+ {
+ for (tab= end_tab-1; tab >= first_inner; tab--)
+ set_join_cache_denial(tab);
+ end_tab= first_inner;
+ }
+ }
+ else if (join_tab->first_sj_inner_tab)
+ {
+ first_inner= join_tab->first_sj_inner_tab;
+ for (tab= join_tab-1; tab >= first_inner; tab--)
+ {
+ if (tab->first_sj_inner_tab == first_inner)
+ set_join_cache_denial(tab);
+ }
+ }
+ else set_join_cache_denial(join_tab);
+}
+
+
+/*
+ Check whether a join buffer can be used to join the specified table
+
+ SYNOPSIS
+ check_join_cache_usage()
+ tab joined table to check join buffer usage for
+ join join for which the check is performed
+ options options of the join
+ no_jbuf_after don't use join buffering after table with this number
+ icp_other_tables_ok OUT TRUE if condition pushdown supports
+ other tables presence
+
+ DESCRIPTION
+ The function finds out whether the table 'tab' can be joined using a join
+ buffer. This check is performed after the best execution plan for 'join'
+ has been chosen. If the function decides that a join buffer can be employed
+ then it selects the most appropriate join cache object that contains this
+ join buffer.
+ The result of the check and the type of the the join buffer to be used
+ depend on:
+ - the access method to access rows of the joined table
+ - whether the join table is an inner table of an outer join or semi-join
+ - the join cache level set for the query
+ - the join 'options'.
+ In any case join buffer is not used if the number of the joined table is
+ greater than 'no_jbuf_after'. It's also never used if the value of
+ join_cache_level is equal to 0.
+ The other valid settings of join_cache_level lay in the interval 1..8.
+ If join_cache_level==1|2 then join buffer is used only for inner joins
+ with 'JT_ALL' access method.
+ If join_cache_level==3|4 then join buffer is used for any join operation
+ (inner join, outer join, semi-join) with 'JT_ALL' access method.
+ If 'JT_ALL' access method is used to read rows of the joined table then
+ always a JOIN_CACHE_BNL object is employed.
+ If an index is used to access rows of the joined table and the value of
+ join_cache_level==5|6 then a JOIN_CACHE_BKA object is employed.
+ If an index is used to access rows of the joined table and the value of
+ join_cache_level==7|8 then a JOIN_CACHE_BKA_UNIQUE object is employed.
+ If the value of join_cache_level is odd then creation of a non-linked
+ join cache is forced.
+ If the function decides that a join buffer can be used to join the table
+ 'tab' then it sets the value of tab->use_join_buffer to TRUE and assigns
+ the selected join cache object to the field 'cache' of the previous
+ join table.
+ If the function creates a join cache object it tries to initialize it. The
+ failure to do this results in an invocation of the function that destructs
+ the created object.
+
+ NOTES
+ An inner table of a nested outer join or a nested semi-join can be currently
+ joined only when a linked cache object is employed. In these cases setting
+ join cache level to an odd number results in denial of usage of any join
+ buffer when joining the table.
+ For a nested outer join/semi-join, currently, we either use join buffers for
+ all inner tables or for none of them.
+ Some engines (e.g. Falcon) currently allow to use only a join cache
+ of the type JOIN_CACHE_BKA_UNIQUE when the joined table is accessed through
+ an index. For these engines setting the value of join_cache_level to 5 or 6
+ results in that no join buffer is used to join the table.
+
+ TODO
+ Support BKA inside SJ-Materialization nests. When doing this, we'll need
+ to only store sj-inner tables in the join buffer.
+#if 0
+ JOIN_TAB *first_tab= join->join_tab+join->const_tables;
+ uint n_tables= i-join->const_tables;
+ / *
+ We normally put all preceding tables into the join buffer, except
+ for the constant tables.
+ If we're inside a semi-join materialization nest, e.g.
+
+ outer_tbl1 outer_tbl2 ( inner_tbl1, inner_tbl2 ) ...
+ ^-- we're here
+
+ then we need to put into the join buffer only the tables from
+ within the nest.
+ * /
+ if (i >= first_sjm_table && i < last_sjm_table)
+ {
+ n_tables= i - first_sjm_table; // will be >0 if we got here
+ first_tab= join->join_tab + first_sjm_table;
+ }
+#endif
+
+ RETURN
+
+ cache level if cache is used, otherwise returns 0
+*/
+
+static
+uint check_join_cache_usage(JOIN_TAB *tab,
+ JOIN *join, ulonglong options,
+ uint no_jbuf_after,
+ bool *icp_other_tables_ok)
+{
+ uint flags;
+ COST_VECT cost;
+ ha_rows rows;
+ uint bufsz= 4096;
+ JOIN_CACHE *prev_cache=0;
+ uint cache_level= join->thd->variables.join_cache_level;
+ bool force_unlinked_cache= test(cache_level & 1);
+ uint i= tab - join->join_tab;
+
+ *icp_other_tables_ok= TRUE;
+ if (cache_level == 0 || i == join->const_tables)
+ return 0;
+
+ if (options & SELECT_NO_JOIN_CACHE)
+ goto no_join_cache;
+ /*
+ psergey-todo: why the below when execution code seems to handle the
+ "range checked for each record" case?
+ */
+ if (tab->use_quick == 2)
+ goto no_join_cache;
+ /*
+ Use join cache with FirstMatch semi-join strategy only when semi-join
+ contains only one table.
+ */
+ if (tab->is_inner_table_of_semi_join_with_first_match() &&
+ !tab->is_single_inner_of_semi_join_with_first_match())
+ goto no_join_cache;
+ /*
+ Non-linked join buffers can't guarantee one match
+ */
+ if (force_unlinked_cache &&
+ (tab->is_inner_table_of_outer_join() &&
+ !tab->is_single_inner_of_outer_join()))
+ goto no_join_cache;
+
+ /*
+ Don't use join buffering if we're dictated not to by no_jbuf_after (this
+ ...)
+ */
+/* !!!NB igor: Enable the statement in the comment after backporting the SJ code
+ if (!(i <= no_jbuf_after) || tab->loosescan_match_tab ||
+ sj_is_materialize_strategy(join->best_positions[i].sj_strategy))
+ goto no_join_cache;
+*/
+
+ for (JOIN_TAB *first_inner= tab->first_inner; first_inner;
+ first_inner= first_inner->first_upper)
+ {
+ if (first_inner != tab && !first_inner->use_join_cache)
+ goto no_join_cache;
+ }
+ if (tab->first_sj_inner_tab && tab->first_sj_inner_tab != tab &&
+ !tab->first_sj_inner_tab->use_join_cache)
+ goto no_join_cache;
+ if (!tab[-1].use_join_cache)
+ {
+ /*
+ Check whether table tab and the previous one belong to the same nest of
+ inner tables and if so do not use join buffer when joining table tab.
+ */
+ if (tab->first_inner)
+ {
+ for (JOIN_TAB *first_inner= tab[-1].first_inner;
+ first_inner;
+ first_inner= first_inner->first_upper)
+ {
+ if (first_inner == tab->first_inner)
+ goto no_join_cache;
+ }
+ }
+ else if (tab->first_sj_inner_tab &&
+ tab->first_sj_inner_tab == tab[-1].first_sj_inner_tab)
+ goto no_join_cache;
+ }
+
+ if (!force_unlinked_cache)
+ prev_cache= tab[-1].cache;
+
+ switch (tab->type) {
+ case JT_ALL:
+ if (cache_level <= 2 && (tab->first_inner || tab->first_sj_inner_tab))
+ goto no_join_cache;
+ if ((options & SELECT_DESCRIBE) ||
+ (((tab->cache= new JOIN_CACHE_BNL(join, tab, prev_cache))) &&
+ !tab->cache->init()))
+ {
+ *icp_other_tables_ok= FALSE;
+ return cache_level;
+ }
+ goto no_join_cache;
+ case JT_SYSTEM:
+ case JT_CONST:
+ case JT_REF:
+ case JT_EQ_REF:
+ if (cache_level <= 4)
+ return 0;
+ flags= HA_MRR_NO_NULL_ENDPOINTS;
+ if (tab->table->covering_keys.is_set(tab->ref.key))
+ flags|= HA_MRR_INDEX_ONLY;
+ rows= tab->table->file->multi_range_read_info(tab->ref.key, 10, 20,
+ &bufsz, &flags, &cost);
+ if ((rows != HA_POS_ERROR) && !(flags & HA_MRR_USE_DEFAULT_IMPL) &&
+ (!(flags & HA_MRR_NO_ASSOCIATION) || cache_level > 6) &&
+ ((options & SELECT_DESCRIBE) ||
+ (((cache_level <= 6 &&
+ (tab->cache= new JOIN_CACHE_BKA(join, tab, flags, prev_cache))) ||
+ (cache_level > 6 &&
+ (tab->cache= new JOIN_CACHE_BKA_UNIQUE(join, tab, flags, prev_cache)))
+ ) && !tab->cache->init())))
+ return cache_level;
+ goto no_join_cache;
+ default : ;
+ }
+
+no_join_cache:
+ if (cache_level>2)
+ revise_cache_usage(tab);
+ return 0;
+}
+
+static bool
+make_join_readinfo(JOIN *join, ulonglong options, uint no_jbuf_after)
{
uint i;
bool statistics= test(!(join->select_options & SELECT_DESCRIBE));
@@ -6543,6 +6892,7 @@ make_join_readinfo(JOIN *join, ulonglong
{
JOIN_TAB *tab=join->join_tab+i;
TABLE *table=tab->table;
+ bool icp_other_tables_ok;
tab->read_record.table= table;
tab->read_record.file=table->file;
tab->next_select=sub_select; /* normal select */
@@ -6565,14 +6915,18 @@ make_join_readinfo(JOIN *join, ulonglong
sorted= 0; // only first must be sorted
switch (tab->type) {
case JT_SYSTEM: // Only happens with left join
- table->status=STATUS_NO_RECORD;
- tab->read_first_record= join_read_system;
- tab->read_record.read_record= join_no_more_records;
- break;
case JT_CONST: // Only happens with left join
+ /* Only happens with outer joins */
table->status=STATUS_NO_RECORD;
- tab->read_first_record= join_read_const;
+ tab->read_first_record= tab->type == JT_SYSTEM ?
+ join_read_system :join_read_const;
tab->read_record.read_record= join_no_more_records;
+ if (check_join_cache_usage(tab, join, options, no_jbuf_after,
+ &icp_other_tables_ok))
+ {
+ tab->use_join_cache= TRUE;
+ tab[-1].next_select=sub_select_cache;
+ }
if (table->covering_keys.is_set(tab->ref.key) &&
!table->no_keyread)
{
@@ -6580,7 +6934,7 @@ make_join_readinfo(JOIN *join, ulonglong
table->file->extra(HA_EXTRA_KEYREAD);
}
else
- push_index_cond(tab, tab->ref.key, TRUE);
+ push_index_cond(tab, tab->ref.key, icp_other_tables_ok);
break;
case JT_EQ_REF:
table->status=STATUS_NO_RECORD;
@@ -6593,6 +6947,12 @@ make_join_readinfo(JOIN *join, ulonglong
tab->quick=0;
tab->read_first_record= join_read_key;
tab->read_record.read_record= join_no_more_records;
+ if (check_join_cache_usage(tab, join, options, no_jbuf_after,
+ &icp_other_tables_ok))
+ {
+ tab->use_join_cache= TRUE;
+ tab[-1].next_select=sub_select_cache;
+ }
if (table->covering_keys.is_set(tab->ref.key) &&
!table->no_keyread)
{
@@ -6600,7 +6960,7 @@ make_join_readinfo(JOIN *join, ulonglong
table->file->extra(HA_EXTRA_KEYREAD);
}
else
- push_index_cond(tab, tab->ref.key, TRUE);
+ push_index_cond(tab, tab->ref.key, icp_other_tables_ok );
break;
case JT_REF_OR_NULL:
case JT_REF:
@@ -6612,6 +6972,12 @@ make_join_readinfo(JOIN *join, ulonglong
}
delete tab->quick;
tab->quick=0;
+ if (check_join_cache_usage(tab, join, options, no_jbuf_after,
+ &icp_other_tables_ok))
+ {
+ tab->use_join_cache= TRUE;
+ tab[-1].next_select=sub_select_cache;
+ }
if (tab->type == JT_REF)
{
tab->read_first_record= join_read_always_key;
@@ -6629,7 +6995,7 @@ make_join_readinfo(JOIN *join, ulonglong
table->file->extra(HA_EXTRA_KEYREAD);
}
else
- push_index_cond(tab, tab->ref.key, TRUE);
+ push_index_cond(tab, tab->ref.key, icp_other_tables_ok);
break;
case JT_FT:
table->status=STATUS_NO_RECORD;
@@ -6642,15 +7008,11 @@ make_join_readinfo(JOIN *join, ulonglong
If the incoming data set is already sorted don't use cache.
*/
table->status=STATUS_NO_RECORD;
- if (i != join->const_tables && !(options & SELECT_NO_JOIN_CACHE) &&
- tab->use_quick != 2 && !tab->first_inner && !ordered_set)
+ if (check_join_cache_usage(tab, join, options, no_jbuf_after,
+ &icp_other_tables_ok))
{
- if ((options & SELECT_DESCRIBE) ||
- !join_init_cache(join->thd,join->join_tab+join->const_tables,
- i-join->const_tables))
- {
- tab[-1].next_select=sub_select_cache; /* Patch previous */
- }
+ tab->use_join_cache= TRUE;
+ tab[-1].next_select=sub_select_cache;
}
/* These init changes read_record */
if (tab->use_quick == 2)
@@ -6727,7 +7089,7 @@ make_join_readinfo(JOIN *join, ulonglong
}
if (tab->select && tab->select->quick &&
tab->select->quick->index != MAX_KEY && ! tab->table->key_read)
- push_index_cond(tab, tab->select->quick->index, FALSE);
+ push_index_cond(tab, tab->select->quick->index, icp_other_tables_ok);
}
break;
default:
@@ -6739,7 +7101,32 @@ make_join_readinfo(JOIN *join, ulonglong
}
}
join->join_tab[join->tables-1].next_select=0; /* Set by do_select */
- DBUG_VOID_RETURN;
+
+/*
+ If a join buffer is used to join a table the ordering by an index
+ for the first non-constant table cannot be employed anymore.
+ */
+ for (i=join->const_tables ; i < join->tables ; i++)
+ {
+ JOIN_TAB *tab=join->join_tab+i;
+ if (tab->use_join_cache)
+ {
+ JOIN_TAB *sort_by_tab= join->get_sort_by_join_tab();
+ if (sort_by_tab && !join->need_tmp)
+ {
+ join->need_tmp= 1;
+ join->simple_order= join->simple_group= 0;
+ if (sort_by_tab->type == JT_NEXT)
+ {
+ sort_by_tab->type= JT_ALL;
+ sort_by_tab->read_first_record= join_init_read_record;
+ }
+ }
+ break;
+ }
+ }
+
+ DBUG_RETURN(FALSE);
}
@@ -6784,8 +7171,11 @@ void JOIN_TAB::cleanup()
select= 0;
delete quick;
quick= 0;
- x_free(cache.buff);
- cache.buff= 0;
+ if (cache)
+ {
+ cache->free();
+ cache= 0;
+ }
limit= 0;
if (table)
{
@@ -11240,33 +11630,85 @@ do_select(JOIN *join,List<Item> *fields,
}
+/*
+ Fill the join buffer with partial records, retrieve all full matches for them
+
+ SYNOPSIS
+ sub_select_cache()
+ join pointer to the structure providing all context info for the query
+ join_tab the first next table of the execution plan to be retrieved
+ end_records true when we need to perform final steps of the retrieval
+
+ DESCRIPTION
+ For a given table Ti= join_tab from the sequence of tables of the chosen
+ execution plan T1,...,Ti,...,Tn the function just put the partial record
+ t1,...,t[i-1] into the join buffer associated with table Ti unless this
+ is the last record added into the buffer. In this case, the function
+ additionally finds all matching full records for all partial
+ records accumulated in the buffer, after which it cleans the buffer up.
+ If a partial join record t1,...,ti is extended utilizing a dynamic
+ range scan then it is not put into the join buffer. Rather all matching
+ records are found for it at once by the function sub_select.
+
+ NOTES
+ The function implements the algorithmic schema for both Blocked Nested
+ Loop Join and Batched Key Access Join. The difference can be seen only at
+ the level of of the implementation of the put_record and join_records
+ virtual methods for the cache object associated with the join_tab.
+ The put_record method accumulates records in the cache, while the
+ join_records method builds all matching join records and send them into
+ the output stream.
+
+ RETURN
+ return one of enum_nested_loop_state, except NESTED_LOOP_NO_MORE_ROWS.
+*/
+
enum_nested_loop_state
-sub_select_cache(JOIN *join,JOIN_TAB *join_tab,bool end_of_records)
+sub_select_cache(JOIN *join, JOIN_TAB *join_tab, bool end_of_records)
{
enum_nested_loop_state rc;
+ JOIN_CACHE *cache= join_tab->cache;
+
+ DBUG_ENTER("sub_select_cache");
+
+ /* This function cannot be called if join_tab has no associated join buffer */
+ DBUG_ASSERT(cache != NULL);
+
+ join_tab->cache->reset_join(join);
if (end_of_records)
{
- rc= flush_cached_records(join,join_tab,FALSE);
+ rc= cache->join_records(FALSE);
if (rc == NESTED_LOOP_OK || rc == NESTED_LOOP_NO_MORE_ROWS)
- rc= sub_select(join,join_tab,end_of_records);
- return rc;
+ rc= sub_select(join, join_tab, end_of_records);
+ DBUG_RETURN(rc);
}
- if (join->thd->killed) // If aborted by user
+ if (join->thd->killed)
{
+ /* The user has aborted the execution of the query */
join->thd->send_kill_message();
- return NESTED_LOOP_KILLED; /* purecov: inspected */
+ DBUG_RETURN(NESTED_LOOP_KILLED);
}
- if (join_tab->use_quick != 2 || test_if_quick_select(join_tab) <= 0)
+ if (!test_if_use_dynamic_range_scan(join_tab))
{
- if (!store_record_in_cache(&join_tab->cache))
- return NESTED_LOOP_OK; // There is more room in cache
- return flush_cached_records(join,join_tab,FALSE);
+ if (!cache->put_record())
+ DBUG_RETURN(NESTED_LOOP_OK);
+ /*
+ We has decided that after the record we've just put into the buffer
+ won't add any more records. Now try to find all the matching
+ extensions for all records in the buffer.
+ */
+ rc= cache->join_records(FALSE);
+ DBUG_RETURN(rc);
}
- rc= flush_cached_records(join, join_tab, TRUE);
+ /*
+ TODO: Check whether we really need the call below and we can't do
+ without it. If it's not the case remove it.
+ */
+ rc= cache->join_records(TRUE);
if (rc == NESTED_LOOP_OK || rc == NESTED_LOOP_NO_MORE_ROWS)
rc= sub_select(join, join_tab, end_of_records);
- return rc;
+ DBUG_RETURN(rc);
}
/**
@@ -11447,7 +11889,6 @@ sub_select(JOIN *join,JOIN_TAB *join_tab
return rc;
}
-
/**
Process one record of the nested loop join.
@@ -11642,82 +12083,6 @@ evaluate_null_complemented_join_record(J
}
-static enum_nested_loop_state
-flush_cached_records(JOIN *join,JOIN_TAB *join_tab,bool skip_last)
-{
- enum_nested_loop_state rc= NESTED_LOOP_OK;
- int error;
- READ_RECORD *info;
-
- join_tab->table->null_row= 0;
- if (!join_tab->cache.records)
- return NESTED_LOOP_OK; /* Nothing to do */
- if (skip_last)
- (void) store_record_in_cache(&join_tab->cache); // Must save this for later
- if (join_tab->use_quick == 2)
- {
- if (join_tab->select->quick)
- { /* Used quick select last. reset it */
- delete join_tab->select->quick;
- join_tab->select->quick=0;
- }
- }
- /* read through all records */
- if ((error=join_init_read_record(join_tab)))
- {
- reset_cache_write(&join_tab->cache);
- return error < 0 ? NESTED_LOOP_NO_MORE_ROWS: NESTED_LOOP_ERROR;
- }
-
- for (JOIN_TAB *tmp=join->join_tab; tmp != join_tab ; tmp++)
- {
- tmp->status=tmp->table->status;
- tmp->table->status=0;
- }
-
- info= &join_tab->read_record;
- do
- {
- if (join->thd->killed)
- {
- join->thd->send_kill_message();
- return NESTED_LOOP_KILLED; // Aborted by user /* purecov: inspected */
- }
- SQL_SELECT *select=join_tab->select;
- if (rc == NESTED_LOOP_OK)
- update_virtual_fields(join_tab->table);
- if (rc == NESTED_LOOP_OK &&
- (!join_tab->cache.select || !join_tab->cache.select->skip_record()))
- {
- uint i;
- reset_cache_read(&join_tab->cache);
- for (i=(join_tab->cache.records- (skip_last ? 1 : 0)) ; i-- > 0 ;)
- {
- read_cached_record(join_tab);
- if (!select || !select->skip_record())
- {
- rc= (join_tab->next_select)(join,join_tab+1,0);
- if (rc != NESTED_LOOP_OK && rc != NESTED_LOOP_NO_MORE_ROWS)
- {
- reset_cache_write(&join_tab->cache);
- return rc;
- }
- }
- }
- }
- } while (!(error=info->read_record(info)));
-
- if (skip_last)
- read_cached_record(join_tab); // Restore current record
- reset_cache_write(&join_tab->cache);
- if (error > 0) // Fatal error
- return NESTED_LOOP_ERROR; /* purecov: inspected */
- for (JOIN_TAB *tmp2=join->join_tab; tmp2 != join_tab ; tmp2++)
- tmp2->table->status=tmp2->status;
- return NESTED_LOOP_OK;
-}
-
-
/*****************************************************************************
The different ways to read a record
Returns -1 if row was not found, 0 if row was found and 1 on errors
@@ -12131,8 +12496,13 @@ test_if_quick_select(JOIN_TAB *tab)
}
-static int
-join_init_read_record(JOIN_TAB *tab)
+static
+bool test_if_use_dynamic_range_scan(JOIN_TAB *join_tab)
+{
+ return (join_tab->use_quick == 2 && test_if_quick_select(join_tab) > 0);
+}
+
+int join_init_read_record(JOIN_TAB *tab)
{
if (tab->select && tab->select->quick && tab->select->quick->reset())
return 1;
@@ -14329,252 +14699,6 @@ SORT_FIELD *make_unireg_sortorder(ORDER
}
-/*****************************************************************************
- Fill join cache with packed records
- Records are stored in tab->cache.buffer and last record in
- last record is stored with pointers to blobs to support very big
- records
-******************************************************************************/
-
-static int
-join_init_cache(THD *thd,JOIN_TAB *tables,uint table_count)
-{
- reg1 uint i;
- uint length, blobs;
- size_t size;
- CACHE_FIELD *copy,**blob_ptr;
- JOIN_CACHE *cache;
- JOIN_TAB *join_tab;
- DBUG_ENTER("join_init_cache");
-
- cache= &tables[table_count].cache;
- cache->fields=blobs=0;
-
- join_tab=tables;
- for (i=0 ; i < table_count ; i++,join_tab++)
- {
- if (!join_tab->used_fieldlength) /* Not calced yet */
- calc_used_field_length(thd, join_tab);
- cache->fields+=join_tab->used_fields;
- blobs+=join_tab->used_blobs;
- }
- if (!(cache->field=(CACHE_FIELD*)
- sql_alloc(sizeof(CACHE_FIELD)*(cache->fields+table_count*2)+(blobs+1)*
-
- sizeof(CACHE_FIELD*))))
- {
- my_free((uchar*) cache->buff,MYF(0)); /* purecov: inspected */
- cache->buff=0; /* purecov: inspected */
- DBUG_RETURN(1); /* purecov: inspected */
- }
- copy=cache->field;
- blob_ptr=cache->blob_ptr=(CACHE_FIELD**)
- (cache->field+cache->fields+table_count*2);
-
- length=0;
- for (i=0 ; i < table_count ; i++)
- {
- bool have_bit_fields= FALSE;
- uint null_fields=0,used_fields;
- Field **f_ptr,*field;
- MY_BITMAP *read_set= tables[i].table->read_set;
- for (f_ptr=tables[i].table->field,used_fields=tables[i].used_fields ;
- used_fields ;
- f_ptr++)
- {
- field= *f_ptr;
- if (bitmap_is_set(read_set, field->field_index))
- {
- used_fields--;
- length+=field->fill_cache_field(copy);
- if (copy->blob_field)
- (*blob_ptr++)=copy;
- if (field->real_maybe_null())
- null_fields++;
- if (field->type() == MYSQL_TYPE_BIT &&
- ((Field_bit*)field)->bit_len)
- have_bit_fields= TRUE;
- copy++;
- }
- }
- /* Copy null bits from table */
- if (null_fields || have_bit_fields)
- { /* must copy null bits */
- copy->str= tables[i].table->null_flags;
- copy->length= tables[i].table->s->null_bytes;
- copy->strip=0;
- copy->blob_field=0;
- length+=copy->length;
- copy++;
- cache->fields++;
- }
- /* If outer join table, copy null_row flag */
- if (tables[i].table->maybe_null)
- {
- copy->str= (uchar*) &tables[i].table->null_row;
- copy->length=sizeof(tables[i].table->null_row);
- copy->strip=0;
- copy->blob_field=0;
- length+=copy->length;
- copy++;
- cache->fields++;
- }
- }
-
- cache->length=length+blobs*sizeof(char*);
- cache->blobs=blobs;
- *blob_ptr=0; /* End sequentel */
- size=max(thd->variables.join_buff_size, cache->length);
- if (!(cache->buff=(uchar*) my_malloc(size,MYF(0))))
- DBUG_RETURN(1); /* Don't use cache */ /* purecov: inspected */
- cache->end=cache->buff+size;
- reset_cache_write(cache);
- DBUG_RETURN(0);
-}
-
-
-static ulong
-used_blob_length(CACHE_FIELD **ptr)
-{
- uint length,blob_length;
- for (length=0 ; *ptr ; ptr++)
- {
- (*ptr)->blob_length=blob_length=(*ptr)->blob_field->get_length();
- length+=blob_length;
- (*ptr)->blob_field->get_ptr(&(*ptr)->str);
- }
- return length;
-}
-
-
-static bool
-store_record_in_cache(JOIN_CACHE *cache)
-{
- uint length;
- uchar *pos;
- CACHE_FIELD *copy,*end_field;
- bool last_record;
-
- pos=cache->pos;
- end_field=cache->field+cache->fields;
-
- length=cache->length;
- if (cache->blobs)
- length+=used_blob_length(cache->blob_ptr);
- if ((last_record= (length + cache->length > (size_t) (cache->end - pos))))
- cache->ptr_record=cache->records;
-
- /*
- There is room in cache. Put record there
- */
- cache->records++;
- for (copy=cache->field ; copy < end_field; copy++)
- {
- if (copy->blob_field)
- {
- if (last_record)
- {
- copy->blob_field->get_image(pos, copy->length+sizeof(char*),
- copy->blob_field->charset());
- pos+=copy->length+sizeof(char*);
- }
- else
- {
- copy->blob_field->get_image(pos, copy->length, // blob length
- copy->blob_field->charset());
- memcpy(pos+copy->length,copy->str,copy->blob_length); // Blob data
- pos+=copy->length+copy->blob_length;
- }
- }
- else
- {
- if (copy->strip)
- {
- uchar *str,*end;
- for (str=copy->str,end= str+copy->length;
- end > str && end[-1] == ' ' ;
- end--) ;
- length=(uint) (end-str);
- memcpy(pos+2, str, length);
- int2store(pos, length);
- pos+= length+2;
- }
- else
- {
- memcpy(pos,copy->str,copy->length);
- pos+=copy->length;
- }
- }
- }
- cache->pos=pos;
- return last_record || (size_t) (cache->end - pos) < cache->length;
-}
-
-
-static void
-reset_cache_read(JOIN_CACHE *cache)
-{
- cache->record_nr=0;
- cache->pos=cache->buff;
-}
-
-
-static void reset_cache_write(JOIN_CACHE *cache)
-{
- reset_cache_read(cache);
- cache->records= 0;
- cache->ptr_record= (uint) ~0;
-}
-
-
-static void
-read_cached_record(JOIN_TAB *tab)
-{
- uchar *pos;
- uint length;
- bool last_record;
- CACHE_FIELD *copy,*end_field;
-
- last_record=tab->cache.record_nr++ == tab->cache.ptr_record;
- pos=tab->cache.pos;
-
- for (copy=tab->cache.field,end_field=copy+tab->cache.fields ;
- copy < end_field;
- copy++)
- {
- if (copy->blob_field)
- {
- if (last_record)
- {
- copy->blob_field->set_image(pos, copy->length+sizeof(char*),
- copy->blob_field->charset());
- pos+=copy->length+sizeof(char*);
- }
- else
- {
- copy->blob_field->set_ptr(pos, pos+copy->length);
- pos+=copy->length+copy->blob_field->get_length();
- }
- }
- else
- {
- if (copy->strip)
- {
- length= uint2korr(pos);
- memcpy(copy->str, pos+2, length);
- memset(copy->str+length, ' ', copy->length-length);
- pos+= 2 + length;
- }
- else
- {
- memcpy(copy->str,pos,copy->length);
- pos+=copy->length;
- }
- }
- }
- tab->cache.pos=pos;
- return;
-}
/*
@@ -16734,12 +16858,9 @@ static void select_describe(JOIN *join,
if (keyno != MAX_KEY && keyno == table->file->pushed_idx_cond_keyno &&
table->file->pushed_idx_cond)
extra.append(STRING_WITH_LEN("; Using index condition"));
- /*
- psergey: enable the below when we backport BKA:
-
else if (tab->cache_idx_cond)
extra.append(STRING_WITH_LEN("; Using index condition(BKA)"));
- */
+
if (quick_type == QUICK_SELECT_I::QS_TYPE_ROR_UNION ||
quick_type == QUICK_SELECT_I::QS_TYPE_ROR_INTERSECT ||
quick_type == QUICK_SELECT_I::QS_TYPE_INDEX_MERGE)
=== modified file 'sql/sql_select.h'
--- a/sql/sql_select.h 2009-12-15 21:35:55 +0000
+++ b/sql/sql_select.h 2009-12-21 02:26:15 +0000
@@ -94,26 +94,6 @@ typedef struct st_table_ref
} TABLE_REF;
-/**
- CACHE_FIELD and JOIN_CACHE is used on full join to cache records in outer
- table
-*/
-
-typedef struct st_cache_field {
- uchar *str;
- uint length, blob_length;
- Field_blob *blob_field;
- bool strip;
-} CACHE_FIELD;
-
-
-typedef struct st_join_cache {
- uchar *buff,*pos,*end;
- uint records,record_nr,ptr_record,fields,length,blobs;
- CACHE_FIELD *field,**blob_ptr;
- SQL_SELECT *select;
-} JOIN_CACHE;
-
/*
The structs which holds the join connections and join states
@@ -143,6 +123,7 @@ typedef enum_nested_loop_state
typedef int (*Read_record_func)(struct st_join_table *tab);
Next_select_func setup_end_select_func(JOIN *join);
+class JOIN_CACHE;
typedef struct st_join_table {
st_join_table() {} /* Remove gcc warning */
@@ -210,6 +191,9 @@ typedef struct st_join_table {
uint use_quick,index;
uint status; ///< Save status for cache
uint used_fields,used_fieldlength,used_blobs;
+ uint used_null_fields;
+ uint used_rowid_fields;
+ uint used_uneven_bit_fields;
enum join_type type;
bool cached_eq_ref_table,eq_ref_table,not_used_in_distinct;
bool sorted;
@@ -220,11 +204,21 @@ typedef struct st_join_table {
*/
ha_rows limit;
TABLE_REF ref;
- JOIN_CACHE cache;
+ bool use_join_cache;
+ JOIN_CACHE *cache;
+ /*
+ Index condition for BKA access join
+ */
+ Item *cache_idx_cond;
+ SQL_SELECT *cache_select;
JOIN *join;
/** Bitmap of nested joins this table is part of */
nested_join_map embedding_map;
+ /* FirstMatch variables (final QEP) */
+ struct st_join_table *first_sj_inner_tab;
+ struct st_join_table *last_sj_inner_tab;
+
void cleanup();
inline bool is_using_loose_index_scan()
{
@@ -232,6 +226,58 @@ typedef struct st_join_table {
(select->quick->get_type() ==
QUICK_SELECT_I::QS_TYPE_GROUP_MIN_MAX));
}
+ bool check_rowid_field()
+ {
+/* !!!NB igor: enable the code in this comment after backporting the SJ code
+ if (keep_current_rowid && !used_rowid_fields)
+ {
+ used_rowid_fields= 1;
+ used_fieldlength+= table->file->ref_length;
+ }
+*/
+ return test(used_rowid_fields);
+ }
+ bool is_inner_table_of_semi_join_with_first_match()
+ {
+ return first_sj_inner_tab != NULL;
+ }
+ bool is_inner_table_of_outer_join()
+ {
+ return first_inner != NULL;
+ }
+ bool is_single_inner_of_semi_join_with_first_match()
+ {
+ return first_sj_inner_tab == this && last_sj_inner_tab == this;
+ }
+ bool is_single_inner_of_outer_join()
+ {
+ return first_inner == this && first_inner->last_inner == this;
+ }
+ bool is_first_inner_for_outer_join()
+ {
+ return first_inner && first_inner == this;
+ }
+ bool use_match_flag()
+ {
+ return is_first_inner_for_outer_join() || first_sj_inner_tab == this ;
+ }
+ bool check_only_first_match()
+ {
+ return last_sj_inner_tab == this ||
+ (first_inner && first_inner->last_inner == this &&
+ table->reginfo.not_exists_optimize);
+ }
+ bool is_last_inner_table()
+ {
+ return (first_inner && first_inner->last_inner == this) ||
+ last_sj_inner_tab == this;
+ }
+ struct st_join_table *get_first_inner_table()
+ {
+ if (first_inner)
+ return first_inner;
+ return first_sj_inner_tab;
+ }
void set_select_cond(COND *to, uint line)
{
DBUG_PRINT("info", ("select_cond changes %p -> %p at line %u tab %p",
@@ -248,6 +294,849 @@ typedef struct st_join_table {
}
} JOIN_TAB;
+
+/*
+ Categories of data fields of variable length written into join cache buffers.
+ The value of any of these fields is written into cache together with the
+ prepended length of the value.
+*/
+#define CACHE_BLOB 1 /* blob field */
+#define CACHE_STRIPPED 2 /* field stripped of trailing spaces */
+#define CACHE_VARSTR1 3 /* short string value (length takes 1 byte) */
+#define CACHE_VARSTR2 4 /* long string value (length takes 2 bytes) */
+
+/*
+ The CACHE_FIELD structure used to describe fields of records that
+ are written into a join cache buffer from record buffers and backward.
+*/
+typedef struct st_cache_field {
+ uchar *str; /**< buffer from/to where the field is to be copied */
+ uint length; /**< maximal number of bytes to be copied from/to str */
+ /*
+ Field object for the moved field
+ (0 - for a flag field, see JOIN_CACHE::create_flag_fields).
+ */
+ Field *field;
+ uint type; /**< category of the of the copied field (CACHE_BLOB et al.) */
+ /*
+ The number of the record offset value for the field in the sequence
+ of offsets placed after the last field of the record. These
+ offset values are used to access fields referred to from other caches.
+ If the value is 0 then no offset for the field is saved in the
+ trailing sequence of offsets.
+ */
+ uint referenced_field_no;
+ /* The remaining structure fields are used as containers for temp values */
+ uint blob_length; /**< length of the blob to be copied */
+ uint offset; /**< field offset to be saved in cache buffer */
+} CACHE_FIELD;
+
+
+/*
+ JOIN_CACHE is the base class to support the implementations of both
+ Blocked-Based Nested Loops (BNL) Join Algorithm and Batched Key Access (BKA)
+ Join Algorithm. The first algorithm is supported by the derived class
+ JOIN_CACHE_BNL, while the second algorithm is supported by the derived
+ class JOIN_CACHE_BKA.
+ These two algorithms have a lot in common. Both algorithms first
+ accumulate the records of the left join operand in a join buffer and
+ then search for matching rows of the second operand for all accumulated
+ records.
+ For the first algorithm this strategy saves on logical I/O operations:
+ the entire set of records from the join buffer requires only one look-through
+ the records provided by the second operand.
+ For the second algorithm the accumulation of records allows to optimize
+ fetching rows of the second operand from disk for some engines (MyISAM,
+ InnoDB), or to minimize the number of round-trips between the Server and
+ the engine nodes (NDB Cluster).
+*/
+
+class JOIN_CACHE :public Sql_alloc
+{
+
+private:
+
+ /* Size of the offset of a record from the cache */
+ uint size_of_rec_ofs;
+ /* Size of the length of a record in the cache */
+ uint size_of_rec_len;
+ /* Size of the offset of a field within a record in the cache */
+ uint size_of_fld_ofs;
+
+protected:
+
+ /* 3 functions below actually do not use the hidden parameter 'this' */
+
+ /* Calculate the number of bytes used to store an offset value */
+ uint offset_size(uint len)
+ { return (len < 256 ? 1 : len < 256*256 ? 2 : 4); }
+
+ /* Get the offset value that takes ofs_sz bytes at the position ptr */
+ ulong get_offset(uint ofs_sz, uchar *ptr)
+ {
+ switch (ofs_sz) {
+ case 1: return uint(*ptr);
+ case 2: return uint2korr(ptr);
+ case 4: return uint4korr(ptr);
+ }
+ return 0;
+ }
+
+ /* Set the offset value ofs that takes ofs_sz bytes at the position ptr */
+ void store_offset(uint ofs_sz, uchar *ptr, ulong ofs)
+ {
+ switch (ofs_sz) {
+ case 1: *ptr= (uchar) ofs; return;
+ case 2: int2store(ptr, (uint16) ofs); return;
+ case 4: int4store(ptr, (uint32) ofs); return;
+ }
+ }
+
+ /*
+ The total maximal length of the fields stored for a record in the cache.
+ For blob fields only the sizes of the blob lengths are taken into account.
+ */
+ uint length;
+
+ /*
+ Representation of the executed multi-way join through which all needed
+ context can be accessed.
+ */
+ JOIN *join;
+
+ /*
+ Cardinality of the range of join tables whose fields can be put into the
+ cache. (A table from the range not necessarily contributes to the cache.)
+ */
+ uint tables;
+
+ /*
+ The total number of flag and data fields that can appear in a record
+ written into the cache. Fields with null values are always skipped
+ to save space.
+ */
+ uint fields;
+
+ /*
+ The total number of flag fields in a record put into the cache. They are
+ used for table null bitmaps, table null row flags, and an optional match
+ flag. Flag fields go before other fields in a cache record with the match
+ flag field placed always at the very beginning of the record.
+ */
+ uint flag_fields;
+
+ /* The total number of blob fields that are written into the cache */
+ uint blobs;
+
+ /*
+ The total number of fields referenced from field descriptors for other join
+ caches. These fields are used to construct key values to access matching
+ rows with index lookups. Currently the fields can be referenced only from
+ descriptors for bka caches. However they may belong to a cache of any type.
+ */
+ uint referenced_fields;
+
+ /*
+ The current number of already created data field descriptors.
+ This number can be useful for implementations of the init methods.
+ */
+ uint data_field_count;
+
+ /*
+ The current number of already created pointers to the data field
+ descriptors. This number can be useful for implementations of
+ the init methods.
+ */
+ uint data_field_ptr_count;
+ /*
+ Array of the descriptors of fields containing 'fields' elements.
+ These are all fields that are stored for a record in the cache.
+ */
+ CACHE_FIELD *field_descr;
+
+ /*
+ Array of pointers to the blob descriptors that contains 'blobs' elements.
+ */
+ CACHE_FIELD **blob_ptr;
+
+ /*
+ This flag indicates that records written into the join buffer contain
+ a match flag field. The flag must be set by the init method.
+ */
+ bool with_match_flag;
+ /*
+ This flag indicates that any record is prepended with the length of the
+ record which allows us to skip the record or part of it without reading.
+ */
+ bool with_length;
+
+ /*
+ The maximal number of bytes used for a record representation in
+ the cache excluding the space for blob data.
+ For future derived classes this representation may contains some
+ redundant info such as a key value associated with the record.
+ */
+ uint pack_length;
+ /*
+ The value of pack_length incremented by the total size of all
+ pointers of a record in the cache to the blob data.
+ */
+ uint pack_length_with_blob_ptrs;
+
+ /* Pointer to the beginning of the join buffer */
+ uchar *buff;
+ /*
+ Size of the entire memory allocated for the join buffer.
+ Part of this memory may be reserved for the auxiliary buffer.
+ */
+ ulong buff_size;
+ /* Size of the auxiliary buffer. */
+ ulong aux_buff_size;
+
+ /* The number of records put into the join buffer */
+ uint records;
+
+ /*
+ Pointer to the current position in the join buffer.
+ This member is used both when writing to buffer and
+ when reading from it.
+ */
+ uchar *pos;
+ /*
+ Pointer to the first free position in the join buffer,
+ right after the last record into it.
+ */
+ uchar *end_pos;
+
+ /*
+ Pointer to the beginning of first field of the current read/write record
+ from the join buffer. The value is adjusted by the get_record/put_record
+ functions.
+ */
+ uchar *curr_rec_pos;
+ /*
+ Pointer to the beginning of first field of the last record
+ from the join buffer.
+ */
+ uchar *last_rec_pos;
+
+ /*
+ Flag is set if the blob data for the last record in the join buffer
+ is in record buffers rather than in the join cache.
+ */
+ bool last_rec_blob_data_is_in_rec_buff;
+
+ /*
+ Pointer to the position to the current record link.
+ Record links are used only with linked caches. Record links allow to set
+ connections between parts of one join record that are stored in different
+ join buffers.
+ In the simplest case a record link is just a pointer to the beginning of
+ the record stored in the buffer.
+ In a more general case a link could be a reference to an array of pointers
+ to records in the buffer. */
+ uchar *curr_rec_link;
+
+ void calc_record_fields();
+ int alloc_fields(uint external_fields);
+ void create_flag_fields();
+ void create_remaining_fields(bool all_read_fields);
+ void set_constants();
+ int alloc_buffer();
+
+ uint get_size_of_rec_offset() { return size_of_rec_ofs; }
+ uint get_size_of_rec_length() { return size_of_rec_len; }
+ uint get_size_of_fld_offset() { return size_of_fld_ofs; }
+
+ uchar *get_rec_ref(uchar *ptr)
+ {
+ return buff+get_offset(size_of_rec_ofs, ptr-size_of_rec_ofs);
+ }
+ ulong get_rec_length(uchar *ptr)
+ {
+ return (ulong) get_offset(size_of_rec_len, ptr);
+ }
+ ulong get_fld_offset(uchar *ptr)
+ {
+ return (ulong) get_offset(size_of_fld_ofs, ptr);
+ }
+
+ void store_rec_ref(uchar *ptr, uchar* ref)
+ {
+ store_offset(size_of_rec_ofs, ptr-size_of_rec_ofs, (ulong) (ref-buff));
+ }
+
+ void store_rec_length(uchar *ptr, ulong len)
+ {
+ store_offset(size_of_rec_len, ptr, len);
+ }
+ void store_fld_offset(uchar *ptr, ulong ofs)
+ {
+ store_offset(size_of_fld_ofs, ptr, ofs);
+ }
+
+ /* Write record fields and their required offsets into the join buffer */
+ uint write_record_data(uchar *link, bool *is_full);
+
+ /*
+ This method must determine for how much the auxiliary buffer should be
+ incremented when a new record is added to the join buffer.
+ If no auxiliary buffer is needed the function should return 0.
+ */
+ virtual uint aux_buffer_incr() { return 0; }
+
+ /* Shall calculate how much space is remaining in the join buffer */
+ virtual ulong rem_space()
+ {
+ return max(buff_size-(end_pos-buff)-aux_buff_size,0);
+ }
+
+ /* Shall skip record from the join buffer if its match flag is on */
+ virtual bool skip_record_if_match();
+
+ /* Read all flag and data fields of a record from the join buffer */
+ uint read_all_record_fields();
+
+ /* Read all flag fields of a record from the join buffer */
+ uint read_flag_fields();
+
+ /* Read a data record field from the join buffer */
+ uint read_record_field(CACHE_FIELD *copy, bool last_record);
+
+ /* Read a referenced field from the join buffer */
+ bool read_referenced_field(CACHE_FIELD *copy, uchar *rec_ptr, uint *len);
+
+ /*
+ True if rec_ptr points to the record whose blob data stay in
+ record buffers
+ */
+ bool blob_data_is_in_rec_buff(uchar *rec_ptr)
+ {
+ return rec_ptr == last_rec_pos && last_rec_blob_data_is_in_rec_buff;
+ }
+
+ /* Find matches from the next table for records from the join buffer */
+ virtual enum_nested_loop_state join_matching_records(bool skip_last)=0;
+
+ /* Add null complements for unmatched outer records from buffer */
+ virtual enum_nested_loop_state join_null_complements(bool skip_last);
+
+ /* Restore the fields of the last record from the join buffer */
+ virtual void restore_last_record();
+
+ /*Set match flag for a record in join buffer if it has not been set yet */
+ bool set_match_flag_if_none(JOIN_TAB *first_inner, uchar *rec_ptr);
+
+ enum_nested_loop_state generate_full_extensions(uchar *rec_ptr);
+
+ /* Check matching to a partial join record from the join buffer */
+ bool check_match(uchar *rec_ptr);
+
+public:
+
+ /* Table to be joined with the partial join records from the cache */
+ JOIN_TAB *join_tab;
+
+ /* Pointer to the previous join cache if there is any */
+ JOIN_CACHE *prev_cache;
+ /* Pointer to the next join cache if there is any */
+ JOIN_CACHE *next_cache;
+
+ /* Shall initialize the join cache structure */
+ virtual int init()=0;
+
+ /* The function shall return TRUE only for BKA caches */
+ virtual bool is_key_access() { return FALSE; }
+
+ /* Shall reset the join buffer for reading/writing */
+ virtual void reset(bool for_writing);
+
+ /*
+ This function shall add a record into the join buffer and return TRUE
+ if it has been decided that it should be the last record in the buffer.
+ */
+ virtual bool put_record();
+
+ /*
+ This function shall read the next record into the join buffer and return
+ TRUE if there is no more next records.
+ */
+ virtual bool get_record();
+
+ /*
+ This function shall read the record at the position rec_ptr
+ in the join buffer
+ */
+ virtual void get_record_by_pos(uchar *rec_ptr);
+
+ /* Shall return the value of the match flag for the positioned record */
+ virtual bool get_match_flag_by_pos(uchar *rec_ptr);
+
+ /* Shall return the position of the current record */
+ virtual uchar *get_curr_rec() { return curr_rec_pos; }
+
+ /* Shall set the current record link */
+ virtual void set_curr_rec_link(uchar *link) { curr_rec_link= link; }
+
+ /* Shall return the current record link */
+ virtual uchar *get_curr_rec_link()
+ {
+ return (curr_rec_link ? curr_rec_link : get_curr_rec());
+ }
+
+ /* Join records from the join buffer with records from the next join table */
+ enum_nested_loop_state join_records(bool skip_last);
+
+ virtual ~JOIN_CACHE() {}
+ void reset_join(JOIN *j) { join= j; }
+ void free()
+ {
+ x_free(buff);
+ buff= 0;
+ }
+
+ friend class JOIN_CACHE_BNL;
+ friend class JOIN_CACHE_BKA;
+ friend class JOIN_CACHE_BKA_UNIQUE;
+};
+
+
+class JOIN_CACHE_BNL :public JOIN_CACHE
+{
+
+protected:
+
+ /* Using BNL find matches from the next table for records from join buffer */
+ enum_nested_loop_state join_matching_records(bool skip_last);
+
+public:
+
+ /*
+ This constructor creates an unlinked BNL join cache. The cache is to be
+ used to join table 'tab' to the result of joining the previous tables
+ specified by the 'j' parameter.
+ */
+ JOIN_CACHE_BNL(JOIN *j, JOIN_TAB *tab)
+ {
+ join= j;
+ join_tab= tab;
+ prev_cache= next_cache= 0;
+ }
+
+ /*
+ This constructor creates a linked BNL join cache. The cache is to be
+ used to join table 'tab' to the result of joining the previous tables
+ specified by the 'j' parameter. The parameter 'prev' specifies the previous
+ cache object to which this cache is linked.
+ */
+ JOIN_CACHE_BNL(JOIN *j, JOIN_TAB *tab, JOIN_CACHE *prev)
+ {
+ join= j;
+ join_tab= tab;
+ prev_cache= prev;
+ next_cache= 0;
+ if (prev)
+ prev->next_cache= this;
+ }
+
+ /* Initialize the BNL cache */
+ int init();
+
+};
+
+class JOIN_CACHE_BKA :public JOIN_CACHE
+{
+protected:
+
+ /* Flag to to be passed to the MRR interface */
+ uint mrr_mode;
+
+ /* MRR buffer assotiated with this join cache */
+ HANDLER_BUFFER mrr_buff;
+
+ /* Shall initialize the MRR buffer */
+ virtual void init_mrr_buff()
+ {
+ mrr_buff.buffer= end_pos;
+ mrr_buff.buffer_end= buff+buff_size;
+ }
+
+ /*
+ The number of the cache fields that are used in building keys to access
+ the table join_tab
+ */
+ uint local_key_arg_fields;
+ /*
+ The total number of the fields in the previous caches that are used
+ in building keys t access the table join_tab
+ */
+ uint external_key_arg_fields;
+
+ /*
+ This flag indicates that the key values will be read directly from the join
+ buffer. It will save us building key values in the key buffer.
+ */
+ bool use_emb_key;
+ /* The length of an embedded key value */
+ uint emb_key_length;
+
+ /* Check the possibility to read the access keys directly from join buffer */
+ bool check_emb_key_usage();
+
+ /* Calculate the increment of the MM buffer for a record write */
+ uint aux_buffer_incr();
+
+ /* Using BKA find matches from the next table for records from join buffer */
+ enum_nested_loop_state join_matching_records(bool skip_last);
+
+ /* Prepare to search for records that match records from the join buffer */
+ enum_nested_loop_state init_join_matching_records(RANGE_SEQ_IF *seq_funcs,
+ uint ranges);
+
+ /* Finish searching for records that match records from the join buffer */
+ enum_nested_loop_state end_join_matching_records(enum_nested_loop_state rc);
+
+public:
+
+ /*
+ This constructor creates an unlinked BKA join cache. The cache is to be
+ used to join table 'tab' to the result of joining the previous tables
+ specified by the 'j' parameter.
+ The MRR mode initially is set to 'flags'.
+ */
+ JOIN_CACHE_BKA(JOIN *j, JOIN_TAB *tab, uint flags)
+ {
+ join= j;
+ join_tab= tab;
+ prev_cache= next_cache= 0;
+ mrr_mode= flags;
+ }
+
+ /*
+ This constructor creates a linked BKA join cache. The cache is to be
+ used to join table 'tab' to the result of joining the previous tables
+ specified by the 'j' parameter. The parameter 'prev' specifies the cache
+ object to which this cache is linked.
+ The MRR mode initially is set to 'flags'.
+ */
+ JOIN_CACHE_BKA(JOIN *j, JOIN_TAB *tab, uint flags, JOIN_CACHE* prev)
+ {
+ join= j;
+ join_tab= tab;
+ prev_cache= prev;
+ next_cache= 0;
+ if (prev)
+ prev->next_cache= this;
+ mrr_mode= flags;
+ }
+
+ /* Initialize the BKA cache */
+ int init();
+
+ bool is_key_access() { return TRUE; }
+
+ /* Shall get the key built over the next record from the join buffer */
+ virtual uint get_next_key(uchar **key);
+
+ /* Check if the record combination matches the index condition */
+ bool skip_index_tuple(range_seq_t rseq, char *range_info);
+};
+
+/*
+ The class JOIN_CACHE_BKA_UNIQUE supports the variant of the BKA join algorithm
+ that submits only distinct keys to the MRR interface. The records in the join
+ buffer of a cache of this class that have the same access key are linked into
+ a chain attached to a key entry structure that either itself contains the key
+ value, or, in the case when the keys are embedded, refers to its occurance in
+ one of the records from the chain.
+ To build the chains with the same keys a hash table is employed. It is placed
+ at the very end of the join buffer. The array of hash entries is allocated
+ first at the very bottom of the join buffer, then go key entries. A hash entry
+ contains a header of the list of the key entries with the same hash value.
+ Each key entry is a structure of the following type:
+ struct st_join_cache_key_entry {
+ union {
+ uchar[] value;
+ cache_ref *value_ref; // offset from the beginning of the buffer
+ } hash_table_key;
+ key_ref next_key; // offset backward from the beginning of hash table
+ cache_ref *last_rec // offset from the beginning of the buffer
+ }
+ The references linking the records in a chain are always placed at the very
+ beginning of the record info stored in the join buffer. The records are
+ linked in a circular list. A new record is always added to the end of this
+ list. When a key is passed to the MRR interface it can be passed either with
+ an association link containing a reference to the header of the record chain
+ attached to the corresponding key entry in the hash table, or without any
+ association link. When the next record is returned by a call to the MRR
+ function multi_range_read_next without any association (because if was not
+ passed together with the key) then the key value is extracted from the
+ returned record and searched for it in the hash table. If there is any records
+ with such key the chain of them will be yielded as the result of this search.
+
+ The following picture represents a typical layout for the info stored in the
+ join buffer of a join cache object of the JOIN_CACHE_BKA_UNIQUE class.
+
+ buff
+ V
+ +----------------------------------------------------------------------------+
+ | |[*]record_1_1| |
+ | ^ | |
+ | | +--------------------------------------------------+ |
+ | | |[*]record_2_1| | |
+ | | ^ | V |
+ | | | +------------------+ |[*]record_1_2| |
+ | | +--------------------+-+ | |
+ |+--+ +---------------------+ | | +-------------+ |
+ || | | V | | |
+ |||[*]record_3_1| |[*]record_1_3| |[*]record_2_2| | |
+ ||^ ^ ^ | |
+ ||+----------+ | | | |
+ ||^ | |<---------------------------+-------------------+ |
+ |++ | | ... mrr | buffer ... ... | | |
+ | | | | |
+ | +-----+--------+ | +-----|-------+ |
+ | V | | | V | | |
+ ||key_3|[/]|[*]| | | |key_2|[/]|[*]| | |
+ | +-+---|-----------------------+ | |
+ | V | | | | |
+ | |key_1|[*]|[*]| | | ... |[*]| ... |[*]| ... | |
+ +----------------------------------------------------------------------------+
+ ^ ^ ^
+ | i-th entry j-th entry
+ hash table
+
+ i-th hash entry:
+ circular record chain for key_1:
+ record_1_1
+ record_1_2
+ record_1_3 (points to record_1_1)
+ circular record chain for key_3:
+ record_3_1 (points to itself)
+
+ j-th hash entry:
+ circular record chain for key_2:
+ record_2_1
+ record_2_2 (points to record_2_1)
+
+*/
+
+class JOIN_CACHE_BKA_UNIQUE :public JOIN_CACHE_BKA
+{
+
+private:
+
+ /* Size of the offset of a key entry in the hash table */
+ uint size_of_key_ofs;
+
+ /*
+ Length of a key value.
+ It is assumed that all key values have the same length.
+ */
+ uint key_length;
+ /*
+ Length of the key entry in the hash table.
+ A key entry either contains the key value, or it contains a reference
+ to the key value if use_emb_key flag is set for the cache.
+ */
+ uint key_entry_length;
+
+ /* The beginning of the hash table in the join buffer */
+ uchar *hash_table;
+ /* Number of hash entries in the hash table */
+ uint hash_entries;
+
+ /* Number of key entries in the hash table (number of distinct keys) */
+ uint key_entries;
+
+ /* The position of the last key entry in the hash table */
+ uchar *last_key_entry;
+
+ /* The position of the currently retrieved key entry in the hash table */
+ uchar *curr_key_entry;
+
+ /*
+ The offset of the record fields from the beginning of the record
+ representation. The record representation starts with a reference to
+ the next record in the key record chain followed by the length of
+ the trailing record data followed by a reference to the record segment
+ in the previous cache, if any, followed by the record fields.
+ */
+ uint rec_fields_offset;
+ /* The offset of the data fields from the beginning of the record fields */
+ uint data_fields_offset;
+
+ uint get_hash_idx(uchar* key, uint key_len);
+
+ void cleanup_hash_table();
+
+protected:
+
+ uint get_size_of_key_offset() { return size_of_key_ofs; }
+
+ /*
+ Get the position of the next_key_ptr field pointed to by
+ a linking reference stored at the position key_ref_ptr.
+ This reference is actually the offset backward from the
+ beginning of hash table.
+ */
+ uchar *get_next_key_ref(uchar *key_ref_ptr)
+ {
+ return hash_table-get_offset(size_of_key_ofs, key_ref_ptr);
+ }
+
+ /*
+ Store the linking reference to the next_key_ptr field at
+ the position key_ref_ptr. The position of the next_key_ptr
+ field is pointed to by ref. The stored reference is actually
+ the offset backward from the beginning of the hash table.
+ */
+ void store_next_key_ref(uchar *key_ref_ptr, uchar *ref)
+ {
+ store_offset(size_of_key_ofs, key_ref_ptr, (ulong) (hash_table-ref));
+ }
+
+ /*
+ Check whether the reference to the next_key_ptr field at the position
+ key_ref_ptr contains a nil value.
+ */
+ bool is_null_key_ref(uchar *key_ref_ptr)
+ {
+ ulong nil= 0;
+ return memcmp(key_ref_ptr, &nil, size_of_key_ofs ) == 0;
+ }
+
+ /*
+ Set the reference to the next_key_ptr field at the position
+ key_ref_ptr equal to nil.
+ */
+ void store_null_key_ref(uchar *key_ref_ptr)
+ {
+ ulong nil= 0;
+ store_offset(size_of_key_ofs, key_ref_ptr, nil);
+ }
+
+ uchar *get_next_rec_ref(uchar *ref_ptr)
+ {
+ return buff+get_offset(get_size_of_rec_offset(), ref_ptr);
+ }
+
+ void store_next_rec_ref(uchar *ref_ptr, uchar *ref)
+ {
+ store_offset(get_size_of_rec_offset(), ref_ptr, (ulong) (ref-buff));
+ }
+
+ /*
+ Get the position of the embedded key value for the current
+ record pointed to by get_curr_rec().
+ */
+ uchar *get_curr_emb_key()
+ {
+ return get_curr_rec()+data_fields_offset;
+ }
+
+ /*
+ Get the position of the embedded key value pointed to by a reference
+ stored at ref_ptr. The stored reference is actually the offset from
+ the beginning of the join buffer.
+ */
+ uchar *get_emb_key(uchar *ref_ptr)
+ {
+ return buff+get_offset(get_size_of_rec_offset(), ref_ptr);
+ }
+
+ /*
+ Store the reference to an embedded key at the position key_ref_ptr.
+ The position of the embedded key is pointed to by ref. The stored
+ reference is actually the offset from the beginning of the join buffer.
+ */
+ void store_emb_key_ref(uchar *ref_ptr, uchar *ref)
+ {
+ store_offset(get_size_of_rec_offset(), ref_ptr, (ulong) (ref-buff));
+ }
+
+ /*
+ Calculate how much space in the buffer would not be occupied by
+ records, key entries and additional memory for the MMR buffer.
+ */
+ ulong rem_space()
+ {
+ return max(last_key_entry-end_pos-aux_buff_size,0);
+ }
+
+ /*
+ Initialize the MRR buffer allocating some space within the join buffer.
+ The entire space between the last record put into the join buffer and the
+ last key entry added to the hash table is used for the MRR buffer.
+ */
+ void init_mrr_buff()
+ {
+ mrr_buff.buffer= end_pos;
+ mrr_buff.buffer_end= last_key_entry;
+ }
+
+ /* Skip record from JOIN_CACHE_BKA_UNIQUE buffer if its match flag is on */
+ bool skip_record_if_match();
+
+ /* Using BKA_UNIQUE find matches for records from join buffer */
+ enum_nested_loop_state join_matching_records(bool skip_last);
+
+ /* Search for a key in the hash table of the join buffer */
+ bool key_search(uchar *key, uint key_len, uchar **key_ref_ptr);
+
+public:
+
+ /*
+ This constructor creates an unlinked BKA_UNIQUE join cache. The cache is
+ to be used to join table 'tab' to the result of joining the previous tables
+ specified by the 'j' parameter.
+ The MRR mode initially is set to 'flags'.
+ */
+ JOIN_CACHE_BKA_UNIQUE(JOIN *j, JOIN_TAB *tab, uint flags)
+ :JOIN_CACHE_BKA(j, tab, flags) {}
+
+ /*
+ This constructor creates a linked BKA_UNIQUE join cache. The cache is
+ to be used to join table 'tab' to the result of joining the previous tables
+ specified by the 'j' parameter. The parameter 'prev' specifies the cache
+ object to which this cache is linked.
+ The MRR mode initially is set to 'flags'.
+ */
+ JOIN_CACHE_BKA_UNIQUE(JOIN *j, JOIN_TAB *tab, uint flags, JOIN_CACHE* prev)
+ :JOIN_CACHE_BKA(j, tab, flags, prev) {}
+
+ /* Initialize the BKA_UNIQUE cache */
+ int init();
+
+ /* Reset the JOIN_CACHE_BKA_UNIQUE buffer for reading/writing */
+ void reset(bool for_writing);
+
+ /* Add a record into the JOIN_CACHE_BKA_UNIQUE buffer */
+ bool put_record();
+
+ /* Read the next record from the JOIN_CACHE_BKA_UNIQUE buffer */
+ bool get_record();
+
+ /*
+ Shall check whether all records in a key chain have
+ their match flags set on
+ */
+ virtual bool check_all_match_flags_for_key(uchar *key_chain_ptr);
+
+ uint get_next_key(uchar **key);
+
+ /* Get the head of the record chain attached to the current key entry */
+ uchar *get_curr_key_chain()
+ {
+ return get_next_rec_ref(curr_key_entry+key_entry_length-
+ get_size_of_rec_offset());
+ }
+
+ /* Check if the record combination matches the index condition */
+ bool skip_index_tuple(range_seq_t rseq, char *range_info);
+};
+
+
enum_nested_loop_state sub_select_cache(JOIN *join, JOIN_TAB *join_tab, bool
end_of_records);
enum_nested_loop_state sub_select(JOIN *join,JOIN_TAB *join_tab, bool
@@ -306,7 +1195,8 @@ public:
TABLE **table,**all_tables,*sort_by_table;
uint tables,const_tables;
uint send_group_parts;
- bool sort_and_group,first_record,full_join,group, no_field_update;
+ bool sort_and_group,first_record,full_join, no_field_update;
+ bool group; /**< If query contains GROUP BY clause */
bool do_send_rows;
/**
TRUE when we want to resume nested loop iterations when
@@ -571,6 +1461,16 @@ public:
{
return (table_map(1) << tables) - 1;
}
+ /*
+ Return the table for which an index scan can be used to satisfy
+ the sort order needed by the ORDER BY/(implicit) GROUP BY clause
+ */
+ JOIN_TAB *get_sort_by_join_tab()
+ {
+ return (need_tmp || !sort_by_table || skip_sort_order ||
+ ((group || tmp_table_param.sum_func_count) && !group_list)) ?
+ NULL : join_tab+const_tables;
+ }
private:
bool make_simple_join(JOIN *join, TABLE *tmp_table);
};
@@ -771,6 +1671,8 @@ bool error_if_full_join(JOIN *join);
int report_error(TABLE *table, int error);
int safe_index_read(JOIN_TAB *tab);
COND *remove_eq_conds(THD *thd, COND *cond, Item::cond_result *cond_value);
+void calc_used_field_length(THD *thd, JOIN_TAB *join_tab);
+int join_init_read_record(JOIN_TAB *tab);
void set_position(JOIN *join,uint idx,JOIN_TAB *table,KEYUSE *key);
inline bool optimizer_flag(THD *thd, uint flag)
=== modified file 'storage/myisam/ha_myisam.cc'
--- a/storage/myisam/ha_myisam.cc 2009-12-15 07:16:46 +0000
+++ b/storage/myisam/ha_myisam.cc 2009-12-21 02:26:15 +0000
@@ -1865,6 +1865,16 @@ int ha_myisam::info(uint flag)
stats.max_data_file_length= misam_info.max_data_file_length;
stats.max_index_file_length= misam_info.max_index_file_length;
stats.create_time= (ulong) misam_info.create_time;
+ /*
+ We want the value of stats.mrr_length_per_rec to be platform independent.
+ The size of the chunk at the end of the join buffer used for MRR needs
+ is calculated now basing on the values passed in the stats structure.
+ The remaining part of the join buffer is used for records. A different
+ number of records in the buffer results in a different number of buffer
+ refills and in a different order of records in the result set.
+ */
+ stats.mrr_length_per_rec= misam_info.reflength + 8; // 8=max(sizeof(void *))
+
ref_length= misam_info.reflength;
share->db_options_in_use= misam_info.options;
stats.block_size= myisam_block_size; /* record block size */
1
0
[Maria-developers] Updated (by Alexi): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 20 Dec '09
by worklog-noreply@askmonty.org 20 Dec '09
20 Dec '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Knielsen
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 20
ESTIMATE.......: 35 (hours remain)
ORIG. ESTIMATE.: 35
PROGRESS NOTES:
-=-=(Alexi - Sun, 20 Dec 2009, 16:00)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.19667 2009-12-20 14:00:56.000000000 +0000
+++ /tmp/wklog.47.new.19667 2009-12-20 14:00:56.000000000 +0000
@@ -196,6 +196,11 @@
...
}
+NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
+simple_command() function, should also use this flag if it wants (in case
+of the --print-annotate-rows-events option set) to recieve Annotate_rows
+events.
+
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -212,8 +217,7 @@
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
- flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
- thd->server_id == 0 /* slave == mysqlbinlog */ )
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
-=-=(Alexi - Sun, 20 Dec 2009, 13:14)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.11350 2009-12-20 13:14:04.000000000 +0200
+++ /tmp/wklog.47.new.11350 2009-12-20 13:14:04.000000000 +0200
@@ -282,23 +282,18 @@
Annotate_rows_log_event* m_annotate_event;
};
-When the saved Annotate_rows object may be deleted? When all corresponding
-Rows events will be processed, i.e. before processing the first non-Rows
-event (note that Annotate_rows object resides in the binary log *after*
-the (possible) 'BEGIN' Query event which accompanies the rows events; note
-also that this deletion is adjusted with the case when some or all
-corresponding Rows events are filtered out by replicate filter rules):
+The saved Annotate_rows object should be deleted when all corresponding
+Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
- if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
- rli->free_annotate_event();
-
apply_event_and_update_pos(ev, ...);
- if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ if (rli->get_annotate_event() && is_last_rows_event(ev))
+ rli->free_annotate_event();
+ else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
@@ -307,10 +302,21 @@
where
- #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
- (type) == WRITE_ROWS_EVENT || \
+ bool is_last_rows_event(Log_event* ev)
+ {
+ Log_event_type type= ev->get_type_code();
+ if (IS_ROWS_EVENT_TYPE(type))
+ {
+ Rows_log_event* rows= (Rows_log_event*)ev;
+ return rows->get_flags(Rows_log_event::STMT_END_F);
+ }
+
+ return 0;
+ }
+
+ #define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
- (type) == DELETE_ROWS_EVENT )
+ (type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
-=-=(Alexi - Sun, 20 Dec 2009, 09:29)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.32726 2009-12-20 07:29:56.000000000 +0000
+++ /tmp/wklog.47.new.32726 2009-12-20 07:29:56.000000000 +0000
@@ -56,7 +56,7 @@
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
- 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ 00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
-=-=(Alexi - Sat, 19 Dec 2009, 16:10)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.16051 2009-12-19 16:10:48.000000000 +0200
+++ /tmp/wklog.47.new.16051 2009-12-19 16:10:48.000000000 +0200
@@ -253,7 +253,7 @@
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
-Rows events is applied):
+Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
-=-=(Alexi - Sat, 19 Dec 2009, 16:02)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.15695 2009-12-19 16:02:33.000000000 +0200
+++ /tmp/wklog.47.new.15695 2009-12-19 16:02:33.000000000 +0200
@@ -12,7 +12,7 @@
post-header and contains the query text in its data part. Example:
************************
- ANNOTATE_RBR_EVENT
+ ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
-=-=(Alexi - Sat, 19 Dec 2009, 15:58)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.15437 2009-12-19 13:58:12.000000000 +0000
+++ /tmp/wklog.47.new.15437 2009-12-19 13:58:12.000000000 +0000
@@ -1 +1,337 @@
+Content
+~~~~~~~
+ 1. Annotate_rows event number
+ 2. Outline of Annotate_rows event behavior
+ 3. How Master writes Annotate_rows events to the binary log
+ 4. How slave treats replicate-annotate-rows-events option
+ 5. How slave IO thread requests Annotate_rows events
+ 6. How master executes the request
+ 7. How slave SQL thread processes Annotate_rows events
+ 8. General remarks
+
+1. Annotate_rows event number
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
+between the last MySQL event number and the Annotate_rows event number:
+
+ enum Log_event_type
+ { ...
+ INCIDENT_EVENT= 26,
+ // New MySQL event numbers are to be added here
+ MYSQL_EVENTS_END,
+
+ MARIA_EVENTS_BEGIN= 51,
+ // New Maria event numbers start from here
+ ANNOTATE_ROWS_EVENT= 51,
+
+ ENUM_END_EVENT
+ };
+
+together with the corresponding extension of 'post_header_len' array in the
+Format description event. (This extension does not affect the compatibility
+of the binary log). Here is how Format description event looks like with
+this extension:
+
+ ************************
+ FORMAT_DESCRIPTION_EVENT
+ ************************
+ 00000004 | A1 A0 2C 4B | time_when = 1261215905
+ 00000008 | 0F | event_type = 15
+ 00000009 | 64 00 00 00 | server_id = 100
+ 0000000D | 7F 00 00 00 | event_len = 127
+ 00000011 | 83 00 00 00 | log_pos = 00000083
+ 00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
+ ------------------------
+ 00000017 | 04 00 | binlog_ver = 4
+ 00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
+ ..... ...
+ 0000004B | A1 A0 2C 4B | time_created = 1261215905
+ 0000004F | 13 | common_header_len = 19
+ ------------------------
+ post_header_len
+ ------------------------
+ 00000050 | 38 | 56 - START_EVENT_V3 [1]
+ ..... ...
+ 00000069 | 02 | 2 - INCIDENT_EVENT [26]
+ 0000006A | 00 | 0 - RESERVED [27]
+ ..... ...
+ 00000081 | 00 | 0 - RESERVED [50]
+ 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ ************************
+
+2. Outline of Annotate_rows event behavior
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Each Annotate_rows_log_event object has two private members describing the
+corresponding query:
+
+ char *m_query_txt;
+ uint m_query_len;
+
+When the object is created for writing to a binary log, this query is taken
+from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
+as well as other implementation details):
+
+ Annotate_rows_log_event(THD *thd)
+ {
+ m_query_txt = thd->query();
+ m_query_len = thd->query_length();
+ }
+
+When the object is read from a binary log, the query is taken from the buffer
+containing the binary log representation of the event (this buffer is allocated
+in Log_event object from which all Log events are derived):
+
+ Annotate_rows_log_event(char *buf, uint event_len,
+ Format_description_log_event *desc)
+ {
+ m_query_len = event_len - desc->common_header_len;
+ m_query_txt = buf + desc->common_header_len;
+ }
+
+The events are written to the binary log by the Log_event::write() member
+which calls virtual write_data_header() and write_data_body() members
+("data header" and "post header" are synonym in replication terminology).
+In our case, data header is empty and data body is just the query:
+
+ bool write_data_body(IO_CACHE *file)
+ {
+ return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
+ }
+
+Printing the event is just printing the query:
+
+ void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
+ {
+ my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
+ }
+
+3. How Master writes Annotate_rows events to the binary log
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The event is written to the binary log just before the group of Table_map
+events which precede corresponding Rows events (one query may generate
+several Table map events in the binary log, but the corresponding
+Annotate_rows event must be written only once before the first Table map
+event; hence the boolean variable 'with_annotate' below):
+
+ int write_locked_table_maps(THD *thd)
+ { ...
+ bool with_annotate= thd->variables.binlog_annotate_rows_events;
+ ...
+ for (uint i= 0; i < ... <number of tables> ...; ++i)
+ { ...
+ thd->binlog_write_table_map(table, ..., with_annotate);
+ with_annotate= 0; // write Annotate_event not more than once
+ ...
+ }
+ ...
+ }
+
+ int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
+ { ...
+ Table_map_log_event the_event(...);
+ ...
+ if (with_annotate)
+ {
+ Annotate_rows_log_event anno(this);
+ mysql_bin_log.write(&anno);
+ }
+
+ mysql_bin_log.write(&the_event);
+ ...
+ }
+
+4. How slave treats replicate-annotate-rows-events option
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The replicate-annotate-rows-events option is treated just as the session
+value of the binlog_annotate_rows_events variable for the slave IO and
+SQL threads. This setting is done during initialization of these threads:
+
+ pthread_handler_t handle_slave_io(void *arg)
+ {
+ THD *thd= new THD;
+ ...
+ init_slave_thread(thd, SLAVE_THD_IO);
+ ...
+ }
+
+ pthread_handler_t handle_slave_sql(void *arg)
+ {
+ THD *thd= new THD;
+ ...
+ init_slave_thread(thd, SLAVE_THD_SQL);
+ ...
+ }
+
+ int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
+ { ...
+ thd->variables.binlog_annotate_rows_events=
+ opt_replicate_annotate_rows_events;
+ ...
+ }
+
+5. How slave IO thread requests Annotate_rows events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When requesting an event, the slave should inform the master whether
+it should send Annotate_rows events or not. To that end we add a new
+BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
+
+ #define BINLOG_DUMP_NON_BLOCK 1
+ #define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
+
+ pthread_handler_t handle_slave_io(void *arg)
+ { ...
+ request_dump(mysql, ...);
+ ...
+ }
+
+ int request_dump(MYSQL* mysql, ...)
+ { ...
+ if (opt_log_slave_updates &&
+ mi->io_thd->variables.binlog_annotate_rows_events)
+ binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
+ ...
+ int2store(buf + 4, binlog_flags);
+ ...
+ simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
+ ...
+ }
+
+6. How master executes the request
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ case COM_BINLOG_DUMP:
+ { ...
+ flags= uint2korr(packet + 4);
+ ...
+ mysql_binlog_send(thd, ..., flags);
+ ...
+ }
+
+ void mysql_binlog_send(THD* thd, ..., ushort flags)
+ { ...
+ Log_event::read_log_event(&log, packet, ...);
+ ...
+ if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
+ thd->server_id == 0 /* slave == mysqlbinlog */ )
+ {
+ my_net_write(net, packet->ptr(), packet->length());
+ }
+ ...
+ }
+
+7. How slave SQL thread processes Annotate_rows events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The slave processes each recieved event by "applying" it, i.e. by
+calling the Log_event::apply_event() function which in turn calls
+the virtual do_apply_event() member specific for each type of the
+event.
+
+ int exec_relay_log_event(THD* thd, Relay_log_info* rli)
+ { ...
+ Log_event *ev = next_event(rli);
+ ...
+ apply_event_and_update_pos(ev, ...);
+
+ if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
+ delete ev;
+ ...
+ }
+
+ int apply_event_and_update_pos(Log_event *ev, ...)
+ { ...
+ ev->apply_event(...);
+ ...
+ }
+
+ int Log_event::apply_event(...)
+ {
+ return do_apply_event(...);
+ }
+
+What does it mean to "apply" an Annotate_rows event? It means to set current
+thd query to that of the described by the event, i.e. to the query which
+caused the subsequent Rows events (see "How Master writes Annotate_rows
+events to the binary log" to follow what happens further when the subsequent
+Rows events is applied):
+
+ int Annotate_rows_log_event::do_apply_event(...)
+ {
+ thd->set_query(m_query_txt, m_query_len);
+ }
+
+NOTE. I am not sure, but possibly current values of thd->query and
+thd->query_length should be saved before calling set_query() and to be
+restored on the Annotate_rows_log_event object deletion.
+Is it really needed ?
+
+After calling this do_apply_event() function we may not delete the
+Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
+above) because thd->query now points to the string inside this object.
+We may keep the pointer to this object in the Relay_log_info:
+
+ class Relay_log_info
+ {
+ public:
+ ...
+ void set_annotate_event(Annotate_rows_log_event*);
+ Annotate_rows_log_event* get_annotate_event();
+ void free_annotate_event();
+ ...
+ private:
+ Annotate_rows_log_event* m_annotate_event;
+ };
+
+When the saved Annotate_rows object may be deleted? When all corresponding
+Rows events will be processed, i.e. before processing the first non-Rows
+event (note that Annotate_rows object resides in the binary log *after*
+the (possible) 'BEGIN' Query event which accompanies the rows events; note
+also that this deletion is adjusted with the case when some or all
+corresponding Rows events are filtered out by replicate filter rules):
+
+ int exec_relay_log_event(THD* thd, Relay_log_info* rli)
+ { ...
+ Log_event *ev= next_event(rli);
+ ...
+ if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
+ rli->free_annotate_event();
+
+ apply_event_and_update_pos(ev, ...);
+
+ if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ rli->set_annotate_event((Annotate_rows_log_event*) ev);
+ else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
+ delete ev;
+ ...
+ }
+
+where
+
+ #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
+ (type) == WRITE_ROWS_EVENT || \
+ (type) == UPDATE_ROWS_EVENT || \
+ (type) == DELETE_ROWS_EVENT )
+
+8. General remarks
+~~~~~~~~~~~~~~~~~~
+Kristian noticed that introducing new log event type should be coordinated
+somehow with MySQL/Sun:
+
+ Kristian: The numeric code for this event must be assigned carefully.
+ It should be coordinated with MySQL/Sun, otherwise we can get into a
+ situation where MySQL uses the same numeric code for one event that
+ MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
+ impossible.
+ Alex: I reserved about 20 numbers not to have possible conflicts
+ with MySQL.
+ Kristian: Still, I think it would be appropriate to send a polite email
+ to internals(a)lists.mysql.com about this and suggesting to reserve the
+ event number.
+
+Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
+flag taking into account that MySQL/Sun may also introduce a flag with the
+same value to be used in the request_dump-mysql_binlog_send interface.
+But this is mainly the question of merging: if a conflict concerning this
+flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
+(this does not require additional changes in the code).
-=-=(Alexi - Sat, 19 Dec 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.14545 2009-12-19 15:41:21.000000000 +0200
+++ /tmp/wklog.47.new.14545 2009-12-19 15:41:21.000000000 +0200
@@ -1,122 +1,107 @@
-First suggestion:
-
-> I think for this we would actually need a new binlog event type
-> (Comment_log_event?). Unless we want to log an empty statement Query_log_event
-> containing only a comment (a bit of a hack).
-
-New server option
-~~~~~~~~~~~~~~~~~
- --binlog-annotate-rows-events
-
-Setting this option makes RBR (rows-) events in the binary log to be
-preceded by Annotate rows events (see below). The corresponding
-'binlog_annotate_rows_events' system variable is dynamic and has both
-global and session values. Default global value is OFF.
-
-Note. Session values are usefull to make it possible to annotate only
- some selected statements:
+Content
+~~~~~~~
+ 1. Annotate_rows_log_event
+ 2. Server option: --binlog-annotate-rows-events
+ 3. Server option: --replicate-annotate-rows-events
+ 4. mysqlbinlog option: --print-annotate-rows-events
+ 5. mysqlbinlog output
+
+1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Describes the query which caused the corresponding rows events. Has empty
+post-header and contains the query text in its data part. Example:
+
+ ************************
+ ANNOTATE_RBR_EVENT
+ ************************
+ 00000220 | B6 A0 2C 4B | time_when = 1261215926
+ 00000224 | 33 | event_type = 51
+ 00000225 | 64 00 00 00 | server_id = 100
+ 00000229 | 36 00 00 00 | event_len = 54
+ 0000022D | 56 02 00 00 | log_pos = 00000256
+ 00000231 | 00 00 | flags = <none>
+ ------------------------
+ 00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
+ 00000237 | 52 54 20 49 |
+ 0000023B | 4E 54 4F 20 |
+ 0000023F | 74 31 20 56 |
+ 00000243 | 41 4C 55 45 |
+ 00000247 | 53 20 28 31 |
+ 0000024B | 29 2C 20 28 |
+ 0000024F | 32 29 2C 20 |
+ 00000253 | 28 33 29 |
+ ************************
+
+In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
+and precedes the first of Table map events which accompany the corresponding
+rows events. (See example in the "mysqlbinlog output" section below.)
+
+2. Server option: --binlog-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tells the master to write Annotate_rows events to the binary log.
+
+ * Variable Name: binlog_annotate_rows_events
+ * Scope: Global & Session
+ * Access Type: Dynamic
+ * Data Type: bool
+ * Default Value: OFF
+NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
-New binlog event type
-~~~~~~~~~~~~~~~~~~~~~
- Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
-
-Describes the query which caused the corresponding rows event. In binary log,
-precedes each Table_map_log_event. Contains empty post-header and the query
-text in its data part.
-
-The numeric code for this event must be assigned carefully. It should be
-coordinated with MySQL/Sun, otherwise we can get into a situation where MySQL
-uses the same numeric code for one event that MariaDB uses for
-ANNOTATE_ROWS_EVENT, which would make merging the two impossible.
-
-Example:
-
- ...
- ************************
- ANNOTATE_ROWS_EVENT [51]
- ************************
- 000000C7 | 54 1B 12 4B | time_when = 1259477844
- 000000CB | 33 | event_type = 51
- 000000CC | 64 00 00 00 | server_id = 100
- 000000D0 | 2C 00 00 00 | event_len = 44
- 000000D4 | F3 00 00 00 | log_pos = 000000F3
- 000000D8 | 00 00 | flags = <none>
- ------------------------
- 000000DA | 69 6E 73 65 | query = "insert into t1 values (1)"
- 000000DE | 72 74 20 69 |
- 000000E2 | 6E 74 6F 20 |
- 000000E6 | 74 31 20 76 |
- 000000EA | 61 6C 75 65 |
- 000000EE | 73 20 28 31 |
- 000000F2 | 29 |
- ************************
- TABLE_MAP_EVENT [19]
- ************************
- 000000F3 | 54 1B 12 4B | time_when = 1259477844
- 000000F7 | 13 | event_type = 19
- 000000F8 | 64 00 00 00 | server_id = 100
- 000000FC | 29 00 00 00 | event_len = 41
- 00000100 | 1C 01 00 00 | log_pos = 0000011C
- 00000104 | 00 00 | flags = <none>
- ------------------------
- ...
- ************************
- WRITE_ROWS_EVENT [23]
- ************************
- 0000011C | 54 1B 12 4B | time_when = 1259477844
- 00000120 | 17 | event_type = 23
- 00000121 | 64 00 00 00 | server_id = 100
- 00000125 | 22 00 00 00 | event_len = 34
- 00000129 | 3E 01 00 00 | log_pos = 0000013E
- 0000012D | 10 00 | flags = LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F
- ------------------------
- 0000012F | 0F 00 00 00 | table_id = 15
- ...
-
-New mysqlbinlog option
-~~~~~~~~~~~~~~~~~~~~~~
- --print-annotate-rows-events
-
-With this option, mysqlbinlog prints the content of Annotate-rows
-events (if the binary log does contain them). Without this option
-(i.e. by default), mysqlbinlog skips Annotate rows events.
-
-
-mysqlbinlog output
-~~~~~~~~~~~~~~~~~~
-Something like this:
+3. Server option: --replicate-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tells the slave to reproduce Annotate_rows events recieved from the master
+in its own binary log (sensible only in pair with log-slave-updates option).
+
+ * Variable Name: replicate_annotate_rows_events
+ * Scope: Global
+ * Access Type: Read only
+ * Data Type: bool
+ * Default Value: OFF
+
+NOTE. Why do we additionally need this 'replicate' option? Why not to make
+the slave to reproduce this events when its binlog-annotate-rows-events
+global value is ON? Well, because, for example, we may want to configure
+the slave which should reproduce Annotate_rows events but has global
+binlog-annotate-rows-events = OFF meaning this to be the default value for
+the client threads (see also "How slave treats replicate-annotate-rows-events
+option" in LLD part).
+
+4. mysqlbinlog option: --print-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+With this option, mysqlbinlog prints the content of Annotate_rows events (if
+the binary log does contain them). Without this option (i.e. by default),
+mysqlbinlog skips Annotate_rows events.
+5. mysqlbinlog output
+~~~~~~~~~~~~~~~~~~~~~
+With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
+in a form like this:
...
- # at 199
- # at 243
- # at 284
- #091129 9:57:24 server id 100 end_log_pos 243 Query: `insert into t1 values
-(1)`
- #091129 9:57:24 server id 100 end_log_pos 284 Table_map: `test`.`t1` mapped
-to number 15
- #091129 9:57:24 server id 100 end_log_pos 318 Write_rows: table id 15
+ # at 1646
+ #091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
+exec_time=0 error_code=0
+ SET TIMESTAMP=1261215926/*!*/;
+ BEGIN
+ /*!*/;
+ # at 1714
+ # at 1812
+ # at 1853
+ # at 1894
+ # at 1938
+ #091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
+t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
+ #091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
+mapped to number 16
+ #091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
+mapped to number 17
+ #091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
+ #091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
-
- BINLOG '
- VBsSSzNkAAAALAAAAPMAAAAAAGluc2VydCBpbnRvIHQxIHZhbHVlcyAoMSk=
- VBsSSxNkAAAAKQAAABwBAAAAAA8AAAAAAAAABHRlc3QAAnQxAAEDAAE=
- VBsSSxdkAAAAIgAAAD4BAAAQAA8AAAAAAAEAAf/+AQAAAA==
- '/*!*/;
- ### INSERT INTO test.t1
- ### SET
- ### @1=1 /* INT meta=0 nullable=1 is_null=0 */
...
-When master sends Annotate rows events
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-1. Master always sends Annotate_rows events to mysqlbinlog (in
- remote case).
-2. Master sends Annotate_rows events to a slave only if the slave has
- both log-slave-updates and binlog-annotate-rows-events options set.
-
-=-=(Bothorsen - Fri, 18 Dec 2009, 16:22)=-=-
Add estimation time.
Worked 5 hours and estimate 35 hours remain (original estimate increased by 5 hours).
-=-=(Bothorsen - Fri, 18 Dec 2009, 16:16)=-=-
This is the work done on this patch so far. Most of it done by Alex.
Worked 15 hours and estimate 035 hours remain (original estimate increased by 50 hours).
-=-=(Alexi - Fri, 04 Dec 2009, 13:00)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.14001 2009-12-04 13:00:24.000000000 +0200
+++ /tmp/wklog.47.new.14001 2009-12-04 13:00:24.000000000 +0200
@@ -6,27 +6,27 @@
New server option
~~~~~~~~~~~~~~~~~
- --binlog-annotate-row-events
+ --binlog-annotate-rows-events
-Setting this option makes RBR (row-) events in the binary log to be
+Setting this option makes RBR (rows-) events in the binary log to be
preceded by Annotate rows events (see below). The corresponding
-'binlog_annotate_row_events' system variable is dynamic and has both
+'binlog_annotate_rows_events' system variable is dynamic and has both
global and session values. Default global value is OFF.
Note. Session values are usefull to make it possible to annotate only
some selected statements:
...
- SET SESSION binlog_annotate_row_events=ON;
+ SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
- SET SESSION binlog_annotate_row_events=OFF;
+ SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
New binlog event type
~~~~~~~~~~~~~~~~~~~~~
Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
-Describes the query which caused the corresponding row event. In binary log,
+Describes the query which caused the corresponding rows event. In binary log,
precedes each Table_map_log_event. Contains empty post-header and the query
text in its data part.
@@ -79,6 +79,15 @@
0000012F | 0F 00 00 00 | table_id = 15
...
+New mysqlbinlog option
+~~~~~~~~~~~~~~~~~~~~~~
+ --print-annotate-rows-events
+
+With this option, mysqlbinlog prints the content of Annotate-rows
+events (if the binary log does contain them). Without this option
+(i.e. by default), mysqlbinlog skips Annotate rows events.
+
+
mysqlbinlog output
~~~~~~~~~~~~~~~~~~
Something like this:
@@ -109,5 +118,5 @@
1. Master always sends Annotate_rows events to mysqlbinlog (in
remote case).
2. Master sends Annotate_rows events to a slave only if the slave has
- both log-slave-updates and binlog-annotate-row-events options set.
+ both log-slave-updates and binlog-annotate-rows-events options set.
------------------------------------------------------------
-=-=(View All Progress Notes, 19 total)=-=-
http://askmonty.org/worklog/index.pl?tid=47&nolimit=1
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
HIGH-LEVEL SPECIFICATION:
Content
~~~~~~~
1. Annotate_rows_log_event
2. Server option: --binlog-annotate-rows-events
3. Server option: --replicate-annotate-rows-events
4. mysqlbinlog option: --print-annotate-rows-events
5. mysqlbinlog output
1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Describes the query which caused the corresponding rows events. Has empty
post-header and contains the query text in its data part. Example:
************************
ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
00000225 | 64 00 00 00 | server_id = 100
00000229 | 36 00 00 00 | event_len = 54
0000022D | 56 02 00 00 | log_pos = 00000256
00000231 | 00 00 | flags = <none>
------------------------
00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
00000237 | 52 54 20 49 |
0000023B | 4E 54 4F 20 |
0000023F | 74 31 20 56 |
00000243 | 41 4C 55 45 |
00000247 | 53 20 28 31 |
0000024B | 29 2C 20 28 |
0000024F | 32 29 2C 20 |
00000253 | 28 33 29 |
************************
In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
and precedes the first of Table map events which accompany the corresponding
rows events. (See example in the "mysqlbinlog output" section below.)
2. Server option: --binlog-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the master to write Annotate_rows events to the binary log.
* Variable Name: binlog_annotate_rows_events
* Scope: Global & Session
* Access Type: Dynamic
* Data Type: bool
* Default Value: OFF
NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
3. Server option: --replicate-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the slave to reproduce Annotate_rows events recieved from the master
in its own binary log (sensible only in pair with log-slave-updates option).
* Variable Name: replicate_annotate_rows_events
* Scope: Global
* Access Type: Read only
* Data Type: bool
* Default Value: OFF
NOTE. Why do we additionally need this 'replicate' option? Why not to make
the slave to reproduce this events when its binlog-annotate-rows-events
global value is ON? Well, because, for example, we may want to configure
the slave which should reproduce Annotate_rows events but has global
binlog-annotate-rows-events = OFF meaning this to be the default value for
the client threads (see also "How slave treats replicate-annotate-rows-events
option" in LLD part).
4. mysqlbinlog option: --print-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With this option, mysqlbinlog prints the content of Annotate_rows events (if
the binary log does contain them). Without this option (i.e. by default),
mysqlbinlog skips Annotate_rows events.
5. mysqlbinlog output
~~~~~~~~~~~~~~~~~~~~~
With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
in a form like this:
...
# at 1646
#091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
exec_time=0 error_code=0
SET TIMESTAMP=1261215926/*!*/;
BEGIN
/*!*/;
# at 1714
# at 1812
# at 1853
# at 1894
# at 1938
#091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
#091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
mapped to number 16
#091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
mapped to number 17
#091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
#091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
...
LOW-LEVEL DESIGN:
Content
~~~~~~~
1. Annotate_rows event number
2. Outline of Annotate_rows event behavior
3. How Master writes Annotate_rows events to the binary log
4. How slave treats replicate-annotate-rows-events option
5. How slave IO thread requests Annotate_rows events
6. How master executes the request
7. How slave SQL thread processes Annotate_rows events
8. General remarks
1. Annotate_rows event number
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
between the last MySQL event number and the Annotate_rows event number:
enum Log_event_type
{ ...
INCIDENT_EVENT= 26,
// New MySQL event numbers are to be added here
MYSQL_EVENTS_END,
MARIA_EVENTS_BEGIN= 51,
// New Maria event numbers start from here
ANNOTATE_ROWS_EVENT= 51,
ENUM_END_EVENT
};
together with the corresponding extension of 'post_header_len' array in the
Format description event. (This extension does not affect the compatibility
of the binary log). Here is how Format description event looks like with
this extension:
************************
FORMAT_DESCRIPTION_EVENT
************************
00000004 | A1 A0 2C 4B | time_when = 1261215905
00000008 | 0F | event_type = 15
00000009 | 64 00 00 00 | server_id = 100
0000000D | 7F 00 00 00 | event_len = 127
00000011 | 83 00 00 00 | log_pos = 00000083
00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
------------------------
00000017 | 04 00 | binlog_ver = 4
00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
..... ...
0000004B | A1 A0 2C 4B | time_created = 1261215905
0000004F | 13 | common_header_len = 19
------------------------
post_header_len
------------------------
00000050 | 38 | 56 - START_EVENT_V3 [1]
..... ...
00000069 | 02 | 2 - INCIDENT_EVENT [26]
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each Annotate_rows_log_event object has two private members describing the
corresponding query:
char *m_query_txt;
uint m_query_len;
When the object is created for writing to a binary log, this query is taken
from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
as well as other implementation details):
Annotate_rows_log_event(THD *thd)
{
m_query_txt = thd->query();
m_query_len = thd->query_length();
}
When the object is read from a binary log, the query is taken from the buffer
containing the binary log representation of the event (this buffer is allocated
in Log_event object from which all Log events are derived):
Annotate_rows_log_event(char *buf, uint event_len,
Format_description_log_event *desc)
{
m_query_len = event_len - desc->common_header_len;
m_query_txt = buf + desc->common_header_len;
}
The events are written to the binary log by the Log_event::write() member
which calls virtual write_data_header() and write_data_body() members
("data header" and "post header" are synonym in replication terminology).
In our case, data header is empty and data body is just the query:
bool write_data_body(IO_CACHE *file)
{
return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
}
Printing the event is just printing the query:
void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
{
my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
}
3. How Master writes Annotate_rows events to the binary log
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The event is written to the binary log just before the group of Table_map
events which precede corresponding Rows events (one query may generate
several Table map events in the binary log, but the corresponding
Annotate_rows event must be written only once before the first Table map
event; hence the boolean variable 'with_annotate' below):
int write_locked_table_maps(THD *thd)
{ ...
bool with_annotate= thd->variables.binlog_annotate_rows_events;
...
for (uint i= 0; i < ... <number of tables> ...; ++i)
{ ...
thd->binlog_write_table_map(table, ..., with_annotate);
with_annotate= 0; // write Annotate_event not more than once
...
}
...
}
int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
{ ...
Table_map_log_event the_event(...);
...
if (with_annotate)
{
Annotate_rows_log_event anno(this);
mysql_bin_log.write(&anno);
}
mysql_bin_log.write(&the_event);
...
}
4. How slave treats replicate-annotate-rows-events option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The replicate-annotate-rows-events option is treated just as the session
value of the binlog_annotate_rows_events variable for the slave IO and
SQL threads. This setting is done during initialization of these threads:
pthread_handler_t handle_slave_io(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_IO);
...
}
pthread_handler_t handle_slave_sql(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_SQL);
...
}
int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
{ ...
thd->variables.binlog_annotate_rows_events=
opt_replicate_annotate_rows_events;
...
}
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When requesting an event, the slave should inform the master whether
it should send Annotate_rows events or not. To that end we add a new
BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
#define BINLOG_DUMP_NON_BLOCK 1
#define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
pthread_handler_t handle_slave_io(void *arg)
{ ...
request_dump(mysql, ...);
...
}
int request_dump(MYSQL* mysql, ...)
{ ...
if (opt_log_slave_updates &&
mi->io_thd->variables.binlog_annotate_rows_events)
binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
...
int2store(buf + 4, binlog_flags);
...
simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
...
}
NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
simple_command() function, should also use this flag if it wants (in case
of the --print-annotate-rows-events option set) to recieve Annotate_rows
events.
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
case COM_BINLOG_DUMP:
{ ...
flags= uint2korr(packet + 4);
...
mysql_binlog_send(thd, ..., flags);
...
}
void mysql_binlog_send(THD* thd, ..., ushort flags)
{ ...
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
...
}
7. How slave SQL thread processes Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The slave processes each recieved event by "applying" it, i.e. by
calling the Log_event::apply_event() function which in turn calls
the virtual do_apply_event() member specific for each type of the
event.
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev = next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
int apply_event_and_update_pos(Log_event *ev, ...)
{ ...
ev->apply_event(...);
...
}
int Log_event::apply_event(...)
{
return do_apply_event(...);
}
What does it mean to "apply" an Annotate_rows event? It means to set current
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
thd->set_query(m_query_txt, m_query_len);
}
NOTE. I am not sure, but possibly current values of thd->query and
thd->query_length should be saved before calling set_query() and to be
restored on the Annotate_rows_log_event object deletion.
Is it really needed ?
After calling this do_apply_event() function we may not delete the
Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
above) because thd->query now points to the string inside this object.
We may keep the pointer to this object in the Relay_log_info:
class Relay_log_info
{
public:
...
void set_annotate_event(Annotate_rows_log_event*);
Annotate_rows_log_event* get_annotate_event();
void free_annotate_event();
...
private:
Annotate_rows_log_event* m_annotate_event;
};
The saved Annotate_rows object should be deleted when all corresponding
Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (rli->get_annotate_event() && is_last_rows_event(ev))
rli->free_annotate_event();
else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
where
bool is_last_rows_event(Log_event* ev)
{
Log_event_type type= ev->get_type_code();
if (IS_ROWS_EVENT_TYPE(type))
{
Rows_log_event* rows= (Rows_log_event*)ev;
return rows->get_flags(Rows_log_event::STMT_END_F);
}
return 0;
}
#define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
(type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
Kristian noticed that introducing new log event type should be coordinated
somehow with MySQL/Sun:
Kristian: The numeric code for this event must be assigned carefully.
It should be coordinated with MySQL/Sun, otherwise we can get into a
situation where MySQL uses the same numeric code for one event that
MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
impossible.
Alex: I reserved about 20 numbers not to have possible conflicts
with MySQL.
Kristian: Still, I think it would be appropriate to send a polite email
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
flag taking into account that MySQL/Sun may also introduce a flag with the
same value to be used in the request_dump-mysql_binlog_send interface.
But this is mainly the question of merging: if a conflict concerning this
flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
(this does not require additional changes in the code).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Alexi): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 20 Dec '09
by worklog-noreply@askmonty.org 20 Dec '09
20 Dec '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Knielsen
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 20
ESTIMATE.......: 35 (hours remain)
ORIG. ESTIMATE.: 35
PROGRESS NOTES:
-=-=(Alexi - Sun, 20 Dec 2009, 16:00)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.19667 2009-12-20 14:00:56.000000000 +0000
+++ /tmp/wklog.47.new.19667 2009-12-20 14:00:56.000000000 +0000
@@ -196,6 +196,11 @@
...
}
+NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
+simple_command() function, should also use this flag if it wants (in case
+of the --print-annotate-rows-events option set) to recieve Annotate_rows
+events.
+
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -212,8 +217,7 @@
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
- flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
- thd->server_id == 0 /* slave == mysqlbinlog */ )
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
-=-=(Alexi - Sun, 20 Dec 2009, 13:14)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.11350 2009-12-20 13:14:04.000000000 +0200
+++ /tmp/wklog.47.new.11350 2009-12-20 13:14:04.000000000 +0200
@@ -282,23 +282,18 @@
Annotate_rows_log_event* m_annotate_event;
};
-When the saved Annotate_rows object may be deleted? When all corresponding
-Rows events will be processed, i.e. before processing the first non-Rows
-event (note that Annotate_rows object resides in the binary log *after*
-the (possible) 'BEGIN' Query event which accompanies the rows events; note
-also that this deletion is adjusted with the case when some or all
-corresponding Rows events are filtered out by replicate filter rules):
+The saved Annotate_rows object should be deleted when all corresponding
+Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
- if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
- rli->free_annotate_event();
-
apply_event_and_update_pos(ev, ...);
- if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ if (rli->get_annotate_event() && is_last_rows_event(ev))
+ rli->free_annotate_event();
+ else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
@@ -307,10 +302,21 @@
where
- #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
- (type) == WRITE_ROWS_EVENT || \
+ bool is_last_rows_event(Log_event* ev)
+ {
+ Log_event_type type= ev->get_type_code();
+ if (IS_ROWS_EVENT_TYPE(type))
+ {
+ Rows_log_event* rows= (Rows_log_event*)ev;
+ return rows->get_flags(Rows_log_event::STMT_END_F);
+ }
+
+ return 0;
+ }
+
+ #define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
- (type) == DELETE_ROWS_EVENT )
+ (type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
-=-=(Alexi - Sun, 20 Dec 2009, 09:29)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.32726 2009-12-20 07:29:56.000000000 +0000
+++ /tmp/wklog.47.new.32726 2009-12-20 07:29:56.000000000 +0000
@@ -56,7 +56,7 @@
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
- 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ 00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
-=-=(Alexi - Sat, 19 Dec 2009, 16:10)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.16051 2009-12-19 16:10:48.000000000 +0200
+++ /tmp/wklog.47.new.16051 2009-12-19 16:10:48.000000000 +0200
@@ -253,7 +253,7 @@
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
-Rows events is applied):
+Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
-=-=(Alexi - Sat, 19 Dec 2009, 16:02)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.15695 2009-12-19 16:02:33.000000000 +0200
+++ /tmp/wklog.47.new.15695 2009-12-19 16:02:33.000000000 +0200
@@ -12,7 +12,7 @@
post-header and contains the query text in its data part. Example:
************************
- ANNOTATE_RBR_EVENT
+ ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
-=-=(Alexi - Sat, 19 Dec 2009, 15:58)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.15437 2009-12-19 13:58:12.000000000 +0000
+++ /tmp/wklog.47.new.15437 2009-12-19 13:58:12.000000000 +0000
@@ -1 +1,337 @@
+Content
+~~~~~~~
+ 1. Annotate_rows event number
+ 2. Outline of Annotate_rows event behavior
+ 3. How Master writes Annotate_rows events to the binary log
+ 4. How slave treats replicate-annotate-rows-events option
+ 5. How slave IO thread requests Annotate_rows events
+ 6. How master executes the request
+ 7. How slave SQL thread processes Annotate_rows events
+ 8. General remarks
+
+1. Annotate_rows event number
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
+between the last MySQL event number and the Annotate_rows event number:
+
+ enum Log_event_type
+ { ...
+ INCIDENT_EVENT= 26,
+ // New MySQL event numbers are to be added here
+ MYSQL_EVENTS_END,
+
+ MARIA_EVENTS_BEGIN= 51,
+ // New Maria event numbers start from here
+ ANNOTATE_ROWS_EVENT= 51,
+
+ ENUM_END_EVENT
+ };
+
+together with the corresponding extension of 'post_header_len' array in the
+Format description event. (This extension does not affect the compatibility
+of the binary log). Here is how Format description event looks like with
+this extension:
+
+ ************************
+ FORMAT_DESCRIPTION_EVENT
+ ************************
+ 00000004 | A1 A0 2C 4B | time_when = 1261215905
+ 00000008 | 0F | event_type = 15
+ 00000009 | 64 00 00 00 | server_id = 100
+ 0000000D | 7F 00 00 00 | event_len = 127
+ 00000011 | 83 00 00 00 | log_pos = 00000083
+ 00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
+ ------------------------
+ 00000017 | 04 00 | binlog_ver = 4
+ 00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
+ ..... ...
+ 0000004B | A1 A0 2C 4B | time_created = 1261215905
+ 0000004F | 13 | common_header_len = 19
+ ------------------------
+ post_header_len
+ ------------------------
+ 00000050 | 38 | 56 - START_EVENT_V3 [1]
+ ..... ...
+ 00000069 | 02 | 2 - INCIDENT_EVENT [26]
+ 0000006A | 00 | 0 - RESERVED [27]
+ ..... ...
+ 00000081 | 00 | 0 - RESERVED [50]
+ 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ ************************
+
+2. Outline of Annotate_rows event behavior
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Each Annotate_rows_log_event object has two private members describing the
+corresponding query:
+
+ char *m_query_txt;
+ uint m_query_len;
+
+When the object is created for writing to a binary log, this query is taken
+from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
+as well as other implementation details):
+
+ Annotate_rows_log_event(THD *thd)
+ {
+ m_query_txt = thd->query();
+ m_query_len = thd->query_length();
+ }
+
+When the object is read from a binary log, the query is taken from the buffer
+containing the binary log representation of the event (this buffer is allocated
+in Log_event object from which all Log events are derived):
+
+ Annotate_rows_log_event(char *buf, uint event_len,
+ Format_description_log_event *desc)
+ {
+ m_query_len = event_len - desc->common_header_len;
+ m_query_txt = buf + desc->common_header_len;
+ }
+
+The events are written to the binary log by the Log_event::write() member
+which calls virtual write_data_header() and write_data_body() members
+("data header" and "post header" are synonym in replication terminology).
+In our case, data header is empty and data body is just the query:
+
+ bool write_data_body(IO_CACHE *file)
+ {
+ return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
+ }
+
+Printing the event is just printing the query:
+
+ void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
+ {
+ my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
+ }
+
+3. How Master writes Annotate_rows events to the binary log
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The event is written to the binary log just before the group of Table_map
+events which precede corresponding Rows events (one query may generate
+several Table map events in the binary log, but the corresponding
+Annotate_rows event must be written only once before the first Table map
+event; hence the boolean variable 'with_annotate' below):
+
+ int write_locked_table_maps(THD *thd)
+ { ...
+ bool with_annotate= thd->variables.binlog_annotate_rows_events;
+ ...
+ for (uint i= 0; i < ... <number of tables> ...; ++i)
+ { ...
+ thd->binlog_write_table_map(table, ..., with_annotate);
+ with_annotate= 0; // write Annotate_event not more than once
+ ...
+ }
+ ...
+ }
+
+ int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
+ { ...
+ Table_map_log_event the_event(...);
+ ...
+ if (with_annotate)
+ {
+ Annotate_rows_log_event anno(this);
+ mysql_bin_log.write(&anno);
+ }
+
+ mysql_bin_log.write(&the_event);
+ ...
+ }
+
+4. How slave treats replicate-annotate-rows-events option
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The replicate-annotate-rows-events option is treated just as the session
+value of the binlog_annotate_rows_events variable for the slave IO and
+SQL threads. This setting is done during initialization of these threads:
+
+ pthread_handler_t handle_slave_io(void *arg)
+ {
+ THD *thd= new THD;
+ ...
+ init_slave_thread(thd, SLAVE_THD_IO);
+ ...
+ }
+
+ pthread_handler_t handle_slave_sql(void *arg)
+ {
+ THD *thd= new THD;
+ ...
+ init_slave_thread(thd, SLAVE_THD_SQL);
+ ...
+ }
+
+ int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
+ { ...
+ thd->variables.binlog_annotate_rows_events=
+ opt_replicate_annotate_rows_events;
+ ...
+ }
+
+5. How slave IO thread requests Annotate_rows events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When requesting an event, the slave should inform the master whether
+it should send Annotate_rows events or not. To that end we add a new
+BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
+
+ #define BINLOG_DUMP_NON_BLOCK 1
+ #define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
+
+ pthread_handler_t handle_slave_io(void *arg)
+ { ...
+ request_dump(mysql, ...);
+ ...
+ }
+
+ int request_dump(MYSQL* mysql, ...)
+ { ...
+ if (opt_log_slave_updates &&
+ mi->io_thd->variables.binlog_annotate_rows_events)
+ binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
+ ...
+ int2store(buf + 4, binlog_flags);
+ ...
+ simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
+ ...
+ }
+
+6. How master executes the request
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ case COM_BINLOG_DUMP:
+ { ...
+ flags= uint2korr(packet + 4);
+ ...
+ mysql_binlog_send(thd, ..., flags);
+ ...
+ }
+
+ void mysql_binlog_send(THD* thd, ..., ushort flags)
+ { ...
+ Log_event::read_log_event(&log, packet, ...);
+ ...
+ if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
+ thd->server_id == 0 /* slave == mysqlbinlog */ )
+ {
+ my_net_write(net, packet->ptr(), packet->length());
+ }
+ ...
+ }
+
+7. How slave SQL thread processes Annotate_rows events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The slave processes each recieved event by "applying" it, i.e. by
+calling the Log_event::apply_event() function which in turn calls
+the virtual do_apply_event() member specific for each type of the
+event.
+
+ int exec_relay_log_event(THD* thd, Relay_log_info* rli)
+ { ...
+ Log_event *ev = next_event(rli);
+ ...
+ apply_event_and_update_pos(ev, ...);
+
+ if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
+ delete ev;
+ ...
+ }
+
+ int apply_event_and_update_pos(Log_event *ev, ...)
+ { ...
+ ev->apply_event(...);
+ ...
+ }
+
+ int Log_event::apply_event(...)
+ {
+ return do_apply_event(...);
+ }
+
+What does it mean to "apply" an Annotate_rows event? It means to set current
+thd query to that of the described by the event, i.e. to the query which
+caused the subsequent Rows events (see "How Master writes Annotate_rows
+events to the binary log" to follow what happens further when the subsequent
+Rows events is applied):
+
+ int Annotate_rows_log_event::do_apply_event(...)
+ {
+ thd->set_query(m_query_txt, m_query_len);
+ }
+
+NOTE. I am not sure, but possibly current values of thd->query and
+thd->query_length should be saved before calling set_query() and to be
+restored on the Annotate_rows_log_event object deletion.
+Is it really needed ?
+
+After calling this do_apply_event() function we may not delete the
+Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
+above) because thd->query now points to the string inside this object.
+We may keep the pointer to this object in the Relay_log_info:
+
+ class Relay_log_info
+ {
+ public:
+ ...
+ void set_annotate_event(Annotate_rows_log_event*);
+ Annotate_rows_log_event* get_annotate_event();
+ void free_annotate_event();
+ ...
+ private:
+ Annotate_rows_log_event* m_annotate_event;
+ };
+
+When the saved Annotate_rows object may be deleted? When all corresponding
+Rows events will be processed, i.e. before processing the first non-Rows
+event (note that Annotate_rows object resides in the binary log *after*
+the (possible) 'BEGIN' Query event which accompanies the rows events; note
+also that this deletion is adjusted with the case when some or all
+corresponding Rows events are filtered out by replicate filter rules):
+
+ int exec_relay_log_event(THD* thd, Relay_log_info* rli)
+ { ...
+ Log_event *ev= next_event(rli);
+ ...
+ if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
+ rli->free_annotate_event();
+
+ apply_event_and_update_pos(ev, ...);
+
+ if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ rli->set_annotate_event((Annotate_rows_log_event*) ev);
+ else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
+ delete ev;
+ ...
+ }
+
+where
+
+ #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
+ (type) == WRITE_ROWS_EVENT || \
+ (type) == UPDATE_ROWS_EVENT || \
+ (type) == DELETE_ROWS_EVENT )
+
+8. General remarks
+~~~~~~~~~~~~~~~~~~
+Kristian noticed that introducing new log event type should be coordinated
+somehow with MySQL/Sun:
+
+ Kristian: The numeric code for this event must be assigned carefully.
+ It should be coordinated with MySQL/Sun, otherwise we can get into a
+ situation where MySQL uses the same numeric code for one event that
+ MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
+ impossible.
+ Alex: I reserved about 20 numbers not to have possible conflicts
+ with MySQL.
+ Kristian: Still, I think it would be appropriate to send a polite email
+ to internals(a)lists.mysql.com about this and suggesting to reserve the
+ event number.
+
+Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
+flag taking into account that MySQL/Sun may also introduce a flag with the
+same value to be used in the request_dump-mysql_binlog_send interface.
+But this is mainly the question of merging: if a conflict concerning this
+flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
+(this does not require additional changes in the code).
-=-=(Alexi - Sat, 19 Dec 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.14545 2009-12-19 15:41:21.000000000 +0200
+++ /tmp/wklog.47.new.14545 2009-12-19 15:41:21.000000000 +0200
@@ -1,122 +1,107 @@
-First suggestion:
-
-> I think for this we would actually need a new binlog event type
-> (Comment_log_event?). Unless we want to log an empty statement Query_log_event
-> containing only a comment (a bit of a hack).
-
-New server option
-~~~~~~~~~~~~~~~~~
- --binlog-annotate-rows-events
-
-Setting this option makes RBR (rows-) events in the binary log to be
-preceded by Annotate rows events (see below). The corresponding
-'binlog_annotate_rows_events' system variable is dynamic and has both
-global and session values. Default global value is OFF.
-
-Note. Session values are usefull to make it possible to annotate only
- some selected statements:
+Content
+~~~~~~~
+ 1. Annotate_rows_log_event
+ 2. Server option: --binlog-annotate-rows-events
+ 3. Server option: --replicate-annotate-rows-events
+ 4. mysqlbinlog option: --print-annotate-rows-events
+ 5. mysqlbinlog output
+
+1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Describes the query which caused the corresponding rows events. Has empty
+post-header and contains the query text in its data part. Example:
+
+ ************************
+ ANNOTATE_RBR_EVENT
+ ************************
+ 00000220 | B6 A0 2C 4B | time_when = 1261215926
+ 00000224 | 33 | event_type = 51
+ 00000225 | 64 00 00 00 | server_id = 100
+ 00000229 | 36 00 00 00 | event_len = 54
+ 0000022D | 56 02 00 00 | log_pos = 00000256
+ 00000231 | 00 00 | flags = <none>
+ ------------------------
+ 00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
+ 00000237 | 52 54 20 49 |
+ 0000023B | 4E 54 4F 20 |
+ 0000023F | 74 31 20 56 |
+ 00000243 | 41 4C 55 45 |
+ 00000247 | 53 20 28 31 |
+ 0000024B | 29 2C 20 28 |
+ 0000024F | 32 29 2C 20 |
+ 00000253 | 28 33 29 |
+ ************************
+
+In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
+and precedes the first of Table map events which accompany the corresponding
+rows events. (See example in the "mysqlbinlog output" section below.)
+
+2. Server option: --binlog-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tells the master to write Annotate_rows events to the binary log.
+
+ * Variable Name: binlog_annotate_rows_events
+ * Scope: Global & Session
+ * Access Type: Dynamic
+ * Data Type: bool
+ * Default Value: OFF
+NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
-New binlog event type
-~~~~~~~~~~~~~~~~~~~~~
- Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
-
-Describes the query which caused the corresponding rows event. In binary log,
-precedes each Table_map_log_event. Contains empty post-header and the query
-text in its data part.
-
-The numeric code for this event must be assigned carefully. It should be
-coordinated with MySQL/Sun, otherwise we can get into a situation where MySQL
-uses the same numeric code for one event that MariaDB uses for
-ANNOTATE_ROWS_EVENT, which would make merging the two impossible.
-
-Example:
-
- ...
- ************************
- ANNOTATE_ROWS_EVENT [51]
- ************************
- 000000C7 | 54 1B 12 4B | time_when = 1259477844
- 000000CB | 33 | event_type = 51
- 000000CC | 64 00 00 00 | server_id = 100
- 000000D0 | 2C 00 00 00 | event_len = 44
- 000000D4 | F3 00 00 00 | log_pos = 000000F3
- 000000D8 | 00 00 | flags = <none>
- ------------------------
- 000000DA | 69 6E 73 65 | query = "insert into t1 values (1)"
- 000000DE | 72 74 20 69 |
- 000000E2 | 6E 74 6F 20 |
- 000000E6 | 74 31 20 76 |
- 000000EA | 61 6C 75 65 |
- 000000EE | 73 20 28 31 |
- 000000F2 | 29 |
- ************************
- TABLE_MAP_EVENT [19]
- ************************
- 000000F3 | 54 1B 12 4B | time_when = 1259477844
- 000000F7 | 13 | event_type = 19
- 000000F8 | 64 00 00 00 | server_id = 100
- 000000FC | 29 00 00 00 | event_len = 41
- 00000100 | 1C 01 00 00 | log_pos = 0000011C
- 00000104 | 00 00 | flags = <none>
- ------------------------
- ...
- ************************
- WRITE_ROWS_EVENT [23]
- ************************
- 0000011C | 54 1B 12 4B | time_when = 1259477844
- 00000120 | 17 | event_type = 23
- 00000121 | 64 00 00 00 | server_id = 100
- 00000125 | 22 00 00 00 | event_len = 34
- 00000129 | 3E 01 00 00 | log_pos = 0000013E
- 0000012D | 10 00 | flags = LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F
- ------------------------
- 0000012F | 0F 00 00 00 | table_id = 15
- ...
-
-New mysqlbinlog option
-~~~~~~~~~~~~~~~~~~~~~~
- --print-annotate-rows-events
-
-With this option, mysqlbinlog prints the content of Annotate-rows
-events (if the binary log does contain them). Without this option
-(i.e. by default), mysqlbinlog skips Annotate rows events.
-
-
-mysqlbinlog output
-~~~~~~~~~~~~~~~~~~
-Something like this:
+3. Server option: --replicate-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tells the slave to reproduce Annotate_rows events recieved from the master
+in its own binary log (sensible only in pair with log-slave-updates option).
+
+ * Variable Name: replicate_annotate_rows_events
+ * Scope: Global
+ * Access Type: Read only
+ * Data Type: bool
+ * Default Value: OFF
+
+NOTE. Why do we additionally need this 'replicate' option? Why not to make
+the slave to reproduce this events when its binlog-annotate-rows-events
+global value is ON? Well, because, for example, we may want to configure
+the slave which should reproduce Annotate_rows events but has global
+binlog-annotate-rows-events = OFF meaning this to be the default value for
+the client threads (see also "How slave treats replicate-annotate-rows-events
+option" in LLD part).
+
+4. mysqlbinlog option: --print-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+With this option, mysqlbinlog prints the content of Annotate_rows events (if
+the binary log does contain them). Without this option (i.e. by default),
+mysqlbinlog skips Annotate_rows events.
+5. mysqlbinlog output
+~~~~~~~~~~~~~~~~~~~~~
+With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
+in a form like this:
...
- # at 199
- # at 243
- # at 284
- #091129 9:57:24 server id 100 end_log_pos 243 Query: `insert into t1 values
-(1)`
- #091129 9:57:24 server id 100 end_log_pos 284 Table_map: `test`.`t1` mapped
-to number 15
- #091129 9:57:24 server id 100 end_log_pos 318 Write_rows: table id 15
+ # at 1646
+ #091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
+exec_time=0 error_code=0
+ SET TIMESTAMP=1261215926/*!*/;
+ BEGIN
+ /*!*/;
+ # at 1714
+ # at 1812
+ # at 1853
+ # at 1894
+ # at 1938
+ #091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
+t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
+ #091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
+mapped to number 16
+ #091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
+mapped to number 17
+ #091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
+ #091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
-
- BINLOG '
- VBsSSzNkAAAALAAAAPMAAAAAAGluc2VydCBpbnRvIHQxIHZhbHVlcyAoMSk=
- VBsSSxNkAAAAKQAAABwBAAAAAA8AAAAAAAAABHRlc3QAAnQxAAEDAAE=
- VBsSSxdkAAAAIgAAAD4BAAAQAA8AAAAAAAEAAf/+AQAAAA==
- '/*!*/;
- ### INSERT INTO test.t1
- ### SET
- ### @1=1 /* INT meta=0 nullable=1 is_null=0 */
...
-When master sends Annotate rows events
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-1. Master always sends Annotate_rows events to mysqlbinlog (in
- remote case).
-2. Master sends Annotate_rows events to a slave only if the slave has
- both log-slave-updates and binlog-annotate-rows-events options set.
-
-=-=(Bothorsen - Fri, 18 Dec 2009, 16:22)=-=-
Add estimation time.
Worked 5 hours and estimate 35 hours remain (original estimate increased by 5 hours).
-=-=(Bothorsen - Fri, 18 Dec 2009, 16:16)=-=-
This is the work done on this patch so far. Most of it done by Alex.
Worked 15 hours and estimate 035 hours remain (original estimate increased by 50 hours).
-=-=(Alexi - Fri, 04 Dec 2009, 13:00)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.14001 2009-12-04 13:00:24.000000000 +0200
+++ /tmp/wklog.47.new.14001 2009-12-04 13:00:24.000000000 +0200
@@ -6,27 +6,27 @@
New server option
~~~~~~~~~~~~~~~~~
- --binlog-annotate-row-events
+ --binlog-annotate-rows-events
-Setting this option makes RBR (row-) events in the binary log to be
+Setting this option makes RBR (rows-) events in the binary log to be
preceded by Annotate rows events (see below). The corresponding
-'binlog_annotate_row_events' system variable is dynamic and has both
+'binlog_annotate_rows_events' system variable is dynamic and has both
global and session values. Default global value is OFF.
Note. Session values are usefull to make it possible to annotate only
some selected statements:
...
- SET SESSION binlog_annotate_row_events=ON;
+ SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
- SET SESSION binlog_annotate_row_events=OFF;
+ SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
New binlog event type
~~~~~~~~~~~~~~~~~~~~~
Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
-Describes the query which caused the corresponding row event. In binary log,
+Describes the query which caused the corresponding rows event. In binary log,
precedes each Table_map_log_event. Contains empty post-header and the query
text in its data part.
@@ -79,6 +79,15 @@
0000012F | 0F 00 00 00 | table_id = 15
...
+New mysqlbinlog option
+~~~~~~~~~~~~~~~~~~~~~~
+ --print-annotate-rows-events
+
+With this option, mysqlbinlog prints the content of Annotate-rows
+events (if the binary log does contain them). Without this option
+(i.e. by default), mysqlbinlog skips Annotate rows events.
+
+
mysqlbinlog output
~~~~~~~~~~~~~~~~~~
Something like this:
@@ -109,5 +118,5 @@
1. Master always sends Annotate_rows events to mysqlbinlog (in
remote case).
2. Master sends Annotate_rows events to a slave only if the slave has
- both log-slave-updates and binlog-annotate-row-events options set.
+ both log-slave-updates and binlog-annotate-rows-events options set.
------------------------------------------------------------
-=-=(View All Progress Notes, 19 total)=-=-
http://askmonty.org/worklog/index.pl?tid=47&nolimit=1
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
HIGH-LEVEL SPECIFICATION:
Content
~~~~~~~
1. Annotate_rows_log_event
2. Server option: --binlog-annotate-rows-events
3. Server option: --replicate-annotate-rows-events
4. mysqlbinlog option: --print-annotate-rows-events
5. mysqlbinlog output
1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Describes the query which caused the corresponding rows events. Has empty
post-header and contains the query text in its data part. Example:
************************
ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
00000225 | 64 00 00 00 | server_id = 100
00000229 | 36 00 00 00 | event_len = 54
0000022D | 56 02 00 00 | log_pos = 00000256
00000231 | 00 00 | flags = <none>
------------------------
00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
00000237 | 52 54 20 49 |
0000023B | 4E 54 4F 20 |
0000023F | 74 31 20 56 |
00000243 | 41 4C 55 45 |
00000247 | 53 20 28 31 |
0000024B | 29 2C 20 28 |
0000024F | 32 29 2C 20 |
00000253 | 28 33 29 |
************************
In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
and precedes the first of Table map events which accompany the corresponding
rows events. (See example in the "mysqlbinlog output" section below.)
2. Server option: --binlog-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the master to write Annotate_rows events to the binary log.
* Variable Name: binlog_annotate_rows_events
* Scope: Global & Session
* Access Type: Dynamic
* Data Type: bool
* Default Value: OFF
NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
3. Server option: --replicate-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the slave to reproduce Annotate_rows events recieved from the master
in its own binary log (sensible only in pair with log-slave-updates option).
* Variable Name: replicate_annotate_rows_events
* Scope: Global
* Access Type: Read only
* Data Type: bool
* Default Value: OFF
NOTE. Why do we additionally need this 'replicate' option? Why not to make
the slave to reproduce this events when its binlog-annotate-rows-events
global value is ON? Well, because, for example, we may want to configure
the slave which should reproduce Annotate_rows events but has global
binlog-annotate-rows-events = OFF meaning this to be the default value for
the client threads (see also "How slave treats replicate-annotate-rows-events
option" in LLD part).
4. mysqlbinlog option: --print-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With this option, mysqlbinlog prints the content of Annotate_rows events (if
the binary log does contain them). Without this option (i.e. by default),
mysqlbinlog skips Annotate_rows events.
5. mysqlbinlog output
~~~~~~~~~~~~~~~~~~~~~
With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
in a form like this:
...
# at 1646
#091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
exec_time=0 error_code=0
SET TIMESTAMP=1261215926/*!*/;
BEGIN
/*!*/;
# at 1714
# at 1812
# at 1853
# at 1894
# at 1938
#091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
#091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
mapped to number 16
#091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
mapped to number 17
#091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
#091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
...
LOW-LEVEL DESIGN:
Content
~~~~~~~
1. Annotate_rows event number
2. Outline of Annotate_rows event behavior
3. How Master writes Annotate_rows events to the binary log
4. How slave treats replicate-annotate-rows-events option
5. How slave IO thread requests Annotate_rows events
6. How master executes the request
7. How slave SQL thread processes Annotate_rows events
8. General remarks
1. Annotate_rows event number
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
between the last MySQL event number and the Annotate_rows event number:
enum Log_event_type
{ ...
INCIDENT_EVENT= 26,
// New MySQL event numbers are to be added here
MYSQL_EVENTS_END,
MARIA_EVENTS_BEGIN= 51,
// New Maria event numbers start from here
ANNOTATE_ROWS_EVENT= 51,
ENUM_END_EVENT
};
together with the corresponding extension of 'post_header_len' array in the
Format description event. (This extension does not affect the compatibility
of the binary log). Here is how Format description event looks like with
this extension:
************************
FORMAT_DESCRIPTION_EVENT
************************
00000004 | A1 A0 2C 4B | time_when = 1261215905
00000008 | 0F | event_type = 15
00000009 | 64 00 00 00 | server_id = 100
0000000D | 7F 00 00 00 | event_len = 127
00000011 | 83 00 00 00 | log_pos = 00000083
00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
------------------------
00000017 | 04 00 | binlog_ver = 4
00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
..... ...
0000004B | A1 A0 2C 4B | time_created = 1261215905
0000004F | 13 | common_header_len = 19
------------------------
post_header_len
------------------------
00000050 | 38 | 56 - START_EVENT_V3 [1]
..... ...
00000069 | 02 | 2 - INCIDENT_EVENT [26]
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each Annotate_rows_log_event object has two private members describing the
corresponding query:
char *m_query_txt;
uint m_query_len;
When the object is created for writing to a binary log, this query is taken
from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
as well as other implementation details):
Annotate_rows_log_event(THD *thd)
{
m_query_txt = thd->query();
m_query_len = thd->query_length();
}
When the object is read from a binary log, the query is taken from the buffer
containing the binary log representation of the event (this buffer is allocated
in Log_event object from which all Log events are derived):
Annotate_rows_log_event(char *buf, uint event_len,
Format_description_log_event *desc)
{
m_query_len = event_len - desc->common_header_len;
m_query_txt = buf + desc->common_header_len;
}
The events are written to the binary log by the Log_event::write() member
which calls virtual write_data_header() and write_data_body() members
("data header" and "post header" are synonym in replication terminology).
In our case, data header is empty and data body is just the query:
bool write_data_body(IO_CACHE *file)
{
return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
}
Printing the event is just printing the query:
void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
{
my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
}
3. How Master writes Annotate_rows events to the binary log
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The event is written to the binary log just before the group of Table_map
events which precede corresponding Rows events (one query may generate
several Table map events in the binary log, but the corresponding
Annotate_rows event must be written only once before the first Table map
event; hence the boolean variable 'with_annotate' below):
int write_locked_table_maps(THD *thd)
{ ...
bool with_annotate= thd->variables.binlog_annotate_rows_events;
...
for (uint i= 0; i < ... <number of tables> ...; ++i)
{ ...
thd->binlog_write_table_map(table, ..., with_annotate);
with_annotate= 0; // write Annotate_event not more than once
...
}
...
}
int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
{ ...
Table_map_log_event the_event(...);
...
if (with_annotate)
{
Annotate_rows_log_event anno(this);
mysql_bin_log.write(&anno);
}
mysql_bin_log.write(&the_event);
...
}
4. How slave treats replicate-annotate-rows-events option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The replicate-annotate-rows-events option is treated just as the session
value of the binlog_annotate_rows_events variable for the slave IO and
SQL threads. This setting is done during initialization of these threads:
pthread_handler_t handle_slave_io(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_IO);
...
}
pthread_handler_t handle_slave_sql(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_SQL);
...
}
int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
{ ...
thd->variables.binlog_annotate_rows_events=
opt_replicate_annotate_rows_events;
...
}
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When requesting an event, the slave should inform the master whether
it should send Annotate_rows events or not. To that end we add a new
BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
#define BINLOG_DUMP_NON_BLOCK 1
#define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
pthread_handler_t handle_slave_io(void *arg)
{ ...
request_dump(mysql, ...);
...
}
int request_dump(MYSQL* mysql, ...)
{ ...
if (opt_log_slave_updates &&
mi->io_thd->variables.binlog_annotate_rows_events)
binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
...
int2store(buf + 4, binlog_flags);
...
simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
...
}
NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
simple_command() function, should also use this flag if it wants (in case
of the --print-annotate-rows-events option set) to recieve Annotate_rows
events.
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
case COM_BINLOG_DUMP:
{ ...
flags= uint2korr(packet + 4);
...
mysql_binlog_send(thd, ..., flags);
...
}
void mysql_binlog_send(THD* thd, ..., ushort flags)
{ ...
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
...
}
7. How slave SQL thread processes Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The slave processes each recieved event by "applying" it, i.e. by
calling the Log_event::apply_event() function which in turn calls
the virtual do_apply_event() member specific for each type of the
event.
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev = next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
int apply_event_and_update_pos(Log_event *ev, ...)
{ ...
ev->apply_event(...);
...
}
int Log_event::apply_event(...)
{
return do_apply_event(...);
}
What does it mean to "apply" an Annotate_rows event? It means to set current
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
thd->set_query(m_query_txt, m_query_len);
}
NOTE. I am not sure, but possibly current values of thd->query and
thd->query_length should be saved before calling set_query() and to be
restored on the Annotate_rows_log_event object deletion.
Is it really needed ?
After calling this do_apply_event() function we may not delete the
Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
above) because thd->query now points to the string inside this object.
We may keep the pointer to this object in the Relay_log_info:
class Relay_log_info
{
public:
...
void set_annotate_event(Annotate_rows_log_event*);
Annotate_rows_log_event* get_annotate_event();
void free_annotate_event();
...
private:
Annotate_rows_log_event* m_annotate_event;
};
The saved Annotate_rows object should be deleted when all corresponding
Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (rli->get_annotate_event() && is_last_rows_event(ev))
rli->free_annotate_event();
else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
where
bool is_last_rows_event(Log_event* ev)
{
Log_event_type type= ev->get_type_code();
if (IS_ROWS_EVENT_TYPE(type))
{
Rows_log_event* rows= (Rows_log_event*)ev;
return rows->get_flags(Rows_log_event::STMT_END_F);
}
return 0;
}
#define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
(type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
Kristian noticed that introducing new log event type should be coordinated
somehow with MySQL/Sun:
Kristian: The numeric code for this event must be assigned carefully.
It should be coordinated with MySQL/Sun, otherwise we can get into a
situation where MySQL uses the same numeric code for one event that
MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
impossible.
Alex: I reserved about 20 numbers not to have possible conflicts
with MySQL.
Kristian: Still, I think it would be appropriate to send a polite email
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
flag taking into account that MySQL/Sun may also introduce a flag with the
same value to be used in the request_dump-mysql_binlog_send interface.
But this is mainly the question of merging: if a conflict concerning this
flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
(this does not require additional changes in the code).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Alexi): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 20 Dec '09
by worklog-noreply@askmonty.org 20 Dec '09
20 Dec '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Knielsen
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 20
ESTIMATE.......: 35 (hours remain)
ORIG. ESTIMATE.: 35
PROGRESS NOTES:
-=-=(Alexi - Sun, 20 Dec 2009, 16:00)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.19667 2009-12-20 14:00:56.000000000 +0000
+++ /tmp/wklog.47.new.19667 2009-12-20 14:00:56.000000000 +0000
@@ -196,6 +196,11 @@
...
}
+NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
+simple_command() function, should also use this flag if it wants (in case
+of the --print-annotate-rows-events option set) to recieve Annotate_rows
+events.
+
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -212,8 +217,7 @@
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
- flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
- thd->server_id == 0 /* slave == mysqlbinlog */ )
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
-=-=(Alexi - Sun, 20 Dec 2009, 13:14)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.11350 2009-12-20 13:14:04.000000000 +0200
+++ /tmp/wklog.47.new.11350 2009-12-20 13:14:04.000000000 +0200
@@ -282,23 +282,18 @@
Annotate_rows_log_event* m_annotate_event;
};
-When the saved Annotate_rows object may be deleted? When all corresponding
-Rows events will be processed, i.e. before processing the first non-Rows
-event (note that Annotate_rows object resides in the binary log *after*
-the (possible) 'BEGIN' Query event which accompanies the rows events; note
-also that this deletion is adjusted with the case when some or all
-corresponding Rows events are filtered out by replicate filter rules):
+The saved Annotate_rows object should be deleted when all corresponding
+Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
- if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
- rli->free_annotate_event();
-
apply_event_and_update_pos(ev, ...);
- if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ if (rli->get_annotate_event() && is_last_rows_event(ev))
+ rli->free_annotate_event();
+ else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
@@ -307,10 +302,21 @@
where
- #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
- (type) == WRITE_ROWS_EVENT || \
+ bool is_last_rows_event(Log_event* ev)
+ {
+ Log_event_type type= ev->get_type_code();
+ if (IS_ROWS_EVENT_TYPE(type))
+ {
+ Rows_log_event* rows= (Rows_log_event*)ev;
+ return rows->get_flags(Rows_log_event::STMT_END_F);
+ }
+
+ return 0;
+ }
+
+ #define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
- (type) == DELETE_ROWS_EVENT )
+ (type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
-=-=(Alexi - Sun, 20 Dec 2009, 09:29)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.32726 2009-12-20 07:29:56.000000000 +0000
+++ /tmp/wklog.47.new.32726 2009-12-20 07:29:56.000000000 +0000
@@ -56,7 +56,7 @@
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
- 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ 00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
-=-=(Alexi - Sat, 19 Dec 2009, 16:10)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.16051 2009-12-19 16:10:48.000000000 +0200
+++ /tmp/wklog.47.new.16051 2009-12-19 16:10:48.000000000 +0200
@@ -253,7 +253,7 @@
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
-Rows events is applied):
+Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
-=-=(Alexi - Sat, 19 Dec 2009, 16:02)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.15695 2009-12-19 16:02:33.000000000 +0200
+++ /tmp/wklog.47.new.15695 2009-12-19 16:02:33.000000000 +0200
@@ -12,7 +12,7 @@
post-header and contains the query text in its data part. Example:
************************
- ANNOTATE_RBR_EVENT
+ ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
-=-=(Alexi - Sat, 19 Dec 2009, 15:58)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.15437 2009-12-19 13:58:12.000000000 +0000
+++ /tmp/wklog.47.new.15437 2009-12-19 13:58:12.000000000 +0000
@@ -1 +1,337 @@
+Content
+~~~~~~~
+ 1. Annotate_rows event number
+ 2. Outline of Annotate_rows event behavior
+ 3. How Master writes Annotate_rows events to the binary log
+ 4. How slave treats replicate-annotate-rows-events option
+ 5. How slave IO thread requests Annotate_rows events
+ 6. How master executes the request
+ 7. How slave SQL thread processes Annotate_rows events
+ 8. General remarks
+
+1. Annotate_rows event number
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
+between the last MySQL event number and the Annotate_rows event number:
+
+ enum Log_event_type
+ { ...
+ INCIDENT_EVENT= 26,
+ // New MySQL event numbers are to be added here
+ MYSQL_EVENTS_END,
+
+ MARIA_EVENTS_BEGIN= 51,
+ // New Maria event numbers start from here
+ ANNOTATE_ROWS_EVENT= 51,
+
+ ENUM_END_EVENT
+ };
+
+together with the corresponding extension of 'post_header_len' array in the
+Format description event. (This extension does not affect the compatibility
+of the binary log). Here is how Format description event looks like with
+this extension:
+
+ ************************
+ FORMAT_DESCRIPTION_EVENT
+ ************************
+ 00000004 | A1 A0 2C 4B | time_when = 1261215905
+ 00000008 | 0F | event_type = 15
+ 00000009 | 64 00 00 00 | server_id = 100
+ 0000000D | 7F 00 00 00 | event_len = 127
+ 00000011 | 83 00 00 00 | log_pos = 00000083
+ 00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
+ ------------------------
+ 00000017 | 04 00 | binlog_ver = 4
+ 00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
+ ..... ...
+ 0000004B | A1 A0 2C 4B | time_created = 1261215905
+ 0000004F | 13 | common_header_len = 19
+ ------------------------
+ post_header_len
+ ------------------------
+ 00000050 | 38 | 56 - START_EVENT_V3 [1]
+ ..... ...
+ 00000069 | 02 | 2 - INCIDENT_EVENT [26]
+ 0000006A | 00 | 0 - RESERVED [27]
+ ..... ...
+ 00000081 | 00 | 0 - RESERVED [50]
+ 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ ************************
+
+2. Outline of Annotate_rows event behavior
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Each Annotate_rows_log_event object has two private members describing the
+corresponding query:
+
+ char *m_query_txt;
+ uint m_query_len;
+
+When the object is created for writing to a binary log, this query is taken
+from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
+as well as other implementation details):
+
+ Annotate_rows_log_event(THD *thd)
+ {
+ m_query_txt = thd->query();
+ m_query_len = thd->query_length();
+ }
+
+When the object is read from a binary log, the query is taken from the buffer
+containing the binary log representation of the event (this buffer is allocated
+in Log_event object from which all Log events are derived):
+
+ Annotate_rows_log_event(char *buf, uint event_len,
+ Format_description_log_event *desc)
+ {
+ m_query_len = event_len - desc->common_header_len;
+ m_query_txt = buf + desc->common_header_len;
+ }
+
+The events are written to the binary log by the Log_event::write() member
+which calls virtual write_data_header() and write_data_body() members
+("data header" and "post header" are synonym in replication terminology).
+In our case, data header is empty and data body is just the query:
+
+ bool write_data_body(IO_CACHE *file)
+ {
+ return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
+ }
+
+Printing the event is just printing the query:
+
+ void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
+ {
+ my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
+ }
+
+3. How Master writes Annotate_rows events to the binary log
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The event is written to the binary log just before the group of Table_map
+events which precede corresponding Rows events (one query may generate
+several Table map events in the binary log, but the corresponding
+Annotate_rows event must be written only once before the first Table map
+event; hence the boolean variable 'with_annotate' below):
+
+ int write_locked_table_maps(THD *thd)
+ { ...
+ bool with_annotate= thd->variables.binlog_annotate_rows_events;
+ ...
+ for (uint i= 0; i < ... <number of tables> ...; ++i)
+ { ...
+ thd->binlog_write_table_map(table, ..., with_annotate);
+ with_annotate= 0; // write Annotate_event not more than once
+ ...
+ }
+ ...
+ }
+
+ int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
+ { ...
+ Table_map_log_event the_event(...);
+ ...
+ if (with_annotate)
+ {
+ Annotate_rows_log_event anno(this);
+ mysql_bin_log.write(&anno);
+ }
+
+ mysql_bin_log.write(&the_event);
+ ...
+ }
+
+4. How slave treats replicate-annotate-rows-events option
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The replicate-annotate-rows-events option is treated just as the session
+value of the binlog_annotate_rows_events variable for the slave IO and
+SQL threads. This setting is done during initialization of these threads:
+
+ pthread_handler_t handle_slave_io(void *arg)
+ {
+ THD *thd= new THD;
+ ...
+ init_slave_thread(thd, SLAVE_THD_IO);
+ ...
+ }
+
+ pthread_handler_t handle_slave_sql(void *arg)
+ {
+ THD *thd= new THD;
+ ...
+ init_slave_thread(thd, SLAVE_THD_SQL);
+ ...
+ }
+
+ int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
+ { ...
+ thd->variables.binlog_annotate_rows_events=
+ opt_replicate_annotate_rows_events;
+ ...
+ }
+
+5. How slave IO thread requests Annotate_rows events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When requesting an event, the slave should inform the master whether
+it should send Annotate_rows events or not. To that end we add a new
+BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
+
+ #define BINLOG_DUMP_NON_BLOCK 1
+ #define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
+
+ pthread_handler_t handle_slave_io(void *arg)
+ { ...
+ request_dump(mysql, ...);
+ ...
+ }
+
+ int request_dump(MYSQL* mysql, ...)
+ { ...
+ if (opt_log_slave_updates &&
+ mi->io_thd->variables.binlog_annotate_rows_events)
+ binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
+ ...
+ int2store(buf + 4, binlog_flags);
+ ...
+ simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
+ ...
+ }
+
+6. How master executes the request
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ case COM_BINLOG_DUMP:
+ { ...
+ flags= uint2korr(packet + 4);
+ ...
+ mysql_binlog_send(thd, ..., flags);
+ ...
+ }
+
+ void mysql_binlog_send(THD* thd, ..., ushort flags)
+ { ...
+ Log_event::read_log_event(&log, packet, ...);
+ ...
+ if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
+ thd->server_id == 0 /* slave == mysqlbinlog */ )
+ {
+ my_net_write(net, packet->ptr(), packet->length());
+ }
+ ...
+ }
+
+7. How slave SQL thread processes Annotate_rows events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The slave processes each recieved event by "applying" it, i.e. by
+calling the Log_event::apply_event() function which in turn calls
+the virtual do_apply_event() member specific for each type of the
+event.
+
+ int exec_relay_log_event(THD* thd, Relay_log_info* rli)
+ { ...
+ Log_event *ev = next_event(rli);
+ ...
+ apply_event_and_update_pos(ev, ...);
+
+ if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
+ delete ev;
+ ...
+ }
+
+ int apply_event_and_update_pos(Log_event *ev, ...)
+ { ...
+ ev->apply_event(...);
+ ...
+ }
+
+ int Log_event::apply_event(...)
+ {
+ return do_apply_event(...);
+ }
+
+What does it mean to "apply" an Annotate_rows event? It means to set current
+thd query to that of the described by the event, i.e. to the query which
+caused the subsequent Rows events (see "How Master writes Annotate_rows
+events to the binary log" to follow what happens further when the subsequent
+Rows events is applied):
+
+ int Annotate_rows_log_event::do_apply_event(...)
+ {
+ thd->set_query(m_query_txt, m_query_len);
+ }
+
+NOTE. I am not sure, but possibly current values of thd->query and
+thd->query_length should be saved before calling set_query() and to be
+restored on the Annotate_rows_log_event object deletion.
+Is it really needed ?
+
+After calling this do_apply_event() function we may not delete the
+Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
+above) because thd->query now points to the string inside this object.
+We may keep the pointer to this object in the Relay_log_info:
+
+ class Relay_log_info
+ {
+ public:
+ ...
+ void set_annotate_event(Annotate_rows_log_event*);
+ Annotate_rows_log_event* get_annotate_event();
+ void free_annotate_event();
+ ...
+ private:
+ Annotate_rows_log_event* m_annotate_event;
+ };
+
+When the saved Annotate_rows object may be deleted? When all corresponding
+Rows events will be processed, i.e. before processing the first non-Rows
+event (note that Annotate_rows object resides in the binary log *after*
+the (possible) 'BEGIN' Query event which accompanies the rows events; note
+also that this deletion is adjusted with the case when some or all
+corresponding Rows events are filtered out by replicate filter rules):
+
+ int exec_relay_log_event(THD* thd, Relay_log_info* rli)
+ { ...
+ Log_event *ev= next_event(rli);
+ ...
+ if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
+ rli->free_annotate_event();
+
+ apply_event_and_update_pos(ev, ...);
+
+ if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ rli->set_annotate_event((Annotate_rows_log_event*) ev);
+ else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
+ delete ev;
+ ...
+ }
+
+where
+
+ #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
+ (type) == WRITE_ROWS_EVENT || \
+ (type) == UPDATE_ROWS_EVENT || \
+ (type) == DELETE_ROWS_EVENT )
+
+8. General remarks
+~~~~~~~~~~~~~~~~~~
+Kristian noticed that introducing new log event type should be coordinated
+somehow with MySQL/Sun:
+
+ Kristian: The numeric code for this event must be assigned carefully.
+ It should be coordinated with MySQL/Sun, otherwise we can get into a
+ situation where MySQL uses the same numeric code for one event that
+ MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
+ impossible.
+ Alex: I reserved about 20 numbers not to have possible conflicts
+ with MySQL.
+ Kristian: Still, I think it would be appropriate to send a polite email
+ to internals(a)lists.mysql.com about this and suggesting to reserve the
+ event number.
+
+Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
+flag taking into account that MySQL/Sun may also introduce a flag with the
+same value to be used in the request_dump-mysql_binlog_send interface.
+But this is mainly the question of merging: if a conflict concerning this
+flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
+(this does not require additional changes in the code).
-=-=(Alexi - Sat, 19 Dec 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.14545 2009-12-19 15:41:21.000000000 +0200
+++ /tmp/wklog.47.new.14545 2009-12-19 15:41:21.000000000 +0200
@@ -1,122 +1,107 @@
-First suggestion:
-
-> I think for this we would actually need a new binlog event type
-> (Comment_log_event?). Unless we want to log an empty statement Query_log_event
-> containing only a comment (a bit of a hack).
-
-New server option
-~~~~~~~~~~~~~~~~~
- --binlog-annotate-rows-events
-
-Setting this option makes RBR (rows-) events in the binary log to be
-preceded by Annotate rows events (see below). The corresponding
-'binlog_annotate_rows_events' system variable is dynamic and has both
-global and session values. Default global value is OFF.
-
-Note. Session values are usefull to make it possible to annotate only
- some selected statements:
+Content
+~~~~~~~
+ 1. Annotate_rows_log_event
+ 2. Server option: --binlog-annotate-rows-events
+ 3. Server option: --replicate-annotate-rows-events
+ 4. mysqlbinlog option: --print-annotate-rows-events
+ 5. mysqlbinlog output
+
+1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Describes the query which caused the corresponding rows events. Has empty
+post-header and contains the query text in its data part. Example:
+
+ ************************
+ ANNOTATE_RBR_EVENT
+ ************************
+ 00000220 | B6 A0 2C 4B | time_when = 1261215926
+ 00000224 | 33 | event_type = 51
+ 00000225 | 64 00 00 00 | server_id = 100
+ 00000229 | 36 00 00 00 | event_len = 54
+ 0000022D | 56 02 00 00 | log_pos = 00000256
+ 00000231 | 00 00 | flags = <none>
+ ------------------------
+ 00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
+ 00000237 | 52 54 20 49 |
+ 0000023B | 4E 54 4F 20 |
+ 0000023F | 74 31 20 56 |
+ 00000243 | 41 4C 55 45 |
+ 00000247 | 53 20 28 31 |
+ 0000024B | 29 2C 20 28 |
+ 0000024F | 32 29 2C 20 |
+ 00000253 | 28 33 29 |
+ ************************
+
+In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
+and precedes the first of Table map events which accompany the corresponding
+rows events. (See example in the "mysqlbinlog output" section below.)
+
+2. Server option: --binlog-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tells the master to write Annotate_rows events to the binary log.
+
+ * Variable Name: binlog_annotate_rows_events
+ * Scope: Global & Session
+ * Access Type: Dynamic
+ * Data Type: bool
+ * Default Value: OFF
+NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
-New binlog event type
-~~~~~~~~~~~~~~~~~~~~~
- Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
-
-Describes the query which caused the corresponding rows event. In binary log,
-precedes each Table_map_log_event. Contains empty post-header and the query
-text in its data part.
-
-The numeric code for this event must be assigned carefully. It should be
-coordinated with MySQL/Sun, otherwise we can get into a situation where MySQL
-uses the same numeric code for one event that MariaDB uses for
-ANNOTATE_ROWS_EVENT, which would make merging the two impossible.
-
-Example:
-
- ...
- ************************
- ANNOTATE_ROWS_EVENT [51]
- ************************
- 000000C7 | 54 1B 12 4B | time_when = 1259477844
- 000000CB | 33 | event_type = 51
- 000000CC | 64 00 00 00 | server_id = 100
- 000000D0 | 2C 00 00 00 | event_len = 44
- 000000D4 | F3 00 00 00 | log_pos = 000000F3
- 000000D8 | 00 00 | flags = <none>
- ------------------------
- 000000DA | 69 6E 73 65 | query = "insert into t1 values (1)"
- 000000DE | 72 74 20 69 |
- 000000E2 | 6E 74 6F 20 |
- 000000E6 | 74 31 20 76 |
- 000000EA | 61 6C 75 65 |
- 000000EE | 73 20 28 31 |
- 000000F2 | 29 |
- ************************
- TABLE_MAP_EVENT [19]
- ************************
- 000000F3 | 54 1B 12 4B | time_when = 1259477844
- 000000F7 | 13 | event_type = 19
- 000000F8 | 64 00 00 00 | server_id = 100
- 000000FC | 29 00 00 00 | event_len = 41
- 00000100 | 1C 01 00 00 | log_pos = 0000011C
- 00000104 | 00 00 | flags = <none>
- ------------------------
- ...
- ************************
- WRITE_ROWS_EVENT [23]
- ************************
- 0000011C | 54 1B 12 4B | time_when = 1259477844
- 00000120 | 17 | event_type = 23
- 00000121 | 64 00 00 00 | server_id = 100
- 00000125 | 22 00 00 00 | event_len = 34
- 00000129 | 3E 01 00 00 | log_pos = 0000013E
- 0000012D | 10 00 | flags = LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F
- ------------------------
- 0000012F | 0F 00 00 00 | table_id = 15
- ...
-
-New mysqlbinlog option
-~~~~~~~~~~~~~~~~~~~~~~
- --print-annotate-rows-events
-
-With this option, mysqlbinlog prints the content of Annotate-rows
-events (if the binary log does contain them). Without this option
-(i.e. by default), mysqlbinlog skips Annotate rows events.
-
-
-mysqlbinlog output
-~~~~~~~~~~~~~~~~~~
-Something like this:
+3. Server option: --replicate-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tells the slave to reproduce Annotate_rows events recieved from the master
+in its own binary log (sensible only in pair with log-slave-updates option).
+
+ * Variable Name: replicate_annotate_rows_events
+ * Scope: Global
+ * Access Type: Read only
+ * Data Type: bool
+ * Default Value: OFF
+
+NOTE. Why do we additionally need this 'replicate' option? Why not to make
+the slave to reproduce this events when its binlog-annotate-rows-events
+global value is ON? Well, because, for example, we may want to configure
+the slave which should reproduce Annotate_rows events but has global
+binlog-annotate-rows-events = OFF meaning this to be the default value for
+the client threads (see also "How slave treats replicate-annotate-rows-events
+option" in LLD part).
+
+4. mysqlbinlog option: --print-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+With this option, mysqlbinlog prints the content of Annotate_rows events (if
+the binary log does contain them). Without this option (i.e. by default),
+mysqlbinlog skips Annotate_rows events.
+5. mysqlbinlog output
+~~~~~~~~~~~~~~~~~~~~~
+With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
+in a form like this:
...
- # at 199
- # at 243
- # at 284
- #091129 9:57:24 server id 100 end_log_pos 243 Query: `insert into t1 values
-(1)`
- #091129 9:57:24 server id 100 end_log_pos 284 Table_map: `test`.`t1` mapped
-to number 15
- #091129 9:57:24 server id 100 end_log_pos 318 Write_rows: table id 15
+ # at 1646
+ #091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
+exec_time=0 error_code=0
+ SET TIMESTAMP=1261215926/*!*/;
+ BEGIN
+ /*!*/;
+ # at 1714
+ # at 1812
+ # at 1853
+ # at 1894
+ # at 1938
+ #091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
+t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
+ #091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
+mapped to number 16
+ #091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
+mapped to number 17
+ #091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
+ #091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
-
- BINLOG '
- VBsSSzNkAAAALAAAAPMAAAAAAGluc2VydCBpbnRvIHQxIHZhbHVlcyAoMSk=
- VBsSSxNkAAAAKQAAABwBAAAAAA8AAAAAAAAABHRlc3QAAnQxAAEDAAE=
- VBsSSxdkAAAAIgAAAD4BAAAQAA8AAAAAAAEAAf/+AQAAAA==
- '/*!*/;
- ### INSERT INTO test.t1
- ### SET
- ### @1=1 /* INT meta=0 nullable=1 is_null=0 */
...
-When master sends Annotate rows events
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-1. Master always sends Annotate_rows events to mysqlbinlog (in
- remote case).
-2. Master sends Annotate_rows events to a slave only if the slave has
- both log-slave-updates and binlog-annotate-rows-events options set.
-
-=-=(Bothorsen - Fri, 18 Dec 2009, 16:22)=-=-
Add estimation time.
Worked 5 hours and estimate 35 hours remain (original estimate increased by 5 hours).
-=-=(Bothorsen - Fri, 18 Dec 2009, 16:16)=-=-
This is the work done on this patch so far. Most of it done by Alex.
Worked 15 hours and estimate 035 hours remain (original estimate increased by 50 hours).
-=-=(Alexi - Fri, 04 Dec 2009, 13:00)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.14001 2009-12-04 13:00:24.000000000 +0200
+++ /tmp/wklog.47.new.14001 2009-12-04 13:00:24.000000000 +0200
@@ -6,27 +6,27 @@
New server option
~~~~~~~~~~~~~~~~~
- --binlog-annotate-row-events
+ --binlog-annotate-rows-events
-Setting this option makes RBR (row-) events in the binary log to be
+Setting this option makes RBR (rows-) events in the binary log to be
preceded by Annotate rows events (see below). The corresponding
-'binlog_annotate_row_events' system variable is dynamic and has both
+'binlog_annotate_rows_events' system variable is dynamic and has both
global and session values. Default global value is OFF.
Note. Session values are usefull to make it possible to annotate only
some selected statements:
...
- SET SESSION binlog_annotate_row_events=ON;
+ SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
- SET SESSION binlog_annotate_row_events=OFF;
+ SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
New binlog event type
~~~~~~~~~~~~~~~~~~~~~
Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
-Describes the query which caused the corresponding row event. In binary log,
+Describes the query which caused the corresponding rows event. In binary log,
precedes each Table_map_log_event. Contains empty post-header and the query
text in its data part.
@@ -79,6 +79,15 @@
0000012F | 0F 00 00 00 | table_id = 15
...
+New mysqlbinlog option
+~~~~~~~~~~~~~~~~~~~~~~
+ --print-annotate-rows-events
+
+With this option, mysqlbinlog prints the content of Annotate-rows
+events (if the binary log does contain them). Without this option
+(i.e. by default), mysqlbinlog skips Annotate rows events.
+
+
mysqlbinlog output
~~~~~~~~~~~~~~~~~~
Something like this:
@@ -109,5 +118,5 @@
1. Master always sends Annotate_rows events to mysqlbinlog (in
remote case).
2. Master sends Annotate_rows events to a slave only if the slave has
- both log-slave-updates and binlog-annotate-row-events options set.
+ both log-slave-updates and binlog-annotate-rows-events options set.
------------------------------------------------------------
-=-=(View All Progress Notes, 19 total)=-=-
http://askmonty.org/worklog/index.pl?tid=47&nolimit=1
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
HIGH-LEVEL SPECIFICATION:
Content
~~~~~~~
1. Annotate_rows_log_event
2. Server option: --binlog-annotate-rows-events
3. Server option: --replicate-annotate-rows-events
4. mysqlbinlog option: --print-annotate-rows-events
5. mysqlbinlog output
1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Describes the query which caused the corresponding rows events. Has empty
post-header and contains the query text in its data part. Example:
************************
ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
00000225 | 64 00 00 00 | server_id = 100
00000229 | 36 00 00 00 | event_len = 54
0000022D | 56 02 00 00 | log_pos = 00000256
00000231 | 00 00 | flags = <none>
------------------------
00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
00000237 | 52 54 20 49 |
0000023B | 4E 54 4F 20 |
0000023F | 74 31 20 56 |
00000243 | 41 4C 55 45 |
00000247 | 53 20 28 31 |
0000024B | 29 2C 20 28 |
0000024F | 32 29 2C 20 |
00000253 | 28 33 29 |
************************
In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
and precedes the first of Table map events which accompany the corresponding
rows events. (See example in the "mysqlbinlog output" section below.)
2. Server option: --binlog-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the master to write Annotate_rows events to the binary log.
* Variable Name: binlog_annotate_rows_events
* Scope: Global & Session
* Access Type: Dynamic
* Data Type: bool
* Default Value: OFF
NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
3. Server option: --replicate-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the slave to reproduce Annotate_rows events recieved from the master
in its own binary log (sensible only in pair with log-slave-updates option).
* Variable Name: replicate_annotate_rows_events
* Scope: Global
* Access Type: Read only
* Data Type: bool
* Default Value: OFF
NOTE. Why do we additionally need this 'replicate' option? Why not to make
the slave to reproduce this events when its binlog-annotate-rows-events
global value is ON? Well, because, for example, we may want to configure
the slave which should reproduce Annotate_rows events but has global
binlog-annotate-rows-events = OFF meaning this to be the default value for
the client threads (see also "How slave treats replicate-annotate-rows-events
option" in LLD part).
4. mysqlbinlog option: --print-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With this option, mysqlbinlog prints the content of Annotate_rows events (if
the binary log does contain them). Without this option (i.e. by default),
mysqlbinlog skips Annotate_rows events.
5. mysqlbinlog output
~~~~~~~~~~~~~~~~~~~~~
With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
in a form like this:
...
# at 1646
#091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
exec_time=0 error_code=0
SET TIMESTAMP=1261215926/*!*/;
BEGIN
/*!*/;
# at 1714
# at 1812
# at 1853
# at 1894
# at 1938
#091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
#091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
mapped to number 16
#091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
mapped to number 17
#091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
#091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
...
LOW-LEVEL DESIGN:
Content
~~~~~~~
1. Annotate_rows event number
2. Outline of Annotate_rows event behavior
3. How Master writes Annotate_rows events to the binary log
4. How slave treats replicate-annotate-rows-events option
5. How slave IO thread requests Annotate_rows events
6. How master executes the request
7. How slave SQL thread processes Annotate_rows events
8. General remarks
1. Annotate_rows event number
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
between the last MySQL event number and the Annotate_rows event number:
enum Log_event_type
{ ...
INCIDENT_EVENT= 26,
// New MySQL event numbers are to be added here
MYSQL_EVENTS_END,
MARIA_EVENTS_BEGIN= 51,
// New Maria event numbers start from here
ANNOTATE_ROWS_EVENT= 51,
ENUM_END_EVENT
};
together with the corresponding extension of 'post_header_len' array in the
Format description event. (This extension does not affect the compatibility
of the binary log). Here is how Format description event looks like with
this extension:
************************
FORMAT_DESCRIPTION_EVENT
************************
00000004 | A1 A0 2C 4B | time_when = 1261215905
00000008 | 0F | event_type = 15
00000009 | 64 00 00 00 | server_id = 100
0000000D | 7F 00 00 00 | event_len = 127
00000011 | 83 00 00 00 | log_pos = 00000083
00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
------------------------
00000017 | 04 00 | binlog_ver = 4
00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
..... ...
0000004B | A1 A0 2C 4B | time_created = 1261215905
0000004F | 13 | common_header_len = 19
------------------------
post_header_len
------------------------
00000050 | 38 | 56 - START_EVENT_V3 [1]
..... ...
00000069 | 02 | 2 - INCIDENT_EVENT [26]
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each Annotate_rows_log_event object has two private members describing the
corresponding query:
char *m_query_txt;
uint m_query_len;
When the object is created for writing to a binary log, this query is taken
from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
as well as other implementation details):
Annotate_rows_log_event(THD *thd)
{
m_query_txt = thd->query();
m_query_len = thd->query_length();
}
When the object is read from a binary log, the query is taken from the buffer
containing the binary log representation of the event (this buffer is allocated
in Log_event object from which all Log events are derived):
Annotate_rows_log_event(char *buf, uint event_len,
Format_description_log_event *desc)
{
m_query_len = event_len - desc->common_header_len;
m_query_txt = buf + desc->common_header_len;
}
The events are written to the binary log by the Log_event::write() member
which calls virtual write_data_header() and write_data_body() members
("data header" and "post header" are synonym in replication terminology).
In our case, data header is empty and data body is just the query:
bool write_data_body(IO_CACHE *file)
{
return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
}
Printing the event is just printing the query:
void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
{
my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
}
3. How Master writes Annotate_rows events to the binary log
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The event is written to the binary log just before the group of Table_map
events which precede corresponding Rows events (one query may generate
several Table map events in the binary log, but the corresponding
Annotate_rows event must be written only once before the first Table map
event; hence the boolean variable 'with_annotate' below):
int write_locked_table_maps(THD *thd)
{ ...
bool with_annotate= thd->variables.binlog_annotate_rows_events;
...
for (uint i= 0; i < ... <number of tables> ...; ++i)
{ ...
thd->binlog_write_table_map(table, ..., with_annotate);
with_annotate= 0; // write Annotate_event not more than once
...
}
...
}
int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
{ ...
Table_map_log_event the_event(...);
...
if (with_annotate)
{
Annotate_rows_log_event anno(this);
mysql_bin_log.write(&anno);
}
mysql_bin_log.write(&the_event);
...
}
4. How slave treats replicate-annotate-rows-events option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The replicate-annotate-rows-events option is treated just as the session
value of the binlog_annotate_rows_events variable for the slave IO and
SQL threads. This setting is done during initialization of these threads:
pthread_handler_t handle_slave_io(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_IO);
...
}
pthread_handler_t handle_slave_sql(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_SQL);
...
}
int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
{ ...
thd->variables.binlog_annotate_rows_events=
opt_replicate_annotate_rows_events;
...
}
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When requesting an event, the slave should inform the master whether
it should send Annotate_rows events or not. To that end we add a new
BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
#define BINLOG_DUMP_NON_BLOCK 1
#define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
pthread_handler_t handle_slave_io(void *arg)
{ ...
request_dump(mysql, ...);
...
}
int request_dump(MYSQL* mysql, ...)
{ ...
if (opt_log_slave_updates &&
mi->io_thd->variables.binlog_annotate_rows_events)
binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
...
int2store(buf + 4, binlog_flags);
...
simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
...
}
NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
simple_command() function, should also use this flag if it wants (in case
of the --print-annotate-rows-events option set) to recieve Annotate_rows
events.
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
case COM_BINLOG_DUMP:
{ ...
flags= uint2korr(packet + 4);
...
mysql_binlog_send(thd, ..., flags);
...
}
void mysql_binlog_send(THD* thd, ..., ushort flags)
{ ...
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
...
}
7. How slave SQL thread processes Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The slave processes each recieved event by "applying" it, i.e. by
calling the Log_event::apply_event() function which in turn calls
the virtual do_apply_event() member specific for each type of the
event.
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev = next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
int apply_event_and_update_pos(Log_event *ev, ...)
{ ...
ev->apply_event(...);
...
}
int Log_event::apply_event(...)
{
return do_apply_event(...);
}
What does it mean to "apply" an Annotate_rows event? It means to set current
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
thd->set_query(m_query_txt, m_query_len);
}
NOTE. I am not sure, but possibly current values of thd->query and
thd->query_length should be saved before calling set_query() and to be
restored on the Annotate_rows_log_event object deletion.
Is it really needed ?
After calling this do_apply_event() function we may not delete the
Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
above) because thd->query now points to the string inside this object.
We may keep the pointer to this object in the Relay_log_info:
class Relay_log_info
{
public:
...
void set_annotate_event(Annotate_rows_log_event*);
Annotate_rows_log_event* get_annotate_event();
void free_annotate_event();
...
private:
Annotate_rows_log_event* m_annotate_event;
};
The saved Annotate_rows object should be deleted when all corresponding
Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (rli->get_annotate_event() && is_last_rows_event(ev))
rli->free_annotate_event();
else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
where
bool is_last_rows_event(Log_event* ev)
{
Log_event_type type= ev->get_type_code();
if (IS_ROWS_EVENT_TYPE(type))
{
Rows_log_event* rows= (Rows_log_event*)ev;
return rows->get_flags(Rows_log_event::STMT_END_F);
}
return 0;
}
#define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
(type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
Kristian noticed that introducing new log event type should be coordinated
somehow with MySQL/Sun:
Kristian: The numeric code for this event must be assigned carefully.
It should be coordinated with MySQL/Sun, otherwise we can get into a
situation where MySQL uses the same numeric code for one event that
MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
impossible.
Alex: I reserved about 20 numbers not to have possible conflicts
with MySQL.
Kristian: Still, I think it would be appropriate to send a polite email
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
flag taking into account that MySQL/Sun may also introduce a flag with the
same value to be used in the request_dump-mysql_binlog_send interface.
But this is mainly the question of merging: if a conflict concerning this
flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
(this does not require additional changes in the code).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Alexi): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 20 Dec '09
by worklog-noreply@askmonty.org 20 Dec '09
20 Dec '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Knielsen
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 20
ESTIMATE.......: 35 (hours remain)
ORIG. ESTIMATE.: 35
PROGRESS NOTES:
-=-=(Alexi - Sun, 20 Dec 2009, 13:14)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.11350 2009-12-20 13:14:04.000000000 +0200
+++ /tmp/wklog.47.new.11350 2009-12-20 13:14:04.000000000 +0200
@@ -282,23 +282,18 @@
Annotate_rows_log_event* m_annotate_event;
};
-When the saved Annotate_rows object may be deleted? When all corresponding
-Rows events will be processed, i.e. before processing the first non-Rows
-event (note that Annotate_rows object resides in the binary log *after*
-the (possible) 'BEGIN' Query event which accompanies the rows events; note
-also that this deletion is adjusted with the case when some or all
-corresponding Rows events are filtered out by replicate filter rules):
+The saved Annotate_rows object should be deleted when all corresponding
+Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
- if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
- rli->free_annotate_event();
-
apply_event_and_update_pos(ev, ...);
- if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ if (rli->get_annotate_event() && is_last_rows_event(ev))
+ rli->free_annotate_event();
+ else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
@@ -307,10 +302,21 @@
where
- #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
- (type) == WRITE_ROWS_EVENT || \
+ bool is_last_rows_event(Log_event* ev)
+ {
+ Log_event_type type= ev->get_type_code();
+ if (IS_ROWS_EVENT_TYPE(type))
+ {
+ Rows_log_event* rows= (Rows_log_event*)ev;
+ return rows->get_flags(Rows_log_event::STMT_END_F);
+ }
+
+ return 0;
+ }
+
+ #define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
- (type) == DELETE_ROWS_EVENT )
+ (type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
-=-=(Alexi - Sun, 20 Dec 2009, 09:29)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.32726 2009-12-20 07:29:56.000000000 +0000
+++ /tmp/wklog.47.new.32726 2009-12-20 07:29:56.000000000 +0000
@@ -56,7 +56,7 @@
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
- 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ 00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
-=-=(Alexi - Sat, 19 Dec 2009, 16:10)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.16051 2009-12-19 16:10:48.000000000 +0200
+++ /tmp/wklog.47.new.16051 2009-12-19 16:10:48.000000000 +0200
@@ -253,7 +253,7 @@
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
-Rows events is applied):
+Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
-=-=(Alexi - Sat, 19 Dec 2009, 16:02)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.15695 2009-12-19 16:02:33.000000000 +0200
+++ /tmp/wklog.47.new.15695 2009-12-19 16:02:33.000000000 +0200
@@ -12,7 +12,7 @@
post-header and contains the query text in its data part. Example:
************************
- ANNOTATE_RBR_EVENT
+ ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
-=-=(Alexi - Sat, 19 Dec 2009, 15:58)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.15437 2009-12-19 13:58:12.000000000 +0000
+++ /tmp/wklog.47.new.15437 2009-12-19 13:58:12.000000000 +0000
@@ -1 +1,337 @@
+Content
+~~~~~~~
+ 1. Annotate_rows event number
+ 2. Outline of Annotate_rows event behavior
+ 3. How Master writes Annotate_rows events to the binary log
+ 4. How slave treats replicate-annotate-rows-events option
+ 5. How slave IO thread requests Annotate_rows events
+ 6. How master executes the request
+ 7. How slave SQL thread processes Annotate_rows events
+ 8. General remarks
+
+1. Annotate_rows event number
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
+between the last MySQL event number and the Annotate_rows event number:
+
+ enum Log_event_type
+ { ...
+ INCIDENT_EVENT= 26,
+ // New MySQL event numbers are to be added here
+ MYSQL_EVENTS_END,
+
+ MARIA_EVENTS_BEGIN= 51,
+ // New Maria event numbers start from here
+ ANNOTATE_ROWS_EVENT= 51,
+
+ ENUM_END_EVENT
+ };
+
+together with the corresponding extension of 'post_header_len' array in the
+Format description event. (This extension does not affect the compatibility
+of the binary log). Here is how Format description event looks like with
+this extension:
+
+ ************************
+ FORMAT_DESCRIPTION_EVENT
+ ************************
+ 00000004 | A1 A0 2C 4B | time_when = 1261215905
+ 00000008 | 0F | event_type = 15
+ 00000009 | 64 00 00 00 | server_id = 100
+ 0000000D | 7F 00 00 00 | event_len = 127
+ 00000011 | 83 00 00 00 | log_pos = 00000083
+ 00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
+ ------------------------
+ 00000017 | 04 00 | binlog_ver = 4
+ 00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
+ ..... ...
+ 0000004B | A1 A0 2C 4B | time_created = 1261215905
+ 0000004F | 13 | common_header_len = 19
+ ------------------------
+ post_header_len
+ ------------------------
+ 00000050 | 38 | 56 - START_EVENT_V3 [1]
+ ..... ...
+ 00000069 | 02 | 2 - INCIDENT_EVENT [26]
+ 0000006A | 00 | 0 - RESERVED [27]
+ ..... ...
+ 00000081 | 00 | 0 - RESERVED [50]
+ 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ ************************
+
+2. Outline of Annotate_rows event behavior
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Each Annotate_rows_log_event object has two private members describing the
+corresponding query:
+
+ char *m_query_txt;
+ uint m_query_len;
+
+When the object is created for writing to a binary log, this query is taken
+from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
+as well as other implementation details):
+
+ Annotate_rows_log_event(THD *thd)
+ {
+ m_query_txt = thd->query();
+ m_query_len = thd->query_length();
+ }
+
+When the object is read from a binary log, the query is taken from the buffer
+containing the binary log representation of the event (this buffer is allocated
+in Log_event object from which all Log events are derived):
+
+ Annotate_rows_log_event(char *buf, uint event_len,
+ Format_description_log_event *desc)
+ {
+ m_query_len = event_len - desc->common_header_len;
+ m_query_txt = buf + desc->common_header_len;
+ }
+
+The events are written to the binary log by the Log_event::write() member
+which calls virtual write_data_header() and write_data_body() members
+("data header" and "post header" are synonym in replication terminology).
+In our case, data header is empty and data body is just the query:
+
+ bool write_data_body(IO_CACHE *file)
+ {
+ return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
+ }
+
+Printing the event is just printing the query:
+
+ void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
+ {
+ my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
+ }
+
+3. How Master writes Annotate_rows events to the binary log
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The event is written to the binary log just before the group of Table_map
+events which precede corresponding Rows events (one query may generate
+several Table map events in the binary log, but the corresponding
+Annotate_rows event must be written only once before the first Table map
+event; hence the boolean variable 'with_annotate' below):
+
+ int write_locked_table_maps(THD *thd)
+ { ...
+ bool with_annotate= thd->variables.binlog_annotate_rows_events;
+ ...
+ for (uint i= 0; i < ... <number of tables> ...; ++i)
+ { ...
+ thd->binlog_write_table_map(table, ..., with_annotate);
+ with_annotate= 0; // write Annotate_event not more than once
+ ...
+ }
+ ...
+ }
+
+ int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
+ { ...
+ Table_map_log_event the_event(...);
+ ...
+ if (with_annotate)
+ {
+ Annotate_rows_log_event anno(this);
+ mysql_bin_log.write(&anno);
+ }
+
+ mysql_bin_log.write(&the_event);
+ ...
+ }
+
+4. How slave treats replicate-annotate-rows-events option
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The replicate-annotate-rows-events option is treated just as the session
+value of the binlog_annotate_rows_events variable for the slave IO and
+SQL threads. This setting is done during initialization of these threads:
+
+ pthread_handler_t handle_slave_io(void *arg)
+ {
+ THD *thd= new THD;
+ ...
+ init_slave_thread(thd, SLAVE_THD_IO);
+ ...
+ }
+
+ pthread_handler_t handle_slave_sql(void *arg)
+ {
+ THD *thd= new THD;
+ ...
+ init_slave_thread(thd, SLAVE_THD_SQL);
+ ...
+ }
+
+ int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
+ { ...
+ thd->variables.binlog_annotate_rows_events=
+ opt_replicate_annotate_rows_events;
+ ...
+ }
+
+5. How slave IO thread requests Annotate_rows events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When requesting an event, the slave should inform the master whether
+it should send Annotate_rows events or not. To that end we add a new
+BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
+
+ #define BINLOG_DUMP_NON_BLOCK 1
+ #define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
+
+ pthread_handler_t handle_slave_io(void *arg)
+ { ...
+ request_dump(mysql, ...);
+ ...
+ }
+
+ int request_dump(MYSQL* mysql, ...)
+ { ...
+ if (opt_log_slave_updates &&
+ mi->io_thd->variables.binlog_annotate_rows_events)
+ binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
+ ...
+ int2store(buf + 4, binlog_flags);
+ ...
+ simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
+ ...
+ }
+
+6. How master executes the request
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ case COM_BINLOG_DUMP:
+ { ...
+ flags= uint2korr(packet + 4);
+ ...
+ mysql_binlog_send(thd, ..., flags);
+ ...
+ }
+
+ void mysql_binlog_send(THD* thd, ..., ushort flags)
+ { ...
+ Log_event::read_log_event(&log, packet, ...);
+ ...
+ if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
+ thd->server_id == 0 /* slave == mysqlbinlog */ )
+ {
+ my_net_write(net, packet->ptr(), packet->length());
+ }
+ ...
+ }
+
+7. How slave SQL thread processes Annotate_rows events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The slave processes each recieved event by "applying" it, i.e. by
+calling the Log_event::apply_event() function which in turn calls
+the virtual do_apply_event() member specific for each type of the
+event.
+
+ int exec_relay_log_event(THD* thd, Relay_log_info* rli)
+ { ...
+ Log_event *ev = next_event(rli);
+ ...
+ apply_event_and_update_pos(ev, ...);
+
+ if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
+ delete ev;
+ ...
+ }
+
+ int apply_event_and_update_pos(Log_event *ev, ...)
+ { ...
+ ev->apply_event(...);
+ ...
+ }
+
+ int Log_event::apply_event(...)
+ {
+ return do_apply_event(...);
+ }
+
+What does it mean to "apply" an Annotate_rows event? It means to set current
+thd query to that of the described by the event, i.e. to the query which
+caused the subsequent Rows events (see "How Master writes Annotate_rows
+events to the binary log" to follow what happens further when the subsequent
+Rows events is applied):
+
+ int Annotate_rows_log_event::do_apply_event(...)
+ {
+ thd->set_query(m_query_txt, m_query_len);
+ }
+
+NOTE. I am not sure, but possibly current values of thd->query and
+thd->query_length should be saved before calling set_query() and to be
+restored on the Annotate_rows_log_event object deletion.
+Is it really needed ?
+
+After calling this do_apply_event() function we may not delete the
+Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
+above) because thd->query now points to the string inside this object.
+We may keep the pointer to this object in the Relay_log_info:
+
+ class Relay_log_info
+ {
+ public:
+ ...
+ void set_annotate_event(Annotate_rows_log_event*);
+ Annotate_rows_log_event* get_annotate_event();
+ void free_annotate_event();
+ ...
+ private:
+ Annotate_rows_log_event* m_annotate_event;
+ };
+
+When the saved Annotate_rows object may be deleted? When all corresponding
+Rows events will be processed, i.e. before processing the first non-Rows
+event (note that Annotate_rows object resides in the binary log *after*
+the (possible) 'BEGIN' Query event which accompanies the rows events; note
+also that this deletion is adjusted with the case when some or all
+corresponding Rows events are filtered out by replicate filter rules):
+
+ int exec_relay_log_event(THD* thd, Relay_log_info* rli)
+ { ...
+ Log_event *ev= next_event(rli);
+ ...
+ if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
+ rli->free_annotate_event();
+
+ apply_event_and_update_pos(ev, ...);
+
+ if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ rli->set_annotate_event((Annotate_rows_log_event*) ev);
+ else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
+ delete ev;
+ ...
+ }
+
+where
+
+ #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
+ (type) == WRITE_ROWS_EVENT || \
+ (type) == UPDATE_ROWS_EVENT || \
+ (type) == DELETE_ROWS_EVENT )
+
+8. General remarks
+~~~~~~~~~~~~~~~~~~
+Kristian noticed that introducing new log event type should be coordinated
+somehow with MySQL/Sun:
+
+ Kristian: The numeric code for this event must be assigned carefully.
+ It should be coordinated with MySQL/Sun, otherwise we can get into a
+ situation where MySQL uses the same numeric code for one event that
+ MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
+ impossible.
+ Alex: I reserved about 20 numbers not to have possible conflicts
+ with MySQL.
+ Kristian: Still, I think it would be appropriate to send a polite email
+ to internals(a)lists.mysql.com about this and suggesting to reserve the
+ event number.
+
+Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
+flag taking into account that MySQL/Sun may also introduce a flag with the
+same value to be used in the request_dump-mysql_binlog_send interface.
+But this is mainly the question of merging: if a conflict concerning this
+flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
+(this does not require additional changes in the code).
-=-=(Alexi - Sat, 19 Dec 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.14545 2009-12-19 15:41:21.000000000 +0200
+++ /tmp/wklog.47.new.14545 2009-12-19 15:41:21.000000000 +0200
@@ -1,122 +1,107 @@
-First suggestion:
-
-> I think for this we would actually need a new binlog event type
-> (Comment_log_event?). Unless we want to log an empty statement Query_log_event
-> containing only a comment (a bit of a hack).
-
-New server option
-~~~~~~~~~~~~~~~~~
- --binlog-annotate-rows-events
-
-Setting this option makes RBR (rows-) events in the binary log to be
-preceded by Annotate rows events (see below). The corresponding
-'binlog_annotate_rows_events' system variable is dynamic and has both
-global and session values. Default global value is OFF.
-
-Note. Session values are usefull to make it possible to annotate only
- some selected statements:
+Content
+~~~~~~~
+ 1. Annotate_rows_log_event
+ 2. Server option: --binlog-annotate-rows-events
+ 3. Server option: --replicate-annotate-rows-events
+ 4. mysqlbinlog option: --print-annotate-rows-events
+ 5. mysqlbinlog output
+
+1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Describes the query which caused the corresponding rows events. Has empty
+post-header and contains the query text in its data part. Example:
+
+ ************************
+ ANNOTATE_RBR_EVENT
+ ************************
+ 00000220 | B6 A0 2C 4B | time_when = 1261215926
+ 00000224 | 33 | event_type = 51
+ 00000225 | 64 00 00 00 | server_id = 100
+ 00000229 | 36 00 00 00 | event_len = 54
+ 0000022D | 56 02 00 00 | log_pos = 00000256
+ 00000231 | 00 00 | flags = <none>
+ ------------------------
+ 00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
+ 00000237 | 52 54 20 49 |
+ 0000023B | 4E 54 4F 20 |
+ 0000023F | 74 31 20 56 |
+ 00000243 | 41 4C 55 45 |
+ 00000247 | 53 20 28 31 |
+ 0000024B | 29 2C 20 28 |
+ 0000024F | 32 29 2C 20 |
+ 00000253 | 28 33 29 |
+ ************************
+
+In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
+and precedes the first of Table map events which accompany the corresponding
+rows events. (See example in the "mysqlbinlog output" section below.)
+
+2. Server option: --binlog-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tells the master to write Annotate_rows events to the binary log.
+
+ * Variable Name: binlog_annotate_rows_events
+ * Scope: Global & Session
+ * Access Type: Dynamic
+ * Data Type: bool
+ * Default Value: OFF
+NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
-New binlog event type
-~~~~~~~~~~~~~~~~~~~~~
- Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
-
-Describes the query which caused the corresponding rows event. In binary log,
-precedes each Table_map_log_event. Contains empty post-header and the query
-text in its data part.
-
-The numeric code for this event must be assigned carefully. It should be
-coordinated with MySQL/Sun, otherwise we can get into a situation where MySQL
-uses the same numeric code for one event that MariaDB uses for
-ANNOTATE_ROWS_EVENT, which would make merging the two impossible.
-
-Example:
-
- ...
- ************************
- ANNOTATE_ROWS_EVENT [51]
- ************************
- 000000C7 | 54 1B 12 4B | time_when = 1259477844
- 000000CB | 33 | event_type = 51
- 000000CC | 64 00 00 00 | server_id = 100
- 000000D0 | 2C 00 00 00 | event_len = 44
- 000000D4 | F3 00 00 00 | log_pos = 000000F3
- 000000D8 | 00 00 | flags = <none>
- ------------------------
- 000000DA | 69 6E 73 65 | query = "insert into t1 values (1)"
- 000000DE | 72 74 20 69 |
- 000000E2 | 6E 74 6F 20 |
- 000000E6 | 74 31 20 76 |
- 000000EA | 61 6C 75 65 |
- 000000EE | 73 20 28 31 |
- 000000F2 | 29 |
- ************************
- TABLE_MAP_EVENT [19]
- ************************
- 000000F3 | 54 1B 12 4B | time_when = 1259477844
- 000000F7 | 13 | event_type = 19
- 000000F8 | 64 00 00 00 | server_id = 100
- 000000FC | 29 00 00 00 | event_len = 41
- 00000100 | 1C 01 00 00 | log_pos = 0000011C
- 00000104 | 00 00 | flags = <none>
- ------------------------
- ...
- ************************
- WRITE_ROWS_EVENT [23]
- ************************
- 0000011C | 54 1B 12 4B | time_when = 1259477844
- 00000120 | 17 | event_type = 23
- 00000121 | 64 00 00 00 | server_id = 100
- 00000125 | 22 00 00 00 | event_len = 34
- 00000129 | 3E 01 00 00 | log_pos = 0000013E
- 0000012D | 10 00 | flags = LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F
- ------------------------
- 0000012F | 0F 00 00 00 | table_id = 15
- ...
-
-New mysqlbinlog option
-~~~~~~~~~~~~~~~~~~~~~~
- --print-annotate-rows-events
-
-With this option, mysqlbinlog prints the content of Annotate-rows
-events (if the binary log does contain them). Without this option
-(i.e. by default), mysqlbinlog skips Annotate rows events.
-
-
-mysqlbinlog output
-~~~~~~~~~~~~~~~~~~
-Something like this:
+3. Server option: --replicate-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tells the slave to reproduce Annotate_rows events recieved from the master
+in its own binary log (sensible only in pair with log-slave-updates option).
+
+ * Variable Name: replicate_annotate_rows_events
+ * Scope: Global
+ * Access Type: Read only
+ * Data Type: bool
+ * Default Value: OFF
+
+NOTE. Why do we additionally need this 'replicate' option? Why not to make
+the slave to reproduce this events when its binlog-annotate-rows-events
+global value is ON? Well, because, for example, we may want to configure
+the slave which should reproduce Annotate_rows events but has global
+binlog-annotate-rows-events = OFF meaning this to be the default value for
+the client threads (see also "How slave treats replicate-annotate-rows-events
+option" in LLD part).
+
+4. mysqlbinlog option: --print-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+With this option, mysqlbinlog prints the content of Annotate_rows events (if
+the binary log does contain them). Without this option (i.e. by default),
+mysqlbinlog skips Annotate_rows events.
+5. mysqlbinlog output
+~~~~~~~~~~~~~~~~~~~~~
+With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
+in a form like this:
...
- # at 199
- # at 243
- # at 284
- #091129 9:57:24 server id 100 end_log_pos 243 Query: `insert into t1 values
-(1)`
- #091129 9:57:24 server id 100 end_log_pos 284 Table_map: `test`.`t1` mapped
-to number 15
- #091129 9:57:24 server id 100 end_log_pos 318 Write_rows: table id 15
+ # at 1646
+ #091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
+exec_time=0 error_code=0
+ SET TIMESTAMP=1261215926/*!*/;
+ BEGIN
+ /*!*/;
+ # at 1714
+ # at 1812
+ # at 1853
+ # at 1894
+ # at 1938
+ #091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
+t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
+ #091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
+mapped to number 16
+ #091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
+mapped to number 17
+ #091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
+ #091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
-
- BINLOG '
- VBsSSzNkAAAALAAAAPMAAAAAAGluc2VydCBpbnRvIHQxIHZhbHVlcyAoMSk=
- VBsSSxNkAAAAKQAAABwBAAAAAA8AAAAAAAAABHRlc3QAAnQxAAEDAAE=
- VBsSSxdkAAAAIgAAAD4BAAAQAA8AAAAAAAEAAf/+AQAAAA==
- '/*!*/;
- ### INSERT INTO test.t1
- ### SET
- ### @1=1 /* INT meta=0 nullable=1 is_null=0 */
...
-When master sends Annotate rows events
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-1. Master always sends Annotate_rows events to mysqlbinlog (in
- remote case).
-2. Master sends Annotate_rows events to a slave only if the slave has
- both log-slave-updates and binlog-annotate-rows-events options set.
-
-=-=(Bothorsen - Fri, 18 Dec 2009, 16:22)=-=-
Add estimation time.
Worked 5 hours and estimate 35 hours remain (original estimate increased by 5 hours).
-=-=(Bothorsen - Fri, 18 Dec 2009, 16:16)=-=-
This is the work done on this patch so far. Most of it done by Alex.
Worked 15 hours and estimate 035 hours remain (original estimate increased by 50 hours).
-=-=(Alexi - Fri, 04 Dec 2009, 13:00)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.14001 2009-12-04 13:00:24.000000000 +0200
+++ /tmp/wklog.47.new.14001 2009-12-04 13:00:24.000000000 +0200
@@ -6,27 +6,27 @@
New server option
~~~~~~~~~~~~~~~~~
- --binlog-annotate-row-events
+ --binlog-annotate-rows-events
-Setting this option makes RBR (row-) events in the binary log to be
+Setting this option makes RBR (rows-) events in the binary log to be
preceded by Annotate rows events (see below). The corresponding
-'binlog_annotate_row_events' system variable is dynamic and has both
+'binlog_annotate_rows_events' system variable is dynamic and has both
global and session values. Default global value is OFF.
Note. Session values are usefull to make it possible to annotate only
some selected statements:
...
- SET SESSION binlog_annotate_row_events=ON;
+ SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
- SET SESSION binlog_annotate_row_events=OFF;
+ SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
New binlog event type
~~~~~~~~~~~~~~~~~~~~~
Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
-Describes the query which caused the corresponding row event. In binary log,
+Describes the query which caused the corresponding rows event. In binary log,
precedes each Table_map_log_event. Contains empty post-header and the query
text in its data part.
@@ -79,6 +79,15 @@
0000012F | 0F 00 00 00 | table_id = 15
...
+New mysqlbinlog option
+~~~~~~~~~~~~~~~~~~~~~~
+ --print-annotate-rows-events
+
+With this option, mysqlbinlog prints the content of Annotate-rows
+events (if the binary log does contain them). Without this option
+(i.e. by default), mysqlbinlog skips Annotate rows events.
+
+
mysqlbinlog output
~~~~~~~~~~~~~~~~~~
Something like this:
@@ -109,5 +118,5 @@
1. Master always sends Annotate_rows events to mysqlbinlog (in
remote case).
2. Master sends Annotate_rows events to a slave only if the slave has
- both log-slave-updates and binlog-annotate-row-events options set.
+ both log-slave-updates and binlog-annotate-rows-events options set.
-=-=(Alexi - Wed, 02 Dec 2009, 13:32)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.17456 2009-12-02 13:32:18.000000000 +0200
+++ /tmp/wklog.47.new.17456 2009-12-02 13:32:18.000000000 +0200
@@ -1,8 +1 @@
-mysql_binlog_send() [sql/sql_repl.cc]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-1. When sending events to a slave, master should simply skip
- Annotate_rows events (they are not needed for replication).
- [ ??? Multi-master - currently not clear ]
-2. When sending events to mysqlbinlog (remote case), master
- must send Annotate_rows events as well.
------------------------------------------------------------
-=-=(View All Progress Notes, 18 total)=-=-
http://askmonty.org/worklog/index.pl?tid=47&nolimit=1
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
HIGH-LEVEL SPECIFICATION:
Content
~~~~~~~
1. Annotate_rows_log_event
2. Server option: --binlog-annotate-rows-events
3. Server option: --replicate-annotate-rows-events
4. mysqlbinlog option: --print-annotate-rows-events
5. mysqlbinlog output
1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Describes the query which caused the corresponding rows events. Has empty
post-header and contains the query text in its data part. Example:
************************
ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
00000225 | 64 00 00 00 | server_id = 100
00000229 | 36 00 00 00 | event_len = 54
0000022D | 56 02 00 00 | log_pos = 00000256
00000231 | 00 00 | flags = <none>
------------------------
00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
00000237 | 52 54 20 49 |
0000023B | 4E 54 4F 20 |
0000023F | 74 31 20 56 |
00000243 | 41 4C 55 45 |
00000247 | 53 20 28 31 |
0000024B | 29 2C 20 28 |
0000024F | 32 29 2C 20 |
00000253 | 28 33 29 |
************************
In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
and precedes the first of Table map events which accompany the corresponding
rows events. (See example in the "mysqlbinlog output" section below.)
2. Server option: --binlog-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the master to write Annotate_rows events to the binary log.
* Variable Name: binlog_annotate_rows_events
* Scope: Global & Session
* Access Type: Dynamic
* Data Type: bool
* Default Value: OFF
NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
3. Server option: --replicate-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the slave to reproduce Annotate_rows events recieved from the master
in its own binary log (sensible only in pair with log-slave-updates option).
* Variable Name: replicate_annotate_rows_events
* Scope: Global
* Access Type: Read only
* Data Type: bool
* Default Value: OFF
NOTE. Why do we additionally need this 'replicate' option? Why not to make
the slave to reproduce this events when its binlog-annotate-rows-events
global value is ON? Well, because, for example, we may want to configure
the slave which should reproduce Annotate_rows events but has global
binlog-annotate-rows-events = OFF meaning this to be the default value for
the client threads (see also "How slave treats replicate-annotate-rows-events
option" in LLD part).
4. mysqlbinlog option: --print-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With this option, mysqlbinlog prints the content of Annotate_rows events (if
the binary log does contain them). Without this option (i.e. by default),
mysqlbinlog skips Annotate_rows events.
5. mysqlbinlog output
~~~~~~~~~~~~~~~~~~~~~
With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
in a form like this:
...
# at 1646
#091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
exec_time=0 error_code=0
SET TIMESTAMP=1261215926/*!*/;
BEGIN
/*!*/;
# at 1714
# at 1812
# at 1853
# at 1894
# at 1938
#091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
#091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
mapped to number 16
#091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
mapped to number 17
#091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
#091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
...
LOW-LEVEL DESIGN:
Content
~~~~~~~
1. Annotate_rows event number
2. Outline of Annotate_rows event behavior
3. How Master writes Annotate_rows events to the binary log
4. How slave treats replicate-annotate-rows-events option
5. How slave IO thread requests Annotate_rows events
6. How master executes the request
7. How slave SQL thread processes Annotate_rows events
8. General remarks
1. Annotate_rows event number
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
between the last MySQL event number and the Annotate_rows event number:
enum Log_event_type
{ ...
INCIDENT_EVENT= 26,
// New MySQL event numbers are to be added here
MYSQL_EVENTS_END,
MARIA_EVENTS_BEGIN= 51,
// New Maria event numbers start from here
ANNOTATE_ROWS_EVENT= 51,
ENUM_END_EVENT
};
together with the corresponding extension of 'post_header_len' array in the
Format description event. (This extension does not affect the compatibility
of the binary log). Here is how Format description event looks like with
this extension:
************************
FORMAT_DESCRIPTION_EVENT
************************
00000004 | A1 A0 2C 4B | time_when = 1261215905
00000008 | 0F | event_type = 15
00000009 | 64 00 00 00 | server_id = 100
0000000D | 7F 00 00 00 | event_len = 127
00000011 | 83 00 00 00 | log_pos = 00000083
00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
------------------------
00000017 | 04 00 | binlog_ver = 4
00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
..... ...
0000004B | A1 A0 2C 4B | time_created = 1261215905
0000004F | 13 | common_header_len = 19
------------------------
post_header_len
------------------------
00000050 | 38 | 56 - START_EVENT_V3 [1]
..... ...
00000069 | 02 | 2 - INCIDENT_EVENT [26]
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each Annotate_rows_log_event object has two private members describing the
corresponding query:
char *m_query_txt;
uint m_query_len;
When the object is created for writing to a binary log, this query is taken
from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
as well as other implementation details):
Annotate_rows_log_event(THD *thd)
{
m_query_txt = thd->query();
m_query_len = thd->query_length();
}
When the object is read from a binary log, the query is taken from the buffer
containing the binary log representation of the event (this buffer is allocated
in Log_event object from which all Log events are derived):
Annotate_rows_log_event(char *buf, uint event_len,
Format_description_log_event *desc)
{
m_query_len = event_len - desc->common_header_len;
m_query_txt = buf + desc->common_header_len;
}
The events are written to the binary log by the Log_event::write() member
which calls virtual write_data_header() and write_data_body() members
("data header" and "post header" are synonym in replication terminology).
In our case, data header is empty and data body is just the query:
bool write_data_body(IO_CACHE *file)
{
return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
}
Printing the event is just printing the query:
void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
{
my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
}
3. How Master writes Annotate_rows events to the binary log
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The event is written to the binary log just before the group of Table_map
events which precede corresponding Rows events (one query may generate
several Table map events in the binary log, but the corresponding
Annotate_rows event must be written only once before the first Table map
event; hence the boolean variable 'with_annotate' below):
int write_locked_table_maps(THD *thd)
{ ...
bool with_annotate= thd->variables.binlog_annotate_rows_events;
...
for (uint i= 0; i < ... <number of tables> ...; ++i)
{ ...
thd->binlog_write_table_map(table, ..., with_annotate);
with_annotate= 0; // write Annotate_event not more than once
...
}
...
}
int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
{ ...
Table_map_log_event the_event(...);
...
if (with_annotate)
{
Annotate_rows_log_event anno(this);
mysql_bin_log.write(&anno);
}
mysql_bin_log.write(&the_event);
...
}
4. How slave treats replicate-annotate-rows-events option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The replicate-annotate-rows-events option is treated just as the session
value of the binlog_annotate_rows_events variable for the slave IO and
SQL threads. This setting is done during initialization of these threads:
pthread_handler_t handle_slave_io(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_IO);
...
}
pthread_handler_t handle_slave_sql(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_SQL);
...
}
int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
{ ...
thd->variables.binlog_annotate_rows_events=
opt_replicate_annotate_rows_events;
...
}
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When requesting an event, the slave should inform the master whether
it should send Annotate_rows events or not. To that end we add a new
BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
#define BINLOG_DUMP_NON_BLOCK 1
#define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
pthread_handler_t handle_slave_io(void *arg)
{ ...
request_dump(mysql, ...);
...
}
int request_dump(MYSQL* mysql, ...)
{ ...
if (opt_log_slave_updates &&
mi->io_thd->variables.binlog_annotate_rows_events)
binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
...
int2store(buf + 4, binlog_flags);
...
simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
...
}
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
case COM_BINLOG_DUMP:
{ ...
flags= uint2korr(packet + 4);
...
mysql_binlog_send(thd, ..., flags);
...
}
void mysql_binlog_send(THD* thd, ..., ushort flags)
{ ...
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
thd->server_id == 0 /* slave == mysqlbinlog */ )
{
my_net_write(net, packet->ptr(), packet->length());
}
...
}
7. How slave SQL thread processes Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The slave processes each recieved event by "applying" it, i.e. by
calling the Log_event::apply_event() function which in turn calls
the virtual do_apply_event() member specific for each type of the
event.
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev = next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
int apply_event_and_update_pos(Log_event *ev, ...)
{ ...
ev->apply_event(...);
...
}
int Log_event::apply_event(...)
{
return do_apply_event(...);
}
What does it mean to "apply" an Annotate_rows event? It means to set current
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
thd->set_query(m_query_txt, m_query_len);
}
NOTE. I am not sure, but possibly current values of thd->query and
thd->query_length should be saved before calling set_query() and to be
restored on the Annotate_rows_log_event object deletion.
Is it really needed ?
After calling this do_apply_event() function we may not delete the
Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
above) because thd->query now points to the string inside this object.
We may keep the pointer to this object in the Relay_log_info:
class Relay_log_info
{
public:
...
void set_annotate_event(Annotate_rows_log_event*);
Annotate_rows_log_event* get_annotate_event();
void free_annotate_event();
...
private:
Annotate_rows_log_event* m_annotate_event;
};
The saved Annotate_rows object should be deleted when all corresponding
Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (rli->get_annotate_event() && is_last_rows_event(ev))
rli->free_annotate_event();
else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
where
bool is_last_rows_event(Log_event* ev)
{
Log_event_type type= ev->get_type_code();
if (IS_ROWS_EVENT_TYPE(type))
{
Rows_log_event* rows= (Rows_log_event*)ev;
return rows->get_flags(Rows_log_event::STMT_END_F);
}
return 0;
}
#define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
(type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
Kristian noticed that introducing new log event type should be coordinated
somehow with MySQL/Sun:
Kristian: The numeric code for this event must be assigned carefully.
It should be coordinated with MySQL/Sun, otherwise we can get into a
situation where MySQL uses the same numeric code for one event that
MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
impossible.
Alex: I reserved about 20 numbers not to have possible conflicts
with MySQL.
Kristian: Still, I think it would be appropriate to send a polite email
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
flag taking into account that MySQL/Sun may also introduce a flag with the
same value to be used in the request_dump-mysql_binlog_send interface.
But this is mainly the question of merging: if a conflict concerning this
flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
(this does not require additional changes in the code).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Alexi): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 20 Dec '09
by worklog-noreply@askmonty.org 20 Dec '09
20 Dec '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Knielsen
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 20
ESTIMATE.......: 35 (hours remain)
ORIG. ESTIMATE.: 35
PROGRESS NOTES:
-=-=(Alexi - Sun, 20 Dec 2009, 13:14)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.11350 2009-12-20 13:14:04.000000000 +0200
+++ /tmp/wklog.47.new.11350 2009-12-20 13:14:04.000000000 +0200
@@ -282,23 +282,18 @@
Annotate_rows_log_event* m_annotate_event;
};
-When the saved Annotate_rows object may be deleted? When all corresponding
-Rows events will be processed, i.e. before processing the first non-Rows
-event (note that Annotate_rows object resides in the binary log *after*
-the (possible) 'BEGIN' Query event which accompanies the rows events; note
-also that this deletion is adjusted with the case when some or all
-corresponding Rows events are filtered out by replicate filter rules):
+The saved Annotate_rows object should be deleted when all corresponding
+Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
- if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
- rli->free_annotate_event();
-
apply_event_and_update_pos(ev, ...);
- if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ if (rli->get_annotate_event() && is_last_rows_event(ev))
+ rli->free_annotate_event();
+ else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
@@ -307,10 +302,21 @@
where
- #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
- (type) == WRITE_ROWS_EVENT || \
+ bool is_last_rows_event(Log_event* ev)
+ {
+ Log_event_type type= ev->get_type_code();
+ if (IS_ROWS_EVENT_TYPE(type))
+ {
+ Rows_log_event* rows= (Rows_log_event*)ev;
+ return rows->get_flags(Rows_log_event::STMT_END_F);
+ }
+
+ return 0;
+ }
+
+ #define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
- (type) == DELETE_ROWS_EVENT )
+ (type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
-=-=(Alexi - Sun, 20 Dec 2009, 09:29)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.32726 2009-12-20 07:29:56.000000000 +0000
+++ /tmp/wklog.47.new.32726 2009-12-20 07:29:56.000000000 +0000
@@ -56,7 +56,7 @@
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
- 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ 00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
-=-=(Alexi - Sat, 19 Dec 2009, 16:10)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.16051 2009-12-19 16:10:48.000000000 +0200
+++ /tmp/wklog.47.new.16051 2009-12-19 16:10:48.000000000 +0200
@@ -253,7 +253,7 @@
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
-Rows events is applied):
+Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
-=-=(Alexi - Sat, 19 Dec 2009, 16:02)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.15695 2009-12-19 16:02:33.000000000 +0200
+++ /tmp/wklog.47.new.15695 2009-12-19 16:02:33.000000000 +0200
@@ -12,7 +12,7 @@
post-header and contains the query text in its data part. Example:
************************
- ANNOTATE_RBR_EVENT
+ ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
-=-=(Alexi - Sat, 19 Dec 2009, 15:58)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.15437 2009-12-19 13:58:12.000000000 +0000
+++ /tmp/wklog.47.new.15437 2009-12-19 13:58:12.000000000 +0000
@@ -1 +1,337 @@
+Content
+~~~~~~~
+ 1. Annotate_rows event number
+ 2. Outline of Annotate_rows event behavior
+ 3. How Master writes Annotate_rows events to the binary log
+ 4. How slave treats replicate-annotate-rows-events option
+ 5. How slave IO thread requests Annotate_rows events
+ 6. How master executes the request
+ 7. How slave SQL thread processes Annotate_rows events
+ 8. General remarks
+
+1. Annotate_rows event number
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
+between the last MySQL event number and the Annotate_rows event number:
+
+ enum Log_event_type
+ { ...
+ INCIDENT_EVENT= 26,
+ // New MySQL event numbers are to be added here
+ MYSQL_EVENTS_END,
+
+ MARIA_EVENTS_BEGIN= 51,
+ // New Maria event numbers start from here
+ ANNOTATE_ROWS_EVENT= 51,
+
+ ENUM_END_EVENT
+ };
+
+together with the corresponding extension of 'post_header_len' array in the
+Format description event. (This extension does not affect the compatibility
+of the binary log). Here is how Format description event looks like with
+this extension:
+
+ ************************
+ FORMAT_DESCRIPTION_EVENT
+ ************************
+ 00000004 | A1 A0 2C 4B | time_when = 1261215905
+ 00000008 | 0F | event_type = 15
+ 00000009 | 64 00 00 00 | server_id = 100
+ 0000000D | 7F 00 00 00 | event_len = 127
+ 00000011 | 83 00 00 00 | log_pos = 00000083
+ 00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
+ ------------------------
+ 00000017 | 04 00 | binlog_ver = 4
+ 00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
+ ..... ...
+ 0000004B | A1 A0 2C 4B | time_created = 1261215905
+ 0000004F | 13 | common_header_len = 19
+ ------------------------
+ post_header_len
+ ------------------------
+ 00000050 | 38 | 56 - START_EVENT_V3 [1]
+ ..... ...
+ 00000069 | 02 | 2 - INCIDENT_EVENT [26]
+ 0000006A | 00 | 0 - RESERVED [27]
+ ..... ...
+ 00000081 | 00 | 0 - RESERVED [50]
+ 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ ************************
+
+2. Outline of Annotate_rows event behavior
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Each Annotate_rows_log_event object has two private members describing the
+corresponding query:
+
+ char *m_query_txt;
+ uint m_query_len;
+
+When the object is created for writing to a binary log, this query is taken
+from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
+as well as other implementation details):
+
+ Annotate_rows_log_event(THD *thd)
+ {
+ m_query_txt = thd->query();
+ m_query_len = thd->query_length();
+ }
+
+When the object is read from a binary log, the query is taken from the buffer
+containing the binary log representation of the event (this buffer is allocated
+in Log_event object from which all Log events are derived):
+
+ Annotate_rows_log_event(char *buf, uint event_len,
+ Format_description_log_event *desc)
+ {
+ m_query_len = event_len - desc->common_header_len;
+ m_query_txt = buf + desc->common_header_len;
+ }
+
+The events are written to the binary log by the Log_event::write() member
+which calls virtual write_data_header() and write_data_body() members
+("data header" and "post header" are synonym in replication terminology).
+In our case, data header is empty and data body is just the query:
+
+ bool write_data_body(IO_CACHE *file)
+ {
+ return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
+ }
+
+Printing the event is just printing the query:
+
+ void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
+ {
+ my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
+ }
+
+3. How Master writes Annotate_rows events to the binary log
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The event is written to the binary log just before the group of Table_map
+events which precede corresponding Rows events (one query may generate
+several Table map events in the binary log, but the corresponding
+Annotate_rows event must be written only once before the first Table map
+event; hence the boolean variable 'with_annotate' below):
+
+ int write_locked_table_maps(THD *thd)
+ { ...
+ bool with_annotate= thd->variables.binlog_annotate_rows_events;
+ ...
+ for (uint i= 0; i < ... <number of tables> ...; ++i)
+ { ...
+ thd->binlog_write_table_map(table, ..., with_annotate);
+ with_annotate= 0; // write Annotate_event not more than once
+ ...
+ }
+ ...
+ }
+
+ int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
+ { ...
+ Table_map_log_event the_event(...);
+ ...
+ if (with_annotate)
+ {
+ Annotate_rows_log_event anno(this);
+ mysql_bin_log.write(&anno);
+ }
+
+ mysql_bin_log.write(&the_event);
+ ...
+ }
+
+4. How slave treats replicate-annotate-rows-events option
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The replicate-annotate-rows-events option is treated just as the session
+value of the binlog_annotate_rows_events variable for the slave IO and
+SQL threads. This setting is done during initialization of these threads:
+
+ pthread_handler_t handle_slave_io(void *arg)
+ {
+ THD *thd= new THD;
+ ...
+ init_slave_thread(thd, SLAVE_THD_IO);
+ ...
+ }
+
+ pthread_handler_t handle_slave_sql(void *arg)
+ {
+ THD *thd= new THD;
+ ...
+ init_slave_thread(thd, SLAVE_THD_SQL);
+ ...
+ }
+
+ int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
+ { ...
+ thd->variables.binlog_annotate_rows_events=
+ opt_replicate_annotate_rows_events;
+ ...
+ }
+
+5. How slave IO thread requests Annotate_rows events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When requesting an event, the slave should inform the master whether
+it should send Annotate_rows events or not. To that end we add a new
+BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
+
+ #define BINLOG_DUMP_NON_BLOCK 1
+ #define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
+
+ pthread_handler_t handle_slave_io(void *arg)
+ { ...
+ request_dump(mysql, ...);
+ ...
+ }
+
+ int request_dump(MYSQL* mysql, ...)
+ { ...
+ if (opt_log_slave_updates &&
+ mi->io_thd->variables.binlog_annotate_rows_events)
+ binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
+ ...
+ int2store(buf + 4, binlog_flags);
+ ...
+ simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
+ ...
+ }
+
+6. How master executes the request
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ case COM_BINLOG_DUMP:
+ { ...
+ flags= uint2korr(packet + 4);
+ ...
+ mysql_binlog_send(thd, ..., flags);
+ ...
+ }
+
+ void mysql_binlog_send(THD* thd, ..., ushort flags)
+ { ...
+ Log_event::read_log_event(&log, packet, ...);
+ ...
+ if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
+ thd->server_id == 0 /* slave == mysqlbinlog */ )
+ {
+ my_net_write(net, packet->ptr(), packet->length());
+ }
+ ...
+ }
+
+7. How slave SQL thread processes Annotate_rows events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The slave processes each recieved event by "applying" it, i.e. by
+calling the Log_event::apply_event() function which in turn calls
+the virtual do_apply_event() member specific for each type of the
+event.
+
+ int exec_relay_log_event(THD* thd, Relay_log_info* rli)
+ { ...
+ Log_event *ev = next_event(rli);
+ ...
+ apply_event_and_update_pos(ev, ...);
+
+ if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
+ delete ev;
+ ...
+ }
+
+ int apply_event_and_update_pos(Log_event *ev, ...)
+ { ...
+ ev->apply_event(...);
+ ...
+ }
+
+ int Log_event::apply_event(...)
+ {
+ return do_apply_event(...);
+ }
+
+What does it mean to "apply" an Annotate_rows event? It means to set current
+thd query to that of the described by the event, i.e. to the query which
+caused the subsequent Rows events (see "How Master writes Annotate_rows
+events to the binary log" to follow what happens further when the subsequent
+Rows events is applied):
+
+ int Annotate_rows_log_event::do_apply_event(...)
+ {
+ thd->set_query(m_query_txt, m_query_len);
+ }
+
+NOTE. I am not sure, but possibly current values of thd->query and
+thd->query_length should be saved before calling set_query() and to be
+restored on the Annotate_rows_log_event object deletion.
+Is it really needed ?
+
+After calling this do_apply_event() function we may not delete the
+Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
+above) because thd->query now points to the string inside this object.
+We may keep the pointer to this object in the Relay_log_info:
+
+ class Relay_log_info
+ {
+ public:
+ ...
+ void set_annotate_event(Annotate_rows_log_event*);
+ Annotate_rows_log_event* get_annotate_event();
+ void free_annotate_event();
+ ...
+ private:
+ Annotate_rows_log_event* m_annotate_event;
+ };
+
+When the saved Annotate_rows object may be deleted? When all corresponding
+Rows events will be processed, i.e. before processing the first non-Rows
+event (note that Annotate_rows object resides in the binary log *after*
+the (possible) 'BEGIN' Query event which accompanies the rows events; note
+also that this deletion is adjusted with the case when some or all
+corresponding Rows events are filtered out by replicate filter rules):
+
+ int exec_relay_log_event(THD* thd, Relay_log_info* rli)
+ { ...
+ Log_event *ev= next_event(rli);
+ ...
+ if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
+ rli->free_annotate_event();
+
+ apply_event_and_update_pos(ev, ...);
+
+ if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ rli->set_annotate_event((Annotate_rows_log_event*) ev);
+ else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
+ delete ev;
+ ...
+ }
+
+where
+
+ #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
+ (type) == WRITE_ROWS_EVENT || \
+ (type) == UPDATE_ROWS_EVENT || \
+ (type) == DELETE_ROWS_EVENT )
+
+8. General remarks
+~~~~~~~~~~~~~~~~~~
+Kristian noticed that introducing new log event type should be coordinated
+somehow with MySQL/Sun:
+
+ Kristian: The numeric code for this event must be assigned carefully.
+ It should be coordinated with MySQL/Sun, otherwise we can get into a
+ situation where MySQL uses the same numeric code for one event that
+ MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
+ impossible.
+ Alex: I reserved about 20 numbers not to have possible conflicts
+ with MySQL.
+ Kristian: Still, I think it would be appropriate to send a polite email
+ to internals(a)lists.mysql.com about this and suggesting to reserve the
+ event number.
+
+Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
+flag taking into account that MySQL/Sun may also introduce a flag with the
+same value to be used in the request_dump-mysql_binlog_send interface.
+But this is mainly the question of merging: if a conflict concerning this
+flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
+(this does not require additional changes in the code).
-=-=(Alexi - Sat, 19 Dec 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.14545 2009-12-19 15:41:21.000000000 +0200
+++ /tmp/wklog.47.new.14545 2009-12-19 15:41:21.000000000 +0200
@@ -1,122 +1,107 @@
-First suggestion:
-
-> I think for this we would actually need a new binlog event type
-> (Comment_log_event?). Unless we want to log an empty statement Query_log_event
-> containing only a comment (a bit of a hack).
-
-New server option
-~~~~~~~~~~~~~~~~~
- --binlog-annotate-rows-events
-
-Setting this option makes RBR (rows-) events in the binary log to be
-preceded by Annotate rows events (see below). The corresponding
-'binlog_annotate_rows_events' system variable is dynamic and has both
-global and session values. Default global value is OFF.
-
-Note. Session values are usefull to make it possible to annotate only
- some selected statements:
+Content
+~~~~~~~
+ 1. Annotate_rows_log_event
+ 2. Server option: --binlog-annotate-rows-events
+ 3. Server option: --replicate-annotate-rows-events
+ 4. mysqlbinlog option: --print-annotate-rows-events
+ 5. mysqlbinlog output
+
+1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Describes the query which caused the corresponding rows events. Has empty
+post-header and contains the query text in its data part. Example:
+
+ ************************
+ ANNOTATE_RBR_EVENT
+ ************************
+ 00000220 | B6 A0 2C 4B | time_when = 1261215926
+ 00000224 | 33 | event_type = 51
+ 00000225 | 64 00 00 00 | server_id = 100
+ 00000229 | 36 00 00 00 | event_len = 54
+ 0000022D | 56 02 00 00 | log_pos = 00000256
+ 00000231 | 00 00 | flags = <none>
+ ------------------------
+ 00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
+ 00000237 | 52 54 20 49 |
+ 0000023B | 4E 54 4F 20 |
+ 0000023F | 74 31 20 56 |
+ 00000243 | 41 4C 55 45 |
+ 00000247 | 53 20 28 31 |
+ 0000024B | 29 2C 20 28 |
+ 0000024F | 32 29 2C 20 |
+ 00000253 | 28 33 29 |
+ ************************
+
+In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
+and precedes the first of Table map events which accompany the corresponding
+rows events. (See example in the "mysqlbinlog output" section below.)
+
+2. Server option: --binlog-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tells the master to write Annotate_rows events to the binary log.
+
+ * Variable Name: binlog_annotate_rows_events
+ * Scope: Global & Session
+ * Access Type: Dynamic
+ * Data Type: bool
+ * Default Value: OFF
+NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
-New binlog event type
-~~~~~~~~~~~~~~~~~~~~~
- Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
-
-Describes the query which caused the corresponding rows event. In binary log,
-precedes each Table_map_log_event. Contains empty post-header and the query
-text in its data part.
-
-The numeric code for this event must be assigned carefully. It should be
-coordinated with MySQL/Sun, otherwise we can get into a situation where MySQL
-uses the same numeric code for one event that MariaDB uses for
-ANNOTATE_ROWS_EVENT, which would make merging the two impossible.
-
-Example:
-
- ...
- ************************
- ANNOTATE_ROWS_EVENT [51]
- ************************
- 000000C7 | 54 1B 12 4B | time_when = 1259477844
- 000000CB | 33 | event_type = 51
- 000000CC | 64 00 00 00 | server_id = 100
- 000000D0 | 2C 00 00 00 | event_len = 44
- 000000D4 | F3 00 00 00 | log_pos = 000000F3
- 000000D8 | 00 00 | flags = <none>
- ------------------------
- 000000DA | 69 6E 73 65 | query = "insert into t1 values (1)"
- 000000DE | 72 74 20 69 |
- 000000E2 | 6E 74 6F 20 |
- 000000E6 | 74 31 20 76 |
- 000000EA | 61 6C 75 65 |
- 000000EE | 73 20 28 31 |
- 000000F2 | 29 |
- ************************
- TABLE_MAP_EVENT [19]
- ************************
- 000000F3 | 54 1B 12 4B | time_when = 1259477844
- 000000F7 | 13 | event_type = 19
- 000000F8 | 64 00 00 00 | server_id = 100
- 000000FC | 29 00 00 00 | event_len = 41
- 00000100 | 1C 01 00 00 | log_pos = 0000011C
- 00000104 | 00 00 | flags = <none>
- ------------------------
- ...
- ************************
- WRITE_ROWS_EVENT [23]
- ************************
- 0000011C | 54 1B 12 4B | time_when = 1259477844
- 00000120 | 17 | event_type = 23
- 00000121 | 64 00 00 00 | server_id = 100
- 00000125 | 22 00 00 00 | event_len = 34
- 00000129 | 3E 01 00 00 | log_pos = 0000013E
- 0000012D | 10 00 | flags = LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F
- ------------------------
- 0000012F | 0F 00 00 00 | table_id = 15
- ...
-
-New mysqlbinlog option
-~~~~~~~~~~~~~~~~~~~~~~
- --print-annotate-rows-events
-
-With this option, mysqlbinlog prints the content of Annotate-rows
-events (if the binary log does contain them). Without this option
-(i.e. by default), mysqlbinlog skips Annotate rows events.
-
-
-mysqlbinlog output
-~~~~~~~~~~~~~~~~~~
-Something like this:
+3. Server option: --replicate-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tells the slave to reproduce Annotate_rows events recieved from the master
+in its own binary log (sensible only in pair with log-slave-updates option).
+
+ * Variable Name: replicate_annotate_rows_events
+ * Scope: Global
+ * Access Type: Read only
+ * Data Type: bool
+ * Default Value: OFF
+
+NOTE. Why do we additionally need this 'replicate' option? Why not to make
+the slave to reproduce this events when its binlog-annotate-rows-events
+global value is ON? Well, because, for example, we may want to configure
+the slave which should reproduce Annotate_rows events but has global
+binlog-annotate-rows-events = OFF meaning this to be the default value for
+the client threads (see also "How slave treats replicate-annotate-rows-events
+option" in LLD part).
+
+4. mysqlbinlog option: --print-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+With this option, mysqlbinlog prints the content of Annotate_rows events (if
+the binary log does contain them). Without this option (i.e. by default),
+mysqlbinlog skips Annotate_rows events.
+5. mysqlbinlog output
+~~~~~~~~~~~~~~~~~~~~~
+With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
+in a form like this:
...
- # at 199
- # at 243
- # at 284
- #091129 9:57:24 server id 100 end_log_pos 243 Query: `insert into t1 values
-(1)`
- #091129 9:57:24 server id 100 end_log_pos 284 Table_map: `test`.`t1` mapped
-to number 15
- #091129 9:57:24 server id 100 end_log_pos 318 Write_rows: table id 15
+ # at 1646
+ #091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
+exec_time=0 error_code=0
+ SET TIMESTAMP=1261215926/*!*/;
+ BEGIN
+ /*!*/;
+ # at 1714
+ # at 1812
+ # at 1853
+ # at 1894
+ # at 1938
+ #091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
+t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
+ #091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
+mapped to number 16
+ #091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
+mapped to number 17
+ #091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
+ #091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
-
- BINLOG '
- VBsSSzNkAAAALAAAAPMAAAAAAGluc2VydCBpbnRvIHQxIHZhbHVlcyAoMSk=
- VBsSSxNkAAAAKQAAABwBAAAAAA8AAAAAAAAABHRlc3QAAnQxAAEDAAE=
- VBsSSxdkAAAAIgAAAD4BAAAQAA8AAAAAAAEAAf/+AQAAAA==
- '/*!*/;
- ### INSERT INTO test.t1
- ### SET
- ### @1=1 /* INT meta=0 nullable=1 is_null=0 */
...
-When master sends Annotate rows events
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-1. Master always sends Annotate_rows events to mysqlbinlog (in
- remote case).
-2. Master sends Annotate_rows events to a slave only if the slave has
- both log-slave-updates and binlog-annotate-rows-events options set.
-
-=-=(Bothorsen - Fri, 18 Dec 2009, 16:22)=-=-
Add estimation time.
Worked 5 hours and estimate 35 hours remain (original estimate increased by 5 hours).
-=-=(Bothorsen - Fri, 18 Dec 2009, 16:16)=-=-
This is the work done on this patch so far. Most of it done by Alex.
Worked 15 hours and estimate 035 hours remain (original estimate increased by 50 hours).
-=-=(Alexi - Fri, 04 Dec 2009, 13:00)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.14001 2009-12-04 13:00:24.000000000 +0200
+++ /tmp/wklog.47.new.14001 2009-12-04 13:00:24.000000000 +0200
@@ -6,27 +6,27 @@
New server option
~~~~~~~~~~~~~~~~~
- --binlog-annotate-row-events
+ --binlog-annotate-rows-events
-Setting this option makes RBR (row-) events in the binary log to be
+Setting this option makes RBR (rows-) events in the binary log to be
preceded by Annotate rows events (see below). The corresponding
-'binlog_annotate_row_events' system variable is dynamic and has both
+'binlog_annotate_rows_events' system variable is dynamic and has both
global and session values. Default global value is OFF.
Note. Session values are usefull to make it possible to annotate only
some selected statements:
...
- SET SESSION binlog_annotate_row_events=ON;
+ SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
- SET SESSION binlog_annotate_row_events=OFF;
+ SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
New binlog event type
~~~~~~~~~~~~~~~~~~~~~
Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
-Describes the query which caused the corresponding row event. In binary log,
+Describes the query which caused the corresponding rows event. In binary log,
precedes each Table_map_log_event. Contains empty post-header and the query
text in its data part.
@@ -79,6 +79,15 @@
0000012F | 0F 00 00 00 | table_id = 15
...
+New mysqlbinlog option
+~~~~~~~~~~~~~~~~~~~~~~
+ --print-annotate-rows-events
+
+With this option, mysqlbinlog prints the content of Annotate-rows
+events (if the binary log does contain them). Without this option
+(i.e. by default), mysqlbinlog skips Annotate rows events.
+
+
mysqlbinlog output
~~~~~~~~~~~~~~~~~~
Something like this:
@@ -109,5 +118,5 @@
1. Master always sends Annotate_rows events to mysqlbinlog (in
remote case).
2. Master sends Annotate_rows events to a slave only if the slave has
- both log-slave-updates and binlog-annotate-row-events options set.
+ both log-slave-updates and binlog-annotate-rows-events options set.
-=-=(Alexi - Wed, 02 Dec 2009, 13:32)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.17456 2009-12-02 13:32:18.000000000 +0200
+++ /tmp/wklog.47.new.17456 2009-12-02 13:32:18.000000000 +0200
@@ -1,8 +1 @@
-mysql_binlog_send() [sql/sql_repl.cc]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-1. When sending events to a slave, master should simply skip
- Annotate_rows events (they are not needed for replication).
- [ ??? Multi-master - currently not clear ]
-2. When sending events to mysqlbinlog (remote case), master
- must send Annotate_rows events as well.
------------------------------------------------------------
-=-=(View All Progress Notes, 18 total)=-=-
http://askmonty.org/worklog/index.pl?tid=47&nolimit=1
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
HIGH-LEVEL SPECIFICATION:
Content
~~~~~~~
1. Annotate_rows_log_event
2. Server option: --binlog-annotate-rows-events
3. Server option: --replicate-annotate-rows-events
4. mysqlbinlog option: --print-annotate-rows-events
5. mysqlbinlog output
1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Describes the query which caused the corresponding rows events. Has empty
post-header and contains the query text in its data part. Example:
************************
ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
00000225 | 64 00 00 00 | server_id = 100
00000229 | 36 00 00 00 | event_len = 54
0000022D | 56 02 00 00 | log_pos = 00000256
00000231 | 00 00 | flags = <none>
------------------------
00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
00000237 | 52 54 20 49 |
0000023B | 4E 54 4F 20 |
0000023F | 74 31 20 56 |
00000243 | 41 4C 55 45 |
00000247 | 53 20 28 31 |
0000024B | 29 2C 20 28 |
0000024F | 32 29 2C 20 |
00000253 | 28 33 29 |
************************
In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
and precedes the first of Table map events which accompany the corresponding
rows events. (See example in the "mysqlbinlog output" section below.)
2. Server option: --binlog-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the master to write Annotate_rows events to the binary log.
* Variable Name: binlog_annotate_rows_events
* Scope: Global & Session
* Access Type: Dynamic
* Data Type: bool
* Default Value: OFF
NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
3. Server option: --replicate-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the slave to reproduce Annotate_rows events recieved from the master
in its own binary log (sensible only in pair with log-slave-updates option).
* Variable Name: replicate_annotate_rows_events
* Scope: Global
* Access Type: Read only
* Data Type: bool
* Default Value: OFF
NOTE. Why do we additionally need this 'replicate' option? Why not to make
the slave to reproduce this events when its binlog-annotate-rows-events
global value is ON? Well, because, for example, we may want to configure
the slave which should reproduce Annotate_rows events but has global
binlog-annotate-rows-events = OFF meaning this to be the default value for
the client threads (see also "How slave treats replicate-annotate-rows-events
option" in LLD part).
4. mysqlbinlog option: --print-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With this option, mysqlbinlog prints the content of Annotate_rows events (if
the binary log does contain them). Without this option (i.e. by default),
mysqlbinlog skips Annotate_rows events.
5. mysqlbinlog output
~~~~~~~~~~~~~~~~~~~~~
With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
in a form like this:
...
# at 1646
#091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
exec_time=0 error_code=0
SET TIMESTAMP=1261215926/*!*/;
BEGIN
/*!*/;
# at 1714
# at 1812
# at 1853
# at 1894
# at 1938
#091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
#091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
mapped to number 16
#091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
mapped to number 17
#091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
#091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
...
LOW-LEVEL DESIGN:
Content
~~~~~~~
1. Annotate_rows event number
2. Outline of Annotate_rows event behavior
3. How Master writes Annotate_rows events to the binary log
4. How slave treats replicate-annotate-rows-events option
5. How slave IO thread requests Annotate_rows events
6. How master executes the request
7. How slave SQL thread processes Annotate_rows events
8. General remarks
1. Annotate_rows event number
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
between the last MySQL event number and the Annotate_rows event number:
enum Log_event_type
{ ...
INCIDENT_EVENT= 26,
// New MySQL event numbers are to be added here
MYSQL_EVENTS_END,
MARIA_EVENTS_BEGIN= 51,
// New Maria event numbers start from here
ANNOTATE_ROWS_EVENT= 51,
ENUM_END_EVENT
};
together with the corresponding extension of 'post_header_len' array in the
Format description event. (This extension does not affect the compatibility
of the binary log). Here is how Format description event looks like with
this extension:
************************
FORMAT_DESCRIPTION_EVENT
************************
00000004 | A1 A0 2C 4B | time_when = 1261215905
00000008 | 0F | event_type = 15
00000009 | 64 00 00 00 | server_id = 100
0000000D | 7F 00 00 00 | event_len = 127
00000011 | 83 00 00 00 | log_pos = 00000083
00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
------------------------
00000017 | 04 00 | binlog_ver = 4
00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
..... ...
0000004B | A1 A0 2C 4B | time_created = 1261215905
0000004F | 13 | common_header_len = 19
------------------------
post_header_len
------------------------
00000050 | 38 | 56 - START_EVENT_V3 [1]
..... ...
00000069 | 02 | 2 - INCIDENT_EVENT [26]
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each Annotate_rows_log_event object has two private members describing the
corresponding query:
char *m_query_txt;
uint m_query_len;
When the object is created for writing to a binary log, this query is taken
from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
as well as other implementation details):
Annotate_rows_log_event(THD *thd)
{
m_query_txt = thd->query();
m_query_len = thd->query_length();
}
When the object is read from a binary log, the query is taken from the buffer
containing the binary log representation of the event (this buffer is allocated
in Log_event object from which all Log events are derived):
Annotate_rows_log_event(char *buf, uint event_len,
Format_description_log_event *desc)
{
m_query_len = event_len - desc->common_header_len;
m_query_txt = buf + desc->common_header_len;
}
The events are written to the binary log by the Log_event::write() member
which calls virtual write_data_header() and write_data_body() members
("data header" and "post header" are synonym in replication terminology).
In our case, data header is empty and data body is just the query:
bool write_data_body(IO_CACHE *file)
{
return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
}
Printing the event is just printing the query:
void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
{
my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
}
3. How Master writes Annotate_rows events to the binary log
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The event is written to the binary log just before the group of Table_map
events which precede corresponding Rows events (one query may generate
several Table map events in the binary log, but the corresponding
Annotate_rows event must be written only once before the first Table map
event; hence the boolean variable 'with_annotate' below):
int write_locked_table_maps(THD *thd)
{ ...
bool with_annotate= thd->variables.binlog_annotate_rows_events;
...
for (uint i= 0; i < ... <number of tables> ...; ++i)
{ ...
thd->binlog_write_table_map(table, ..., with_annotate);
with_annotate= 0; // write Annotate_event not more than once
...
}
...
}
int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
{ ...
Table_map_log_event the_event(...);
...
if (with_annotate)
{
Annotate_rows_log_event anno(this);
mysql_bin_log.write(&anno);
}
mysql_bin_log.write(&the_event);
...
}
4. How slave treats replicate-annotate-rows-events option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The replicate-annotate-rows-events option is treated just as the session
value of the binlog_annotate_rows_events variable for the slave IO and
SQL threads. This setting is done during initialization of these threads:
pthread_handler_t handle_slave_io(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_IO);
...
}
pthread_handler_t handle_slave_sql(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_SQL);
...
}
int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
{ ...
thd->variables.binlog_annotate_rows_events=
opt_replicate_annotate_rows_events;
...
}
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When requesting an event, the slave should inform the master whether
it should send Annotate_rows events or not. To that end we add a new
BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
#define BINLOG_DUMP_NON_BLOCK 1
#define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
pthread_handler_t handle_slave_io(void *arg)
{ ...
request_dump(mysql, ...);
...
}
int request_dump(MYSQL* mysql, ...)
{ ...
if (opt_log_slave_updates &&
mi->io_thd->variables.binlog_annotate_rows_events)
binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
...
int2store(buf + 4, binlog_flags);
...
simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
...
}
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
case COM_BINLOG_DUMP:
{ ...
flags= uint2korr(packet + 4);
...
mysql_binlog_send(thd, ..., flags);
...
}
void mysql_binlog_send(THD* thd, ..., ushort flags)
{ ...
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
thd->server_id == 0 /* slave == mysqlbinlog */ )
{
my_net_write(net, packet->ptr(), packet->length());
}
...
}
7. How slave SQL thread processes Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The slave processes each recieved event by "applying" it, i.e. by
calling the Log_event::apply_event() function which in turn calls
the virtual do_apply_event() member specific for each type of the
event.
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev = next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
int apply_event_and_update_pos(Log_event *ev, ...)
{ ...
ev->apply_event(...);
...
}
int Log_event::apply_event(...)
{
return do_apply_event(...);
}
What does it mean to "apply" an Annotate_rows event? It means to set current
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
thd->set_query(m_query_txt, m_query_len);
}
NOTE. I am not sure, but possibly current values of thd->query and
thd->query_length should be saved before calling set_query() and to be
restored on the Annotate_rows_log_event object deletion.
Is it really needed ?
After calling this do_apply_event() function we may not delete the
Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
above) because thd->query now points to the string inside this object.
We may keep the pointer to this object in the Relay_log_info:
class Relay_log_info
{
public:
...
void set_annotate_event(Annotate_rows_log_event*);
Annotate_rows_log_event* get_annotate_event();
void free_annotate_event();
...
private:
Annotate_rows_log_event* m_annotate_event;
};
The saved Annotate_rows object should be deleted when all corresponding
Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (rli->get_annotate_event() && is_last_rows_event(ev))
rli->free_annotate_event();
else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
where
bool is_last_rows_event(Log_event* ev)
{
Log_event_type type= ev->get_type_code();
if (IS_ROWS_EVENT_TYPE(type))
{
Rows_log_event* rows= (Rows_log_event*)ev;
return rows->get_flags(Rows_log_event::STMT_END_F);
}
return 0;
}
#define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
(type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
Kristian noticed that introducing new log event type should be coordinated
somehow with MySQL/Sun:
Kristian: The numeric code for this event must be assigned carefully.
It should be coordinated with MySQL/Sun, otherwise we can get into a
situation where MySQL uses the same numeric code for one event that
MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
impossible.
Alex: I reserved about 20 numbers not to have possible conflicts
with MySQL.
Kristian: Still, I think it would be appropriate to send a polite email
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
flag taking into account that MySQL/Sun may also introduce a flag with the
same value to be used in the request_dump-mysql_binlog_send interface.
But this is mainly the question of merging: if a conflict concerning this
flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
(this does not require additional changes in the code).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Alexi): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 20 Dec '09
by worklog-noreply@askmonty.org 20 Dec '09
20 Dec '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Knielsen
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 20
ESTIMATE.......: 35 (hours remain)
ORIG. ESTIMATE.: 35
PROGRESS NOTES:
-=-=(Alexi - Sun, 20 Dec 2009, 13:14)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.11350 2009-12-20 13:14:04.000000000 +0200
+++ /tmp/wklog.47.new.11350 2009-12-20 13:14:04.000000000 +0200
@@ -282,23 +282,18 @@
Annotate_rows_log_event* m_annotate_event;
};
-When the saved Annotate_rows object may be deleted? When all corresponding
-Rows events will be processed, i.e. before processing the first non-Rows
-event (note that Annotate_rows object resides in the binary log *after*
-the (possible) 'BEGIN' Query event which accompanies the rows events; note
-also that this deletion is adjusted with the case when some or all
-corresponding Rows events are filtered out by replicate filter rules):
+The saved Annotate_rows object should be deleted when all corresponding
+Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
- if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
- rli->free_annotate_event();
-
apply_event_and_update_pos(ev, ...);
- if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ if (rli->get_annotate_event() && is_last_rows_event(ev))
+ rli->free_annotate_event();
+ else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
@@ -307,10 +302,21 @@
where
- #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
- (type) == WRITE_ROWS_EVENT || \
+ bool is_last_rows_event(Log_event* ev)
+ {
+ Log_event_type type= ev->get_type_code();
+ if (IS_ROWS_EVENT_TYPE(type))
+ {
+ Rows_log_event* rows= (Rows_log_event*)ev;
+ return rows->get_flags(Rows_log_event::STMT_END_F);
+ }
+
+ return 0;
+ }
+
+ #define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
- (type) == DELETE_ROWS_EVENT )
+ (type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
-=-=(Alexi - Sun, 20 Dec 2009, 09:29)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.32726 2009-12-20 07:29:56.000000000 +0000
+++ /tmp/wklog.47.new.32726 2009-12-20 07:29:56.000000000 +0000
@@ -56,7 +56,7 @@
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
- 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ 00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
-=-=(Alexi - Sat, 19 Dec 2009, 16:10)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.16051 2009-12-19 16:10:48.000000000 +0200
+++ /tmp/wklog.47.new.16051 2009-12-19 16:10:48.000000000 +0200
@@ -253,7 +253,7 @@
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
-Rows events is applied):
+Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
-=-=(Alexi - Sat, 19 Dec 2009, 16:02)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.15695 2009-12-19 16:02:33.000000000 +0200
+++ /tmp/wklog.47.new.15695 2009-12-19 16:02:33.000000000 +0200
@@ -12,7 +12,7 @@
post-header and contains the query text in its data part. Example:
************************
- ANNOTATE_RBR_EVENT
+ ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
-=-=(Alexi - Sat, 19 Dec 2009, 15:58)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.15437 2009-12-19 13:58:12.000000000 +0000
+++ /tmp/wklog.47.new.15437 2009-12-19 13:58:12.000000000 +0000
@@ -1 +1,337 @@
+Content
+~~~~~~~
+ 1. Annotate_rows event number
+ 2. Outline of Annotate_rows event behavior
+ 3. How Master writes Annotate_rows events to the binary log
+ 4. How slave treats replicate-annotate-rows-events option
+ 5. How slave IO thread requests Annotate_rows events
+ 6. How master executes the request
+ 7. How slave SQL thread processes Annotate_rows events
+ 8. General remarks
+
+1. Annotate_rows event number
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
+between the last MySQL event number and the Annotate_rows event number:
+
+ enum Log_event_type
+ { ...
+ INCIDENT_EVENT= 26,
+ // New MySQL event numbers are to be added here
+ MYSQL_EVENTS_END,
+
+ MARIA_EVENTS_BEGIN= 51,
+ // New Maria event numbers start from here
+ ANNOTATE_ROWS_EVENT= 51,
+
+ ENUM_END_EVENT
+ };
+
+together with the corresponding extension of 'post_header_len' array in the
+Format description event. (This extension does not affect the compatibility
+of the binary log). Here is how Format description event looks like with
+this extension:
+
+ ************************
+ FORMAT_DESCRIPTION_EVENT
+ ************************
+ 00000004 | A1 A0 2C 4B | time_when = 1261215905
+ 00000008 | 0F | event_type = 15
+ 00000009 | 64 00 00 00 | server_id = 100
+ 0000000D | 7F 00 00 00 | event_len = 127
+ 00000011 | 83 00 00 00 | log_pos = 00000083
+ 00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
+ ------------------------
+ 00000017 | 04 00 | binlog_ver = 4
+ 00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
+ ..... ...
+ 0000004B | A1 A0 2C 4B | time_created = 1261215905
+ 0000004F | 13 | common_header_len = 19
+ ------------------------
+ post_header_len
+ ------------------------
+ 00000050 | 38 | 56 - START_EVENT_V3 [1]
+ ..... ...
+ 00000069 | 02 | 2 - INCIDENT_EVENT [26]
+ 0000006A | 00 | 0 - RESERVED [27]
+ ..... ...
+ 00000081 | 00 | 0 - RESERVED [50]
+ 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ ************************
+
+2. Outline of Annotate_rows event behavior
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Each Annotate_rows_log_event object has two private members describing the
+corresponding query:
+
+ char *m_query_txt;
+ uint m_query_len;
+
+When the object is created for writing to a binary log, this query is taken
+from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
+as well as other implementation details):
+
+ Annotate_rows_log_event(THD *thd)
+ {
+ m_query_txt = thd->query();
+ m_query_len = thd->query_length();
+ }
+
+When the object is read from a binary log, the query is taken from the buffer
+containing the binary log representation of the event (this buffer is allocated
+in Log_event object from which all Log events are derived):
+
+ Annotate_rows_log_event(char *buf, uint event_len,
+ Format_description_log_event *desc)
+ {
+ m_query_len = event_len - desc->common_header_len;
+ m_query_txt = buf + desc->common_header_len;
+ }
+
+The events are written to the binary log by the Log_event::write() member
+which calls virtual write_data_header() and write_data_body() members
+("data header" and "post header" are synonym in replication terminology).
+In our case, data header is empty and data body is just the query:
+
+ bool write_data_body(IO_CACHE *file)
+ {
+ return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
+ }
+
+Printing the event is just printing the query:
+
+ void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
+ {
+ my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
+ }
+
+3. How Master writes Annotate_rows events to the binary log
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The event is written to the binary log just before the group of Table_map
+events which precede corresponding Rows events (one query may generate
+several Table map events in the binary log, but the corresponding
+Annotate_rows event must be written only once before the first Table map
+event; hence the boolean variable 'with_annotate' below):
+
+ int write_locked_table_maps(THD *thd)
+ { ...
+ bool with_annotate= thd->variables.binlog_annotate_rows_events;
+ ...
+ for (uint i= 0; i < ... <number of tables> ...; ++i)
+ { ...
+ thd->binlog_write_table_map(table, ..., with_annotate);
+ with_annotate= 0; // write Annotate_event not more than once
+ ...
+ }
+ ...
+ }
+
+ int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
+ { ...
+ Table_map_log_event the_event(...);
+ ...
+ if (with_annotate)
+ {
+ Annotate_rows_log_event anno(this);
+ mysql_bin_log.write(&anno);
+ }
+
+ mysql_bin_log.write(&the_event);
+ ...
+ }
+
+4. How slave treats replicate-annotate-rows-events option
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The replicate-annotate-rows-events option is treated just as the session
+value of the binlog_annotate_rows_events variable for the slave IO and
+SQL threads. This setting is done during initialization of these threads:
+
+ pthread_handler_t handle_slave_io(void *arg)
+ {
+ THD *thd= new THD;
+ ...
+ init_slave_thread(thd, SLAVE_THD_IO);
+ ...
+ }
+
+ pthread_handler_t handle_slave_sql(void *arg)
+ {
+ THD *thd= new THD;
+ ...
+ init_slave_thread(thd, SLAVE_THD_SQL);
+ ...
+ }
+
+ int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
+ { ...
+ thd->variables.binlog_annotate_rows_events=
+ opt_replicate_annotate_rows_events;
+ ...
+ }
+
+5. How slave IO thread requests Annotate_rows events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When requesting an event, the slave should inform the master whether
+it should send Annotate_rows events or not. To that end we add a new
+BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
+
+ #define BINLOG_DUMP_NON_BLOCK 1
+ #define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
+
+ pthread_handler_t handle_slave_io(void *arg)
+ { ...
+ request_dump(mysql, ...);
+ ...
+ }
+
+ int request_dump(MYSQL* mysql, ...)
+ { ...
+ if (opt_log_slave_updates &&
+ mi->io_thd->variables.binlog_annotate_rows_events)
+ binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
+ ...
+ int2store(buf + 4, binlog_flags);
+ ...
+ simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
+ ...
+ }
+
+6. How master executes the request
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ case COM_BINLOG_DUMP:
+ { ...
+ flags= uint2korr(packet + 4);
+ ...
+ mysql_binlog_send(thd, ..., flags);
+ ...
+ }
+
+ void mysql_binlog_send(THD* thd, ..., ushort flags)
+ { ...
+ Log_event::read_log_event(&log, packet, ...);
+ ...
+ if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
+ thd->server_id == 0 /* slave == mysqlbinlog */ )
+ {
+ my_net_write(net, packet->ptr(), packet->length());
+ }
+ ...
+ }
+
+7. How slave SQL thread processes Annotate_rows events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The slave processes each recieved event by "applying" it, i.e. by
+calling the Log_event::apply_event() function which in turn calls
+the virtual do_apply_event() member specific for each type of the
+event.
+
+ int exec_relay_log_event(THD* thd, Relay_log_info* rli)
+ { ...
+ Log_event *ev = next_event(rli);
+ ...
+ apply_event_and_update_pos(ev, ...);
+
+ if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
+ delete ev;
+ ...
+ }
+
+ int apply_event_and_update_pos(Log_event *ev, ...)
+ { ...
+ ev->apply_event(...);
+ ...
+ }
+
+ int Log_event::apply_event(...)
+ {
+ return do_apply_event(...);
+ }
+
+What does it mean to "apply" an Annotate_rows event? It means to set current
+thd query to that of the described by the event, i.e. to the query which
+caused the subsequent Rows events (see "How Master writes Annotate_rows
+events to the binary log" to follow what happens further when the subsequent
+Rows events is applied):
+
+ int Annotate_rows_log_event::do_apply_event(...)
+ {
+ thd->set_query(m_query_txt, m_query_len);
+ }
+
+NOTE. I am not sure, but possibly current values of thd->query and
+thd->query_length should be saved before calling set_query() and to be
+restored on the Annotate_rows_log_event object deletion.
+Is it really needed ?
+
+After calling this do_apply_event() function we may not delete the
+Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
+above) because thd->query now points to the string inside this object.
+We may keep the pointer to this object in the Relay_log_info:
+
+ class Relay_log_info
+ {
+ public:
+ ...
+ void set_annotate_event(Annotate_rows_log_event*);
+ Annotate_rows_log_event* get_annotate_event();
+ void free_annotate_event();
+ ...
+ private:
+ Annotate_rows_log_event* m_annotate_event;
+ };
+
+When the saved Annotate_rows object may be deleted? When all corresponding
+Rows events will be processed, i.e. before processing the first non-Rows
+event (note that Annotate_rows object resides in the binary log *after*
+the (possible) 'BEGIN' Query event which accompanies the rows events; note
+also that this deletion is adjusted with the case when some or all
+corresponding Rows events are filtered out by replicate filter rules):
+
+ int exec_relay_log_event(THD* thd, Relay_log_info* rli)
+ { ...
+ Log_event *ev= next_event(rli);
+ ...
+ if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
+ rli->free_annotate_event();
+
+ apply_event_and_update_pos(ev, ...);
+
+ if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ rli->set_annotate_event((Annotate_rows_log_event*) ev);
+ else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
+ delete ev;
+ ...
+ }
+
+where
+
+ #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
+ (type) == WRITE_ROWS_EVENT || \
+ (type) == UPDATE_ROWS_EVENT || \
+ (type) == DELETE_ROWS_EVENT )
+
+8. General remarks
+~~~~~~~~~~~~~~~~~~
+Kristian noticed that introducing new log event type should be coordinated
+somehow with MySQL/Sun:
+
+ Kristian: The numeric code for this event must be assigned carefully.
+ It should be coordinated with MySQL/Sun, otherwise we can get into a
+ situation where MySQL uses the same numeric code for one event that
+ MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
+ impossible.
+ Alex: I reserved about 20 numbers not to have possible conflicts
+ with MySQL.
+ Kristian: Still, I think it would be appropriate to send a polite email
+ to internals(a)lists.mysql.com about this and suggesting to reserve the
+ event number.
+
+Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
+flag taking into account that MySQL/Sun may also introduce a flag with the
+same value to be used in the request_dump-mysql_binlog_send interface.
+But this is mainly the question of merging: if a conflict concerning this
+flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
+(this does not require additional changes in the code).
-=-=(Alexi - Sat, 19 Dec 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.14545 2009-12-19 15:41:21.000000000 +0200
+++ /tmp/wklog.47.new.14545 2009-12-19 15:41:21.000000000 +0200
@@ -1,122 +1,107 @@
-First suggestion:
-
-> I think for this we would actually need a new binlog event type
-> (Comment_log_event?). Unless we want to log an empty statement Query_log_event
-> containing only a comment (a bit of a hack).
-
-New server option
-~~~~~~~~~~~~~~~~~
- --binlog-annotate-rows-events
-
-Setting this option makes RBR (rows-) events in the binary log to be
-preceded by Annotate rows events (see below). The corresponding
-'binlog_annotate_rows_events' system variable is dynamic and has both
-global and session values. Default global value is OFF.
-
-Note. Session values are usefull to make it possible to annotate only
- some selected statements:
+Content
+~~~~~~~
+ 1. Annotate_rows_log_event
+ 2. Server option: --binlog-annotate-rows-events
+ 3. Server option: --replicate-annotate-rows-events
+ 4. mysqlbinlog option: --print-annotate-rows-events
+ 5. mysqlbinlog output
+
+1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Describes the query which caused the corresponding rows events. Has empty
+post-header and contains the query text in its data part. Example:
+
+ ************************
+ ANNOTATE_RBR_EVENT
+ ************************
+ 00000220 | B6 A0 2C 4B | time_when = 1261215926
+ 00000224 | 33 | event_type = 51
+ 00000225 | 64 00 00 00 | server_id = 100
+ 00000229 | 36 00 00 00 | event_len = 54
+ 0000022D | 56 02 00 00 | log_pos = 00000256
+ 00000231 | 00 00 | flags = <none>
+ ------------------------
+ 00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
+ 00000237 | 52 54 20 49 |
+ 0000023B | 4E 54 4F 20 |
+ 0000023F | 74 31 20 56 |
+ 00000243 | 41 4C 55 45 |
+ 00000247 | 53 20 28 31 |
+ 0000024B | 29 2C 20 28 |
+ 0000024F | 32 29 2C 20 |
+ 00000253 | 28 33 29 |
+ ************************
+
+In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
+and precedes the first of Table map events which accompany the corresponding
+rows events. (See example in the "mysqlbinlog output" section below.)
+
+2. Server option: --binlog-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tells the master to write Annotate_rows events to the binary log.
+
+ * Variable Name: binlog_annotate_rows_events
+ * Scope: Global & Session
+ * Access Type: Dynamic
+ * Data Type: bool
+ * Default Value: OFF
+NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
-New binlog event type
-~~~~~~~~~~~~~~~~~~~~~
- Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
-
-Describes the query which caused the corresponding rows event. In binary log,
-precedes each Table_map_log_event. Contains empty post-header and the query
-text in its data part.
-
-The numeric code for this event must be assigned carefully. It should be
-coordinated with MySQL/Sun, otherwise we can get into a situation where MySQL
-uses the same numeric code for one event that MariaDB uses for
-ANNOTATE_ROWS_EVENT, which would make merging the two impossible.
-
-Example:
-
- ...
- ************************
- ANNOTATE_ROWS_EVENT [51]
- ************************
- 000000C7 | 54 1B 12 4B | time_when = 1259477844
- 000000CB | 33 | event_type = 51
- 000000CC | 64 00 00 00 | server_id = 100
- 000000D0 | 2C 00 00 00 | event_len = 44
- 000000D4 | F3 00 00 00 | log_pos = 000000F3
- 000000D8 | 00 00 | flags = <none>
- ------------------------
- 000000DA | 69 6E 73 65 | query = "insert into t1 values (1)"
- 000000DE | 72 74 20 69 |
- 000000E2 | 6E 74 6F 20 |
- 000000E6 | 74 31 20 76 |
- 000000EA | 61 6C 75 65 |
- 000000EE | 73 20 28 31 |
- 000000F2 | 29 |
- ************************
- TABLE_MAP_EVENT [19]
- ************************
- 000000F3 | 54 1B 12 4B | time_when = 1259477844
- 000000F7 | 13 | event_type = 19
- 000000F8 | 64 00 00 00 | server_id = 100
- 000000FC | 29 00 00 00 | event_len = 41
- 00000100 | 1C 01 00 00 | log_pos = 0000011C
- 00000104 | 00 00 | flags = <none>
- ------------------------
- ...
- ************************
- WRITE_ROWS_EVENT [23]
- ************************
- 0000011C | 54 1B 12 4B | time_when = 1259477844
- 00000120 | 17 | event_type = 23
- 00000121 | 64 00 00 00 | server_id = 100
- 00000125 | 22 00 00 00 | event_len = 34
- 00000129 | 3E 01 00 00 | log_pos = 0000013E
- 0000012D | 10 00 | flags = LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F
- ------------------------
- 0000012F | 0F 00 00 00 | table_id = 15
- ...
-
-New mysqlbinlog option
-~~~~~~~~~~~~~~~~~~~~~~
- --print-annotate-rows-events
-
-With this option, mysqlbinlog prints the content of Annotate-rows
-events (if the binary log does contain them). Without this option
-(i.e. by default), mysqlbinlog skips Annotate rows events.
-
-
-mysqlbinlog output
-~~~~~~~~~~~~~~~~~~
-Something like this:
+3. Server option: --replicate-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tells the slave to reproduce Annotate_rows events recieved from the master
+in its own binary log (sensible only in pair with log-slave-updates option).
+
+ * Variable Name: replicate_annotate_rows_events
+ * Scope: Global
+ * Access Type: Read only
+ * Data Type: bool
+ * Default Value: OFF
+
+NOTE. Why do we additionally need this 'replicate' option? Why not to make
+the slave to reproduce this events when its binlog-annotate-rows-events
+global value is ON? Well, because, for example, we may want to configure
+the slave which should reproduce Annotate_rows events but has global
+binlog-annotate-rows-events = OFF meaning this to be the default value for
+the client threads (see also "How slave treats replicate-annotate-rows-events
+option" in LLD part).
+
+4. mysqlbinlog option: --print-annotate-rows-events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+With this option, mysqlbinlog prints the content of Annotate_rows events (if
+the binary log does contain them). Without this option (i.e. by default),
+mysqlbinlog skips Annotate_rows events.
+5. mysqlbinlog output
+~~~~~~~~~~~~~~~~~~~~~
+With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
+in a form like this:
...
- # at 199
- # at 243
- # at 284
- #091129 9:57:24 server id 100 end_log_pos 243 Query: `insert into t1 values
-(1)`
- #091129 9:57:24 server id 100 end_log_pos 284 Table_map: `test`.`t1` mapped
-to number 15
- #091129 9:57:24 server id 100 end_log_pos 318 Write_rows: table id 15
+ # at 1646
+ #091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
+exec_time=0 error_code=0
+ SET TIMESTAMP=1261215926/*!*/;
+ BEGIN
+ /*!*/;
+ # at 1714
+ # at 1812
+ # at 1853
+ # at 1894
+ # at 1938
+ #091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
+t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
+ #091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
+mapped to number 16
+ #091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
+mapped to number 17
+ #091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
+ #091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
-
- BINLOG '
- VBsSSzNkAAAALAAAAPMAAAAAAGluc2VydCBpbnRvIHQxIHZhbHVlcyAoMSk=
- VBsSSxNkAAAAKQAAABwBAAAAAA8AAAAAAAAABHRlc3QAAnQxAAEDAAE=
- VBsSSxdkAAAAIgAAAD4BAAAQAA8AAAAAAAEAAf/+AQAAAA==
- '/*!*/;
- ### INSERT INTO test.t1
- ### SET
- ### @1=1 /* INT meta=0 nullable=1 is_null=0 */
...
-When master sends Annotate rows events
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-1. Master always sends Annotate_rows events to mysqlbinlog (in
- remote case).
-2. Master sends Annotate_rows events to a slave only if the slave has
- both log-slave-updates and binlog-annotate-rows-events options set.
-
-=-=(Bothorsen - Fri, 18 Dec 2009, 16:22)=-=-
Add estimation time.
Worked 5 hours and estimate 35 hours remain (original estimate increased by 5 hours).
-=-=(Bothorsen - Fri, 18 Dec 2009, 16:16)=-=-
This is the work done on this patch so far. Most of it done by Alex.
Worked 15 hours and estimate 035 hours remain (original estimate increased by 50 hours).
-=-=(Alexi - Fri, 04 Dec 2009, 13:00)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.14001 2009-12-04 13:00:24.000000000 +0200
+++ /tmp/wklog.47.new.14001 2009-12-04 13:00:24.000000000 +0200
@@ -6,27 +6,27 @@
New server option
~~~~~~~~~~~~~~~~~
- --binlog-annotate-row-events
+ --binlog-annotate-rows-events
-Setting this option makes RBR (row-) events in the binary log to be
+Setting this option makes RBR (rows-) events in the binary log to be
preceded by Annotate rows events (see below). The corresponding
-'binlog_annotate_row_events' system variable is dynamic and has both
+'binlog_annotate_rows_events' system variable is dynamic and has both
global and session values. Default global value is OFF.
Note. Session values are usefull to make it possible to annotate only
some selected statements:
...
- SET SESSION binlog_annotate_row_events=ON;
+ SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
- SET SESSION binlog_annotate_row_events=OFF;
+ SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
New binlog event type
~~~~~~~~~~~~~~~~~~~~~
Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
-Describes the query which caused the corresponding row event. In binary log,
+Describes the query which caused the corresponding rows event. In binary log,
precedes each Table_map_log_event. Contains empty post-header and the query
text in its data part.
@@ -79,6 +79,15 @@
0000012F | 0F 00 00 00 | table_id = 15
...
+New mysqlbinlog option
+~~~~~~~~~~~~~~~~~~~~~~
+ --print-annotate-rows-events
+
+With this option, mysqlbinlog prints the content of Annotate-rows
+events (if the binary log does contain them). Without this option
+(i.e. by default), mysqlbinlog skips Annotate rows events.
+
+
mysqlbinlog output
~~~~~~~~~~~~~~~~~~
Something like this:
@@ -109,5 +118,5 @@
1. Master always sends Annotate_rows events to mysqlbinlog (in
remote case).
2. Master sends Annotate_rows events to a slave only if the slave has
- both log-slave-updates and binlog-annotate-row-events options set.
+ both log-slave-updates and binlog-annotate-rows-events options set.
-=-=(Alexi - Wed, 02 Dec 2009, 13:32)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.17456 2009-12-02 13:32:18.000000000 +0200
+++ /tmp/wklog.47.new.17456 2009-12-02 13:32:18.000000000 +0200
@@ -1,8 +1 @@
-mysql_binlog_send() [sql/sql_repl.cc]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-1. When sending events to a slave, master should simply skip
- Annotate_rows events (they are not needed for replication).
- [ ??? Multi-master - currently not clear ]
-2. When sending events to mysqlbinlog (remote case), master
- must send Annotate_rows events as well.
------------------------------------------------------------
-=-=(View All Progress Notes, 18 total)=-=-
http://askmonty.org/worklog/index.pl?tid=47&nolimit=1
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
HIGH-LEVEL SPECIFICATION:
Content
~~~~~~~
1. Annotate_rows_log_event
2. Server option: --binlog-annotate-rows-events
3. Server option: --replicate-annotate-rows-events
4. mysqlbinlog option: --print-annotate-rows-events
5. mysqlbinlog output
1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Describes the query which caused the corresponding rows events. Has empty
post-header and contains the query text in its data part. Example:
************************
ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
00000225 | 64 00 00 00 | server_id = 100
00000229 | 36 00 00 00 | event_len = 54
0000022D | 56 02 00 00 | log_pos = 00000256
00000231 | 00 00 | flags = <none>
------------------------
00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
00000237 | 52 54 20 49 |
0000023B | 4E 54 4F 20 |
0000023F | 74 31 20 56 |
00000243 | 41 4C 55 45 |
00000247 | 53 20 28 31 |
0000024B | 29 2C 20 28 |
0000024F | 32 29 2C 20 |
00000253 | 28 33 29 |
************************
In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
and precedes the first of Table map events which accompany the corresponding
rows events. (See example in the "mysqlbinlog output" section below.)
2. Server option: --binlog-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the master to write Annotate_rows events to the binary log.
* Variable Name: binlog_annotate_rows_events
* Scope: Global & Session
* Access Type: Dynamic
* Data Type: bool
* Default Value: OFF
NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
3. Server option: --replicate-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the slave to reproduce Annotate_rows events recieved from the master
in its own binary log (sensible only in pair with log-slave-updates option).
* Variable Name: replicate_annotate_rows_events
* Scope: Global
* Access Type: Read only
* Data Type: bool
* Default Value: OFF
NOTE. Why do we additionally need this 'replicate' option? Why not to make
the slave to reproduce this events when its binlog-annotate-rows-events
global value is ON? Well, because, for example, we may want to configure
the slave which should reproduce Annotate_rows events but has global
binlog-annotate-rows-events = OFF meaning this to be the default value for
the client threads (see also "How slave treats replicate-annotate-rows-events
option" in LLD part).
4. mysqlbinlog option: --print-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With this option, mysqlbinlog prints the content of Annotate_rows events (if
the binary log does contain them). Without this option (i.e. by default),
mysqlbinlog skips Annotate_rows events.
5. mysqlbinlog output
~~~~~~~~~~~~~~~~~~~~~
With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
in a form like this:
...
# at 1646
#091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
exec_time=0 error_code=0
SET TIMESTAMP=1261215926/*!*/;
BEGIN
/*!*/;
# at 1714
# at 1812
# at 1853
# at 1894
# at 1938
#091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
#091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
mapped to number 16
#091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
mapped to number 17
#091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
#091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
...
LOW-LEVEL DESIGN:
Content
~~~~~~~
1. Annotate_rows event number
2. Outline of Annotate_rows event behavior
3. How Master writes Annotate_rows events to the binary log
4. How slave treats replicate-annotate-rows-events option
5. How slave IO thread requests Annotate_rows events
6. How master executes the request
7. How slave SQL thread processes Annotate_rows events
8. General remarks
1. Annotate_rows event number
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
between the last MySQL event number and the Annotate_rows event number:
enum Log_event_type
{ ...
INCIDENT_EVENT= 26,
// New MySQL event numbers are to be added here
MYSQL_EVENTS_END,
MARIA_EVENTS_BEGIN= 51,
// New Maria event numbers start from here
ANNOTATE_ROWS_EVENT= 51,
ENUM_END_EVENT
};
together with the corresponding extension of 'post_header_len' array in the
Format description event. (This extension does not affect the compatibility
of the binary log). Here is how Format description event looks like with
this extension:
************************
FORMAT_DESCRIPTION_EVENT
************************
00000004 | A1 A0 2C 4B | time_when = 1261215905
00000008 | 0F | event_type = 15
00000009 | 64 00 00 00 | server_id = 100
0000000D | 7F 00 00 00 | event_len = 127
00000011 | 83 00 00 00 | log_pos = 00000083
00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
------------------------
00000017 | 04 00 | binlog_ver = 4
00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
..... ...
0000004B | A1 A0 2C 4B | time_created = 1261215905
0000004F | 13 | common_header_len = 19
------------------------
post_header_len
------------------------
00000050 | 38 | 56 - START_EVENT_V3 [1]
..... ...
00000069 | 02 | 2 - INCIDENT_EVENT [26]
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each Annotate_rows_log_event object has two private members describing the
corresponding query:
char *m_query_txt;
uint m_query_len;
When the object is created for writing to a binary log, this query is taken
from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
as well as other implementation details):
Annotate_rows_log_event(THD *thd)
{
m_query_txt = thd->query();
m_query_len = thd->query_length();
}
When the object is read from a binary log, the query is taken from the buffer
containing the binary log representation of the event (this buffer is allocated
in Log_event object from which all Log events are derived):
Annotate_rows_log_event(char *buf, uint event_len,
Format_description_log_event *desc)
{
m_query_len = event_len - desc->common_header_len;
m_query_txt = buf + desc->common_header_len;
}
The events are written to the binary log by the Log_event::write() member
which calls virtual write_data_header() and write_data_body() members
("data header" and "post header" are synonym in replication terminology).
In our case, data header is empty and data body is just the query:
bool write_data_body(IO_CACHE *file)
{
return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
}
Printing the event is just printing the query:
void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
{
my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
}
3. How Master writes Annotate_rows events to the binary log
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The event is written to the binary log just before the group of Table_map
events which precede corresponding Rows events (one query may generate
several Table map events in the binary log, but the corresponding
Annotate_rows event must be written only once before the first Table map
event; hence the boolean variable 'with_annotate' below):
int write_locked_table_maps(THD *thd)
{ ...
bool with_annotate= thd->variables.binlog_annotate_rows_events;
...
for (uint i= 0; i < ... <number of tables> ...; ++i)
{ ...
thd->binlog_write_table_map(table, ..., with_annotate);
with_annotate= 0; // write Annotate_event not more than once
...
}
...
}
int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
{ ...
Table_map_log_event the_event(...);
...
if (with_annotate)
{
Annotate_rows_log_event anno(this);
mysql_bin_log.write(&anno);
}
mysql_bin_log.write(&the_event);
...
}
4. How slave treats replicate-annotate-rows-events option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The replicate-annotate-rows-events option is treated just as the session
value of the binlog_annotate_rows_events variable for the slave IO and
SQL threads. This setting is done during initialization of these threads:
pthread_handler_t handle_slave_io(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_IO);
...
}
pthread_handler_t handle_slave_sql(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_SQL);
...
}
int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
{ ...
thd->variables.binlog_annotate_rows_events=
opt_replicate_annotate_rows_events;
...
}
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When requesting an event, the slave should inform the master whether
it should send Annotate_rows events or not. To that end we add a new
BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
#define BINLOG_DUMP_NON_BLOCK 1
#define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
pthread_handler_t handle_slave_io(void *arg)
{ ...
request_dump(mysql, ...);
...
}
int request_dump(MYSQL* mysql, ...)
{ ...
if (opt_log_slave_updates &&
mi->io_thd->variables.binlog_annotate_rows_events)
binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
...
int2store(buf + 4, binlog_flags);
...
simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
...
}
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
case COM_BINLOG_DUMP:
{ ...
flags= uint2korr(packet + 4);
...
mysql_binlog_send(thd, ..., flags);
...
}
void mysql_binlog_send(THD* thd, ..., ushort flags)
{ ...
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
thd->server_id == 0 /* slave == mysqlbinlog */ )
{
my_net_write(net, packet->ptr(), packet->length());
}
...
}
7. How slave SQL thread processes Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The slave processes each recieved event by "applying" it, i.e. by
calling the Log_event::apply_event() function which in turn calls
the virtual do_apply_event() member specific for each type of the
event.
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev = next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
int apply_event_and_update_pos(Log_event *ev, ...)
{ ...
ev->apply_event(...);
...
}
int Log_event::apply_event(...)
{
return do_apply_event(...);
}
What does it mean to "apply" an Annotate_rows event? It means to set current
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
thd->set_query(m_query_txt, m_query_len);
}
NOTE. I am not sure, but possibly current values of thd->query and
thd->query_length should be saved before calling set_query() and to be
restored on the Annotate_rows_log_event object deletion.
Is it really needed ?
After calling this do_apply_event() function we may not delete the
Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
above) because thd->query now points to the string inside this object.
We may keep the pointer to this object in the Relay_log_info:
class Relay_log_info
{
public:
...
void set_annotate_event(Annotate_rows_log_event*);
Annotate_rows_log_event* get_annotate_event();
void free_annotate_event();
...
private:
Annotate_rows_log_event* m_annotate_event;
};
The saved Annotate_rows object should be deleted when all corresponding
Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (rli->get_annotate_event() && is_last_rows_event(ev))
rli->free_annotate_event();
else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
where
bool is_last_rows_event(Log_event* ev)
{
Log_event_type type= ev->get_type_code();
if (IS_ROWS_EVENT_TYPE(type))
{
Rows_log_event* rows= (Rows_log_event*)ev;
return rows->get_flags(Rows_log_event::STMT_END_F);
}
return 0;
}
#define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
(type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
Kristian noticed that introducing new log event type should be coordinated
somehow with MySQL/Sun:
Kristian: The numeric code for this event must be assigned carefully.
It should be coordinated with MySQL/Sun, otherwise we can get into a
situation where MySQL uses the same numeric code for one event that
MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
impossible.
Alex: I reserved about 20 numbers not to have possible conflicts
with MySQL.
Kristian: Still, I think it would be appropriate to send a polite email
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
flag taking into account that MySQL/Sun may also introduce a flag with the
same value to be used in the request_dump-mysql_binlog_send interface.
But this is mainly the question of merging: if a conflict concerning this
flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
(this does not require additional changes in the code).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0