December 2009 - developers - lists.mariadb.org

[Maria-developers] Updated (by Sanja): Add info to engine description (61)
by worklog-noreply＠askmonty.org 04 Dec '09

04 Dec '09

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Add info to engine description CREATION DATE..: Mon, 02 Nov 2009, 22:58 SUPERVISOR.....: Monty IMPLEMENTOR....: Sanja COPIES TO......: CATEGORY.......: Server-BackLog TASK ID........: 61 (http://askmonty.org/worklog/?tid=61) VERSION........: Server-5.1 STATUS.........: Code-Review PRIORITY.......: 60 WORKED HOURS...: 0 ESTIMATE.......: 5 (hours remain) ORIG. ESTIMATE.: 5 PROGRESS NOTES: -=-=(Sanja - Fri, 04 Dec 2009, 13:50)=-=- Status updated. --- /tmp/wklog.61.old.16175 2009-12-04 13:50:12.000000000 +0200 +++ /tmp/wklog.61.new.16175 2009-12-04 13:50:12.000000000 +0200 @@ -1 +1 @@ -Assigned +Code-Review -=-=(Sanja - Fri, 20 Nov 2009, 00:18)=-=- High-Level Specification modified. --- /tmp/wklog.61.old.19932 2009-11-20 00:18:57.000000000 +0200 +++ /tmp/wklog.61.new.19932 2009-11-20 00:18:57.000000000 +0200 @@ -1 +1,9 @@ +There is two time of engines/plugins build in and loadable. For both we add +structure with additional information +1) for build in we will modify MYSQL_PLUGIN_STATIC to collect that strucures in +array also with builtin_*_plugin structures. + +2) for dinamic plugins we check in plugin_dl_add() if the pligin defins maria +simbols we read info in if no then fill maturety with UNKNOWN and version with +empty string. -=-=(Sanja - Thu, 19 Nov 2009, 23:56)=-=- High Level Description modified. --- /tmp/wklog.61.old.19057 2009-11-19 23:56:48.000000000 +0200 +++ /tmp/wklog.61.new.19057 2009-11-19 23:56:48.000000000 +0200 @@ -1,8 +1,6 @@ -Add additional information about engine and show it in SHOW ENGINES: +Add additional information about engine and show it in information schema: -License (PROPRIETARY, GPL, BSD) (it is already present, just have to be shown) - -Maturity (TEST, ALPHA, BETA, GAMMA, RELEASE) +Maturity (UNKNOWN, TEST, ALPHA, BETA, GAMMA, RELEASE) Version (just string from engine developer like "0.99 beta", better if it will be monotonically increasing in terms of alphabetical sort). -=-=(Hakan - Thu, 05 Nov 2009, 21:09)=-=- High Level Description modified. --- /tmp/wklog.61.old.22402 2009-11-05 21:09:08.000000000 +0200 +++ /tmp/wklog.61.new.22402 2009-11-05 21:09:08.000000000 +0200 @@ -1,8 +1,8 @@ Add additional information about engine and show it in SHOW ENGINES: -License (PROPRIETARY, GPL, BSD) (it is present just have to be shown) +License (PROPRIETARY, GPL, BSD) (it is already present, just have to be shown) Maturity (TEST, ALPHA, BETA, GAMMA, RELEASE) -Version (just string from engine developer like "0.99 betta", better if it will +Version (just string from engine developer like "0.99 beta", better if it will be monotonically increasing in terms of alphabetical sort). DESCRIPTION: Add additional information about engine and show it in information schema: Maturity (UNKNOWN, TEST, ALPHA, BETA, GAMMA, RELEASE) Version (just string from engine developer like "0.99 beta", better if it will be monotonically increasing in terms of alphabetical sort). HIGH-LEVEL SPECIFICATION: There is two time of engines/plugins build in and loadable. For both we add structure with additional information 1) for build in we will modify MYSQL_PLUGIN_STATIC to collect that strucures in array also with builtin_*_plugin structures. 2) for dinamic plugins we check in plugin_dl_add() if the pligin defins maria simbols we read info in if no then fill maturety with UNKNOWN and version with empty string. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Updated (by Alexi): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply＠askmonty.org 04 Dec '09

04 Dec '09

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Store in binlog text of statements that caused RBR events CREATION DATE..: Sat, 15 Aug 2009, 23:48 SUPERVISOR.....: Monty IMPLEMENTOR....: COPIES TO......: Knielsen CATEGORY.......: Server-RawIdeaBin TASK ID........: 47 (http://askmonty.org/worklog/?tid=47) VERSION........: Server-9.x STATUS.........: Un-Assigned PRIORITY.......: 60 WORKED HOURS...: 0 ESTIMATE.......: 0 (hours remain) ORIG. ESTIMATE.: 0 PROGRESS NOTES: -=-=(Alexi - Fri, 04 Dec 2009, 13:00)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.14001 2009-12-04 13:00:24.000000000 +0200 +++ /tmp/wklog.47.new.14001 2009-12-04 13:00:24.000000000 +0200 @@ -6,27 +6,27 @@ New server option ~~~~~~~~~~~~~~~~~ - --binlog-annotate-row-events + --binlog-annotate-rows-events -Setting this option makes RBR (row-) events in the binary log to be +Setting this option makes RBR (rows-) events in the binary log to be preceded by Annotate rows events (see below). The corresponding -'binlog_annotate_row_events' system variable is dynamic and has both +'binlog_annotate_rows_events' system variable is dynamic and has both global and session values. Default global value is OFF. Note. Session values are usefull to make it possible to annotate only some selected statements: ... - SET SESSION binlog_annotate_row_events=ON; + SET SESSION binlog_annotate_rows_events=ON; ... statements to be annotated ... - SET SESSION binlog_annotate_row_events=OFF; + SET SESSION binlog_annotate_rows_events=OFF; ... statements not to be annotated ... New binlog event type ~~~~~~~~~~~~~~~~~~~~~ Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ] -Describes the query which caused the corresponding row event. In binary log, +Describes the query which caused the corresponding rows event. In binary log, precedes each Table_map_log_event. Contains empty post-header and the query text in its data part. @@ -79,6 +79,15 @@ 0000012F | 0F 00 00 00 | table_id = 15 ... +New mysqlbinlog option +~~~~~~~~~~~~~~~~~~~~~~ + --print-annotate-rows-events + +With this option, mysqlbinlog prints the content of Annotate-rows +events (if the binary log does contain them). Without this option +(i.e. by default), mysqlbinlog skips Annotate rows events. + + mysqlbinlog output ~~~~~~~~~~~~~~~~~~ Something like this: @@ -109,5 +118,5 @@ 1. Master always sends Annotate_rows events to mysqlbinlog (in remote case). 2. Master sends Annotate_rows events to a slave only if the slave has - both log-slave-updates and binlog-annotate-row-events options set. + both log-slave-updates and binlog-annotate-rows-events options set. -=-=(Alexi - Wed, 02 Dec 2009, 13:32)=-=- Low Level Design modified. --- /tmp/wklog.47.old.17456 2009-12-02 13:32:18.000000000 +0200 +++ /tmp/wklog.47.new.17456 2009-12-02 13:32:18.000000000 +0200 @@ -1,8 +1 @@ -mysql_binlog_send() [sql/sql_repl.cc] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -1. When sending events to a slave, master should simply skip - Annotate_rows events (they are not needed for replication). - [ ??? Multi-master - currently not clear ] -2. When sending events to mysqlbinlog (remote case), master - must send Annotate_rows events as well. -=-=(Alexi - Wed, 02 Dec 2009, 13:31)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.17414 2009-12-02 11:31:59.000000000 +0000 +++ /tmp/wklog.47.new.17414 2009-12-02 11:31:59.000000000 +0000 @@ -104,3 +104,10 @@ ### @1=1 /* INT meta=0 nullable=1 is_null=0 */ ... +When master sends Annotate rows events +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +1. Master always sends Annotate_rows events to mysqlbinlog (in + remote case). +2. Master sends Annotate_rows events to a slave only if the slave has + both log-slave-updates and binlog-annotate-row-events options set. + -=-=(Knielsen - Mon, 30 Nov 2009, 11:21)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.18210 2009-11-30 11:21:33.000000000 +0200 +++ /tmp/wklog.47.new.18210 2009-11-30 11:21:33.000000000 +0200 @@ -28,7 +28,14 @@ Describes the query which caused the corresponding row event. In binary log, precedes each Table_map_log_event. Contains empty post-header and the query -text in its data part. Example: +text in its data part. + +The numeric code for this event must be assigned carefully. It should be +coordinated with MySQL/Sun, otherwise we can get into a situation where MySQL +uses the same numeric code for one event that MariaDB uses for +ANNOTATE_ROWS_EVENT, which would make merging the two impossible. + +Example: ... ************************ -=-=(Alexi - Mon, 30 Nov 2009, 10:33)=-=- Low Level Design modified. --- /tmp/wklog.47.old.16188 2009-11-30 10:33:44.000000000 +0200 +++ /tmp/wklog.47.new.16188 2009-11-30 10:33:44.000000000 +0200 @@ -2,6 +2,7 @@ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. When sending events to a slave, master should simply skip Annotate_rows events (they are not needed for replication). + [ ??? Multi-master - currently not clear ] 2. When sending events to mysqlbinlog (remote case), master must send Annotate_rows events as well. -=-=(Alexi - Sun, 29 Nov 2009, 13:00)=-=- Low Level Design modified. --- /tmp/wklog.47.old.32047 2009-11-29 13:00:21.000000000 +0200 +++ /tmp/wklog.47.new.32047 2009-11-29 13:00:21.000000000 +0200 @@ -1 +1,7 @@ +mysql_binlog_send() [sql/sql_repl.cc] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +1. When sending events to a slave, master should simply skip + Annotate_rows events (they are not needed for replication). +2. When sending events to mysqlbinlog (remote case), master + must send Annotate_rows events as well. -=-=(Alexi - Sun, 29 Nov 2009, 09:50)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.24993 2009-11-29 07:50:36.000000000 +0000 +++ /tmp/wklog.47.new.24993 2009-11-29 07:50:36.000000000 +0000 @@ -4,3 +4,96 @@ > (Comment_log_event?). Unless we want to log an empty statement Query_log_event > containing only a comment (a bit of a hack). +New server option +~~~~~~~~~~~~~~~~~ + --binlog-annotate-row-events + +Setting this option makes RBR (row-) events in the binary log to be +preceded by Annotate rows events (see below). The corresponding +'binlog_annotate_row_events' system variable is dynamic and has both +global and session values. Default global value is OFF. + +Note. Session values are usefull to make it possible to annotate only + some selected statements: + + ... + SET SESSION binlog_annotate_row_events=ON; + ... statements to be annotated ... + SET SESSION binlog_annotate_row_events=OFF; + ... statements not to be annotated ... + +New binlog event type +~~~~~~~~~~~~~~~~~~~~~ + Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ] + +Describes the query which caused the corresponding row event. In binary log, +precedes each Table_map_log_event. Contains empty post-header and the query +text in its data part. Example: + + ... + ************************ + ANNOTATE_ROWS_EVENT [51] + ************************ + 000000C7 | 54 1B 12 4B | time_when = 1259477844 + 000000CB | 33 | event_type = 51 + 000000CC | 64 00 00 00 | server_id = 100 + 000000D0 | 2C 00 00 00 | event_len = 44 + 000000D4 | F3 00 00 00 | log_pos = 000000F3 + 000000D8 | 00 00 | flags = <none> + ------------------------ + 000000DA | 69 6E 73 65 | query = "insert into t1 values (1)" + 000000DE | 72 74 20 69 | + 000000E2 | 6E 74 6F 20 | + 000000E6 | 74 31 20 76 | + 000000EA | 61 6C 75 65 | + 000000EE | 73 20 28 31 | + 000000F2 | 29 | + ************************ + TABLE_MAP_EVENT [19] + ************************ + 000000F3 | 54 1B 12 4B | time_when = 1259477844 + 000000F7 | 13 | event_type = 19 + 000000F8 | 64 00 00 00 | server_id = 100 + 000000FC | 29 00 00 00 | event_len = 41 + 00000100 | 1C 01 00 00 | log_pos = 0000011C + 00000104 | 00 00 | flags = <none> + ------------------------ + ... + ************************ + WRITE_ROWS_EVENT [23] + ************************ + 0000011C | 54 1B 12 4B | time_when = 1259477844 + 00000120 | 17 | event_type = 23 + 00000121 | 64 00 00 00 | server_id = 100 + 00000125 | 22 00 00 00 | event_len = 34 + 00000129 | 3E 01 00 00 | log_pos = 0000013E + 0000012D | 10 00 | flags = LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F + ------------------------ + 0000012F | 0F 00 00 00 | table_id = 15 + ... + +mysqlbinlog output +~~~~~~~~~~~~~~~~~~ +Something like this: + + ... + # at 199 + # at 243 + # at 284 + #091129 9:57:24 server id 100 end_log_pos 243 Query: `insert into t1 values +(1)` + #091129 9:57:24 server id 100 end_log_pos 284 Table_map: `test`.`t1` mapped +to number 15 + #091129 9:57:24 server id 100 end_log_pos 318 Write_rows: table id 15 +flags: STMT_END_F + + BINLOG ' + VBsSSzNkAAAALAAAAPMAAAAAAGluc2VydCBpbnRvIHQxIHZhbHVlcyAoMSk= + VBsSSxNkAAAAKQAAABwBAAAAAA8AAAAAAAAABHRlc3QAAnQxAAEDAAE= + VBsSSxdkAAAAIgAAAD4BAAAQAA8AAAAAAAEAAf/+AQAAAA== + '/*!*/; + ### INSERT INTO test.t1 + ### SET + ### @1=1 /* INT meta=0 nullable=1 is_null=0 */ + ... + -=-=(Knielsen - Fri, 27 Nov 2009, 13:30)=-=- Observers changed: Knielsen -=-=(Psergey - Sun, 16 Aug 2009, 11:08)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.12485 2009-08-16 11:08:33.000000000 +0300 +++ /tmp/wklog.47.new.12485 2009-08-16 11:08:33.000000000 +0300 @@ -1 +1,6 @@ +First suggestion: + +> I think for this we would actually need a new binlog event type +> (Comment_log_event?). Unless we want to log an empty statement Query_log_event +> containing only a comment (a bit of a hack). -=-=(Psergey - Sun, 16 Aug 2009, 00:02)=-=- Dependency created: 39 now depends on 47 DESCRIPTION: Store in binlog (and show in mysqlbinlog output) texts of statements that caused RBR events This is needed for (list from Monty): - Easier to understand why updates happened - Would make it easier to find out where in application things went wrong (as you can search for exact strings) - Allow one to filter things based on comments in the statement. The cost of this can be that the binlog will be approximately 2x in size (especially insert of big blob's would be a bit painful), so this should be an optional feature. HIGH-LEVEL SPECIFICATION: First suggestion: > I think for this we would actually need a new binlog event type > (Comment_log_event?). Unless we want to log an empty statement Query_log_event > containing only a comment (a bit of a hack). New server option ~~~~~~~~~~~~~~~~~ --binlog-annotate-rows-events Setting this option makes RBR (rows-) events in the binary log to be preceded by Annotate rows events (see below). The corresponding 'binlog_annotate_rows_events' system variable is dynamic and has both global and session values. Default global value is OFF. Note. Session values are usefull to make it possible to annotate only some selected statements: ... SET SESSION binlog_annotate_rows_events=ON; ... statements to be annotated ... SET SESSION binlog_annotate_rows_events=OFF; ... statements not to be annotated ... New binlog event type ~~~~~~~~~~~~~~~~~~~~~ Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ] Describes the query which caused the corresponding rows event. In binary log, precedes each Table_map_log_event. Contains empty post-header and the query text in its data part. The numeric code for this event must be assigned carefully. It should be coordinated with MySQL/Sun, otherwise we can get into a situation where MySQL uses the same numeric code for one event that MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two impossible. Example: ... ************************ ANNOTATE_ROWS_EVENT [51] ************************ 000000C7 | 54 1B 12 4B | time_when = 1259477844 000000CB | 33 | event_type = 51 000000CC | 64 00 00 00 | server_id = 100 000000D0 | 2C 00 00 00 | event_len = 44 000000D4 | F3 00 00 00 | log_pos = 000000F3 000000D8 | 00 00 | flags = <none> ------------------------ 000000DA | 69 6E 73 65 | query = "insert into t1 values (1)" 000000DE | 72 74 20 69 | 000000E2 | 6E 74 6F 20 | 000000E6 | 74 31 20 76 | 000000EA | 61 6C 75 65 | 000000EE | 73 20 28 31 | 000000F2 | 29 | ************************ TABLE_MAP_EVENT [19] ************************ 000000F3 | 54 1B 12 4B | time_when = 1259477844 000000F7 | 13 | event_type = 19 000000F8 | 64 00 00 00 | server_id = 100 000000FC | 29 00 00 00 | event_len = 41 00000100 | 1C 01 00 00 | log_pos = 0000011C 00000104 | 00 00 | flags = <none> ------------------------ ... ************************ WRITE_ROWS_EVENT [23] ************************ 0000011C | 54 1B 12 4B | time_when = 1259477844 00000120 | 17 | event_type = 23 00000121 | 64 00 00 00 | server_id = 100 00000125 | 22 00 00 00 | event_len = 34 00000129 | 3E 01 00 00 | log_pos = 0000013E 0000012D | 10 00 | flags = LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F ------------------------ 0000012F | 0F 00 00 00 | table_id = 15 ... New mysqlbinlog option ~~~~~~~~~~~~~~~~~~~~~~ --print-annotate-rows-events With this option, mysqlbinlog prints the content of Annotate-rows events (if the binary log does contain them). Without this option (i.e. by default), mysqlbinlog skips Annotate rows events. mysqlbinlog output ~~~~~~~~~~~~~~~~~~ Something like this: ... # at 199 # at 243 # at 284 #091129 9:57:24 server id 100 end_log_pos 243 Query: `insert into t1 values (1)` #091129 9:57:24 server id 100 end_log_pos 284 Table_map: `test`.`t1` mapped to number 15 #091129 9:57:24 server id 100 end_log_pos 318 Write_rows: table id 15 flags: STMT_END_F BINLOG ' VBsSSzNkAAAALAAAAPMAAAAAAGluc2VydCBpbnRvIHQxIHZhbHVlcyAoMSk= VBsSSxNkAAAAKQAAABwBAAAAAA8AAAAAAAAABHRlc3QAAnQxAAEDAAE= VBsSSxdkAAAAIgAAAD4BAAAQAA8AAAAAAAEAAf/+AQAAAA== '/*!*/; ### INSERT INTO test.t1 ### SET ### @1=1 /* INT meta=0 nullable=1 is_null=0 */ ... When master sends Annotate rows events ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. Master always sends Annotate_rows events to mysqlbinlog (in remote case). 2. Master sends Annotate_rows events to a slave only if the slave has both log-slave-updates and binlog-annotate-rows-events options set. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Updated (by Alexi): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply＠askmonty.org 04 Dec '09

04 Dec '09

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Store in binlog text of statements that caused RBR events CREATION DATE..: Sat, 15 Aug 2009, 23:48 SUPERVISOR.....: Monty IMPLEMENTOR....: COPIES TO......: Knielsen CATEGORY.......: Server-RawIdeaBin TASK ID........: 47 (http://askmonty.org/worklog/?tid=47) VERSION........: Server-9.x STATUS.........: Un-Assigned PRIORITY.......: 60 WORKED HOURS...: 0 ESTIMATE.......: 0 (hours remain) ORIG. ESTIMATE.: 0 PROGRESS NOTES: -=-=(Alexi - Fri, 04 Dec 2009, 13:00)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.14001 2009-12-04 13:00:24.000000000 +0200 +++ /tmp/wklog.47.new.14001 2009-12-04 13:00:24.000000000 +0200 @@ -6,27 +6,27 @@ New server option ~~~~~~~~~~~~~~~~~ - --binlog-annotate-row-events + --binlog-annotate-rows-events -Setting this option makes RBR (row-) events in the binary log to be +Setting this option makes RBR (rows-) events in the binary log to be preceded by Annotate rows events (see below). The corresponding -'binlog_annotate_row_events' system variable is dynamic and has both +'binlog_annotate_rows_events' system variable is dynamic and has both global and session values. Default global value is OFF. Note. Session values are usefull to make it possible to annotate only some selected statements: ... - SET SESSION binlog_annotate_row_events=ON; + SET SESSION binlog_annotate_rows_events=ON; ... statements to be annotated ... - SET SESSION binlog_annotate_row_events=OFF; + SET SESSION binlog_annotate_rows_events=OFF; ... statements not to be annotated ... New binlog event type ~~~~~~~~~~~~~~~~~~~~~ Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ] -Describes the query which caused the corresponding row event. In binary log, +Describes the query which caused the corresponding rows event. In binary log, precedes each Table_map_log_event. Contains empty post-header and the query text in its data part. @@ -79,6 +79,15 @@ 0000012F | 0F 00 00 00 | table_id = 15 ... +New mysqlbinlog option +~~~~~~~~~~~~~~~~~~~~~~ + --print-annotate-rows-events + +With this option, mysqlbinlog prints the content of Annotate-rows +events (if the binary log does contain them). Without this option +(i.e. by default), mysqlbinlog skips Annotate rows events. + + mysqlbinlog output ~~~~~~~~~~~~~~~~~~ Something like this: @@ -109,5 +118,5 @@ 1. Master always sends Annotate_rows events to mysqlbinlog (in remote case). 2. Master sends Annotate_rows events to a slave only if the slave has - both log-slave-updates and binlog-annotate-row-events options set. + both log-slave-updates and binlog-annotate-rows-events options set. -=-=(Alexi - Wed, 02 Dec 2009, 13:32)=-=- Low Level Design modified. --- /tmp/wklog.47.old.17456 2009-12-02 13:32:18.000000000 +0200 +++ /tmp/wklog.47.new.17456 2009-12-02 13:32:18.000000000 +0200 @@ -1,8 +1 @@ -mysql_binlog_send() [sql/sql_repl.cc] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -1. When sending events to a slave, master should simply skip - Annotate_rows events (they are not needed for replication). - [ ??? Multi-master - currently not clear ] -2. When sending events to mysqlbinlog (remote case), master - must send Annotate_rows events as well. -=-=(Alexi - Wed, 02 Dec 2009, 13:31)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.17414 2009-12-02 11:31:59.000000000 +0000 +++ /tmp/wklog.47.new.17414 2009-12-02 11:31:59.000000000 +0000 @@ -104,3 +104,10 @@ ### @1=1 /* INT meta=0 nullable=1 is_null=0 */ ... +When master sends Annotate rows events +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +1. Master always sends Annotate_rows events to mysqlbinlog (in + remote case). +2. Master sends Annotate_rows events to a slave only if the slave has + both log-slave-updates and binlog-annotate-row-events options set. + -=-=(Knielsen - Mon, 30 Nov 2009, 11:21)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.18210 2009-11-30 11:21:33.000000000 +0200 +++ /tmp/wklog.47.new.18210 2009-11-30 11:21:33.000000000 +0200 @@ -28,7 +28,14 @@ Describes the query which caused the corresponding row event. In binary log, precedes each Table_map_log_event. Contains empty post-header and the query -text in its data part. Example: +text in its data part. + +The numeric code for this event must be assigned carefully. It should be +coordinated with MySQL/Sun, otherwise we can get into a situation where MySQL +uses the same numeric code for one event that MariaDB uses for +ANNOTATE_ROWS_EVENT, which would make merging the two impossible. + +Example: ... ************************ -=-=(Alexi - Mon, 30 Nov 2009, 10:33)=-=- Low Level Design modified. --- /tmp/wklog.47.old.16188 2009-11-30 10:33:44.000000000 +0200 +++ /tmp/wklog.47.new.16188 2009-11-30 10:33:44.000000000 +0200 @@ -2,6 +2,7 @@ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. When sending events to a slave, master should simply skip Annotate_rows events (they are not needed for replication). + [ ??? Multi-master - currently not clear ] 2. When sending events to mysqlbinlog (remote case), master must send Annotate_rows events as well. -=-=(Alexi - Sun, 29 Nov 2009, 13:00)=-=- Low Level Design modified. --- /tmp/wklog.47.old.32047 2009-11-29 13:00:21.000000000 +0200 +++ /tmp/wklog.47.new.32047 2009-11-29 13:00:21.000000000 +0200 @@ -1 +1,7 @@ +mysql_binlog_send() [sql/sql_repl.cc] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +1. When sending events to a slave, master should simply skip + Annotate_rows events (they are not needed for replication). +2. When sending events to mysqlbinlog (remote case), master + must send Annotate_rows events as well. -=-=(Alexi - Sun, 29 Nov 2009, 09:50)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.24993 2009-11-29 07:50:36.000000000 +0000 +++ /tmp/wklog.47.new.24993 2009-11-29 07:50:36.000000000 +0000 @@ -4,3 +4,96 @@ > (Comment_log_event?). Unless we want to log an empty statement Query_log_event > containing only a comment (a bit of a hack). +New server option +~~~~~~~~~~~~~~~~~ + --binlog-annotate-row-events + +Setting this option makes RBR (row-) events in the binary log to be +preceded by Annotate rows events (see below). The corresponding +'binlog_annotate_row_events' system variable is dynamic and has both +global and session values. Default global value is OFF. + +Note. Session values are usefull to make it possible to annotate only + some selected statements: + + ... + SET SESSION binlog_annotate_row_events=ON; + ... statements to be annotated ... + SET SESSION binlog_annotate_row_events=OFF; + ... statements not to be annotated ... + +New binlog event type +~~~~~~~~~~~~~~~~~~~~~ + Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ] + +Describes the query which caused the corresponding row event. In binary log, +precedes each Table_map_log_event. Contains empty post-header and the query +text in its data part. Example: + + ... + ************************ + ANNOTATE_ROWS_EVENT [51] + ************************ + 000000C7 | 54 1B 12 4B | time_when = 1259477844 + 000000CB | 33 | event_type = 51 + 000000CC | 64 00 00 00 | server_id = 100 + 000000D0 | 2C 00 00 00 | event_len = 44 + 000000D4 | F3 00 00 00 | log_pos = 000000F3 + 000000D8 | 00 00 | flags = <none> + ------------------------ + 000000DA | 69 6E 73 65 | query = "insert into t1 values (1)" + 000000DE | 72 74 20 69 | + 000000E2 | 6E 74 6F 20 | + 000000E6 | 74 31 20 76 | + 000000EA | 61 6C 75 65 | + 000000EE | 73 20 28 31 | + 000000F2 | 29 | + ************************ + TABLE_MAP_EVENT [19] + ************************ + 000000F3 | 54 1B 12 4B | time_when = 1259477844 + 000000F7 | 13 | event_type = 19 + 000000F8 | 64 00 00 00 | server_id = 100 + 000000FC | 29 00 00 00 | event_len = 41 + 00000100 | 1C 01 00 00 | log_pos = 0000011C + 00000104 | 00 00 | flags = <none> + ------------------------ + ... + ************************ + WRITE_ROWS_EVENT [23] + ************************ + 0000011C | 54 1B 12 4B | time_when = 1259477844 + 00000120 | 17 | event_type = 23 + 00000121 | 64 00 00 00 | server_id = 100 + 00000125 | 22 00 00 00 | event_len = 34 + 00000129 | 3E 01 00 00 | log_pos = 0000013E + 0000012D | 10 00 | flags = LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F + ------------------------ + 0000012F | 0F 00 00 00 | table_id = 15 + ... + +mysqlbinlog output +~~~~~~~~~~~~~~~~~~ +Something like this: + + ... + # at 199 + # at 243 + # at 284 + #091129 9:57:24 server id 100 end_log_pos 243 Query: `insert into t1 values +(1)` + #091129 9:57:24 server id 100 end_log_pos 284 Table_map: `test`.`t1` mapped +to number 15 + #091129 9:57:24 server id 100 end_log_pos 318 Write_rows: table id 15 +flags: STMT_END_F + + BINLOG ' + VBsSSzNkAAAALAAAAPMAAAAAAGluc2VydCBpbnRvIHQxIHZhbHVlcyAoMSk= + VBsSSxNkAAAAKQAAABwBAAAAAA8AAAAAAAAABHRlc3QAAnQxAAEDAAE= + VBsSSxdkAAAAIgAAAD4BAAAQAA8AAAAAAAEAAf/+AQAAAA== + '/*!*/; + ### INSERT INTO test.t1 + ### SET + ### @1=1 /* INT meta=0 nullable=1 is_null=0 */ + ... + -=-=(Knielsen - Fri, 27 Nov 2009, 13:30)=-=- Observers changed: Knielsen -=-=(Psergey - Sun, 16 Aug 2009, 11:08)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.12485 2009-08-16 11:08:33.000000000 +0300 +++ /tmp/wklog.47.new.12485 2009-08-16 11:08:33.000000000 +0300 @@ -1 +1,6 @@ +First suggestion: + +> I think for this we would actually need a new binlog event type +> (Comment_log_event?). Unless we want to log an empty statement Query_log_event +> containing only a comment (a bit of a hack). -=-=(Psergey - Sun, 16 Aug 2009, 00:02)=-=- Dependency created: 39 now depends on 47 DESCRIPTION: Store in binlog (and show in mysqlbinlog output) texts of statements that caused RBR events This is needed for (list from Monty): - Easier to understand why updates happened - Would make it easier to find out where in application things went wrong (as you can search for exact strings) - Allow one to filter things based on comments in the statement. The cost of this can be that the binlog will be approximately 2x in size (especially insert of big blob's would be a bit painful), so this should be an optional feature. HIGH-LEVEL SPECIFICATION: First suggestion: > I think for this we would actually need a new binlog event type > (Comment_log_event?). Unless we want to log an empty statement Query_log_event > containing only a comment (a bit of a hack). New server option ~~~~~~~~~~~~~~~~~ --binlog-annotate-rows-events Setting this option makes RBR (rows-) events in the binary log to be preceded by Annotate rows events (see below). The corresponding 'binlog_annotate_rows_events' system variable is dynamic and has both global and session values. Default global value is OFF. Note. Session values are usefull to make it possible to annotate only some selected statements: ... SET SESSION binlog_annotate_rows_events=ON; ... statements to be annotated ... SET SESSION binlog_annotate_rows_events=OFF; ... statements not to be annotated ... New binlog event type ~~~~~~~~~~~~~~~~~~~~~ Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ] Describes the query which caused the corresponding rows event. In binary log, precedes each Table_map_log_event. Contains empty post-header and the query text in its data part. The numeric code for this event must be assigned carefully. It should be coordinated with MySQL/Sun, otherwise we can get into a situation where MySQL uses the same numeric code for one event that MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two impossible. Example: ... ************************ ANNOTATE_ROWS_EVENT [51] ************************ 000000C7 | 54 1B 12 4B | time_when = 1259477844 000000CB | 33 | event_type = 51 000000CC | 64 00 00 00 | server_id = 100 000000D0 | 2C 00 00 00 | event_len = 44 000000D4 | F3 00 00 00 | log_pos = 000000F3 000000D8 | 00 00 | flags = <none> ------------------------ 000000DA | 69 6E 73 65 | query = "insert into t1 values (1)" 000000DE | 72 74 20 69 | 000000E2 | 6E 74 6F 20 | 000000E6 | 74 31 20 76 | 000000EA | 61 6C 75 65 | 000000EE | 73 20 28 31 | 000000F2 | 29 | ************************ TABLE_MAP_EVENT [19] ************************ 000000F3 | 54 1B 12 4B | time_when = 1259477844 000000F7 | 13 | event_type = 19 000000F8 | 64 00 00 00 | server_id = 100 000000FC | 29 00 00 00 | event_len = 41 00000100 | 1C 01 00 00 | log_pos = 0000011C 00000104 | 00 00 | flags = <none> ------------------------ ... ************************ WRITE_ROWS_EVENT [23] ************************ 0000011C | 54 1B 12 4B | time_when = 1259477844 00000120 | 17 | event_type = 23 00000121 | 64 00 00 00 | server_id = 100 00000125 | 22 00 00 00 | event_len = 34 00000129 | 3E 01 00 00 | log_pos = 0000013E 0000012D | 10 00 | flags = LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F ------------------------ 0000012F | 0F 00 00 00 | table_id = 15 ... New mysqlbinlog option ~~~~~~~~~~~~~~~~~~~~~~ --print-annotate-rows-events With this option, mysqlbinlog prints the content of Annotate-rows events (if the binary log does contain them). Without this option (i.e. by default), mysqlbinlog skips Annotate rows events. mysqlbinlog output ~~~~~~~~~~~~~~~~~~ Something like this: ... # at 199 # at 243 # at 284 #091129 9:57:24 server id 100 end_log_pos 243 Query: `insert into t1 values (1)` #091129 9:57:24 server id 100 end_log_pos 284 Table_map: `test`.`t1` mapped to number 15 #091129 9:57:24 server id 100 end_log_pos 318 Write_rows: table id 15 flags: STMT_END_F BINLOG ' VBsSSzNkAAAALAAAAPMAAAAAAGluc2VydCBpbnRvIHQxIHZhbHVlcyAoMSk= VBsSSxNkAAAAKQAAABwBAAAAAA8AAAAAAAAABHRlc3QAAnQxAAEDAAE= VBsSSxdkAAAAIgAAAD4BAAAQAA8AAAAAAAEAAf/+AQAAAA== '/*!*/; ### INSERT INTO test.t1 ### SET ### @1=1 /* INT meta=0 nullable=1 is_null=0 */ ... When master sends Annotate rows events ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. Master always sends Annotate_rows events to mysqlbinlog (in remote case). 2. Master sends Annotate_rows events to a slave only if the slave has both log-slave-updates and binlog-annotate-rows-events options set. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Updated (by Alexi): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply＠askmonty.org 04 Dec '09

04 Dec '09

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Store in binlog text of statements that caused RBR events CREATION DATE..: Sat, 15 Aug 2009, 23:48 SUPERVISOR.....: Monty IMPLEMENTOR....: COPIES TO......: Knielsen CATEGORY.......: Server-RawIdeaBin TASK ID........: 47 (http://askmonty.org/worklog/?tid=47) VERSION........: Server-9.x STATUS.........: Un-Assigned PRIORITY.......: 60 WORKED HOURS...: 0 ESTIMATE.......: 0 (hours remain) ORIG. ESTIMATE.: 0 PROGRESS NOTES: -=-=(Alexi - Fri, 04 Dec 2009, 13:00)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.14001 2009-12-04 13:00:24.000000000 +0200 +++ /tmp/wklog.47.new.14001 2009-12-04 13:00:24.000000000 +0200 @@ -6,27 +6,27 @@ New server option ~~~~~~~~~~~~~~~~~ - --binlog-annotate-row-events + --binlog-annotate-rows-events -Setting this option makes RBR (row-) events in the binary log to be +Setting this option makes RBR (rows-) events in the binary log to be preceded by Annotate rows events (see below). The corresponding -'binlog_annotate_row_events' system variable is dynamic and has both +'binlog_annotate_rows_events' system variable is dynamic and has both global and session values. Default global value is OFF. Note. Session values are usefull to make it possible to annotate only some selected statements: ... - SET SESSION binlog_annotate_row_events=ON; + SET SESSION binlog_annotate_rows_events=ON; ... statements to be annotated ... - SET SESSION binlog_annotate_row_events=OFF; + SET SESSION binlog_annotate_rows_events=OFF; ... statements not to be annotated ... New binlog event type ~~~~~~~~~~~~~~~~~~~~~ Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ] -Describes the query which caused the corresponding row event. In binary log, +Describes the query which caused the corresponding rows event. In binary log, precedes each Table_map_log_event. Contains empty post-header and the query text in its data part. @@ -79,6 +79,15 @@ 0000012F | 0F 00 00 00 | table_id = 15 ... +New mysqlbinlog option +~~~~~~~~~~~~~~~~~~~~~~ + --print-annotate-rows-events + +With this option, mysqlbinlog prints the content of Annotate-rows +events (if the binary log does contain them). Without this option +(i.e. by default), mysqlbinlog skips Annotate rows events. + + mysqlbinlog output ~~~~~~~~~~~~~~~~~~ Something like this: @@ -109,5 +118,5 @@ 1. Master always sends Annotate_rows events to mysqlbinlog (in remote case). 2. Master sends Annotate_rows events to a slave only if the slave has - both log-slave-updates and binlog-annotate-row-events options set. + both log-slave-updates and binlog-annotate-rows-events options set. -=-=(Alexi - Wed, 02 Dec 2009, 13:32)=-=- Low Level Design modified. --- /tmp/wklog.47.old.17456 2009-12-02 13:32:18.000000000 +0200 +++ /tmp/wklog.47.new.17456 2009-12-02 13:32:18.000000000 +0200 @@ -1,8 +1 @@ -mysql_binlog_send() [sql/sql_repl.cc] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -1. When sending events to a slave, master should simply skip - Annotate_rows events (they are not needed for replication). - [ ??? Multi-master - currently not clear ] -2. When sending events to mysqlbinlog (remote case), master - must send Annotate_rows events as well. -=-=(Alexi - Wed, 02 Dec 2009, 13:31)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.17414 2009-12-02 11:31:59.000000000 +0000 +++ /tmp/wklog.47.new.17414 2009-12-02 11:31:59.000000000 +0000 @@ -104,3 +104,10 @@ ### @1=1 /* INT meta=0 nullable=1 is_null=0 */ ... +When master sends Annotate rows events +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +1. Master always sends Annotate_rows events to mysqlbinlog (in + remote case). +2. Master sends Annotate_rows events to a slave only if the slave has + both log-slave-updates and binlog-annotate-row-events options set. + -=-=(Knielsen - Mon, 30 Nov 2009, 11:21)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.18210 2009-11-30 11:21:33.000000000 +0200 +++ /tmp/wklog.47.new.18210 2009-11-30 11:21:33.000000000 +0200 @@ -28,7 +28,14 @@ Describes the query which caused the corresponding row event. In binary log, precedes each Table_map_log_event. Contains empty post-header and the query -text in its data part. Example: +text in its data part. + +The numeric code for this event must be assigned carefully. It should be +coordinated with MySQL/Sun, otherwise we can get into a situation where MySQL +uses the same numeric code for one event that MariaDB uses for +ANNOTATE_ROWS_EVENT, which would make merging the two impossible. + +Example: ... ************************ -=-=(Alexi - Mon, 30 Nov 2009, 10:33)=-=- Low Level Design modified. --- /tmp/wklog.47.old.16188 2009-11-30 10:33:44.000000000 +0200 +++ /tmp/wklog.47.new.16188 2009-11-30 10:33:44.000000000 +0200 @@ -2,6 +2,7 @@ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. When sending events to a slave, master should simply skip Annotate_rows events (they are not needed for replication). + [ ??? Multi-master - currently not clear ] 2. When sending events to mysqlbinlog (remote case), master must send Annotate_rows events as well. -=-=(Alexi - Sun, 29 Nov 2009, 13:00)=-=- Low Level Design modified. --- /tmp/wklog.47.old.32047 2009-11-29 13:00:21.000000000 +0200 +++ /tmp/wklog.47.new.32047 2009-11-29 13:00:21.000000000 +0200 @@ -1 +1,7 @@ +mysql_binlog_send() [sql/sql_repl.cc] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +1. When sending events to a slave, master should simply skip + Annotate_rows events (they are not needed for replication). +2. When sending events to mysqlbinlog (remote case), master + must send Annotate_rows events as well. -=-=(Alexi - Sun, 29 Nov 2009, 09:50)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.24993 2009-11-29 07:50:36.000000000 +0000 +++ /tmp/wklog.47.new.24993 2009-11-29 07:50:36.000000000 +0000 @@ -4,3 +4,96 @@ > (Comment_log_event?). Unless we want to log an empty statement Query_log_event > containing only a comment (a bit of a hack). +New server option +~~~~~~~~~~~~~~~~~ + --binlog-annotate-row-events + +Setting this option makes RBR (row-) events in the binary log to be +preceded by Annotate rows events (see below). The corresponding +'binlog_annotate_row_events' system variable is dynamic and has both +global and session values. Default global value is OFF. + +Note. Session values are usefull to make it possible to annotate only + some selected statements: + + ... + SET SESSION binlog_annotate_row_events=ON; + ... statements to be annotated ... + SET SESSION binlog_annotate_row_events=OFF; + ... statements not to be annotated ... + +New binlog event type +~~~~~~~~~~~~~~~~~~~~~ + Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ] + +Describes the query which caused the corresponding row event. In binary log, +precedes each Table_map_log_event. Contains empty post-header and the query +text in its data part. Example: + + ... + ************************ + ANNOTATE_ROWS_EVENT [51] + ************************ + 000000C7 | 54 1B 12 4B | time_when = 1259477844 + 000000CB | 33 | event_type = 51 + 000000CC | 64 00 00 00 | server_id = 100 + 000000D0 | 2C 00 00 00 | event_len = 44 + 000000D4 | F3 00 00 00 | log_pos = 000000F3 + 000000D8 | 00 00 | flags = <none> + ------------------------ + 000000DA | 69 6E 73 65 | query = "insert into t1 values (1)" + 000000DE | 72 74 20 69 | + 000000E2 | 6E 74 6F 20 | + 000000E6 | 74 31 20 76 | + 000000EA | 61 6C 75 65 | + 000000EE | 73 20 28 31 | + 000000F2 | 29 | + ************************ + TABLE_MAP_EVENT [19] + ************************ + 000000F3 | 54 1B 12 4B | time_when = 1259477844 + 000000F7 | 13 | event_type = 19 + 000000F8 | 64 00 00 00 | server_id = 100 + 000000FC | 29 00 00 00 | event_len = 41 + 00000100 | 1C 01 00 00 | log_pos = 0000011C + 00000104 | 00 00 | flags = <none> + ------------------------ + ... + ************************ + WRITE_ROWS_EVENT [23] + ************************ + 0000011C | 54 1B 12 4B | time_when = 1259477844 + 00000120 | 17 | event_type = 23 + 00000121 | 64 00 00 00 | server_id = 100 + 00000125 | 22 00 00 00 | event_len = 34 + 00000129 | 3E 01 00 00 | log_pos = 0000013E + 0000012D | 10 00 | flags = LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F + ------------------------ + 0000012F | 0F 00 00 00 | table_id = 15 + ... + +mysqlbinlog output +~~~~~~~~~~~~~~~~~~ +Something like this: + + ... + # at 199 + # at 243 + # at 284 + #091129 9:57:24 server id 100 end_log_pos 243 Query: `insert into t1 values +(1)` + #091129 9:57:24 server id 100 end_log_pos 284 Table_map: `test`.`t1` mapped +to number 15 + #091129 9:57:24 server id 100 end_log_pos 318 Write_rows: table id 15 +flags: STMT_END_F + + BINLOG ' + VBsSSzNkAAAALAAAAPMAAAAAAGluc2VydCBpbnRvIHQxIHZhbHVlcyAoMSk= + VBsSSxNkAAAAKQAAABwBAAAAAA8AAAAAAAAABHRlc3QAAnQxAAEDAAE= + VBsSSxdkAAAAIgAAAD4BAAAQAA8AAAAAAAEAAf/+AQAAAA== + '/*!*/; + ### INSERT INTO test.t1 + ### SET + ### @1=1 /* INT meta=0 nullable=1 is_null=0 */ + ... + -=-=(Knielsen - Fri, 27 Nov 2009, 13:30)=-=- Observers changed: Knielsen -=-=(Psergey - Sun, 16 Aug 2009, 11:08)=-=- High-Level Specification modified. --- /tmp/wklog.47.old.12485 2009-08-16 11:08:33.000000000 +0300 +++ /tmp/wklog.47.new.12485 2009-08-16 11:08:33.000000000 +0300 @@ -1 +1,6 @@ +First suggestion: + +> I think for this we would actually need a new binlog event type +> (Comment_log_event?). Unless we want to log an empty statement Query_log_event +> containing only a comment (a bit of a hack). -=-=(Psergey - Sun, 16 Aug 2009, 00:02)=-=- Dependency created: 39 now depends on 47 DESCRIPTION: Store in binlog (and show in mysqlbinlog output) texts of statements that caused RBR events This is needed for (list from Monty): - Easier to understand why updates happened - Would make it easier to find out where in application things went wrong (as you can search for exact strings) - Allow one to filter things based on comments in the statement. The cost of this can be that the binlog will be approximately 2x in size (especially insert of big blob's would be a bit painful), so this should be an optional feature. HIGH-LEVEL SPECIFICATION: First suggestion: > I think for this we would actually need a new binlog event type > (Comment_log_event?). Unless we want to log an empty statement Query_log_event > containing only a comment (a bit of a hack). New server option ~~~~~~~~~~~~~~~~~ --binlog-annotate-rows-events Setting this option makes RBR (rows-) events in the binary log to be preceded by Annotate rows events (see below). The corresponding 'binlog_annotate_rows_events' system variable is dynamic and has both global and session values. Default global value is OFF. Note. Session values are usefull to make it possible to annotate only some selected statements: ... SET SESSION binlog_annotate_rows_events=ON; ... statements to be annotated ... SET SESSION binlog_annotate_rows_events=OFF; ... statements not to be annotated ... New binlog event type ~~~~~~~~~~~~~~~~~~~~~ Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ] Describes the query which caused the corresponding rows event. In binary log, precedes each Table_map_log_event. Contains empty post-header and the query text in its data part. The numeric code for this event must be assigned carefully. It should be coordinated with MySQL/Sun, otherwise we can get into a situation where MySQL uses the same numeric code for one event that MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two impossible. Example: ... ************************ ANNOTATE_ROWS_EVENT [51] ************************ 000000C7 | 54 1B 12 4B | time_when = 1259477844 000000CB | 33 | event_type = 51 000000CC | 64 00 00 00 | server_id = 100 000000D0 | 2C 00 00 00 | event_len = 44 000000D4 | F3 00 00 00 | log_pos = 000000F3 000000D8 | 00 00 | flags = <none> ------------------------ 000000DA | 69 6E 73 65 | query = "insert into t1 values (1)" 000000DE | 72 74 20 69 | 000000E2 | 6E 74 6F 20 | 000000E6 | 74 31 20 76 | 000000EA | 61 6C 75 65 | 000000EE | 73 20 28 31 | 000000F2 | 29 | ************************ TABLE_MAP_EVENT [19] ************************ 000000F3 | 54 1B 12 4B | time_when = 1259477844 000000F7 | 13 | event_type = 19 000000F8 | 64 00 00 00 | server_id = 100 000000FC | 29 00 00 00 | event_len = 41 00000100 | 1C 01 00 00 | log_pos = 0000011C 00000104 | 00 00 | flags = <none> ------------------------ ... ************************ WRITE_ROWS_EVENT [23] ************************ 0000011C | 54 1B 12 4B | time_when = 1259477844 00000120 | 17 | event_type = 23 00000121 | 64 00 00 00 | server_id = 100 00000125 | 22 00 00 00 | event_len = 34 00000129 | 3E 01 00 00 | log_pos = 0000013E 0000012D | 10 00 | flags = LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F ------------------------ 0000012F | 0F 00 00 00 | table_id = 15 ... New mysqlbinlog option ~~~~~~~~~~~~~~~~~~~~~~ --print-annotate-rows-events With this option, mysqlbinlog prints the content of Annotate-rows events (if the binary log does contain them). Without this option (i.e. by default), mysqlbinlog skips Annotate rows events. mysqlbinlog output ~~~~~~~~~~~~~~~~~~ Something like this: ... # at 199 # at 243 # at 284 #091129 9:57:24 server id 100 end_log_pos 243 Query: `insert into t1 values (1)` #091129 9:57:24 server id 100 end_log_pos 284 Table_map: `test`.`t1` mapped to number 15 #091129 9:57:24 server id 100 end_log_pos 318 Write_rows: table id 15 flags: STMT_END_F BINLOG ' VBsSSzNkAAAALAAAAPMAAAAAAGluc2VydCBpbnRvIHQxIHZhbHVlcyAoMSk= VBsSSxNkAAAAKQAAABwBAAAAAA8AAAAAAAAABHRlc3QAAnQxAAEDAAE= VBsSSxdkAAAAIgAAAD4BAAAQAA8AAAAAAAEAAf/+AQAAAA== '/*!*/; ### INSERT INTO test.t1 ### SET ### @1=1 /* INT meta=0 nullable=1 is_null=0 */ ... When master sends Annotate rows events ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. Master always sends Annotate_rows events to mysqlbinlog (in remote case). 2. Master sends Annotate_rows events to a slave only if the slave has both log-slave-updates and binlog-annotate-rows-events options set. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Updated (by Timour): Subquery optimization: Efficient NOT IN execution with NULLs (68)
by worklog-noreply＠askmonty.org 04 Dec '09

04 Dec '09

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Subquery optimization: Efficient NOT IN execution with NULLs CREATION DATE..: Fri, 27 Nov 2009, 13:22 SUPERVISOR.....: Monty IMPLEMENTOR....: Timour COPIES TO......: CATEGORY.......: Server-Sprint TASK ID........: 68 (http://askmonty.org/worklog/?tid=68) VERSION........: Server-9.x STATUS.........: Assigned PRIORITY.......: 60 WORKED HOURS...: 0 ESTIMATE.......: 0 (hours remain) ORIG. ESTIMATE.: 0 PROGRESS NOTES: -=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=- Version updated. --- /tmp/wklog.68.old.5257 2009-12-04 11:27:11.000000000 +0200 +++ /tmp/wklog.68.new.5257 2009-12-04 11:27:11.000000000 +0200 @@ -1 +1 @@ -Benchmarks-3.0 +Server-9.x -=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=- Category updated. --- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200 +++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200 @@ -1 +1 @@ -Server-RawIdeaBin +Server-Sprint -=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=- Status updated. --- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200 +++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200 @@ -1 +1 @@ -Un-Assigned +Assigned -=-=(Timour - Fri, 04 Dec 2009, 11:26)=-=- High-Level Specification modified. --- /tmp/wklog.68.old.5182 2009-12-04 11:26:25.000000000 +0200 +++ /tmp/wklog.68.new.5182 2009-12-04 11:26:25.000000000 +0200 @@ -50,23 +50,39 @@ The array can be created on demand. Other consideration that may be taken into account: + 1. If columns a_j1,...,a_jm do not contain null values in the temporary -table at all, create for them only one index array (and of course do not -create any bitmaps for them). -2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows -where a_i is not null and V(a_i) is the number of distinct values for -a_i excluding nulls. - If d(a_i) is close to 1 then do not create any index array: check +table at all and v_j1,...,v_jm cannot be null, create for these columns +only one index array (and of course do not create any bitmaps for them). + +2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number +of rows, where a_i is not null and V(a_i) is the number of distinct +values for a_i excluding nulls. +If d(a_i) is close to N'(a_i) then do not create any index array: check whether there is a match running through the records that have been -filtered in. Anyway if d(a_i) is close to 1 then a intersection with -rowid{a_i=v_i} would not reduce the number of remaining rowids +filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection + with rowid{a_i=v_i} will not reduce the number of remaining rowids significantly. - If additionally N-N' is small do not create a bitmap for this column -either. -3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a -sorted array of rowids from the set rowid{a_i is null} can be used -instead of a bitmap. +In other words is V(a_i) exceeds some threshold there is no sense to +create an index for a_i. +If additionally N-N'(a_i) is small do not create a bitmap for this +column either. + +3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is +small a sorted array of rowids from the set rowid{a_i is null} can be +used instead of a bitmap. + 4. We always have a match if R0= INTERSECT rowid{a_i is null} is not empty. Here i runs through all indexes from [1..n] such that v_i is not null. For a given subset of columns this fact has to be checked only once. It can be easily done with bitmap intersection. + +5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be +created only for rows with nulls. + +6. If v1,...,vn never can be a null and number of rows with nulls is +small do not create indexes and do not create bitmaps. + +7. If you get a row with nulls in all columns stop filling the temporary +table and return UNKNOWN for any tuple <v1,...,vn>. + -=-=(Timour - Fri, 27 Nov 2009, 13:23)=-=- High-Level Specification modified. --- /tmp/wklog.68.old.17140 2009-11-27 11:23:17.000000000 +0000 +++ /tmp/wklog.68.new.17140 2009-11-27 11:23:17.000000000 +0000 @@ -1 +1,72 @@ +This a copy of the initial algorithm proposed by Igor: +====================================================== +For each left side tuple (v_1,...,v_n) we have to find the following set +of rowids for the temp table containing N rows as the result of +materialization of the subquery: + + R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs +trough all indexes from [1..n] such that v_i is not null. + +Bear in mind the following specifics of this intersection: + (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint + (2) For each i: rowid{a_i is null} is the same for each tuple + +Due to (2) it makes sense to build rowid{a_i is null} only once. +A good representation for such sets would be bitmaps: +- it requires minimum memory: not more than N*n bits in total +- search of an element in a set is extremely cheap + +Taken all above into account I could suggest the following algorithm to +build R: + + Using indexes (read about them below) for each column participating in the + intersection, + merge ordered sets rowid{a_i=v_i} in the following manner. + If a rowid r has been encountered maximum in k sets +rowid{a_i1=v_i1},...,rowid(a_ik=v_ik), + then it has to be checked against all rowid{a_i=v_i} such that i is +not in {i1,...,ik}. + As soon as we fail to find r in one of these sets we discard it. + If r has been found in all of them then r belongs to the set R. + +Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i +is null} is either +belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can +infer that for any r from R +indexes a_i can be uniquely divided into two groups: one contains +indexes a_i where r belongs to +the sets rowid{a_i=v_i}, the other contains indexes a_j such that r +belongs to rowid{a_j is null}. + +Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order +needed for the merge procedure. We could use BTREE indexes for temp +table. But they are rather expensive and +take a lot of memory as the are implemented with RB trees. +I would suggest creating for each column from the temporary table just +an array of rowids sorted by the value from column a. +Index lookup in such an array is cheap. It's also rather cheap to check +that the next rowid refers to a row with a different value in column a. +The array can be created on demand. + +Other consideration that may be taken into account: +1. If columns a_j1,...,a_jm do not contain null values in the temporary +table at all, create for them only one index array (and of course do not +create any bitmaps for them). +2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows +where a_i is not null and V(a_i) is the number of distinct values for +a_i excluding nulls. + If d(a_i) is close to 1 then do not create any index array: check +whether there is a match running through the records that have been +filtered in. Anyway if d(a_i) is close to 1 then a intersection with +rowid{a_i=v_i} would not reduce the number of remaining rowids +significantly. + If additionally N-N' is small do not create a bitmap for this column +either. +3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a +sorted array of rowids from the set rowid{a_i is null} can be used +instead of a bitmap. +4. We always have a match if R0= INTERSECT rowid{a_i is null} is not +empty. Here i runs through all indexes from [1..n] such that v_i is not +null. For a given subset of columns this fact has to be checked only +once. It can be easily done with bitmap intersection. DESCRIPTION: The goal of this task is to implement efficient execution of NOT IN subquery predicates of the form: <oe_1,...,oe_n> NOT IN <non_correlated subquery> when either some oe_i, or some subqury result column contains NULLs. The problem with such predicates is that it is possible to use index lookups only when neither argument of the predicate contains NULLs. If some argument contains a NULL, then due to NULL semantics, it plays the role of a wildcard. If we were to use regular index lookups, then we would get 'no match' for some outer tuple (thus the predicate evaluates to FALSE), while the SQL semantics means 'partial match', and the predicate should evaluate to NULL. This task implements an efficient algorithm to compute such 'parial matches', where a NULL matches any value. HIGH-LEVEL SPECIFICATION: This a copy of the initial algorithm proposed by Igor: ====================================================== For each left side tuple (v_1,...,v_n) we have to find the following set of rowids for the temp table containing N rows as the result of materialization of the subquery: R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs trough all indexes from [1..n] such that v_i is not null. Bear in mind the following specifics of this intersection: (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint (2) For each i: rowid{a_i is null} is the same for each tuple Due to (2) it makes sense to build rowid{a_i is null} only once. A good representation for such sets would be bitmaps: - it requires minimum memory: not more than N*n bits in total - search of an element in a set is extremely cheap Taken all above into account I could suggest the following algorithm to build R: Using indexes (read about them below) for each column participating in the intersection, merge ordered sets rowid{a_i=v_i} in the following manner. If a rowid r has been encountered maximum in k sets rowid{a_i1=v_i1},...,rowid(a_ik=v_ik), then it has to be checked against all rowid{a_i=v_i} such that i is not in {i1,...,ik}. As soon as we fail to find r in one of these sets we discard it. If r has been found in all of them then r belongs to the set R. Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can infer that for any r from R indexes a_i can be uniquely divided into two groups: one contains indexes a_i where r belongs to the sets rowid{a_i=v_i}, the other contains indexes a_j such that r belongs to rowid{a_j is null}. Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order needed for the merge procedure. We could use BTREE indexes for temp table. But they are rather expensive and take a lot of memory as the are implemented with RB trees. I would suggest creating for each column from the temporary table just an array of rowids sorted by the value from column a. Index lookup in such an array is cheap. It's also rather cheap to check that the next rowid refers to a row with a different value in column a. The array can be created on demand. Other consideration that may be taken into account: 1. If columns a_j1,...,a_jm do not contain null values in the temporary table at all and v_j1,...,v_jm cannot be null, create for these columns only one index array (and of course do not create any bitmaps for them). 2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number of rows, where a_i is not null and V(a_i) is the number of distinct values for a_i excluding nulls. If d(a_i) is close to N'(a_i) then do not create any index array: check whether there is a match running through the records that have been filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection with rowid{a_i=v_i} will not reduce the number of remaining rowids significantly. In other words is V(a_i) exceeds some threshold there is no sense to create an index for a_i. If additionally N-N'(a_i) is small do not create a bitmap for this column either. 3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is small a sorted array of rowids from the set rowid{a_i is null} can be used instead of a bitmap. 4. We always have a match if R0= INTERSECT rowid{a_i is null} is not empty. Here i runs through all indexes from [1..n] such that v_i is not null. For a given subset of columns this fact has to be checked only once. It can be easily done with bitmap intersection. 5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be created only for rows with nulls. 6. If v1,...,vn never can be a null and number of rows with nulls is small do not create indexes and do not create bitmaps. 7. If you get a row with nulls in all columns stop filling the temporary table and return UNKNOWN for any tuple <v1,...,vn>. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Updated (by Timour): Subquery optimization: Efficient NOT IN execution with NULLs (68)
by worklog-noreply＠askmonty.org 04 Dec '09

04 Dec '09

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Subquery optimization: Efficient NOT IN execution with NULLs CREATION DATE..: Fri, 27 Nov 2009, 13:22 SUPERVISOR.....: Monty IMPLEMENTOR....: Timour COPIES TO......: CATEGORY.......: Server-Sprint TASK ID........: 68 (http://askmonty.org/worklog/?tid=68) VERSION........: Server-9.x STATUS.........: Assigned PRIORITY.......: 60 WORKED HOURS...: 0 ESTIMATE.......: 0 (hours remain) ORIG. ESTIMATE.: 0 PROGRESS NOTES: -=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=- Version updated. --- /tmp/wklog.68.old.5257 2009-12-04 11:27:11.000000000 +0200 +++ /tmp/wklog.68.new.5257 2009-12-04 11:27:11.000000000 +0200 @@ -1 +1 @@ -Benchmarks-3.0 +Server-9.x -=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=- Category updated. --- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200 +++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200 @@ -1 +1 @@ -Server-RawIdeaBin +Server-Sprint -=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=- Status updated. --- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200 +++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200 @@ -1 +1 @@ -Un-Assigned +Assigned -=-=(Timour - Fri, 04 Dec 2009, 11:26)=-=- High-Level Specification modified. --- /tmp/wklog.68.old.5182 2009-12-04 11:26:25.000000000 +0200 +++ /tmp/wklog.68.new.5182 2009-12-04 11:26:25.000000000 +0200 @@ -50,23 +50,39 @@ The array can be created on demand. Other consideration that may be taken into account: + 1. If columns a_j1,...,a_jm do not contain null values in the temporary -table at all, create for them only one index array (and of course do not -create any bitmaps for them). -2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows -where a_i is not null and V(a_i) is the number of distinct values for -a_i excluding nulls. - If d(a_i) is close to 1 then do not create any index array: check +table at all and v_j1,...,v_jm cannot be null, create for these columns +only one index array (and of course do not create any bitmaps for them). + +2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number +of rows, where a_i is not null and V(a_i) is the number of distinct +values for a_i excluding nulls. +If d(a_i) is close to N'(a_i) then do not create any index array: check whether there is a match running through the records that have been -filtered in. Anyway if d(a_i) is close to 1 then a intersection with -rowid{a_i=v_i} would not reduce the number of remaining rowids +filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection + with rowid{a_i=v_i} will not reduce the number of remaining rowids significantly. - If additionally N-N' is small do not create a bitmap for this column -either. -3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a -sorted array of rowids from the set rowid{a_i is null} can be used -instead of a bitmap. +In other words is V(a_i) exceeds some threshold there is no sense to +create an index for a_i. +If additionally N-N'(a_i) is small do not create a bitmap for this +column either. + +3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is +small a sorted array of rowids from the set rowid{a_i is null} can be +used instead of a bitmap. + 4. We always have a match if R0= INTERSECT rowid{a_i is null} is not empty. Here i runs through all indexes from [1..n] such that v_i is not null. For a given subset of columns this fact has to be checked only once. It can be easily done with bitmap intersection. + +5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be +created only for rows with nulls. + +6. If v1,...,vn never can be a null and number of rows with nulls is +small do not create indexes and do not create bitmaps. + +7. If you get a row with nulls in all columns stop filling the temporary +table and return UNKNOWN for any tuple <v1,...,vn>. + -=-=(Timour - Fri, 27 Nov 2009, 13:23)=-=- High-Level Specification modified. --- /tmp/wklog.68.old.17140 2009-11-27 11:23:17.000000000 +0000 +++ /tmp/wklog.68.new.17140 2009-11-27 11:23:17.000000000 +0000 @@ -1 +1,72 @@ +This a copy of the initial algorithm proposed by Igor: +====================================================== +For each left side tuple (v_1,...,v_n) we have to find the following set +of rowids for the temp table containing N rows as the result of +materialization of the subquery: + + R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs +trough all indexes from [1..n] such that v_i is not null. + +Bear in mind the following specifics of this intersection: + (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint + (2) For each i: rowid{a_i is null} is the same for each tuple + +Due to (2) it makes sense to build rowid{a_i is null} only once. +A good representation for such sets would be bitmaps: +- it requires minimum memory: not more than N*n bits in total +- search of an element in a set is extremely cheap + +Taken all above into account I could suggest the following algorithm to +build R: + + Using indexes (read about them below) for each column participating in the + intersection, + merge ordered sets rowid{a_i=v_i} in the following manner. + If a rowid r has been encountered maximum in k sets +rowid{a_i1=v_i1},...,rowid(a_ik=v_ik), + then it has to be checked against all rowid{a_i=v_i} such that i is +not in {i1,...,ik}. + As soon as we fail to find r in one of these sets we discard it. + If r has been found in all of them then r belongs to the set R. + +Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i +is null} is either +belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can +infer that for any r from R +indexes a_i can be uniquely divided into two groups: one contains +indexes a_i where r belongs to +the sets rowid{a_i=v_i}, the other contains indexes a_j such that r +belongs to rowid{a_j is null}. + +Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order +needed for the merge procedure. We could use BTREE indexes for temp +table. But they are rather expensive and +take a lot of memory as the are implemented with RB trees. +I would suggest creating for each column from the temporary table just +an array of rowids sorted by the value from column a. +Index lookup in such an array is cheap. It's also rather cheap to check +that the next rowid refers to a row with a different value in column a. +The array can be created on demand. + +Other consideration that may be taken into account: +1. If columns a_j1,...,a_jm do not contain null values in the temporary +table at all, create for them only one index array (and of course do not +create any bitmaps for them). +2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows +where a_i is not null and V(a_i) is the number of distinct values for +a_i excluding nulls. + If d(a_i) is close to 1 then do not create any index array: check +whether there is a match running through the records that have been +filtered in. Anyway if d(a_i) is close to 1 then a intersection with +rowid{a_i=v_i} would not reduce the number of remaining rowids +significantly. + If additionally N-N' is small do not create a bitmap for this column +either. +3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a +sorted array of rowids from the set rowid{a_i is null} can be used +instead of a bitmap. +4. We always have a match if R0= INTERSECT rowid{a_i is null} is not +empty. Here i runs through all indexes from [1..n] such that v_i is not +null. For a given subset of columns this fact has to be checked only +once. It can be easily done with bitmap intersection. DESCRIPTION: The goal of this task is to implement efficient execution of NOT IN subquery predicates of the form: <oe_1,...,oe_n> NOT IN <non_correlated subquery> when either some oe_i, or some subqury result column contains NULLs. The problem with such predicates is that it is possible to use index lookups only when neither argument of the predicate contains NULLs. If some argument contains a NULL, then due to NULL semantics, it plays the role of a wildcard. If we were to use regular index lookups, then we would get 'no match' for some outer tuple (thus the predicate evaluates to FALSE), while the SQL semantics means 'partial match', and the predicate should evaluate to NULL. This task implements an efficient algorithm to compute such 'parial matches', where a NULL matches any value. HIGH-LEVEL SPECIFICATION: This a copy of the initial algorithm proposed by Igor: ====================================================== For each left side tuple (v_1,...,v_n) we have to find the following set of rowids for the temp table containing N rows as the result of materialization of the subquery: R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs trough all indexes from [1..n] such that v_i is not null. Bear in mind the following specifics of this intersection: (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint (2) For each i: rowid{a_i is null} is the same for each tuple Due to (2) it makes sense to build rowid{a_i is null} only once. A good representation for such sets would be bitmaps: - it requires minimum memory: not more than N*n bits in total - search of an element in a set is extremely cheap Taken all above into account I could suggest the following algorithm to build R: Using indexes (read about them below) for each column participating in the intersection, merge ordered sets rowid{a_i=v_i} in the following manner. If a rowid r has been encountered maximum in k sets rowid{a_i1=v_i1},...,rowid(a_ik=v_ik), then it has to be checked against all rowid{a_i=v_i} such that i is not in {i1,...,ik}. As soon as we fail to find r in one of these sets we discard it. If r has been found in all of them then r belongs to the set R. Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can infer that for any r from R indexes a_i can be uniquely divided into two groups: one contains indexes a_i where r belongs to the sets rowid{a_i=v_i}, the other contains indexes a_j such that r belongs to rowid{a_j is null}. Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order needed for the merge procedure. We could use BTREE indexes for temp table. But they are rather expensive and take a lot of memory as the are implemented with RB trees. I would suggest creating for each column from the temporary table just an array of rowids sorted by the value from column a. Index lookup in such an array is cheap. It's also rather cheap to check that the next rowid refers to a row with a different value in column a. The array can be created on demand. Other consideration that may be taken into account: 1. If columns a_j1,...,a_jm do not contain null values in the temporary table at all and v_j1,...,v_jm cannot be null, create for these columns only one index array (and of course do not create any bitmaps for them). 2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number of rows, where a_i is not null and V(a_i) is the number of distinct values for a_i excluding nulls. If d(a_i) is close to N'(a_i) then do not create any index array: check whether there is a match running through the records that have been filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection with rowid{a_i=v_i} will not reduce the number of remaining rowids significantly. In other words is V(a_i) exceeds some threshold there is no sense to create an index for a_i. If additionally N-N'(a_i) is small do not create a bitmap for this column either. 3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is small a sorted array of rowids from the set rowid{a_i is null} can be used instead of a bitmap. 4. We always have a match if R0= INTERSECT rowid{a_i is null} is not empty. Here i runs through all indexes from [1..n] such that v_i is not null. For a given subset of columns this fact has to be checked only once. It can be easily done with bitmap intersection. 5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be created only for rows with nulls. 6. If v1,...,vn never can be a null and number of rows with nulls is small do not create indexes and do not create bitmaps. 7. If you get a row with nulls in all columns stop filling the temporary table and return UNKNOWN for any tuple <v1,...,vn>. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Updated (by Timour): Subquery optimization: Efficient NOT IN execution with NULLs (68)
by worklog-noreply＠askmonty.org 04 Dec '09

04 Dec '09

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Subquery optimization: Efficient NOT IN execution with NULLs CREATION DATE..: Fri, 27 Nov 2009, 13:22 SUPERVISOR.....: Monty IMPLEMENTOR....: Timour COPIES TO......: CATEGORY.......: Server-Sprint TASK ID........: 68 (http://askmonty.org/worklog/?tid=68) VERSION........: Benchmarks-3.0 STATUS.........: Assigned PRIORITY.......: 60 WORKED HOURS...: 0 ESTIMATE.......: 0 (hours remain) ORIG. ESTIMATE.: 0 PROGRESS NOTES: -=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=- Category updated. --- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200 +++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200 @@ -1 +1 @@ -Server-RawIdeaBin +Server-Sprint -=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=- Status updated. --- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200 +++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200 @@ -1 +1 @@ -Un-Assigned +Assigned -=-=(Timour - Fri, 04 Dec 2009, 11:26)=-=- High-Level Specification modified. --- /tmp/wklog.68.old.5182 2009-12-04 11:26:25.000000000 +0200 +++ /tmp/wklog.68.new.5182 2009-12-04 11:26:25.000000000 +0200 @@ -50,23 +50,39 @@ The array can be created on demand. Other consideration that may be taken into account: + 1. If columns a_j1,...,a_jm do not contain null values in the temporary -table at all, create for them only one index array (and of course do not -create any bitmaps for them). -2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows -where a_i is not null and V(a_i) is the number of distinct values for -a_i excluding nulls. - If d(a_i) is close to 1 then do not create any index array: check +table at all and v_j1,...,v_jm cannot be null, create for these columns +only one index array (and of course do not create any bitmaps for them). + +2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number +of rows, where a_i is not null and V(a_i) is the number of distinct +values for a_i excluding nulls. +If d(a_i) is close to N'(a_i) then do not create any index array: check whether there is a match running through the records that have been -filtered in. Anyway if d(a_i) is close to 1 then a intersection with -rowid{a_i=v_i} would not reduce the number of remaining rowids +filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection + with rowid{a_i=v_i} will not reduce the number of remaining rowids significantly. - If additionally N-N' is small do not create a bitmap for this column -either. -3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a -sorted array of rowids from the set rowid{a_i is null} can be used -instead of a bitmap. +In other words is V(a_i) exceeds some threshold there is no sense to +create an index for a_i. +If additionally N-N'(a_i) is small do not create a bitmap for this +column either. + +3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is +small a sorted array of rowids from the set rowid{a_i is null} can be +used instead of a bitmap. + 4. We always have a match if R0= INTERSECT rowid{a_i is null} is not empty. Here i runs through all indexes from [1..n] such that v_i is not null. For a given subset of columns this fact has to be checked only once. It can be easily done with bitmap intersection. + +5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be +created only for rows with nulls. + +6. If v1,...,vn never can be a null and number of rows with nulls is +small do not create indexes and do not create bitmaps. + +7. If you get a row with nulls in all columns stop filling the temporary +table and return UNKNOWN for any tuple <v1,...,vn>. + -=-=(Timour - Fri, 27 Nov 2009, 13:23)=-=- High-Level Specification modified. --- /tmp/wklog.68.old.17140 2009-11-27 11:23:17.000000000 +0000 +++ /tmp/wklog.68.new.17140 2009-11-27 11:23:17.000000000 +0000 @@ -1 +1,72 @@ +This a copy of the initial algorithm proposed by Igor: +====================================================== +For each left side tuple (v_1,...,v_n) we have to find the following set +of rowids for the temp table containing N rows as the result of +materialization of the subquery: + + R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs +trough all indexes from [1..n] such that v_i is not null. + +Bear in mind the following specifics of this intersection: + (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint + (2) For each i: rowid{a_i is null} is the same for each tuple + +Due to (2) it makes sense to build rowid{a_i is null} only once. +A good representation for such sets would be bitmaps: +- it requires minimum memory: not more than N*n bits in total +- search of an element in a set is extremely cheap + +Taken all above into account I could suggest the following algorithm to +build R: + + Using indexes (read about them below) for each column participating in the + intersection, + merge ordered sets rowid{a_i=v_i} in the following manner. + If a rowid r has been encountered maximum in k sets +rowid{a_i1=v_i1},...,rowid(a_ik=v_ik), + then it has to be checked against all rowid{a_i=v_i} such that i is +not in {i1,...,ik}. + As soon as we fail to find r in one of these sets we discard it. + If r has been found in all of them then r belongs to the set R. + +Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i +is null} is either +belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can +infer that for any r from R +indexes a_i can be uniquely divided into two groups: one contains +indexes a_i where r belongs to +the sets rowid{a_i=v_i}, the other contains indexes a_j such that r +belongs to rowid{a_j is null}. + +Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order +needed for the merge procedure. We could use BTREE indexes for temp +table. But they are rather expensive and +take a lot of memory as the are implemented with RB trees. +I would suggest creating for each column from the temporary table just +an array of rowids sorted by the value from column a. +Index lookup in such an array is cheap. It's also rather cheap to check +that the next rowid refers to a row with a different value in column a. +The array can be created on demand. + +Other consideration that may be taken into account: +1. If columns a_j1,...,a_jm do not contain null values in the temporary +table at all, create for them only one index array (and of course do not +create any bitmaps for them). +2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows +where a_i is not null and V(a_i) is the number of distinct values for +a_i excluding nulls. + If d(a_i) is close to 1 then do not create any index array: check +whether there is a match running through the records that have been +filtered in. Anyway if d(a_i) is close to 1 then a intersection with +rowid{a_i=v_i} would not reduce the number of remaining rowids +significantly. + If additionally N-N' is small do not create a bitmap for this column +either. +3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a +sorted array of rowids from the set rowid{a_i is null} can be used +instead of a bitmap. +4. We always have a match if R0= INTERSECT rowid{a_i is null} is not +empty. Here i runs through all indexes from [1..n] such that v_i is not +null. For a given subset of columns this fact has to be checked only +once. It can be easily done with bitmap intersection. DESCRIPTION: The goal of this task is to implement efficient execution of NOT IN subquery predicates of the form: <oe_1,...,oe_n> NOT IN <non_correlated subquery> when either some oe_i, or some subqury result column contains NULLs. The problem with such predicates is that it is possible to use index lookups only when neither argument of the predicate contains NULLs. If some argument contains a NULL, then due to NULL semantics, it plays the role of a wildcard. If we were to use regular index lookups, then we would get 'no match' for some outer tuple (thus the predicate evaluates to FALSE), while the SQL semantics means 'partial match', and the predicate should evaluate to NULL. This task implements an efficient algorithm to compute such 'parial matches', where a NULL matches any value. HIGH-LEVEL SPECIFICATION: This a copy of the initial algorithm proposed by Igor: ====================================================== For each left side tuple (v_1,...,v_n) we have to find the following set of rowids for the temp table containing N rows as the result of materialization of the subquery: R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs trough all indexes from [1..n] such that v_i is not null. Bear in mind the following specifics of this intersection: (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint (2) For each i: rowid{a_i is null} is the same for each tuple Due to (2) it makes sense to build rowid{a_i is null} only once. A good representation for such sets would be bitmaps: - it requires minimum memory: not more than N*n bits in total - search of an element in a set is extremely cheap Taken all above into account I could suggest the following algorithm to build R: Using indexes (read about them below) for each column participating in the intersection, merge ordered sets rowid{a_i=v_i} in the following manner. If a rowid r has been encountered maximum in k sets rowid{a_i1=v_i1},...,rowid(a_ik=v_ik), then it has to be checked against all rowid{a_i=v_i} such that i is not in {i1,...,ik}. As soon as we fail to find r in one of these sets we discard it. If r has been found in all of them then r belongs to the set R. Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can infer that for any r from R indexes a_i can be uniquely divided into two groups: one contains indexes a_i where r belongs to the sets rowid{a_i=v_i}, the other contains indexes a_j such that r belongs to rowid{a_j is null}. Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order needed for the merge procedure. We could use BTREE indexes for temp table. But they are rather expensive and take a lot of memory as the are implemented with RB trees. I would suggest creating for each column from the temporary table just an array of rowids sorted by the value from column a. Index lookup in such an array is cheap. It's also rather cheap to check that the next rowid refers to a row with a different value in column a. The array can be created on demand. Other consideration that may be taken into account: 1. If columns a_j1,...,a_jm do not contain null values in the temporary table at all and v_j1,...,v_jm cannot be null, create for these columns only one index array (and of course do not create any bitmaps for them). 2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number of rows, where a_i is not null and V(a_i) is the number of distinct values for a_i excluding nulls. If d(a_i) is close to N'(a_i) then do not create any index array: check whether there is a match running through the records that have been filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection with rowid{a_i=v_i} will not reduce the number of remaining rowids significantly. In other words is V(a_i) exceeds some threshold there is no sense to create an index for a_i. If additionally N-N'(a_i) is small do not create a bitmap for this column either. 3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is small a sorted array of rowids from the set rowid{a_i is null} can be used instead of a bitmap. 4. We always have a match if R0= INTERSECT rowid{a_i is null} is not empty. Here i runs through all indexes from [1..n] such that v_i is not null. For a given subset of columns this fact has to be checked only once. It can be easily done with bitmap intersection. 5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be created only for rows with nulls. 6. If v1,...,vn never can be a null and number of rows with nulls is small do not create indexes and do not create bitmaps. 7. If you get a row with nulls in all columns stop filling the temporary table and return UNKNOWN for any tuple <v1,...,vn>. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Updated (by Timour): Subquery optimization: Efficient NOT IN execution with NULLs (68)
by worklog-noreply＠askmonty.org 04 Dec '09

04 Dec '09

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Subquery optimization: Efficient NOT IN execution with NULLs CREATION DATE..: Fri, 27 Nov 2009, 13:22 SUPERVISOR.....: Monty IMPLEMENTOR....: Timour COPIES TO......: CATEGORY.......: Server-Sprint TASK ID........: 68 (http://askmonty.org/worklog/?tid=68) VERSION........: Benchmarks-3.0 STATUS.........: Assigned PRIORITY.......: 60 WORKED HOURS...: 0 ESTIMATE.......: 0 (hours remain) ORIG. ESTIMATE.: 0 PROGRESS NOTES: -=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=- Category updated. --- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200 +++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200 @@ -1 +1 @@ -Server-RawIdeaBin +Server-Sprint -=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=- Status updated. --- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200 +++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200 @@ -1 +1 @@ -Un-Assigned +Assigned -=-=(Timour - Fri, 04 Dec 2009, 11:26)=-=- High-Level Specification modified. --- /tmp/wklog.68.old.5182 2009-12-04 11:26:25.000000000 +0200 +++ /tmp/wklog.68.new.5182 2009-12-04 11:26:25.000000000 +0200 @@ -50,23 +50,39 @@ The array can be created on demand. Other consideration that may be taken into account: + 1. If columns a_j1,...,a_jm do not contain null values in the temporary -table at all, create for them only one index array (and of course do not -create any bitmaps for them). -2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows -where a_i is not null and V(a_i) is the number of distinct values for -a_i excluding nulls. - If d(a_i) is close to 1 then do not create any index array: check +table at all and v_j1,...,v_jm cannot be null, create for these columns +only one index array (and of course do not create any bitmaps for them). + +2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number +of rows, where a_i is not null and V(a_i) is the number of distinct +values for a_i excluding nulls. +If d(a_i) is close to N'(a_i) then do not create any index array: check whether there is a match running through the records that have been -filtered in. Anyway if d(a_i) is close to 1 then a intersection with -rowid{a_i=v_i} would not reduce the number of remaining rowids +filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection + with rowid{a_i=v_i} will not reduce the number of remaining rowids significantly. - If additionally N-N' is small do not create a bitmap for this column -either. -3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a -sorted array of rowids from the set rowid{a_i is null} can be used -instead of a bitmap. +In other words is V(a_i) exceeds some threshold there is no sense to +create an index for a_i. +If additionally N-N'(a_i) is small do not create a bitmap for this +column either. + +3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is +small a sorted array of rowids from the set rowid{a_i is null} can be +used instead of a bitmap. + 4. We always have a match if R0= INTERSECT rowid{a_i is null} is not empty. Here i runs through all indexes from [1..n] such that v_i is not null. For a given subset of columns this fact has to be checked only once. It can be easily done with bitmap intersection. + +5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be +created only for rows with nulls. + +6. If v1,...,vn never can be a null and number of rows with nulls is +small do not create indexes and do not create bitmaps. + +7. If you get a row with nulls in all columns stop filling the temporary +table and return UNKNOWN for any tuple <v1,...,vn>. + -=-=(Timour - Fri, 27 Nov 2009, 13:23)=-=- High-Level Specification modified. --- /tmp/wklog.68.old.17140 2009-11-27 11:23:17.000000000 +0000 +++ /tmp/wklog.68.new.17140 2009-11-27 11:23:17.000000000 +0000 @@ -1 +1,72 @@ +This a copy of the initial algorithm proposed by Igor: +====================================================== +For each left side tuple (v_1,...,v_n) we have to find the following set +of rowids for the temp table containing N rows as the result of +materialization of the subquery: + + R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs +trough all indexes from [1..n] such that v_i is not null. + +Bear in mind the following specifics of this intersection: + (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint + (2) For each i: rowid{a_i is null} is the same for each tuple + +Due to (2) it makes sense to build rowid{a_i is null} only once. +A good representation for such sets would be bitmaps: +- it requires minimum memory: not more than N*n bits in total +- search of an element in a set is extremely cheap + +Taken all above into account I could suggest the following algorithm to +build R: + + Using indexes (read about them below) for each column participating in the + intersection, + merge ordered sets rowid{a_i=v_i} in the following manner. + If a rowid r has been encountered maximum in k sets +rowid{a_i1=v_i1},...,rowid(a_ik=v_ik), + then it has to be checked against all rowid{a_i=v_i} such that i is +not in {i1,...,ik}. + As soon as we fail to find r in one of these sets we discard it. + If r has been found in all of them then r belongs to the set R. + +Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i +is null} is either +belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can +infer that for any r from R +indexes a_i can be uniquely divided into two groups: one contains +indexes a_i where r belongs to +the sets rowid{a_i=v_i}, the other contains indexes a_j such that r +belongs to rowid{a_j is null}. + +Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order +needed for the merge procedure. We could use BTREE indexes for temp +table. But they are rather expensive and +take a lot of memory as the are implemented with RB trees. +I would suggest creating for each column from the temporary table just +an array of rowids sorted by the value from column a. +Index lookup in such an array is cheap. It's also rather cheap to check +that the next rowid refers to a row with a different value in column a. +The array can be created on demand. + +Other consideration that may be taken into account: +1. If columns a_j1,...,a_jm do not contain null values in the temporary +table at all, create for them only one index array (and of course do not +create any bitmaps for them). +2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows +where a_i is not null and V(a_i) is the number of distinct values for +a_i excluding nulls. + If d(a_i) is close to 1 then do not create any index array: check +whether there is a match running through the records that have been +filtered in. Anyway if d(a_i) is close to 1 then a intersection with +rowid{a_i=v_i} would not reduce the number of remaining rowids +significantly. + If additionally N-N' is small do not create a bitmap for this column +either. +3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a +sorted array of rowids from the set rowid{a_i is null} can be used +instead of a bitmap. +4. We always have a match if R0= INTERSECT rowid{a_i is null} is not +empty. Here i runs through all indexes from [1..n] such that v_i is not +null. For a given subset of columns this fact has to be checked only +once. It can be easily done with bitmap intersection. DESCRIPTION: The goal of this task is to implement efficient execution of NOT IN subquery predicates of the form: <oe_1,...,oe_n> NOT IN <non_correlated subquery> when either some oe_i, or some subqury result column contains NULLs. The problem with such predicates is that it is possible to use index lookups only when neither argument of the predicate contains NULLs. If some argument contains a NULL, then due to NULL semantics, it plays the role of a wildcard. If we were to use regular index lookups, then we would get 'no match' for some outer tuple (thus the predicate evaluates to FALSE), while the SQL semantics means 'partial match', and the predicate should evaluate to NULL. This task implements an efficient algorithm to compute such 'parial matches', where a NULL matches any value. HIGH-LEVEL SPECIFICATION: This a copy of the initial algorithm proposed by Igor: ====================================================== For each left side tuple (v_1,...,v_n) we have to find the following set of rowids for the temp table containing N rows as the result of materialization of the subquery: R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs trough all indexes from [1..n] such that v_i is not null. Bear in mind the following specifics of this intersection: (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint (2) For each i: rowid{a_i is null} is the same for each tuple Due to (2) it makes sense to build rowid{a_i is null} only once. A good representation for such sets would be bitmaps: - it requires minimum memory: not more than N*n bits in total - search of an element in a set is extremely cheap Taken all above into account I could suggest the following algorithm to build R: Using indexes (read about them below) for each column participating in the intersection, merge ordered sets rowid{a_i=v_i} in the following manner. If a rowid r has been encountered maximum in k sets rowid{a_i1=v_i1},...,rowid(a_ik=v_ik), then it has to be checked against all rowid{a_i=v_i} such that i is not in {i1,...,ik}. As soon as we fail to find r in one of these sets we discard it. If r has been found in all of them then r belongs to the set R. Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can infer that for any r from R indexes a_i can be uniquely divided into two groups: one contains indexes a_i where r belongs to the sets rowid{a_i=v_i}, the other contains indexes a_j such that r belongs to rowid{a_j is null}. Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order needed for the merge procedure. We could use BTREE indexes for temp table. But they are rather expensive and take a lot of memory as the are implemented with RB trees. I would suggest creating for each column from the temporary table just an array of rowids sorted by the value from column a. Index lookup in such an array is cheap. It's also rather cheap to check that the next rowid refers to a row with a different value in column a. The array can be created on demand. Other consideration that may be taken into account: 1. If columns a_j1,...,a_jm do not contain null values in the temporary table at all and v_j1,...,v_jm cannot be null, create for these columns only one index array (and of course do not create any bitmaps for them). 2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number of rows, where a_i is not null and V(a_i) is the number of distinct values for a_i excluding nulls. If d(a_i) is close to N'(a_i) then do not create any index array: check whether there is a match running through the records that have been filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection with rowid{a_i=v_i} will not reduce the number of remaining rowids significantly. In other words is V(a_i) exceeds some threshold there is no sense to create an index for a_i. If additionally N-N'(a_i) is small do not create a bitmap for this column either. 3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is small a sorted array of rowids from the set rowid{a_i is null} can be used instead of a bitmap. 4. We always have a match if R0= INTERSECT rowid{a_i is null} is not empty. Here i runs through all indexes from [1..n] such that v_i is not null. For a given subset of columns this fact has to be checked only once. It can be easily done with bitmap intersection. 5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be created only for rows with nulls. 6. If v1,...,vn never can be a null and number of rows with nulls is small do not create indexes and do not create bitmaps. 7. If you get a row with nulls in all columns stop filling the temporary table and return UNKNOWN for any tuple <v1,...,vn>. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Updated (by Timour): Subquery optimization: Efficient NOT IN execution with NULLs (68)
by worklog-noreply＠askmonty.org 04 Dec '09

04 Dec '09

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Subquery optimization: Efficient NOT IN execution with NULLs CREATION DATE..: Fri, 27 Nov 2009, 13:22 SUPERVISOR.....: Monty IMPLEMENTOR....: COPIES TO......: CATEGORY.......: Server-RawIdeaBin TASK ID........: 68 (http://askmonty.org/worklog/?tid=68) VERSION........: Benchmarks-3.0 STATUS.........: Un-Assigned PRIORITY.......: 60 WORKED HOURS...: 0 ESTIMATE.......: 0 (hours remain) ORIG. ESTIMATE.: 0 PROGRESS NOTES: -=-=(Timour - Fri, 04 Dec 2009, 11:26)=-=- High-Level Specification modified. --- /tmp/wklog.68.old.5182 2009-12-04 11:26:25.000000000 +0200 +++ /tmp/wklog.68.new.5182 2009-12-04 11:26:25.000000000 +0200 @@ -50,23 +50,39 @@ The array can be created on demand. Other consideration that may be taken into account: + 1. If columns a_j1,...,a_jm do not contain null values in the temporary -table at all, create for them only one index array (and of course do not -create any bitmaps for them). -2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows -where a_i is not null and V(a_i) is the number of distinct values for -a_i excluding nulls. - If d(a_i) is close to 1 then do not create any index array: check +table at all and v_j1,...,v_jm cannot be null, create for these columns +only one index array (and of course do not create any bitmaps for them). + +2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number +of rows, where a_i is not null and V(a_i) is the number of distinct +values for a_i excluding nulls. +If d(a_i) is close to N'(a_i) then do not create any index array: check whether there is a match running through the records that have been -filtered in. Anyway if d(a_i) is close to 1 then a intersection with -rowid{a_i=v_i} would not reduce the number of remaining rowids +filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection + with rowid{a_i=v_i} will not reduce the number of remaining rowids significantly. - If additionally N-N' is small do not create a bitmap for this column -either. -3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a -sorted array of rowids from the set rowid{a_i is null} can be used -instead of a bitmap. +In other words is V(a_i) exceeds some threshold there is no sense to +create an index for a_i. +If additionally N-N'(a_i) is small do not create a bitmap for this +column either. + +3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is +small a sorted array of rowids from the set rowid{a_i is null} can be +used instead of a bitmap. + 4. We always have a match if R0= INTERSECT rowid{a_i is null} is not empty. Here i runs through all indexes from [1..n] such that v_i is not null. For a given subset of columns this fact has to be checked only once. It can be easily done with bitmap intersection. + +5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be +created only for rows with nulls. + +6. If v1,...,vn never can be a null and number of rows with nulls is +small do not create indexes and do not create bitmaps. + +7. If you get a row with nulls in all columns stop filling the temporary +table and return UNKNOWN for any tuple <v1,...,vn>. + -=-=(Timour - Fri, 27 Nov 2009, 13:23)=-=- High-Level Specification modified. --- /tmp/wklog.68.old.17140 2009-11-27 11:23:17.000000000 +0000 +++ /tmp/wklog.68.new.17140 2009-11-27 11:23:17.000000000 +0000 @@ -1 +1,72 @@ +This a copy of the initial algorithm proposed by Igor: +====================================================== +For each left side tuple (v_1,...,v_n) we have to find the following set +of rowids for the temp table containing N rows as the result of +materialization of the subquery: + + R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs +trough all indexes from [1..n] such that v_i is not null. + +Bear in mind the following specifics of this intersection: + (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint + (2) For each i: rowid{a_i is null} is the same for each tuple + +Due to (2) it makes sense to build rowid{a_i is null} only once. +A good representation for such sets would be bitmaps: +- it requires minimum memory: not more than N*n bits in total +- search of an element in a set is extremely cheap + +Taken all above into account I could suggest the following algorithm to +build R: + + Using indexes (read about them below) for each column participating in the + intersection, + merge ordered sets rowid{a_i=v_i} in the following manner. + If a rowid r has been encountered maximum in k sets +rowid{a_i1=v_i1},...,rowid(a_ik=v_ik), + then it has to be checked against all rowid{a_i=v_i} such that i is +not in {i1,...,ik}. + As soon as we fail to find r in one of these sets we discard it. + If r has been found in all of them then r belongs to the set R. + +Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i +is null} is either +belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can +infer that for any r from R +indexes a_i can be uniquely divided into two groups: one contains +indexes a_i where r belongs to +the sets rowid{a_i=v_i}, the other contains indexes a_j such that r +belongs to rowid{a_j is null}. + +Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order +needed for the merge procedure. We could use BTREE indexes for temp +table. But they are rather expensive and +take a lot of memory as the are implemented with RB trees. +I would suggest creating for each column from the temporary table just +an array of rowids sorted by the value from column a. +Index lookup in such an array is cheap. It's also rather cheap to check +that the next rowid refers to a row with a different value in column a. +The array can be created on demand. + +Other consideration that may be taken into account: +1. If columns a_j1,...,a_jm do not contain null values in the temporary +table at all, create for them only one index array (and of course do not +create any bitmaps for them). +2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows +where a_i is not null and V(a_i) is the number of distinct values for +a_i excluding nulls. + If d(a_i) is close to 1 then do not create any index array: check +whether there is a match running through the records that have been +filtered in. Anyway if d(a_i) is close to 1 then a intersection with +rowid{a_i=v_i} would not reduce the number of remaining rowids +significantly. + If additionally N-N' is small do not create a bitmap for this column +either. +3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a +sorted array of rowids from the set rowid{a_i is null} can be used +instead of a bitmap. +4. We always have a match if R0= INTERSECT rowid{a_i is null} is not +empty. Here i runs through all indexes from [1..n] such that v_i is not +null. For a given subset of columns this fact has to be checked only +once. It can be easily done with bitmap intersection. DESCRIPTION: The goal of this task is to implement efficient execution of NOT IN subquery predicates of the form: <oe_1,...,oe_n> NOT IN <non_correlated subquery> when either some oe_i, or some subqury result column contains NULLs. The problem with such predicates is that it is possible to use index lookups only when neither argument of the predicate contains NULLs. If some argument contains a NULL, then due to NULL semantics, it plays the role of a wildcard. If we were to use regular index lookups, then we would get 'no match' for some outer tuple (thus the predicate evaluates to FALSE), while the SQL semantics means 'partial match', and the predicate should evaluate to NULL. This task implements an efficient algorithm to compute such 'parial matches', where a NULL matches any value. HIGH-LEVEL SPECIFICATION: This a copy of the initial algorithm proposed by Igor: ====================================================== For each left side tuple (v_1,...,v_n) we have to find the following set of rowids for the temp table containing N rows as the result of materialization of the subquery: R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs trough all indexes from [1..n] such that v_i is not null. Bear in mind the following specifics of this intersection: (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint (2) For each i: rowid{a_i is null} is the same for each tuple Due to (2) it makes sense to build rowid{a_i is null} only once. A good representation for such sets would be bitmaps: - it requires minimum memory: not more than N*n bits in total - search of an element in a set is extremely cheap Taken all above into account I could suggest the following algorithm to build R: Using indexes (read about them below) for each column participating in the intersection, merge ordered sets rowid{a_i=v_i} in the following manner. If a rowid r has been encountered maximum in k sets rowid{a_i1=v_i1},...,rowid(a_ik=v_ik), then it has to be checked against all rowid{a_i=v_i} such that i is not in {i1,...,ik}. As soon as we fail to find r in one of these sets we discard it. If r has been found in all of them then r belongs to the set R. Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can infer that for any r from R indexes a_i can be uniquely divided into two groups: one contains indexes a_i where r belongs to the sets rowid{a_i=v_i}, the other contains indexes a_j such that r belongs to rowid{a_j is null}. Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order needed for the merge procedure. We could use BTREE indexes for temp table. But they are rather expensive and take a lot of memory as the are implemented with RB trees. I would suggest creating for each column from the temporary table just an array of rowids sorted by the value from column a. Index lookup in such an array is cheap. It's also rather cheap to check that the next rowid refers to a row with a different value in column a. The array can be created on demand. Other consideration that may be taken into account: 1. If columns a_j1,...,a_jm do not contain null values in the temporary table at all and v_j1,...,v_jm cannot be null, create for these columns only one index array (and of course do not create any bitmaps for them). 2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number of rows, where a_i is not null and V(a_i) is the number of distinct values for a_i excluding nulls. If d(a_i) is close to N'(a_i) then do not create any index array: check whether there is a match running through the records that have been filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection with rowid{a_i=v_i} will not reduce the number of remaining rowids significantly. In other words is V(a_i) exceeds some threshold there is no sense to create an index for a_i. If additionally N-N'(a_i) is small do not create a bitmap for this column either. 3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is small a sorted array of rowids from the set rowid{a_i is null} can be used instead of a bitmap. 4. We always have a match if R0= INTERSECT rowid{a_i is null} is not empty. Here i runs through all indexes from [1..n] such that v_i is not null. For a given subset of columns this fact has to be checked only once. It can be easily done with bitmap intersection. 5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be created only for rows with nulls. 6. If v1,...,vn never can be a null and number of rows with nulls is small do not create indexes and do not create bitmaps. 7. If you get a row with nulls in all columns stop filling the temporary table and return UNKNOWN for any tuple <v1,...,vn>. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0

[Maria-developers] Updated (by Timour): Subquery optimization: Efficient NOT IN execution with NULLs (68)
by worklog-noreply＠askmonty.org 04 Dec '09

04 Dec '09

----------------------------------------------------------------------- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...........: Subquery optimization: Efficient NOT IN execution with NULLs CREATION DATE..: Fri, 27 Nov 2009, 13:22 SUPERVISOR.....: Monty IMPLEMENTOR....: COPIES TO......: CATEGORY.......: Server-RawIdeaBin TASK ID........: 68 (http://askmonty.org/worklog/?tid=68) VERSION........: Benchmarks-3.0 STATUS.........: Un-Assigned PRIORITY.......: 60 WORKED HOURS...: 0 ESTIMATE.......: 0 (hours remain) ORIG. ESTIMATE.: 0 PROGRESS NOTES: -=-=(Timour - Fri, 04 Dec 2009, 11:26)=-=- High-Level Specification modified. --- /tmp/wklog.68.old.5182 2009-12-04 11:26:25.000000000 +0200 +++ /tmp/wklog.68.new.5182 2009-12-04 11:26:25.000000000 +0200 @@ -50,23 +50,39 @@ The array can be created on demand. Other consideration that may be taken into account: + 1. If columns a_j1,...,a_jm do not contain null values in the temporary -table at all, create for them only one index array (and of course do not -create any bitmaps for them). -2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows -where a_i is not null and V(a_i) is the number of distinct values for -a_i excluding nulls. - If d(a_i) is close to 1 then do not create any index array: check +table at all and v_j1,...,v_jm cannot be null, create for these columns +only one index array (and of course do not create any bitmaps for them). + +2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number +of rows, where a_i is not null and V(a_i) is the number of distinct +values for a_i excluding nulls. +If d(a_i) is close to N'(a_i) then do not create any index array: check whether there is a match running through the records that have been -filtered in. Anyway if d(a_i) is close to 1 then a intersection with -rowid{a_i=v_i} would not reduce the number of remaining rowids +filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection + with rowid{a_i=v_i} will not reduce the number of remaining rowids significantly. - If additionally N-N' is small do not create a bitmap for this column -either. -3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a -sorted array of rowids from the set rowid{a_i is null} can be used -instead of a bitmap. +In other words is V(a_i) exceeds some threshold there is no sense to +create an index for a_i. +If additionally N-N'(a_i) is small do not create a bitmap for this +column either. + +3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is +small a sorted array of rowids from the set rowid{a_i is null} can be +used instead of a bitmap. + 4. We always have a match if R0= INTERSECT rowid{a_i is null} is not empty. Here i runs through all indexes from [1..n] such that v_i is not null. For a given subset of columns this fact has to be checked only once. It can be easily done with bitmap intersection. + +5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be +created only for rows with nulls. + +6. If v1,...,vn never can be a null and number of rows with nulls is +small do not create indexes and do not create bitmaps. + +7. If you get a row with nulls in all columns stop filling the temporary +table and return UNKNOWN for any tuple <v1,...,vn>. + -=-=(Timour - Fri, 27 Nov 2009, 13:23)=-=- High-Level Specification modified. --- /tmp/wklog.68.old.17140 2009-11-27 11:23:17.000000000 +0000 +++ /tmp/wklog.68.new.17140 2009-11-27 11:23:17.000000000 +0000 @@ -1 +1,72 @@ +This a copy of the initial algorithm proposed by Igor: +====================================================== +For each left side tuple (v_1,...,v_n) we have to find the following set +of rowids for the temp table containing N rows as the result of +materialization of the subquery: + + R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs +trough all indexes from [1..n] such that v_i is not null. + +Bear in mind the following specifics of this intersection: + (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint + (2) For each i: rowid{a_i is null} is the same for each tuple + +Due to (2) it makes sense to build rowid{a_i is null} only once. +A good representation for such sets would be bitmaps: +- it requires minimum memory: not more than N*n bits in total +- search of an element in a set is extremely cheap + +Taken all above into account I could suggest the following algorithm to +build R: + + Using indexes (read about them below) for each column participating in the + intersection, + merge ordered sets rowid{a_i=v_i} in the following manner. + If a rowid r has been encountered maximum in k sets +rowid{a_i1=v_i1},...,rowid(a_ik=v_ik), + then it has to be checked against all rowid{a_i=v_i} such that i is +not in {i1,...,ik}. + As soon as we fail to find r in one of these sets we discard it. + If r has been found in all of them then r belongs to the set R. + +Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i +is null} is either +belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can +infer that for any r from R +indexes a_i can be uniquely divided into two groups: one contains +indexes a_i where r belongs to +the sets rowid{a_i=v_i}, the other contains indexes a_j such that r +belongs to rowid{a_j is null}. + +Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order +needed for the merge procedure. We could use BTREE indexes for temp +table. But they are rather expensive and +take a lot of memory as the are implemented with RB trees. +I would suggest creating for each column from the temporary table just +an array of rowids sorted by the value from column a. +Index lookup in such an array is cheap. It's also rather cheap to check +that the next rowid refers to a row with a different value in column a. +The array can be created on demand. + +Other consideration that may be taken into account: +1. If columns a_j1,...,a_jm do not contain null values in the temporary +table at all, create for them only one index array (and of course do not +create any bitmaps for them). +2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows +where a_i is not null and V(a_i) is the number of distinct values for +a_i excluding nulls. + If d(a_i) is close to 1 then do not create any index array: check +whether there is a match running through the records that have been +filtered in. Anyway if d(a_i) is close to 1 then a intersection with +rowid{a_i=v_i} would not reduce the number of remaining rowids +significantly. + If additionally N-N' is small do not create a bitmap for this column +either. +3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a +sorted array of rowids from the set rowid{a_i is null} can be used +instead of a bitmap. +4. We always have a match if R0= INTERSECT rowid{a_i is null} is not +empty. Here i runs through all indexes from [1..n] such that v_i is not +null. For a given subset of columns this fact has to be checked only +once. It can be easily done with bitmap intersection. DESCRIPTION: The goal of this task is to implement efficient execution of NOT IN subquery predicates of the form: <oe_1,...,oe_n> NOT IN <non_correlated subquery> when either some oe_i, or some subqury result column contains NULLs. The problem with such predicates is that it is possible to use index lookups only when neither argument of the predicate contains NULLs. If some argument contains a NULL, then due to NULL semantics, it plays the role of a wildcard. If we were to use regular index lookups, then we would get 'no match' for some outer tuple (thus the predicate evaluates to FALSE), while the SQL semantics means 'partial match', and the predicate should evaluate to NULL. This task implements an efficient algorithm to compute such 'parial matches', where a NULL matches any value. HIGH-LEVEL SPECIFICATION: This a copy of the initial algorithm proposed by Igor: ====================================================== For each left side tuple (v_1,...,v_n) we have to find the following set of rowids for the temp table containing N rows as the result of materialization of the subquery: R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs trough all indexes from [1..n] such that v_i is not null. Bear in mind the following specifics of this intersection: (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint (2) For each i: rowid{a_i is null} is the same for each tuple Due to (2) it makes sense to build rowid{a_i is null} only once. A good representation for such sets would be bitmaps: - it requires minimum memory: not more than N*n bits in total - search of an element in a set is extremely cheap Taken all above into account I could suggest the following algorithm to build R: Using indexes (read about them below) for each column participating in the intersection, merge ordered sets rowid{a_i=v_i} in the following manner. If a rowid r has been encountered maximum in k sets rowid{a_i1=v_i1},...,rowid(a_ik=v_ik), then it has to be checked against all rowid{a_i=v_i} such that i is not in {i1,...,ik}. As soon as we fail to find r in one of these sets we discard it. If r has been found in all of them then r belongs to the set R. Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can infer that for any r from R indexes a_i can be uniquely divided into two groups: one contains indexes a_i where r belongs to the sets rowid{a_i=v_i}, the other contains indexes a_j such that r belongs to rowid{a_j is null}. Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order needed for the merge procedure. We could use BTREE indexes for temp table. But they are rather expensive and take a lot of memory as the are implemented with RB trees. I would suggest creating for each column from the temporary table just an array of rowids sorted by the value from column a. Index lookup in such an array is cheap. It's also rather cheap to check that the next rowid refers to a row with a different value in column a. The array can be created on demand. Other consideration that may be taken into account: 1. If columns a_j1,...,a_jm do not contain null values in the temporary table at all and v_j1,...,v_jm cannot be null, create for these columns only one index array (and of course do not create any bitmaps for them). 2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number of rows, where a_i is not null and V(a_i) is the number of distinct values for a_i excluding nulls. If d(a_i) is close to N'(a_i) then do not create any index array: check whether there is a match running through the records that have been filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection with rowid{a_i=v_i} will not reduce the number of remaining rowids significantly. In other words is V(a_i) exceeds some threshold there is no sense to create an index for a_i. If additionally N-N'(a_i) is small do not create a bitmap for this column either. 3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is small a sorted array of rowids from the set rowid{a_i is null} can be used instead of a bitmap. 4. We always have a match if R0= INTERSECT rowid{a_i is null} is not empty. Here i runs through all indexes from [1..n] such that v_i is not null. For a given subset of columns this fact has to be checked only once. It can be easily done with bitmap intersection. 5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be created only for rows with nulls. 6. If v1,...,vn never can be a null and number of rows with nulls is small do not create indexes and do not create bitmaps. 7. If you get a row with nulls in all columns stop filling the temporary table and return UNKNOWN for any tuple <v1,...,vn>. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE ----------------------------------------------------------------------- WorkLog (v3.5.9)

1 0