developers
Threads by month
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
April 2010
- 22 participants
- 164 discussions
[Maria-developers] Progress (by Knielsen): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 06 Apr '10
by worklog-noreply@askmonty.org 06 Apr '10
06 Apr '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Knielsen, Serg
CATEGORY.......: Server-Sprint
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: In-Progress
PRIORITY.......: 60
WORKED HOURS...: 47
ESTIMATE.......: 8 (hours remain)
ORIG. ESTIMATE.: 35
PROGRESS NOTES:
-=-=(Knielsen - Tue, 06 Apr 2010, 15:26)=-=-
Code review (mailed to maria-developers@).
Worked 7 hours and estimate 8 hours remain (original estimate unchanged).
-=-=(Knielsen - Tue, 06 Apr 2010, 15:25)=-=-
Status updated.
--- /tmp/wklog.47.old.12734 2010-04-06 15:25:54.000000000 +0000
+++ /tmp/wklog.47.new.12734 2010-04-06 15:25:54.000000000 +0000
@@ -1 +1 @@
-Code-Review
+In-Progress
-=-=(Knielsen - Mon, 29 Mar 2010, 10:59)=-=-
Status updated.
--- /tmp/wklog.47.old.27790 2010-03-29 10:59:53.000000000 +0000
+++ /tmp/wklog.47.new.27790 2010-03-29 10:59:53.000000000 +0000
@@ -1 +1 @@
-In-Progress
+Code-Review
-=-=(Alexi - Thu, 18 Feb 2010, 19:29)=-=-
Worked 20 hours (alexi)
Worked 20 hours and estimate 15 hours remain (original estimate unchanged).
-=-=(Serg - Fri, 05 Feb 2010, 14:04)=-=-
Observers changed: Knielsen,Serg
-=-=(Guest - Fri, 05 Feb 2010, 13:40)=-=-
Category updated.
--- /tmp/wklog.47.old.9197 2010-02-05 13:40:36.000000000 +0200
+++ /tmp/wklog.47.new.9197 2010-02-05 13:40:36.000000000 +0200
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-Sprint
-=-=(Guest - Fri, 05 Feb 2010, 13:40)=-=-
Status updated.
--- /tmp/wklog.47.old.9197 2010-02-05 13:40:36.000000000 +0200
+++ /tmp/wklog.47.new.9197 2010-02-05 13:40:36.000000000 +0200
@@ -1 +1 @@
-Un-Assigned
+In-Progress
-=-=(Alexi - Thu, 04 Feb 2010, 09:54)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.16174 2010-02-04 09:54:13.000000000 +0200
+++ /tmp/wklog.47.new.16174 2010-02-04 09:54:13.000000000 +0200
@@ -171,35 +171,20 @@
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-When requesting an event, the slave should inform the master whether
-it should send Annotate_rows events or not. To that end we add a new
-BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
+If the replicate-annotate-rows-events option is not set on a slave, there
+is no need for master to send Annotate_rows events to this slave. The slave
+(or mysqlbinlog in remote case), before requesting binlog dump via the
+COM_BINLOG_DUMP command, informs the master whether it should send these
+events by executing the newly added COM_BINLOG_DUMP_OPTIONS_EXT server
+command:
+
+ case COM_BINLOG_DUMP_OPTIONS_EXT:
+ thd->binlog_dump_flags_ext= packet[0];
+ my_ok(thd);
+ break;
- #define BINLOG_DUMP_NON_BLOCK 1
- #define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
-
- pthread_handler_t handle_slave_io(void *arg)
- { ...
- request_dump(mysql, ...);
- ...
- }
-
- int request_dump(MYSQL* mysql, ...)
- { ...
- if (opt_log_slave_updates &&
- mi->io_thd->variables.binlog_annotate_rows_events)
- binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
- ...
- int2store(buf + 4, binlog_flags);
- ...
- simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
- ...
- }
-
-NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
-simple_command() function, should also use this flag if it wants (in case
-of the --print-annotate-rows-events option set) to recieve Annotate_rows
-events.
+Note. We add this new command and don't use COM_BINLOG_DUMP to avoid possible
+conflicts with MySQL/Sun.
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -338,10 +323,4 @@
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
-Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
-flag taking into account that MySQL/Sun may also introduce a flag with the
-same value to be used in the request_dump-mysql_binlog_send interface.
-But this is mainly the question of merging: if a conflict concerning this
-flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
-(this does not require additional changes in the code).
-=-=(Alexi - Sun, 20 Dec 2009, 16:00)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.19667 2009-12-20 14:00:56.000000000 +0000
+++ /tmp/wklog.47.new.19667 2009-12-20 14:00:56.000000000 +0000
@@ -196,6 +196,11 @@
...
}
+NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
+simple_command() function, should also use this flag if it wants (in case
+of the --print-annotate-rows-events option set) to recieve Annotate_rows
+events.
+
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -212,8 +217,7 @@
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
- flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
- thd->server_id == 0 /* slave == mysqlbinlog */ )
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
-=-=(Alexi - Sun, 20 Dec 2009, 13:14)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.11350 2009-12-20 13:14:04.000000000 +0200
+++ /tmp/wklog.47.new.11350 2009-12-20 13:14:04.000000000 +0200
@@ -282,23 +282,18 @@
Annotate_rows_log_event* m_annotate_event;
};
-When the saved Annotate_rows object may be deleted? When all corresponding
-Rows events will be processed, i.e. before processing the first non-Rows
-event (note that Annotate_rows object resides in the binary log *after*
-the (possible) 'BEGIN' Query event which accompanies the rows events; note
-also that this deletion is adjusted with the case when some or all
-corresponding Rows events are filtered out by replicate filter rules):
+The saved Annotate_rows object should be deleted when all corresponding
+Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
- if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
- rli->free_annotate_event();
-
apply_event_and_update_pos(ev, ...);
- if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ if (rli->get_annotate_event() && is_last_rows_event(ev))
+ rli->free_annotate_event();
+ else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
@@ -307,10 +302,21 @@
where
- #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
- (type) == WRITE_ROWS_EVENT || \
+ bool is_last_rows_event(Log_event* ev)
+ {
+ Log_event_type type= ev->get_type_code();
+ if (IS_ROWS_EVENT_TYPE(type))
+ {
+ Rows_log_event* rows= (Rows_log_event*)ev;
+ return rows->get_flags(Rows_log_event::STMT_END_F);
+ }
+
+ return 0;
+ }
+
+ #define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
- (type) == DELETE_ROWS_EVENT )
+ (type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
------------------------------------------------------------
-=-=(View All Progress Notes, 27 total)=-=-
http://askmonty.org/worklog/index.pl?tid=47&nolimit=1
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
HIGH-LEVEL SPECIFICATION:
Content
~~~~~~~
1. Annotate_rows_log_event
2. Server option: --binlog-annotate-rows-events
3. Server option: --replicate-annotate-rows-events
4. mysqlbinlog option: --print-annotate-rows-events
5. mysqlbinlog output
1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Describes the query which caused the corresponding rows events. Has empty
post-header and contains the query text in its data part. Example:
************************
ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
00000225 | 64 00 00 00 | server_id = 100
00000229 | 36 00 00 00 | event_len = 54
0000022D | 56 02 00 00 | log_pos = 00000256
00000231 | 00 00 | flags = <none>
------------------------
00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
00000237 | 52 54 20 49 |
0000023B | 4E 54 4F 20 |
0000023F | 74 31 20 56 |
00000243 | 41 4C 55 45 |
00000247 | 53 20 28 31 |
0000024B | 29 2C 20 28 |
0000024F | 32 29 2C 20 |
00000253 | 28 33 29 |
************************
In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
and precedes the first of Table map events which accompany the corresponding
rows events. (See example in the "mysqlbinlog output" section below.)
2. Server option: --binlog-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the master to write Annotate_rows events to the binary log.
* Variable Name: binlog_annotate_rows_events
* Scope: Global & Session
* Access Type: Dynamic
* Data Type: bool
* Default Value: OFF
NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
3. Server option: --replicate-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the slave to reproduce Annotate_rows events recieved from the master
in its own binary log (sensible only in pair with log-slave-updates option).
* Variable Name: replicate_annotate_rows_events
* Scope: Global
* Access Type: Read only
* Data Type: bool
* Default Value: OFF
NOTE. Why do we additionally need this 'replicate' option? Why not to make
the slave to reproduce this events when its binlog-annotate-rows-events
global value is ON? Well, because, for example, we may want to configure
the slave which should reproduce Annotate_rows events but has global
binlog-annotate-rows-events = OFF meaning this to be the default value for
the client threads (see also "How slave treats replicate-annotate-rows-events
option" in LLD part).
4. mysqlbinlog option: --print-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With this option, mysqlbinlog prints the content of Annotate_rows events (if
the binary log does contain them). Without this option (i.e. by default),
mysqlbinlog skips Annotate_rows events.
5. mysqlbinlog output
~~~~~~~~~~~~~~~~~~~~~
With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
in a form like this:
...
# at 1646
#091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
exec_time=0 error_code=0
SET TIMESTAMP=1261215926/*!*/;
BEGIN
/*!*/;
# at 1714
# at 1812
# at 1853
# at 1894
# at 1938
#091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
#091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
mapped to number 16
#091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
mapped to number 17
#091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
#091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
...
LOW-LEVEL DESIGN:
Content
~~~~~~~
1. Annotate_rows event number
2. Outline of Annotate_rows event behavior
3. How Master writes Annotate_rows events to the binary log
4. How slave treats replicate-annotate-rows-events option
5. How slave IO thread requests Annotate_rows events
6. How master executes the request
7. How slave SQL thread processes Annotate_rows events
8. General remarks
1. Annotate_rows event number
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
between the last MySQL event number and the Annotate_rows event number:
enum Log_event_type
{ ...
INCIDENT_EVENT= 26,
// New MySQL event numbers are to be added here
MYSQL_EVENTS_END,
MARIA_EVENTS_BEGIN= 51,
// New Maria event numbers start from here
ANNOTATE_ROWS_EVENT= 51,
ENUM_END_EVENT
};
together with the corresponding extension of 'post_header_len' array in the
Format description event. (This extension does not affect the compatibility
of the binary log). Here is how Format description event looks like with
this extension:
************************
FORMAT_DESCRIPTION_EVENT
************************
00000004 | A1 A0 2C 4B | time_when = 1261215905
00000008 | 0F | event_type = 15
00000009 | 64 00 00 00 | server_id = 100
0000000D | 7F 00 00 00 | event_len = 127
00000011 | 83 00 00 00 | log_pos = 00000083
00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
------------------------
00000017 | 04 00 | binlog_ver = 4
00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
..... ...
0000004B | A1 A0 2C 4B | time_created = 1261215905
0000004F | 13 | common_header_len = 19
------------------------
post_header_len
------------------------
00000050 | 38 | 56 - START_EVENT_V3 [1]
..... ...
00000069 | 02 | 2 - INCIDENT_EVENT [26]
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each Annotate_rows_log_event object has two private members describing the
corresponding query:
char *m_query_txt;
uint m_query_len;
When the object is created for writing to a binary log, this query is taken
from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
as well as other implementation details):
Annotate_rows_log_event(THD *thd)
{
m_query_txt = thd->query();
m_query_len = thd->query_length();
}
When the object is read from a binary log, the query is taken from the buffer
containing the binary log representation of the event (this buffer is allocated
in Log_event object from which all Log events are derived):
Annotate_rows_log_event(char *buf, uint event_len,
Format_description_log_event *desc)
{
m_query_len = event_len - desc->common_header_len;
m_query_txt = buf + desc->common_header_len;
}
The events are written to the binary log by the Log_event::write() member
which calls virtual write_data_header() and write_data_body() members
("data header" and "post header" are synonym in replication terminology).
In our case, data header is empty and data body is just the query:
bool write_data_body(IO_CACHE *file)
{
return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
}
Printing the event is just printing the query:
void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
{
my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
}
3. How Master writes Annotate_rows events to the binary log
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The event is written to the binary log just before the group of Table_map
events which precede corresponding Rows events (one query may generate
several Table map events in the binary log, but the corresponding
Annotate_rows event must be written only once before the first Table map
event; hence the boolean variable 'with_annotate' below):
int write_locked_table_maps(THD *thd)
{ ...
bool with_annotate= thd->variables.binlog_annotate_rows_events;
...
for (uint i= 0; i < ... <number of tables> ...; ++i)
{ ...
thd->binlog_write_table_map(table, ..., with_annotate);
with_annotate= 0; // write Annotate_event not more than once
...
}
...
}
int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
{ ...
Table_map_log_event the_event(...);
...
if (with_annotate)
{
Annotate_rows_log_event anno(this);
mysql_bin_log.write(&anno);
}
mysql_bin_log.write(&the_event);
...
}
4. How slave treats replicate-annotate-rows-events option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The replicate-annotate-rows-events option is treated just as the session
value of the binlog_annotate_rows_events variable for the slave IO and
SQL threads. This setting is done during initialization of these threads:
pthread_handler_t handle_slave_io(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_IO);
...
}
pthread_handler_t handle_slave_sql(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_SQL);
...
}
int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
{ ...
thd->variables.binlog_annotate_rows_events=
opt_replicate_annotate_rows_events;
...
}
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If the replicate-annotate-rows-events option is not set on a slave, there
is no need for master to send Annotate_rows events to this slave. The slave
(or mysqlbinlog in remote case), before requesting binlog dump via the
COM_BINLOG_DUMP command, informs the master whether it should send these
events by executing the newly added COM_BINLOG_DUMP_OPTIONS_EXT server
command:
case COM_BINLOG_DUMP_OPTIONS_EXT:
thd->binlog_dump_flags_ext= packet[0];
my_ok(thd);
break;
Note. We add this new command and don't use COM_BINLOG_DUMP to avoid possible
conflicts with MySQL/Sun.
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
case COM_BINLOG_DUMP:
{ ...
flags= uint2korr(packet + 4);
...
mysql_binlog_send(thd, ..., flags);
...
}
void mysql_binlog_send(THD* thd, ..., ushort flags)
{ ...
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
...
}
7. How slave SQL thread processes Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The slave processes each recieved event by "applying" it, i.e. by
calling the Log_event::apply_event() function which in turn calls
the virtual do_apply_event() member specific for each type of the
event.
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev = next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
int apply_event_and_update_pos(Log_event *ev, ...)
{ ...
ev->apply_event(...);
...
}
int Log_event::apply_event(...)
{
return do_apply_event(...);
}
What does it mean to "apply" an Annotate_rows event? It means to set current
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
thd->set_query(m_query_txt, m_query_len);
}
NOTE. I am not sure, but possibly current values of thd->query and
thd->query_length should be saved before calling set_query() and to be
restored on the Annotate_rows_log_event object deletion.
Is it really needed ?
After calling this do_apply_event() function we may not delete the
Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
above) because thd->query now points to the string inside this object.
We may keep the pointer to this object in the Relay_log_info:
class Relay_log_info
{
public:
...
void set_annotate_event(Annotate_rows_log_event*);
Annotate_rows_log_event* get_annotate_event();
void free_annotate_event();
...
private:
Annotate_rows_log_event* m_annotate_event;
};
The saved Annotate_rows object should be deleted when all corresponding
Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (rli->get_annotate_event() && is_last_rows_event(ev))
rli->free_annotate_event();
else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
where
bool is_last_rows_event(Log_event* ev)
{
Log_event_type type= ev->get_type_code();
if (IS_ROWS_EVENT_TYPE(type))
{
Rows_log_event* rows= (Rows_log_event*)ev;
return rows->get_flags(Rows_log_event::STMT_END_F);
}
return 0;
}
#define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
(type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
Kristian noticed that introducing new log event type should be coordinated
somehow with MySQL/Sun:
Kristian: The numeric code for this event must be assigned carefully.
It should be coordinated with MySQL/Sun, otherwise we can get into a
situation where MySQL uses the same numeric code for one event that
MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
impossible.
Alex: I reserved about 20 numbers not to have possible conflicts
with MySQL.
Kristian: Still, I think it would be appropriate to send a polite email
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Knielsen): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 06 Apr '10
by worklog-noreply@askmonty.org 06 Apr '10
06 Apr '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Knielsen, Serg
CATEGORY.......: Server-Sprint
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: In-Progress
PRIORITY.......: 60
WORKED HOURS...: 40
ESTIMATE.......: 15 (hours remain)
ORIG. ESTIMATE.: 35
PROGRESS NOTES:
-=-=(Knielsen - Tue, 06 Apr 2010, 15:25)=-=-
Status updated.
--- /tmp/wklog.47.old.12734 2010-04-06 15:25:54.000000000 +0000
+++ /tmp/wklog.47.new.12734 2010-04-06 15:25:54.000000000 +0000
@@ -1 +1 @@
-Code-Review
+In-Progress
-=-=(Knielsen - Mon, 29 Mar 2010, 10:59)=-=-
Status updated.
--- /tmp/wklog.47.old.27790 2010-03-29 10:59:53.000000000 +0000
+++ /tmp/wklog.47.new.27790 2010-03-29 10:59:53.000000000 +0000
@@ -1 +1 @@
-In-Progress
+Code-Review
-=-=(Alexi - Thu, 18 Feb 2010, 19:29)=-=-
Worked 20 hours (alexi)
Worked 20 hours and estimate 15 hours remain (original estimate unchanged).
-=-=(Serg - Fri, 05 Feb 2010, 14:04)=-=-
Observers changed: Knielsen,Serg
-=-=(Guest - Fri, 05 Feb 2010, 13:40)=-=-
Category updated.
--- /tmp/wklog.47.old.9197 2010-02-05 13:40:36.000000000 +0200
+++ /tmp/wklog.47.new.9197 2010-02-05 13:40:36.000000000 +0200
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-Sprint
-=-=(Guest - Fri, 05 Feb 2010, 13:40)=-=-
Status updated.
--- /tmp/wklog.47.old.9197 2010-02-05 13:40:36.000000000 +0200
+++ /tmp/wklog.47.new.9197 2010-02-05 13:40:36.000000000 +0200
@@ -1 +1 @@
-Un-Assigned
+In-Progress
-=-=(Alexi - Thu, 04 Feb 2010, 09:54)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.16174 2010-02-04 09:54:13.000000000 +0200
+++ /tmp/wklog.47.new.16174 2010-02-04 09:54:13.000000000 +0200
@@ -171,35 +171,20 @@
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-When requesting an event, the slave should inform the master whether
-it should send Annotate_rows events or not. To that end we add a new
-BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
+If the replicate-annotate-rows-events option is not set on a slave, there
+is no need for master to send Annotate_rows events to this slave. The slave
+(or mysqlbinlog in remote case), before requesting binlog dump via the
+COM_BINLOG_DUMP command, informs the master whether it should send these
+events by executing the newly added COM_BINLOG_DUMP_OPTIONS_EXT server
+command:
+
+ case COM_BINLOG_DUMP_OPTIONS_EXT:
+ thd->binlog_dump_flags_ext= packet[0];
+ my_ok(thd);
+ break;
- #define BINLOG_DUMP_NON_BLOCK 1
- #define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
-
- pthread_handler_t handle_slave_io(void *arg)
- { ...
- request_dump(mysql, ...);
- ...
- }
-
- int request_dump(MYSQL* mysql, ...)
- { ...
- if (opt_log_slave_updates &&
- mi->io_thd->variables.binlog_annotate_rows_events)
- binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
- ...
- int2store(buf + 4, binlog_flags);
- ...
- simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
- ...
- }
-
-NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
-simple_command() function, should also use this flag if it wants (in case
-of the --print-annotate-rows-events option set) to recieve Annotate_rows
-events.
+Note. We add this new command and don't use COM_BINLOG_DUMP to avoid possible
+conflicts with MySQL/Sun.
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -338,10 +323,4 @@
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
-Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
-flag taking into account that MySQL/Sun may also introduce a flag with the
-same value to be used in the request_dump-mysql_binlog_send interface.
-But this is mainly the question of merging: if a conflict concerning this
-flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
-(this does not require additional changes in the code).
-=-=(Alexi - Sun, 20 Dec 2009, 16:00)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.19667 2009-12-20 14:00:56.000000000 +0000
+++ /tmp/wklog.47.new.19667 2009-12-20 14:00:56.000000000 +0000
@@ -196,6 +196,11 @@
...
}
+NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
+simple_command() function, should also use this flag if it wants (in case
+of the --print-annotate-rows-events option set) to recieve Annotate_rows
+events.
+
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -212,8 +217,7 @@
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
- flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
- thd->server_id == 0 /* slave == mysqlbinlog */ )
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
-=-=(Alexi - Sun, 20 Dec 2009, 13:14)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.11350 2009-12-20 13:14:04.000000000 +0200
+++ /tmp/wklog.47.new.11350 2009-12-20 13:14:04.000000000 +0200
@@ -282,23 +282,18 @@
Annotate_rows_log_event* m_annotate_event;
};
-When the saved Annotate_rows object may be deleted? When all corresponding
-Rows events will be processed, i.e. before processing the first non-Rows
-event (note that Annotate_rows object resides in the binary log *after*
-the (possible) 'BEGIN' Query event which accompanies the rows events; note
-also that this deletion is adjusted with the case when some or all
-corresponding Rows events are filtered out by replicate filter rules):
+The saved Annotate_rows object should be deleted when all corresponding
+Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
- if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
- rli->free_annotate_event();
-
apply_event_and_update_pos(ev, ...);
- if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ if (rli->get_annotate_event() && is_last_rows_event(ev))
+ rli->free_annotate_event();
+ else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
@@ -307,10 +302,21 @@
where
- #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
- (type) == WRITE_ROWS_EVENT || \
+ bool is_last_rows_event(Log_event* ev)
+ {
+ Log_event_type type= ev->get_type_code();
+ if (IS_ROWS_EVENT_TYPE(type))
+ {
+ Rows_log_event* rows= (Rows_log_event*)ev;
+ return rows->get_flags(Rows_log_event::STMT_END_F);
+ }
+
+ return 0;
+ }
+
+ #define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
- (type) == DELETE_ROWS_EVENT )
+ (type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
-=-=(Alexi - Sun, 20 Dec 2009, 09:29)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.32726 2009-12-20 07:29:56.000000000 +0000
+++ /tmp/wklog.47.new.32726 2009-12-20 07:29:56.000000000 +0000
@@ -56,7 +56,7 @@
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
- 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ 00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
------------------------------------------------------------
-=-=(View All Progress Notes, 26 total)=-=-
http://askmonty.org/worklog/index.pl?tid=47&nolimit=1
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
HIGH-LEVEL SPECIFICATION:
Content
~~~~~~~
1. Annotate_rows_log_event
2. Server option: --binlog-annotate-rows-events
3. Server option: --replicate-annotate-rows-events
4. mysqlbinlog option: --print-annotate-rows-events
5. mysqlbinlog output
1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Describes the query which caused the corresponding rows events. Has empty
post-header and contains the query text in its data part. Example:
************************
ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
00000225 | 64 00 00 00 | server_id = 100
00000229 | 36 00 00 00 | event_len = 54
0000022D | 56 02 00 00 | log_pos = 00000256
00000231 | 00 00 | flags = <none>
------------------------
00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
00000237 | 52 54 20 49 |
0000023B | 4E 54 4F 20 |
0000023F | 74 31 20 56 |
00000243 | 41 4C 55 45 |
00000247 | 53 20 28 31 |
0000024B | 29 2C 20 28 |
0000024F | 32 29 2C 20 |
00000253 | 28 33 29 |
************************
In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
and precedes the first of Table map events which accompany the corresponding
rows events. (See example in the "mysqlbinlog output" section below.)
2. Server option: --binlog-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the master to write Annotate_rows events to the binary log.
* Variable Name: binlog_annotate_rows_events
* Scope: Global & Session
* Access Type: Dynamic
* Data Type: bool
* Default Value: OFF
NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
3. Server option: --replicate-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the slave to reproduce Annotate_rows events recieved from the master
in its own binary log (sensible only in pair with log-slave-updates option).
* Variable Name: replicate_annotate_rows_events
* Scope: Global
* Access Type: Read only
* Data Type: bool
* Default Value: OFF
NOTE. Why do we additionally need this 'replicate' option? Why not to make
the slave to reproduce this events when its binlog-annotate-rows-events
global value is ON? Well, because, for example, we may want to configure
the slave which should reproduce Annotate_rows events but has global
binlog-annotate-rows-events = OFF meaning this to be the default value for
the client threads (see also "How slave treats replicate-annotate-rows-events
option" in LLD part).
4. mysqlbinlog option: --print-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With this option, mysqlbinlog prints the content of Annotate_rows events (if
the binary log does contain them). Without this option (i.e. by default),
mysqlbinlog skips Annotate_rows events.
5. mysqlbinlog output
~~~~~~~~~~~~~~~~~~~~~
With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
in a form like this:
...
# at 1646
#091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
exec_time=0 error_code=0
SET TIMESTAMP=1261215926/*!*/;
BEGIN
/*!*/;
# at 1714
# at 1812
# at 1853
# at 1894
# at 1938
#091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
#091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
mapped to number 16
#091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
mapped to number 17
#091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
#091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
...
LOW-LEVEL DESIGN:
Content
~~~~~~~
1. Annotate_rows event number
2. Outline of Annotate_rows event behavior
3. How Master writes Annotate_rows events to the binary log
4. How slave treats replicate-annotate-rows-events option
5. How slave IO thread requests Annotate_rows events
6. How master executes the request
7. How slave SQL thread processes Annotate_rows events
8. General remarks
1. Annotate_rows event number
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
between the last MySQL event number and the Annotate_rows event number:
enum Log_event_type
{ ...
INCIDENT_EVENT= 26,
// New MySQL event numbers are to be added here
MYSQL_EVENTS_END,
MARIA_EVENTS_BEGIN= 51,
// New Maria event numbers start from here
ANNOTATE_ROWS_EVENT= 51,
ENUM_END_EVENT
};
together with the corresponding extension of 'post_header_len' array in the
Format description event. (This extension does not affect the compatibility
of the binary log). Here is how Format description event looks like with
this extension:
************************
FORMAT_DESCRIPTION_EVENT
************************
00000004 | A1 A0 2C 4B | time_when = 1261215905
00000008 | 0F | event_type = 15
00000009 | 64 00 00 00 | server_id = 100
0000000D | 7F 00 00 00 | event_len = 127
00000011 | 83 00 00 00 | log_pos = 00000083
00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
------------------------
00000017 | 04 00 | binlog_ver = 4
00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
..... ...
0000004B | A1 A0 2C 4B | time_created = 1261215905
0000004F | 13 | common_header_len = 19
------------------------
post_header_len
------------------------
00000050 | 38 | 56 - START_EVENT_V3 [1]
..... ...
00000069 | 02 | 2 - INCIDENT_EVENT [26]
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each Annotate_rows_log_event object has two private members describing the
corresponding query:
char *m_query_txt;
uint m_query_len;
When the object is created for writing to a binary log, this query is taken
from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
as well as other implementation details):
Annotate_rows_log_event(THD *thd)
{
m_query_txt = thd->query();
m_query_len = thd->query_length();
}
When the object is read from a binary log, the query is taken from the buffer
containing the binary log representation of the event (this buffer is allocated
in Log_event object from which all Log events are derived):
Annotate_rows_log_event(char *buf, uint event_len,
Format_description_log_event *desc)
{
m_query_len = event_len - desc->common_header_len;
m_query_txt = buf + desc->common_header_len;
}
The events are written to the binary log by the Log_event::write() member
which calls virtual write_data_header() and write_data_body() members
("data header" and "post header" are synonym in replication terminology).
In our case, data header is empty and data body is just the query:
bool write_data_body(IO_CACHE *file)
{
return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
}
Printing the event is just printing the query:
void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
{
my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
}
3. How Master writes Annotate_rows events to the binary log
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The event is written to the binary log just before the group of Table_map
events which precede corresponding Rows events (one query may generate
several Table map events in the binary log, but the corresponding
Annotate_rows event must be written only once before the first Table map
event; hence the boolean variable 'with_annotate' below):
int write_locked_table_maps(THD *thd)
{ ...
bool with_annotate= thd->variables.binlog_annotate_rows_events;
...
for (uint i= 0; i < ... <number of tables> ...; ++i)
{ ...
thd->binlog_write_table_map(table, ..., with_annotate);
with_annotate= 0; // write Annotate_event not more than once
...
}
...
}
int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
{ ...
Table_map_log_event the_event(...);
...
if (with_annotate)
{
Annotate_rows_log_event anno(this);
mysql_bin_log.write(&anno);
}
mysql_bin_log.write(&the_event);
...
}
4. How slave treats replicate-annotate-rows-events option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The replicate-annotate-rows-events option is treated just as the session
value of the binlog_annotate_rows_events variable for the slave IO and
SQL threads. This setting is done during initialization of these threads:
pthread_handler_t handle_slave_io(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_IO);
...
}
pthread_handler_t handle_slave_sql(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_SQL);
...
}
int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
{ ...
thd->variables.binlog_annotate_rows_events=
opt_replicate_annotate_rows_events;
...
}
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If the replicate-annotate-rows-events option is not set on a slave, there
is no need for master to send Annotate_rows events to this slave. The slave
(or mysqlbinlog in remote case), before requesting binlog dump via the
COM_BINLOG_DUMP command, informs the master whether it should send these
events by executing the newly added COM_BINLOG_DUMP_OPTIONS_EXT server
command:
case COM_BINLOG_DUMP_OPTIONS_EXT:
thd->binlog_dump_flags_ext= packet[0];
my_ok(thd);
break;
Note. We add this new command and don't use COM_BINLOG_DUMP to avoid possible
conflicts with MySQL/Sun.
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
case COM_BINLOG_DUMP:
{ ...
flags= uint2korr(packet + 4);
...
mysql_binlog_send(thd, ..., flags);
...
}
void mysql_binlog_send(THD* thd, ..., ushort flags)
{ ...
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
...
}
7. How slave SQL thread processes Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The slave processes each recieved event by "applying" it, i.e. by
calling the Log_event::apply_event() function which in turn calls
the virtual do_apply_event() member specific for each type of the
event.
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev = next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
int apply_event_and_update_pos(Log_event *ev, ...)
{ ...
ev->apply_event(...);
...
}
int Log_event::apply_event(...)
{
return do_apply_event(...);
}
What does it mean to "apply" an Annotate_rows event? It means to set current
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
thd->set_query(m_query_txt, m_query_len);
}
NOTE. I am not sure, but possibly current values of thd->query and
thd->query_length should be saved before calling set_query() and to be
restored on the Annotate_rows_log_event object deletion.
Is it really needed ?
After calling this do_apply_event() function we may not delete the
Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
above) because thd->query now points to the string inside this object.
We may keep the pointer to this object in the Relay_log_info:
class Relay_log_info
{
public:
...
void set_annotate_event(Annotate_rows_log_event*);
Annotate_rows_log_event* get_annotate_event();
void free_annotate_event();
...
private:
Annotate_rows_log_event* m_annotate_event;
};
The saved Annotate_rows object should be deleted when all corresponding
Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (rli->get_annotate_event() && is_last_rows_event(ev))
rli->free_annotate_event();
else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
where
bool is_last_rows_event(Log_event* ev)
{
Log_event_type type= ev->get_type_code();
if (IS_ROWS_EVENT_TYPE(type))
{
Rows_log_event* rows= (Rows_log_event*)ev;
return rows->get_flags(Rows_log_event::STMT_END_F);
}
return 0;
}
#define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
(type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
Kristian noticed that introducing new log event type should be coordinated
somehow with MySQL/Sun:
Kristian: The numeric code for this event must be assigned carefully.
It should be coordinated with MySQL/Sun, otherwise we can get into a
situation where MySQL uses the same numeric code for one event that
MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
impossible.
Alex: I reserved about 20 numbers not to have possible conflicts
with MySQL.
Kristian: Still, I think it would be appropriate to send a polite email
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Knielsen): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 06 Apr '10
by worklog-noreply@askmonty.org 06 Apr '10
06 Apr '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Knielsen, Serg
CATEGORY.......: Server-Sprint
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: In-Progress
PRIORITY.......: 60
WORKED HOURS...: 40
ESTIMATE.......: 15 (hours remain)
ORIG. ESTIMATE.: 35
PROGRESS NOTES:
-=-=(Knielsen - Tue, 06 Apr 2010, 15:25)=-=-
Status updated.
--- /tmp/wklog.47.old.12734 2010-04-06 15:25:54.000000000 +0000
+++ /tmp/wklog.47.new.12734 2010-04-06 15:25:54.000000000 +0000
@@ -1 +1 @@
-Code-Review
+In-Progress
-=-=(Knielsen - Mon, 29 Mar 2010, 10:59)=-=-
Status updated.
--- /tmp/wklog.47.old.27790 2010-03-29 10:59:53.000000000 +0000
+++ /tmp/wklog.47.new.27790 2010-03-29 10:59:53.000000000 +0000
@@ -1 +1 @@
-In-Progress
+Code-Review
-=-=(Alexi - Thu, 18 Feb 2010, 19:29)=-=-
Worked 20 hours (alexi)
Worked 20 hours and estimate 15 hours remain (original estimate unchanged).
-=-=(Serg - Fri, 05 Feb 2010, 14:04)=-=-
Observers changed: Knielsen,Serg
-=-=(Guest - Fri, 05 Feb 2010, 13:40)=-=-
Category updated.
--- /tmp/wklog.47.old.9197 2010-02-05 13:40:36.000000000 +0200
+++ /tmp/wklog.47.new.9197 2010-02-05 13:40:36.000000000 +0200
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-Sprint
-=-=(Guest - Fri, 05 Feb 2010, 13:40)=-=-
Status updated.
--- /tmp/wklog.47.old.9197 2010-02-05 13:40:36.000000000 +0200
+++ /tmp/wklog.47.new.9197 2010-02-05 13:40:36.000000000 +0200
@@ -1 +1 @@
-Un-Assigned
+In-Progress
-=-=(Alexi - Thu, 04 Feb 2010, 09:54)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.16174 2010-02-04 09:54:13.000000000 +0200
+++ /tmp/wklog.47.new.16174 2010-02-04 09:54:13.000000000 +0200
@@ -171,35 +171,20 @@
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-When requesting an event, the slave should inform the master whether
-it should send Annotate_rows events or not. To that end we add a new
-BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
+If the replicate-annotate-rows-events option is not set on a slave, there
+is no need for master to send Annotate_rows events to this slave. The slave
+(or mysqlbinlog in remote case), before requesting binlog dump via the
+COM_BINLOG_DUMP command, informs the master whether it should send these
+events by executing the newly added COM_BINLOG_DUMP_OPTIONS_EXT server
+command:
+
+ case COM_BINLOG_DUMP_OPTIONS_EXT:
+ thd->binlog_dump_flags_ext= packet[0];
+ my_ok(thd);
+ break;
- #define BINLOG_DUMP_NON_BLOCK 1
- #define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
-
- pthread_handler_t handle_slave_io(void *arg)
- { ...
- request_dump(mysql, ...);
- ...
- }
-
- int request_dump(MYSQL* mysql, ...)
- { ...
- if (opt_log_slave_updates &&
- mi->io_thd->variables.binlog_annotate_rows_events)
- binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
- ...
- int2store(buf + 4, binlog_flags);
- ...
- simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
- ...
- }
-
-NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
-simple_command() function, should also use this flag if it wants (in case
-of the --print-annotate-rows-events option set) to recieve Annotate_rows
-events.
+Note. We add this new command and don't use COM_BINLOG_DUMP to avoid possible
+conflicts with MySQL/Sun.
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -338,10 +323,4 @@
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
-Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
-flag taking into account that MySQL/Sun may also introduce a flag with the
-same value to be used in the request_dump-mysql_binlog_send interface.
-But this is mainly the question of merging: if a conflict concerning this
-flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
-(this does not require additional changes in the code).
-=-=(Alexi - Sun, 20 Dec 2009, 16:00)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.19667 2009-12-20 14:00:56.000000000 +0000
+++ /tmp/wklog.47.new.19667 2009-12-20 14:00:56.000000000 +0000
@@ -196,6 +196,11 @@
...
}
+NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
+simple_command() function, should also use this flag if it wants (in case
+of the --print-annotate-rows-events option set) to recieve Annotate_rows
+events.
+
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -212,8 +217,7 @@
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
- flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
- thd->server_id == 0 /* slave == mysqlbinlog */ )
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
-=-=(Alexi - Sun, 20 Dec 2009, 13:14)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.11350 2009-12-20 13:14:04.000000000 +0200
+++ /tmp/wklog.47.new.11350 2009-12-20 13:14:04.000000000 +0200
@@ -282,23 +282,18 @@
Annotate_rows_log_event* m_annotate_event;
};
-When the saved Annotate_rows object may be deleted? When all corresponding
-Rows events will be processed, i.e. before processing the first non-Rows
-event (note that Annotate_rows object resides in the binary log *after*
-the (possible) 'BEGIN' Query event which accompanies the rows events; note
-also that this deletion is adjusted with the case when some or all
-corresponding Rows events are filtered out by replicate filter rules):
+The saved Annotate_rows object should be deleted when all corresponding
+Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
- if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
- rli->free_annotate_event();
-
apply_event_and_update_pos(ev, ...);
- if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ if (rli->get_annotate_event() && is_last_rows_event(ev))
+ rli->free_annotate_event();
+ else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
@@ -307,10 +302,21 @@
where
- #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
- (type) == WRITE_ROWS_EVENT || \
+ bool is_last_rows_event(Log_event* ev)
+ {
+ Log_event_type type= ev->get_type_code();
+ if (IS_ROWS_EVENT_TYPE(type))
+ {
+ Rows_log_event* rows= (Rows_log_event*)ev;
+ return rows->get_flags(Rows_log_event::STMT_END_F);
+ }
+
+ return 0;
+ }
+
+ #define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
- (type) == DELETE_ROWS_EVENT )
+ (type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
-=-=(Alexi - Sun, 20 Dec 2009, 09:29)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.32726 2009-12-20 07:29:56.000000000 +0000
+++ /tmp/wklog.47.new.32726 2009-12-20 07:29:56.000000000 +0000
@@ -56,7 +56,7 @@
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
- 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ 00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
------------------------------------------------------------
-=-=(View All Progress Notes, 26 total)=-=-
http://askmonty.org/worklog/index.pl?tid=47&nolimit=1
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
HIGH-LEVEL SPECIFICATION:
Content
~~~~~~~
1. Annotate_rows_log_event
2. Server option: --binlog-annotate-rows-events
3. Server option: --replicate-annotate-rows-events
4. mysqlbinlog option: --print-annotate-rows-events
5. mysqlbinlog output
1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Describes the query which caused the corresponding rows events. Has empty
post-header and contains the query text in its data part. Example:
************************
ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
00000225 | 64 00 00 00 | server_id = 100
00000229 | 36 00 00 00 | event_len = 54
0000022D | 56 02 00 00 | log_pos = 00000256
00000231 | 00 00 | flags = <none>
------------------------
00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
00000237 | 52 54 20 49 |
0000023B | 4E 54 4F 20 |
0000023F | 74 31 20 56 |
00000243 | 41 4C 55 45 |
00000247 | 53 20 28 31 |
0000024B | 29 2C 20 28 |
0000024F | 32 29 2C 20 |
00000253 | 28 33 29 |
************************
In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
and precedes the first of Table map events which accompany the corresponding
rows events. (See example in the "mysqlbinlog output" section below.)
2. Server option: --binlog-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the master to write Annotate_rows events to the binary log.
* Variable Name: binlog_annotate_rows_events
* Scope: Global & Session
* Access Type: Dynamic
* Data Type: bool
* Default Value: OFF
NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
3. Server option: --replicate-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the slave to reproduce Annotate_rows events recieved from the master
in its own binary log (sensible only in pair with log-slave-updates option).
* Variable Name: replicate_annotate_rows_events
* Scope: Global
* Access Type: Read only
* Data Type: bool
* Default Value: OFF
NOTE. Why do we additionally need this 'replicate' option? Why not to make
the slave to reproduce this events when its binlog-annotate-rows-events
global value is ON? Well, because, for example, we may want to configure
the slave which should reproduce Annotate_rows events but has global
binlog-annotate-rows-events = OFF meaning this to be the default value for
the client threads (see also "How slave treats replicate-annotate-rows-events
option" in LLD part).
4. mysqlbinlog option: --print-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With this option, mysqlbinlog prints the content of Annotate_rows events (if
the binary log does contain them). Without this option (i.e. by default),
mysqlbinlog skips Annotate_rows events.
5. mysqlbinlog output
~~~~~~~~~~~~~~~~~~~~~
With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
in a form like this:
...
# at 1646
#091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
exec_time=0 error_code=0
SET TIMESTAMP=1261215926/*!*/;
BEGIN
/*!*/;
# at 1714
# at 1812
# at 1853
# at 1894
# at 1938
#091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
#091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
mapped to number 16
#091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
mapped to number 17
#091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
#091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
...
LOW-LEVEL DESIGN:
Content
~~~~~~~
1. Annotate_rows event number
2. Outline of Annotate_rows event behavior
3. How Master writes Annotate_rows events to the binary log
4. How slave treats replicate-annotate-rows-events option
5. How slave IO thread requests Annotate_rows events
6. How master executes the request
7. How slave SQL thread processes Annotate_rows events
8. General remarks
1. Annotate_rows event number
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
between the last MySQL event number and the Annotate_rows event number:
enum Log_event_type
{ ...
INCIDENT_EVENT= 26,
// New MySQL event numbers are to be added here
MYSQL_EVENTS_END,
MARIA_EVENTS_BEGIN= 51,
// New Maria event numbers start from here
ANNOTATE_ROWS_EVENT= 51,
ENUM_END_EVENT
};
together with the corresponding extension of 'post_header_len' array in the
Format description event. (This extension does not affect the compatibility
of the binary log). Here is how Format description event looks like with
this extension:
************************
FORMAT_DESCRIPTION_EVENT
************************
00000004 | A1 A0 2C 4B | time_when = 1261215905
00000008 | 0F | event_type = 15
00000009 | 64 00 00 00 | server_id = 100
0000000D | 7F 00 00 00 | event_len = 127
00000011 | 83 00 00 00 | log_pos = 00000083
00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
------------------------
00000017 | 04 00 | binlog_ver = 4
00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
..... ...
0000004B | A1 A0 2C 4B | time_created = 1261215905
0000004F | 13 | common_header_len = 19
------------------------
post_header_len
------------------------
00000050 | 38 | 56 - START_EVENT_V3 [1]
..... ...
00000069 | 02 | 2 - INCIDENT_EVENT [26]
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each Annotate_rows_log_event object has two private members describing the
corresponding query:
char *m_query_txt;
uint m_query_len;
When the object is created for writing to a binary log, this query is taken
from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
as well as other implementation details):
Annotate_rows_log_event(THD *thd)
{
m_query_txt = thd->query();
m_query_len = thd->query_length();
}
When the object is read from a binary log, the query is taken from the buffer
containing the binary log representation of the event (this buffer is allocated
in Log_event object from which all Log events are derived):
Annotate_rows_log_event(char *buf, uint event_len,
Format_description_log_event *desc)
{
m_query_len = event_len - desc->common_header_len;
m_query_txt = buf + desc->common_header_len;
}
The events are written to the binary log by the Log_event::write() member
which calls virtual write_data_header() and write_data_body() members
("data header" and "post header" are synonym in replication terminology).
In our case, data header is empty and data body is just the query:
bool write_data_body(IO_CACHE *file)
{
return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
}
Printing the event is just printing the query:
void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
{
my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
}
3. How Master writes Annotate_rows events to the binary log
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The event is written to the binary log just before the group of Table_map
events which precede corresponding Rows events (one query may generate
several Table map events in the binary log, but the corresponding
Annotate_rows event must be written only once before the first Table map
event; hence the boolean variable 'with_annotate' below):
int write_locked_table_maps(THD *thd)
{ ...
bool with_annotate= thd->variables.binlog_annotate_rows_events;
...
for (uint i= 0; i < ... <number of tables> ...; ++i)
{ ...
thd->binlog_write_table_map(table, ..., with_annotate);
with_annotate= 0; // write Annotate_event not more than once
...
}
...
}
int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
{ ...
Table_map_log_event the_event(...);
...
if (with_annotate)
{
Annotate_rows_log_event anno(this);
mysql_bin_log.write(&anno);
}
mysql_bin_log.write(&the_event);
...
}
4. How slave treats replicate-annotate-rows-events option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The replicate-annotate-rows-events option is treated just as the session
value of the binlog_annotate_rows_events variable for the slave IO and
SQL threads. This setting is done during initialization of these threads:
pthread_handler_t handle_slave_io(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_IO);
...
}
pthread_handler_t handle_slave_sql(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_SQL);
...
}
int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
{ ...
thd->variables.binlog_annotate_rows_events=
opt_replicate_annotate_rows_events;
...
}
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If the replicate-annotate-rows-events option is not set on a slave, there
is no need for master to send Annotate_rows events to this slave. The slave
(or mysqlbinlog in remote case), before requesting binlog dump via the
COM_BINLOG_DUMP command, informs the master whether it should send these
events by executing the newly added COM_BINLOG_DUMP_OPTIONS_EXT server
command:
case COM_BINLOG_DUMP_OPTIONS_EXT:
thd->binlog_dump_flags_ext= packet[0];
my_ok(thd);
break;
Note. We add this new command and don't use COM_BINLOG_DUMP to avoid possible
conflicts with MySQL/Sun.
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
case COM_BINLOG_DUMP:
{ ...
flags= uint2korr(packet + 4);
...
mysql_binlog_send(thd, ..., flags);
...
}
void mysql_binlog_send(THD* thd, ..., ushort flags)
{ ...
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
...
}
7. How slave SQL thread processes Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The slave processes each recieved event by "applying" it, i.e. by
calling the Log_event::apply_event() function which in turn calls
the virtual do_apply_event() member specific for each type of the
event.
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev = next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
int apply_event_and_update_pos(Log_event *ev, ...)
{ ...
ev->apply_event(...);
...
}
int Log_event::apply_event(...)
{
return do_apply_event(...);
}
What does it mean to "apply" an Annotate_rows event? It means to set current
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
thd->set_query(m_query_txt, m_query_len);
}
NOTE. I am not sure, but possibly current values of thd->query and
thd->query_length should be saved before calling set_query() and to be
restored on the Annotate_rows_log_event object deletion.
Is it really needed ?
After calling this do_apply_event() function we may not delete the
Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
above) because thd->query now points to the string inside this object.
We may keep the pointer to this object in the Relay_log_info:
class Relay_log_info
{
public:
...
void set_annotate_event(Annotate_rows_log_event*);
Annotate_rows_log_event* get_annotate_event();
void free_annotate_event();
...
private:
Annotate_rows_log_event* m_annotate_event;
};
The saved Annotate_rows object should be deleted when all corresponding
Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (rli->get_annotate_event() && is_last_rows_event(ev))
rli->free_annotate_event();
else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
where
bool is_last_rows_event(Log_event* ev)
{
Log_event_type type= ev->get_type_code();
if (IS_ROWS_EVENT_TYPE(type))
{
Rows_log_event* rows= (Rows_log_event*)ev;
return rows->get_flags(Rows_log_event::STMT_END_F);
}
return 0;
}
#define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
(type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
Kristian noticed that introducing new log event type should be coordinated
somehow with MySQL/Sun:
Kristian: The numeric code for this event must be assigned carefully.
It should be coordinated with MySQL/Sun, otherwise we can get into a
situation where MySQL uses the same numeric code for one event that
MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
impossible.
Alex: I reserved about 20 numbers not to have possible conflicts
with MySQL.
Kristian: Still, I think it would be appropriate to send a polite email
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Knielsen): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 06 Apr '10
by worklog-noreply@askmonty.org 06 Apr '10
06 Apr '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Knielsen, Serg
CATEGORY.......: Server-Sprint
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: In-Progress
PRIORITY.......: 60
WORKED HOURS...: 40
ESTIMATE.......: 15 (hours remain)
ORIG. ESTIMATE.: 35
PROGRESS NOTES:
-=-=(Knielsen - Tue, 06 Apr 2010, 15:25)=-=-
Status updated.
--- /tmp/wklog.47.old.12734 2010-04-06 15:25:54.000000000 +0000
+++ /tmp/wklog.47.new.12734 2010-04-06 15:25:54.000000000 +0000
@@ -1 +1 @@
-Code-Review
+In-Progress
-=-=(Knielsen - Mon, 29 Mar 2010, 10:59)=-=-
Status updated.
--- /tmp/wklog.47.old.27790 2010-03-29 10:59:53.000000000 +0000
+++ /tmp/wklog.47.new.27790 2010-03-29 10:59:53.000000000 +0000
@@ -1 +1 @@
-In-Progress
+Code-Review
-=-=(Alexi - Thu, 18 Feb 2010, 19:29)=-=-
Worked 20 hours (alexi)
Worked 20 hours and estimate 15 hours remain (original estimate unchanged).
-=-=(Serg - Fri, 05 Feb 2010, 14:04)=-=-
Observers changed: Knielsen,Serg
-=-=(Guest - Fri, 05 Feb 2010, 13:40)=-=-
Category updated.
--- /tmp/wklog.47.old.9197 2010-02-05 13:40:36.000000000 +0200
+++ /tmp/wklog.47.new.9197 2010-02-05 13:40:36.000000000 +0200
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-Sprint
-=-=(Guest - Fri, 05 Feb 2010, 13:40)=-=-
Status updated.
--- /tmp/wklog.47.old.9197 2010-02-05 13:40:36.000000000 +0200
+++ /tmp/wklog.47.new.9197 2010-02-05 13:40:36.000000000 +0200
@@ -1 +1 @@
-Un-Assigned
+In-Progress
-=-=(Alexi - Thu, 04 Feb 2010, 09:54)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.16174 2010-02-04 09:54:13.000000000 +0200
+++ /tmp/wklog.47.new.16174 2010-02-04 09:54:13.000000000 +0200
@@ -171,35 +171,20 @@
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-When requesting an event, the slave should inform the master whether
-it should send Annotate_rows events or not. To that end we add a new
-BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
+If the replicate-annotate-rows-events option is not set on a slave, there
+is no need for master to send Annotate_rows events to this slave. The slave
+(or mysqlbinlog in remote case), before requesting binlog dump via the
+COM_BINLOG_DUMP command, informs the master whether it should send these
+events by executing the newly added COM_BINLOG_DUMP_OPTIONS_EXT server
+command:
+
+ case COM_BINLOG_DUMP_OPTIONS_EXT:
+ thd->binlog_dump_flags_ext= packet[0];
+ my_ok(thd);
+ break;
- #define BINLOG_DUMP_NON_BLOCK 1
- #define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
-
- pthread_handler_t handle_slave_io(void *arg)
- { ...
- request_dump(mysql, ...);
- ...
- }
-
- int request_dump(MYSQL* mysql, ...)
- { ...
- if (opt_log_slave_updates &&
- mi->io_thd->variables.binlog_annotate_rows_events)
- binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
- ...
- int2store(buf + 4, binlog_flags);
- ...
- simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
- ...
- }
-
-NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
-simple_command() function, should also use this flag if it wants (in case
-of the --print-annotate-rows-events option set) to recieve Annotate_rows
-events.
+Note. We add this new command and don't use COM_BINLOG_DUMP to avoid possible
+conflicts with MySQL/Sun.
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -338,10 +323,4 @@
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
-Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
-flag taking into account that MySQL/Sun may also introduce a flag with the
-same value to be used in the request_dump-mysql_binlog_send interface.
-But this is mainly the question of merging: if a conflict concerning this
-flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
-(this does not require additional changes in the code).
-=-=(Alexi - Sun, 20 Dec 2009, 16:00)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.19667 2009-12-20 14:00:56.000000000 +0000
+++ /tmp/wklog.47.new.19667 2009-12-20 14:00:56.000000000 +0000
@@ -196,6 +196,11 @@
...
}
+NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
+simple_command() function, should also use this flag if it wants (in case
+of the --print-annotate-rows-events option set) to recieve Annotate_rows
+events.
+
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -212,8 +217,7 @@
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
- flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
- thd->server_id == 0 /* slave == mysqlbinlog */ )
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
-=-=(Alexi - Sun, 20 Dec 2009, 13:14)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.11350 2009-12-20 13:14:04.000000000 +0200
+++ /tmp/wklog.47.new.11350 2009-12-20 13:14:04.000000000 +0200
@@ -282,23 +282,18 @@
Annotate_rows_log_event* m_annotate_event;
};
-When the saved Annotate_rows object may be deleted? When all corresponding
-Rows events will be processed, i.e. before processing the first non-Rows
-event (note that Annotate_rows object resides in the binary log *after*
-the (possible) 'BEGIN' Query event which accompanies the rows events; note
-also that this deletion is adjusted with the case when some or all
-corresponding Rows events are filtered out by replicate filter rules):
+The saved Annotate_rows object should be deleted when all corresponding
+Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
- if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
- rli->free_annotate_event();
-
apply_event_and_update_pos(ev, ...);
- if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ if (rli->get_annotate_event() && is_last_rows_event(ev))
+ rli->free_annotate_event();
+ else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
@@ -307,10 +302,21 @@
where
- #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
- (type) == WRITE_ROWS_EVENT || \
+ bool is_last_rows_event(Log_event* ev)
+ {
+ Log_event_type type= ev->get_type_code();
+ if (IS_ROWS_EVENT_TYPE(type))
+ {
+ Rows_log_event* rows= (Rows_log_event*)ev;
+ return rows->get_flags(Rows_log_event::STMT_END_F);
+ }
+
+ return 0;
+ }
+
+ #define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
- (type) == DELETE_ROWS_EVENT )
+ (type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
-=-=(Alexi - Sun, 20 Dec 2009, 09:29)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.32726 2009-12-20 07:29:56.000000000 +0000
+++ /tmp/wklog.47.new.32726 2009-12-20 07:29:56.000000000 +0000
@@ -56,7 +56,7 @@
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
- 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ 00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
------------------------------------------------------------
-=-=(View All Progress Notes, 26 total)=-=-
http://askmonty.org/worklog/index.pl?tid=47&nolimit=1
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
HIGH-LEVEL SPECIFICATION:
Content
~~~~~~~
1. Annotate_rows_log_event
2. Server option: --binlog-annotate-rows-events
3. Server option: --replicate-annotate-rows-events
4. mysqlbinlog option: --print-annotate-rows-events
5. mysqlbinlog output
1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Describes the query which caused the corresponding rows events. Has empty
post-header and contains the query text in its data part. Example:
************************
ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
00000225 | 64 00 00 00 | server_id = 100
00000229 | 36 00 00 00 | event_len = 54
0000022D | 56 02 00 00 | log_pos = 00000256
00000231 | 00 00 | flags = <none>
------------------------
00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
00000237 | 52 54 20 49 |
0000023B | 4E 54 4F 20 |
0000023F | 74 31 20 56 |
00000243 | 41 4C 55 45 |
00000247 | 53 20 28 31 |
0000024B | 29 2C 20 28 |
0000024F | 32 29 2C 20 |
00000253 | 28 33 29 |
************************
In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
and precedes the first of Table map events which accompany the corresponding
rows events. (See example in the "mysqlbinlog output" section below.)
2. Server option: --binlog-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the master to write Annotate_rows events to the binary log.
* Variable Name: binlog_annotate_rows_events
* Scope: Global & Session
* Access Type: Dynamic
* Data Type: bool
* Default Value: OFF
NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
3. Server option: --replicate-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the slave to reproduce Annotate_rows events recieved from the master
in its own binary log (sensible only in pair with log-slave-updates option).
* Variable Name: replicate_annotate_rows_events
* Scope: Global
* Access Type: Read only
* Data Type: bool
* Default Value: OFF
NOTE. Why do we additionally need this 'replicate' option? Why not to make
the slave to reproduce this events when its binlog-annotate-rows-events
global value is ON? Well, because, for example, we may want to configure
the slave which should reproduce Annotate_rows events but has global
binlog-annotate-rows-events = OFF meaning this to be the default value for
the client threads (see also "How slave treats replicate-annotate-rows-events
option" in LLD part).
4. mysqlbinlog option: --print-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With this option, mysqlbinlog prints the content of Annotate_rows events (if
the binary log does contain them). Without this option (i.e. by default),
mysqlbinlog skips Annotate_rows events.
5. mysqlbinlog output
~~~~~~~~~~~~~~~~~~~~~
With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
in a form like this:
...
# at 1646
#091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
exec_time=0 error_code=0
SET TIMESTAMP=1261215926/*!*/;
BEGIN
/*!*/;
# at 1714
# at 1812
# at 1853
# at 1894
# at 1938
#091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
#091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
mapped to number 16
#091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
mapped to number 17
#091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
#091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
...
LOW-LEVEL DESIGN:
Content
~~~~~~~
1. Annotate_rows event number
2. Outline of Annotate_rows event behavior
3. How Master writes Annotate_rows events to the binary log
4. How slave treats replicate-annotate-rows-events option
5. How slave IO thread requests Annotate_rows events
6. How master executes the request
7. How slave SQL thread processes Annotate_rows events
8. General remarks
1. Annotate_rows event number
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
between the last MySQL event number and the Annotate_rows event number:
enum Log_event_type
{ ...
INCIDENT_EVENT= 26,
// New MySQL event numbers are to be added here
MYSQL_EVENTS_END,
MARIA_EVENTS_BEGIN= 51,
// New Maria event numbers start from here
ANNOTATE_ROWS_EVENT= 51,
ENUM_END_EVENT
};
together with the corresponding extension of 'post_header_len' array in the
Format description event. (This extension does not affect the compatibility
of the binary log). Here is how Format description event looks like with
this extension:
************************
FORMAT_DESCRIPTION_EVENT
************************
00000004 | A1 A0 2C 4B | time_when = 1261215905
00000008 | 0F | event_type = 15
00000009 | 64 00 00 00 | server_id = 100
0000000D | 7F 00 00 00 | event_len = 127
00000011 | 83 00 00 00 | log_pos = 00000083
00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
------------------------
00000017 | 04 00 | binlog_ver = 4
00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
..... ...
0000004B | A1 A0 2C 4B | time_created = 1261215905
0000004F | 13 | common_header_len = 19
------------------------
post_header_len
------------------------
00000050 | 38 | 56 - START_EVENT_V3 [1]
..... ...
00000069 | 02 | 2 - INCIDENT_EVENT [26]
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each Annotate_rows_log_event object has two private members describing the
corresponding query:
char *m_query_txt;
uint m_query_len;
When the object is created for writing to a binary log, this query is taken
from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
as well as other implementation details):
Annotate_rows_log_event(THD *thd)
{
m_query_txt = thd->query();
m_query_len = thd->query_length();
}
When the object is read from a binary log, the query is taken from the buffer
containing the binary log representation of the event (this buffer is allocated
in Log_event object from which all Log events are derived):
Annotate_rows_log_event(char *buf, uint event_len,
Format_description_log_event *desc)
{
m_query_len = event_len - desc->common_header_len;
m_query_txt = buf + desc->common_header_len;
}
The events are written to the binary log by the Log_event::write() member
which calls virtual write_data_header() and write_data_body() members
("data header" and "post header" are synonym in replication terminology).
In our case, data header is empty and data body is just the query:
bool write_data_body(IO_CACHE *file)
{
return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
}
Printing the event is just printing the query:
void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
{
my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
}
3. How Master writes Annotate_rows events to the binary log
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The event is written to the binary log just before the group of Table_map
events which precede corresponding Rows events (one query may generate
several Table map events in the binary log, but the corresponding
Annotate_rows event must be written only once before the first Table map
event; hence the boolean variable 'with_annotate' below):
int write_locked_table_maps(THD *thd)
{ ...
bool with_annotate= thd->variables.binlog_annotate_rows_events;
...
for (uint i= 0; i < ... <number of tables> ...; ++i)
{ ...
thd->binlog_write_table_map(table, ..., with_annotate);
with_annotate= 0; // write Annotate_event not more than once
...
}
...
}
int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
{ ...
Table_map_log_event the_event(...);
...
if (with_annotate)
{
Annotate_rows_log_event anno(this);
mysql_bin_log.write(&anno);
}
mysql_bin_log.write(&the_event);
...
}
4. How slave treats replicate-annotate-rows-events option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The replicate-annotate-rows-events option is treated just as the session
value of the binlog_annotate_rows_events variable for the slave IO and
SQL threads. This setting is done during initialization of these threads:
pthread_handler_t handle_slave_io(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_IO);
...
}
pthread_handler_t handle_slave_sql(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_SQL);
...
}
int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
{ ...
thd->variables.binlog_annotate_rows_events=
opt_replicate_annotate_rows_events;
...
}
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If the replicate-annotate-rows-events option is not set on a slave, there
is no need for master to send Annotate_rows events to this slave. The slave
(or mysqlbinlog in remote case), before requesting binlog dump via the
COM_BINLOG_DUMP command, informs the master whether it should send these
events by executing the newly added COM_BINLOG_DUMP_OPTIONS_EXT server
command:
case COM_BINLOG_DUMP_OPTIONS_EXT:
thd->binlog_dump_flags_ext= packet[0];
my_ok(thd);
break;
Note. We add this new command and don't use COM_BINLOG_DUMP to avoid possible
conflicts with MySQL/Sun.
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
case COM_BINLOG_DUMP:
{ ...
flags= uint2korr(packet + 4);
...
mysql_binlog_send(thd, ..., flags);
...
}
void mysql_binlog_send(THD* thd, ..., ushort flags)
{ ...
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
...
}
7. How slave SQL thread processes Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The slave processes each recieved event by "applying" it, i.e. by
calling the Log_event::apply_event() function which in turn calls
the virtual do_apply_event() member specific for each type of the
event.
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev = next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
int apply_event_and_update_pos(Log_event *ev, ...)
{ ...
ev->apply_event(...);
...
}
int Log_event::apply_event(...)
{
return do_apply_event(...);
}
What does it mean to "apply" an Annotate_rows event? It means to set current
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
thd->set_query(m_query_txt, m_query_len);
}
NOTE. I am not sure, but possibly current values of thd->query and
thd->query_length should be saved before calling set_query() and to be
restored on the Annotate_rows_log_event object deletion.
Is it really needed ?
After calling this do_apply_event() function we may not delete the
Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
above) because thd->query now points to the string inside this object.
We may keep the pointer to this object in the Relay_log_info:
class Relay_log_info
{
public:
...
void set_annotate_event(Annotate_rows_log_event*);
Annotate_rows_log_event* get_annotate_event();
void free_annotate_event();
...
private:
Annotate_rows_log_event* m_annotate_event;
};
The saved Annotate_rows object should be deleted when all corresponding
Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (rli->get_annotate_event() && is_last_rows_event(ev))
rli->free_annotate_event();
else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
where
bool is_last_rows_event(Log_event* ev)
{
Log_event_type type= ev->get_type_code();
if (IS_ROWS_EVENT_TYPE(type))
{
Rows_log_event* rows= (Rows_log_event*)ev;
return rows->get_flags(Rows_log_event::STMT_END_F);
}
return 0;
}
#define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
(type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
Kristian noticed that introducing new log event type should be coordinated
somehow with MySQL/Sun:
Kristian: The numeric code for this event must be assigned carefully.
It should be coordinated with MySQL/Sun, otherwise we can get into a
situation where MySQL uses the same numeric code for one event that
MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
impossible.
Alex: I reserved about 20 numbers not to have possible conflicts
with MySQL.
Kristian: Still, I think it would be appropriate to send a polite email
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Knielsen): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 06 Apr '10
by worklog-noreply@askmonty.org 06 Apr '10
06 Apr '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Knielsen, Serg
CATEGORY.......: Server-Sprint
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: In-Progress
PRIORITY.......: 60
WORKED HOURS...: 40
ESTIMATE.......: 15 (hours remain)
ORIG. ESTIMATE.: 35
PROGRESS NOTES:
-=-=(Knielsen - Tue, 06 Apr 2010, 15:25)=-=-
Status updated.
--- /tmp/wklog.47.old.12734 2010-04-06 15:25:54.000000000 +0000
+++ /tmp/wklog.47.new.12734 2010-04-06 15:25:54.000000000 +0000
@@ -1 +1 @@
-Code-Review
+In-Progress
-=-=(Knielsen - Mon, 29 Mar 2010, 10:59)=-=-
Status updated.
--- /tmp/wklog.47.old.27790 2010-03-29 10:59:53.000000000 +0000
+++ /tmp/wklog.47.new.27790 2010-03-29 10:59:53.000000000 +0000
@@ -1 +1 @@
-In-Progress
+Code-Review
-=-=(Alexi - Thu, 18 Feb 2010, 19:29)=-=-
Worked 20 hours (alexi)
Worked 20 hours and estimate 15 hours remain (original estimate unchanged).
-=-=(Serg - Fri, 05 Feb 2010, 14:04)=-=-
Observers changed: Knielsen,Serg
-=-=(Guest - Fri, 05 Feb 2010, 13:40)=-=-
Category updated.
--- /tmp/wklog.47.old.9197 2010-02-05 13:40:36.000000000 +0200
+++ /tmp/wklog.47.new.9197 2010-02-05 13:40:36.000000000 +0200
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-Sprint
-=-=(Guest - Fri, 05 Feb 2010, 13:40)=-=-
Status updated.
--- /tmp/wklog.47.old.9197 2010-02-05 13:40:36.000000000 +0200
+++ /tmp/wklog.47.new.9197 2010-02-05 13:40:36.000000000 +0200
@@ -1 +1 @@
-Un-Assigned
+In-Progress
-=-=(Alexi - Thu, 04 Feb 2010, 09:54)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.16174 2010-02-04 09:54:13.000000000 +0200
+++ /tmp/wklog.47.new.16174 2010-02-04 09:54:13.000000000 +0200
@@ -171,35 +171,20 @@
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-When requesting an event, the slave should inform the master whether
-it should send Annotate_rows events or not. To that end we add a new
-BINLOG_SEND_ANNOTATE_ROWS_EVENT flag used when requesting an event:
+If the replicate-annotate-rows-events option is not set on a slave, there
+is no need for master to send Annotate_rows events to this slave. The slave
+(or mysqlbinlog in remote case), before requesting binlog dump via the
+COM_BINLOG_DUMP command, informs the master whether it should send these
+events by executing the newly added COM_BINLOG_DUMP_OPTIONS_EXT server
+command:
+
+ case COM_BINLOG_DUMP_OPTIONS_EXT:
+ thd->binlog_dump_flags_ext= packet[0];
+ my_ok(thd);
+ break;
- #define BINLOG_DUMP_NON_BLOCK 1
- #define BINLOG_SEND_ANNOTATE_ROWS_EVENT 2
-
- pthread_handler_t handle_slave_io(void *arg)
- { ...
- request_dump(mysql, ...);
- ...
- }
-
- int request_dump(MYSQL* mysql, ...)
- { ...
- if (opt_log_slave_updates &&
- mi->io_thd->variables.binlog_annotate_rows_events)
- binlog_flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
- ...
- int2store(buf + 4, binlog_flags);
- ...
- simple_command(mysql, COM_BINLOG_DUMP, buf, ...);
- ...
- }
-
-NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
-simple_command() function, should also use this flag if it wants (in case
-of the --print-annotate-rows-events option set) to recieve Annotate_rows
-events.
+Note. We add this new command and don't use COM_BINLOG_DUMP to avoid possible
+conflicts with MySQL/Sun.
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -338,10 +323,4 @@
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
-Also we should notice the introduction of the BINLOG_SEND_ANNOTATE_ROWS_EVENT
-flag taking into account that MySQL/Sun may also introduce a flag with the
-same value to be used in the request_dump-mysql_binlog_send interface.
-But this is mainly the question of merging: if a conflict concerning this
-flag occur, we may simply change the BINLOG_SEND_ANNOTATE_ROWS_EVENT value
-(this does not require additional changes in the code).
-=-=(Alexi - Sun, 20 Dec 2009, 16:00)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.19667 2009-12-20 14:00:56.000000000 +0000
+++ /tmp/wklog.47.new.19667 2009-12-20 14:00:56.000000000 +0000
@@ -196,6 +196,11 @@
...
}
+NOTE. mysqlbinlog, when remotely requesting BINLOG_DUMP by calling the
+simple_command() function, should also use this flag if it wants (in case
+of the --print-annotate-rows-events option set) to recieve Annotate_rows
+events.
+
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -212,8 +217,7 @@
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
- flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT ||
- thd->server_id == 0 /* slave == mysqlbinlog */ )
+ flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
-=-=(Alexi - Sun, 20 Dec 2009, 13:14)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.11350 2009-12-20 13:14:04.000000000 +0200
+++ /tmp/wklog.47.new.11350 2009-12-20 13:14:04.000000000 +0200
@@ -282,23 +282,18 @@
Annotate_rows_log_event* m_annotate_event;
};
-When the saved Annotate_rows object may be deleted? When all corresponding
-Rows events will be processed, i.e. before processing the first non-Rows
-event (note that Annotate_rows object resides in the binary log *after*
-the (possible) 'BEGIN' Query event which accompanies the rows events; note
-also that this deletion is adjusted with the case when some or all
-corresponding Rows events are filtered out by replicate filter rules):
+The saved Annotate_rows object should be deleted when all corresponding
+Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
- if (rli->get_annotate_event() && !IS_RBR_EVENT_TYPE(ev->get_type_code()))
- rli->free_annotate_event();
-
apply_event_and_update_pos(ev, ...);
- if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
+ if (rli->get_annotate_event() && is_last_rows_event(ev))
+ rli->free_annotate_event();
+ else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
@@ -307,10 +302,21 @@
where
- #define IS_RBR_EVENT_TYPE(type) ( (type) == TABLE_MAP_EVENT || \
- (type) == WRITE_ROWS_EVENT || \
+ bool is_last_rows_event(Log_event* ev)
+ {
+ Log_event_type type= ev->get_type_code();
+ if (IS_ROWS_EVENT_TYPE(type))
+ {
+ Rows_log_event* rows= (Rows_log_event*)ev;
+ return rows->get_flags(Rows_log_event::STMT_END_F);
+ }
+
+ return 0;
+ }
+
+ #define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
- (type) == DELETE_ROWS_EVENT )
+ (type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
-=-=(Alexi - Sun, 20 Dec 2009, 09:29)=-=-
Low Level Design modified.
--- /tmp/wklog.47.old.32726 2009-12-20 07:29:56.000000000 +0000
+++ /tmp/wklog.47.new.32726 2009-12-20 07:29:56.000000000 +0000
@@ -56,7 +56,7 @@
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
- 00000082 | 00 | 0 - ANNOTATE_RBR_EVENT [51]
+ 00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
------------------------------------------------------------
-=-=(View All Progress Notes, 26 total)=-=-
http://askmonty.org/worklog/index.pl?tid=47&nolimit=1
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
HIGH-LEVEL SPECIFICATION:
Content
~~~~~~~
1. Annotate_rows_log_event
2. Server option: --binlog-annotate-rows-events
3. Server option: --replicate-annotate-rows-events
4. mysqlbinlog option: --print-annotate-rows-events
5. mysqlbinlog output
1. Annotate_rows_log_event [ ANNOTATE_ROWS_EVENT ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Describes the query which caused the corresponding rows events. Has empty
post-header and contains the query text in its data part. Example:
************************
ANNOTATE_ROWS_EVENT
************************
00000220 | B6 A0 2C 4B | time_when = 1261215926
00000224 | 33 | event_type = 51
00000225 | 64 00 00 00 | server_id = 100
00000229 | 36 00 00 00 | event_len = 54
0000022D | 56 02 00 00 | log_pos = 00000256
00000231 | 00 00 | flags = <none>
------------------------
00000233 | 49 4E 53 45 | query = "INSERT INTO t1 VALUES (1), (2), (3)"
00000237 | 52 54 20 49 |
0000023B | 4E 54 4F 20 |
0000023F | 74 31 20 56 |
00000243 | 41 4C 55 45 |
00000247 | 53 20 28 31 |
0000024B | 29 2C 20 28 |
0000024F | 32 29 2C 20 |
00000253 | 28 33 29 |
************************
In binary log, Annotate_rows event follows the (possible) 'BEGIN' Query event
and precedes the first of Table map events which accompany the corresponding
rows events. (See example in the "mysqlbinlog output" section below.)
2. Server option: --binlog-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the master to write Annotate_rows events to the binary log.
* Variable Name: binlog_annotate_rows_events
* Scope: Global & Session
* Access Type: Dynamic
* Data Type: bool
* Default Value: OFF
NOTE. Session values allows to annotate only some selected statements:
...
SET SESSION binlog_annotate_rows_events=ON;
... statements to be annotated ...
SET SESSION binlog_annotate_rows_events=OFF;
... statements not to be annotated ...
3. Server option: --replicate-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tells the slave to reproduce Annotate_rows events recieved from the master
in its own binary log (sensible only in pair with log-slave-updates option).
* Variable Name: replicate_annotate_rows_events
* Scope: Global
* Access Type: Read only
* Data Type: bool
* Default Value: OFF
NOTE. Why do we additionally need this 'replicate' option? Why not to make
the slave to reproduce this events when its binlog-annotate-rows-events
global value is ON? Well, because, for example, we may want to configure
the slave which should reproduce Annotate_rows events but has global
binlog-annotate-rows-events = OFF meaning this to be the default value for
the client threads (see also "How slave treats replicate-annotate-rows-events
option" in LLD part).
4. mysqlbinlog option: --print-annotate-rows-events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With this option, mysqlbinlog prints the content of Annotate_rows events (if
the binary log does contain them). Without this option (i.e. by default),
mysqlbinlog skips Annotate_rows events.
5. mysqlbinlog output
~~~~~~~~~~~~~~~~~~~~~
With --print-annotate-rows-events, mysqlbinlog outputs Annotate_rows events
in a form like this:
...
# at 1646
#091219 12:45:26 server id 100 end_log_pos 1714 Query thread_id=1
exec_time=0 error_code=0
SET TIMESTAMP=1261215926/*!*/;
BEGIN
/*!*/;
# at 1714
# at 1812
# at 1853
# at 1894
# at 1938
#091219 12:45:26 server id 100 end_log_pos 1812 Query: `DELETE t1, t2 FROM
t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a`
#091219 12:45:26 server id 100 end_log_pos 1853 Table_map: `test`.`t1`
mapped to number 16
#091219 12:45:26 server id 100 end_log_pos 1894 Table_map: `test`.`t2`
mapped to number 17
#091219 12:45:26 server id 100 end_log_pos 1938 Delete_rows: table id 16
#091219 12:45:26 server id 100 end_log_pos 1982 Delete_rows: table id 17
flags: STMT_END_F
...
LOW-LEVEL DESIGN:
Content
~~~~~~~
1. Annotate_rows event number
2. Outline of Annotate_rows event behavior
3. How Master writes Annotate_rows events to the binary log
4. How slave treats replicate-annotate-rows-events option
5. How slave IO thread requests Annotate_rows events
6. How master executes the request
7. How slave SQL thread processes Annotate_rows events
8. General remarks
1. Annotate_rows event number
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To avoid possible event numbers conflict with MySQL/Sun, we leave a gap
between the last MySQL event number and the Annotate_rows event number:
enum Log_event_type
{ ...
INCIDENT_EVENT= 26,
// New MySQL event numbers are to be added here
MYSQL_EVENTS_END,
MARIA_EVENTS_BEGIN= 51,
// New Maria event numbers start from here
ANNOTATE_ROWS_EVENT= 51,
ENUM_END_EVENT
};
together with the corresponding extension of 'post_header_len' array in the
Format description event. (This extension does not affect the compatibility
of the binary log). Here is how Format description event looks like with
this extension:
************************
FORMAT_DESCRIPTION_EVENT
************************
00000004 | A1 A0 2C 4B | time_when = 1261215905
00000008 | 0F | event_type = 15
00000009 | 64 00 00 00 | server_id = 100
0000000D | 7F 00 00 00 | event_len = 127
00000011 | 83 00 00 00 | log_pos = 00000083
00000015 | 01 00 | flags = LOG_EVENT_BINLOG_IN_USE_F
------------------------
00000017 | 04 00 | binlog_ver = 4
00000019 | 35 2E 32 2E | server_ver = 5.2.0-MariaDB-alpha-debug-log
..... ...
0000004B | A1 A0 2C 4B | time_created = 1261215905
0000004F | 13 | common_header_len = 19
------------------------
post_header_len
------------------------
00000050 | 38 | 56 - START_EVENT_V3 [1]
..... ...
00000069 | 02 | 2 - INCIDENT_EVENT [26]
0000006A | 00 | 0 - RESERVED [27]
..... ...
00000081 | 00 | 0 - RESERVED [50]
00000082 | 00 | 0 - ANNOTATE_ROWS_EVENT [51]
************************
2. Outline of Annotate_rows event behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each Annotate_rows_log_event object has two private members describing the
corresponding query:
char *m_query_txt;
uint m_query_len;
When the object is created for writing to a binary log, this query is taken
from 'thd' (for short, below we omit the 'Annotate_rows_log_event::' prefix
as well as other implementation details):
Annotate_rows_log_event(THD *thd)
{
m_query_txt = thd->query();
m_query_len = thd->query_length();
}
When the object is read from a binary log, the query is taken from the buffer
containing the binary log representation of the event (this buffer is allocated
in Log_event object from which all Log events are derived):
Annotate_rows_log_event(char *buf, uint event_len,
Format_description_log_event *desc)
{
m_query_len = event_len - desc->common_header_len;
m_query_txt = buf + desc->common_header_len;
}
The events are written to the binary log by the Log_event::write() member
which calls virtual write_data_header() and write_data_body() members
("data header" and "post header" are synonym in replication terminology).
In our case, data header is empty and data body is just the query:
bool write_data_body(IO_CACHE *file)
{
return my_b_safe_write(file, (uchar*) m_query_txt, m_query_len);
}
Printing the event is just printing the query:
void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
{
my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
}
3. How Master writes Annotate_rows events to the binary log
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The event is written to the binary log just before the group of Table_map
events which precede corresponding Rows events (one query may generate
several Table map events in the binary log, but the corresponding
Annotate_rows event must be written only once before the first Table map
event; hence the boolean variable 'with_annotate' below):
int write_locked_table_maps(THD *thd)
{ ...
bool with_annotate= thd->variables.binlog_annotate_rows_events;
...
for (uint i= 0; i < ... <number of tables> ...; ++i)
{ ...
thd->binlog_write_table_map(table, ..., with_annotate);
with_annotate= 0; // write Annotate_event not more than once
...
}
...
}
int THD::binlog_write_table_map(TABLE *table, ..., bool with_annotate)
{ ...
Table_map_log_event the_event(...);
...
if (with_annotate)
{
Annotate_rows_log_event anno(this);
mysql_bin_log.write(&anno);
}
mysql_bin_log.write(&the_event);
...
}
4. How slave treats replicate-annotate-rows-events option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The replicate-annotate-rows-events option is treated just as the session
value of the binlog_annotate_rows_events variable for the slave IO and
SQL threads. This setting is done during initialization of these threads:
pthread_handler_t handle_slave_io(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_IO);
...
}
pthread_handler_t handle_slave_sql(void *arg)
{
THD *thd= new THD;
...
init_slave_thread(thd, SLAVE_THD_SQL);
...
}
int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type)
{ ...
thd->variables.binlog_annotate_rows_events=
opt_replicate_annotate_rows_events;
...
}
5. How slave IO thread requests Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If the replicate-annotate-rows-events option is not set on a slave, there
is no need for master to send Annotate_rows events to this slave. The slave
(or mysqlbinlog in remote case), before requesting binlog dump via the
COM_BINLOG_DUMP command, informs the master whether it should send these
events by executing the newly added COM_BINLOG_DUMP_OPTIONS_EXT server
command:
case COM_BINLOG_DUMP_OPTIONS_EXT:
thd->binlog_dump_flags_ext= packet[0];
my_ok(thd);
break;
Note. We add this new command and don't use COM_BINLOG_DUMP to avoid possible
conflicts with MySQL/Sun.
6. How master executes the request
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
case COM_BINLOG_DUMP:
{ ...
flags= uint2korr(packet + 4);
...
mysql_binlog_send(thd, ..., flags);
...
}
void mysql_binlog_send(THD* thd, ..., ushort flags)
{ ...
Log_event::read_log_event(&log, packet, ...);
...
if ((*packet)[EVENT_TYPE_OFFSET + 1] != ANNOTATE_ROWS_EVENT ||
flags & BINLOG_SEND_ANNOTATE_ROWS_EVENT)
{
my_net_write(net, packet->ptr(), packet->length());
}
...
}
7. How slave SQL thread processes Annotate_rows events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The slave processes each recieved event by "applying" it, i.e. by
calling the Log_event::apply_event() function which in turn calls
the virtual do_apply_event() member specific for each type of the
event.
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev = next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
int apply_event_and_update_pos(Log_event *ev, ...)
{ ...
ev->apply_event(...);
...
}
int Log_event::apply_event(...)
{
return do_apply_event(...);
}
What does it mean to "apply" an Annotate_rows event? It means to set current
thd query to that of the described by the event, i.e. to the query which
caused the subsequent Rows events (see "How Master writes Annotate_rows
events to the binary log" to follow what happens further when the subsequent
Rows events are applied):
int Annotate_rows_log_event::do_apply_event(...)
{
thd->set_query(m_query_txt, m_query_len);
}
NOTE. I am not sure, but possibly current values of thd->query and
thd->query_length should be saved before calling set_query() and to be
restored on the Annotate_rows_log_event object deletion.
Is it really needed ?
After calling this do_apply_event() function we may not delete the
Annotate_rows_log_event object immediatedly (see exec_relay_log_event()
above) because thd->query now points to the string inside this object.
We may keep the pointer to this object in the Relay_log_info:
class Relay_log_info
{
public:
...
void set_annotate_event(Annotate_rows_log_event*);
Annotate_rows_log_event* get_annotate_event();
void free_annotate_event();
...
private:
Annotate_rows_log_event* m_annotate_event;
};
The saved Annotate_rows object should be deleted when all corresponding
Rows events will be processed:
int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{ ...
Log_event *ev= next_event(rli);
...
apply_event_and_update_pos(ev, ...);
if (rli->get_annotate_event() && is_last_rows_event(ev))
rli->free_annotate_event();
else if (ev->get_type_code() == ANNOTATE_ROWS_EVENT)
rli->set_annotate_event((Annotate_rows_log_event*) ev);
else if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
delete ev;
...
}
where
bool is_last_rows_event(Log_event* ev)
{
Log_event_type type= ev->get_type_code();
if (IS_ROWS_EVENT_TYPE(type))
{
Rows_log_event* rows= (Rows_log_event*)ev;
return rows->get_flags(Rows_log_event::STMT_END_F);
}
return 0;
}
#define IS_ROWS_EVENT_TYPE(type) ((type) == WRITE_ROWS_EVENT || \
(type) == UPDATE_ROWS_EVENT || \
(type) == DELETE_ROWS_EVENT)
8. General remarks
~~~~~~~~~~~~~~~~~~
Kristian noticed that introducing new log event type should be coordinated
somehow with MySQL/Sun:
Kristian: The numeric code for this event must be assigned carefully.
It should be coordinated with MySQL/Sun, otherwise we can get into a
situation where MySQL uses the same numeric code for one event that
MariaDB uses for ANNOTATE_ROWS_EVENT, which would make merging the two
impossible.
Alex: I reserved about 20 numbers not to have possible conflicts
with MySQL.
Kristian: Still, I think it would be appropriate to send a polite email
to internals(a)lists.mysql.com about this and suggesting to reserve the
event number.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
Alexi1952 <Alexi1952(a)yandex.ru> writes:
> Hi Sergey,
>
> Send you for review:
>
> 1. WL#47 branch published on:
> https://code.launchpad.net/~alexi1952/maria/maria-5.2-wl47
>
> 2. Diff file attached.
>
> 3. Commit message attached.
Hi Alexi,
Thanks for sending the patch. I did the code review as Sergey is really busy
with preparations for the User's conference.
General comments, most of them with details inline in the patch below:
I started by reading the text in MWL#47. It was _very_ nice to have detailed
description of the approach here, thanks!
The patch also looks very well worked-through.
We should run this through our Buildbot in a MWL#47 feature branch before
pushing to main, as there may be more result file updates needed in special
parts of the test suite (eg --big). I'll help you set it up when you're ready.
It seems your Windows editor (?) has added Windows line ending (CTRL-M)
characters to some of the lines in the patch. You need to fix this and remove
those extra characters.
The new COM_BINLOG_DUMP_OPTIONS_EXT will cause problems the next time MySQL
adds a new command (which will have same numeric value but different
semantics). This needs to be coordinated, which I think means we might as well
use (and coordinate) a new flag for the normal COM_BINLOG_DUMP command.
If the server does not support COM_BINLOG_DUMP_OPTIONS_EXT, this should not be
a fatal error (as it will prevent working against older MariaDB, or
MySQL). This is for both mysqlbinlog and slave thread.
I think a newline in a query will cause mysqlbinlog to output a file that
cannot be correctly read by the mysql command-line client.
I suggest adding a couple more test cases, with multi-session INSERT DELAYED,
different character sets in query, and newline in query (see detailed comments
below).
I am wondering about the --print-annotate-rows-events option for
mysqlbinlog. Maybe we should have it on by default? My reasoning is that there
is little concern about backwards compatibility, as output will only change if
the corresponding option is enabled explicitly in the server (and the extra
output is just a comment anyway). And the user may want to see the annotations
by default. What do you think?
I found it a bit unnatural to have the annotate event be logged inside the
binlog_write_table_map() method, using the extra with_annotate
parameter. Check the suggestions for how to do this differently (below), and
see if it makes sense.
Things that need fixing:
- Windows line ending issue.
- The COM_BINLOG_DUMP_OPTIONS_EXT potential conflict issue
- Extra test cases
- Newline-in-query problem
- Not fail if server does not support annotate events.
- Minor coding style, comment spelling, etc. issues.
- Full Buildbot run before pushing (I'll set it up later).
Also try to see if the other suggestions can be implemented, or if not lets
discuss.
Detailed comments inline with the patch below.
- Kristian.
> === modified file 'client/mysqlbinlog.cc'
> --- client/mysqlbinlog.cc 2010-03-15 11:51:23 +0000
> +++ client/mysqlbinlog.cc 2010-03-24 06:25:13 +0000
> @@ -128,6 +129,69 @@
> OK_STOP
> };
>
> +/**
> + Pointer to the last read Annotate_rows_log_event. Having read an
> + Annotate_rows event, we should not print it immediatedly because all
> + subsequent rbr events can be filtered away, and have to keep it for a while.
> + Also because of that when reading a remote Annotate event we have to keep
> + its binary log representation in a separately allocated buffer.
> +*/
> +static Annotate_rows_log_event *annotate_event= NULL;
> +
> +inline void free_annotate_event()
Do not make this 'inline'.
> +inline void print_annotate_event(PRINT_EVENT_INFO *print_event_info)
> +{
> + if (annotate_event)
> + {
> + annotate_event->print(result_file, print_event_info);
> + delete annotate_event; // the event should be printed not more than once
s/should be printed not more than once/should not be printed more than once/
> @@ -1673,6 +1769,16 @@
> if ((retval= check_master_version()) != OK_CONTINUE)
> DBUG_RETURN(retval);
>
> + buf[0]= 0;
> + if (opt_print_annotate_rows_events)
> + buf[0]|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
> +
> + if (simple_command(mysql, COM_BINLOG_DUMP_OPTIONS_EXT, buf, 1, 0))
> + {
> + error("Got fatal error sending the log dump options command.");
> + DBUG_RETURN(ERROR_STOP);
> + }
What if the server we are trying to dump from does not support the
COM_BINLOG_DUMP_OPTIONS_EXT command? This should not be a fatal error.
At most, it should be an error only if the user actually requested printing of
annotate_rows_events. But I think it should not be an error at all - it seems
more useful to have an option that just prints any annotate events that are
available. Maybe print a warning if the user explicitly requests this option,
no more than that I think.
And I am wondering if the opt_print_annotate_rows_events should not be on by
default in mysqlbinlog. After all, the events will only be there if they were
explicitly enabled in the database, so most likely they will be wanted, or if
not most likely they will not do much harm as they are comments only.
> === modified file 'include/mysql_com.h'
> --- include/mysql_com.h 2010-03-15 11:51:23 +0000
> +++ include/mysql_com.h 2010-03-24 06:25:13 +0000
> @@ -62,6 +62,7 @@
> COM_TABLE_DUMP, COM_CONNECT_OUT, COM_REGISTER_SLAVE,
> COM_STMT_PREPARE, COM_STMT_EXECUTE, COM_STMT_SEND_LONG_DATA, COM_STMT_CLOSE,
> COM_STMT_RESET, COM_SET_OPTION, COM_STMT_FETCH, COM_DAEMON,
> + COM_BINLOG_DUMP_OPTIONS_EXT,
In the MWL#47 text, you explain that
"Note. We add this new command and don't use COM_BINLOG_DUMP to avoid
possible conflicts with MySQL/Sun."
But this does not avoid such conflicts, on the contrary, we will for sure get
serious problems with this the next time upstream MySQL adds _any_ command
here.
One way or the other, when extending the communication between servers and
clients, we need to coordinate with MySQL/Sun/Oracle. And given this, it is
probably better to just use COM_BINLOG_DUMP, do you agree?
(Lars Thalman actually asked if we would be willing to contribute this work
upstream, which suggests that there should be no problem getting such
coordination working).
> === modified file 'mysql-test/extra/rpl_tests/rpl_deadlock.test'
> --- mysql-test/extra/rpl_tests/rpl_deadlock.test 2009-04-03 21:33:13 +0000
> +++ mysql-test/extra/rpl_tests/rpl_deadlock.test 2010-03-24 06:25:13 +0000
> @@ -72,7 +72,7 @@
> --source include/stop_slave.inc
> DELETE FROM t2;
> # Set slave position to the BEGIN log event
> ---replace_result $master_pos_begin MASTER_POS_BEGIN
> +--replace_result $master_pos_begin <master_pos_begin>
Why this change?
> === added file 'mysql-test/extra/rpl_tests/rpl_row_annotate.test'
> --- mysql-test/extra/rpl_tests/rpl_row_annotate.test 1970-01-01 00:00:00 +0000
> +++ mysql-test/extra/rpl_tests/rpl_row_annotate.test 2010-03-24 06:25:13 +0000
> +INSERT DELAYED INTO t2 VALUES (1,1), (2,2), (3,3);
Add to this test to do INSERT DELAYED from two extra connections, one of which
has binlog_annotate_rows_events set and one which does not.
> +INSERT INTO t3 VALUES (1,1), (2,2), (3,3);
> +DELETE t1, t2 FROM t1 INNER JOIN t2 INNER JOIN t3 WHERE t1.a=t2.a AND t2.a=t3.a;
> +
> +INSERT INTO xt1 VALUES (1,1), (2,2), (3,3);
> +INSERT INTO t2 VALUES (1,1), (2,2), (3,3);
> +DELETE xt1, t2 FROM xt1 INNER JOIN t2 INNER JOIN t3 WHERE xt1.a=t2.a AND t2.a=t3.a;
> +
> +INSERT INTO xt1 VALUES (1,1), (2,2), (3,3);
> +INSERT INTO xt2 VALUES (1,1), (2,2), (3,3);
> +DELETE xt1, xt2 FROM xt1 INNER JOIN xt2 INNER JOIN t3 WHERE xt1.a=xt2.a AND xt2.a=t3.a;
Also add tests with a query containing linefeed and other special characters
(eg. insert of blob value and varchar with special character set), to test the
annotation of such queries.
> === added file 'mysql-test/include/binlog_start_pos.inc'
> --- mysql-test/include/binlog_start_pos.inc 1970-01-01 00:00:00 +0000
> +++ mysql-test/include/binlog_start_pos.inc 2010-03-24 06:25:13 +0000
> @@ -0,0 +1,5 @@
> +let $binlog_start_pos=240;
> +--disable_query_log
> +SET @binlog_start_pos=240;
> +--enable_query_log
> +
It would be nice with a short comment explaining what this include file does
and the magic value 240.
(I think what you've tried to do is to clean up this stuff a bit so that the
next change of format_description_event will be easier to integrate in the
test suite, which is good! So just requesting a short explanation here).
> === added file 'mysql-test/suite/binlog/r/binlog_row_annotate.result'
> --- mysql-test/suite/binlog/r/binlog_row_annotate.result 1970-01-01 00:00:00 +0000
> +++ mysql-test/suite/binlog/r/binlog_row_annotate.result 2010-03-24 06:25:13 +0000
> +/*!*/;
> +# at #
> +#010909 4:46:40 server id # end_log_pos # Query thread_id=# exec_time=# error_code=0
> +SET TIMESTAMP=1000000000/*!*/;
> +BEGIN
> +/*!*/;
> +# at #
> +# at #
> +# at #
> +# at #
> +# at #
> +#010909 4:46:40 server id # end_log_pos # Query: `DELETE test1.t1, test2.t2
> +FROM test1.t1 INNER JOIN test2.t2 INNER JOIN test3.t3
> +WHERE test1.t1.a=test2.t2.a AND test2.t2.a=test3.t3.a`
Right, here we see the problem ... this will cause an error if one tries to
apply this mysqlbinlog output to a server.
> === added file 'mysql-test/suite/binlog/t/binlog_row_annotate.test'
> --- mysql-test/suite/binlog/t/binlog_row_annotate.test 1970-01-01 00:00:00 +0000
> +++ mysql-test/suite/binlog/t/binlog_row_annotate.test 2010-03-24 06:25:14 +0000
> +--echo #
> +--echo #####################################################################################
> +--echo # mysqlbinlog --read-from-remote-server --print-annotate-rows-events
> +--echo # Only two Annotates should appear below:
> +--echo # The following Annotates should appear in this output:
These two lines of comment seem to contradict each other? Please fix/clarify.
> === modified file 'sql/handler.cc'
> --- sql/handler.cc 2010-03-15 11:51:23 +0000
> +++ sql/handler.cc 2010-03-24 06:25:16 +0000
> @@ -4594,7 +4594,8 @@
>
> /** @brief
> Write table maps for all (manually or automatically) locked tables
> - to the binary log.
> + to the binary log. Also, if binlog_annotate_rows_events is ON,
You have ^M (chr 13) Windows line endings in this line, and in others. Please
check the patch and remove all Windows line endings introduced.
> @@ -4648,7 +4652,8 @@
> check_table_binlog_row_based(thd, table))
> {
> int const has_trans= table->file->has_transactions();
> - int const error= thd->binlog_write_table_map(table, has_trans);
> + int const error= thd->binlog_write_table_map(table, has_trans,
> + &with_annotate);
I think we can avoid the extra complexity with passing the with_annotate
parameter.
Couldn't we write the annotate_rows_event _after_ the table maps? Then there
could just be a call to mysql_bin_log.write(Annotate_rows_log_event(thd))
after the loop that writes the table maps. (In fact if we moved the
"if (thd->get_binlog_table_maps() == 0)" from write_locked_table_maps() into
the caller binlog_log_row(), the writing of Annotate_rows_log_event could also
be moved there, which seems to me to be the natural place to put it).
Or is there some reason that the annotate event must come before the table
maps? If so, we can still avoid the with_annotate flag, but we need this
before the loop writing the table maps:
if (is_trans)
binlog_start_trans_and_stmt();
mysql_bin_log.write(Annotate_rows_log_event(thd))
> === modified file 'sql/log_event.cc'
> --- sql/log_event.cc 2010-03-15 11:51:23 +0000
> +++ sql/log_event.cc 2010-03-24 06:25:16 +0000
> +Annotate_rows_log_event::Annotate_rows_log_event(const char *buf,
> + uint event_len,
> + const Format_description_log_event *desc)
> + : Log_event(buf, desc),
> + m_save_thd_query_txt(0),
> + m_save_thd_query_len(0)
> +{
> + m_query_len= event_len - desc->common_header_len;
> + m_query_txt= (char*) buf + desc->common_header_len;
Instead of this cast, couldn't we make the m_query_txt member a const char * ?
(from a quick check of the patch, it seems we can).
> +}
> +
> +Annotate_rows_log_event::~Annotate_rows_log_event()
> +{
> +#ifndef MYSQL_CLIENT
> + if (m_save_thd_query_txt)
> + thd->set_query(m_save_thd_query_txt, m_save_thd_query_len);
> +#endif
> +}
I don't like this, it's much too magic that thd->query can change in the
destructor of a completely unrelated object.
I would suggest instead putting in explicit code to restore thd->query in the
places where it needs to be done (you can have a
Annotate_rows_log_event::restore_thd_query() method for that). This makes it
explicit in the code where the restoration happens.
But actually, I am wondering if it would be better not to change thd->query at
all? And instead just pick the query from a saved annotate_rows_log_event when
needed? But I haven't checked the code sufficient to see if this can be done
in a clean way, so think about it, and see if it makes sense.
> +#ifdef MYSQL_CLIENT
> +void Annotate_rows_log_event::print(FILE *file, PRINT_EVENT_INFO *pinfo)
> +{
> + if (!pinfo->short_form)
> + {
> + print_header(&pinfo->head_cache, pinfo, TRUE);
> + my_b_printf(&pinfo->head_cache, "\tQuery: `%s`\n", m_query_txt);
> + }
> +}
> +#endif
I think this will cause problems if the query contains linefeed characters,
right? The mysqlbinlog output will span multiple lines which will not be
marked as comment lines, so trying to apply the output (by piping through
`mysql`) will fail with a syntax error (or worse).
It seems to me we need some kind of quoting here.
Also, I see that mysqlbinlog is careful to make the query string zero
terminated (as required by my_b_printf("%s"), but that's a bit subtle. So
maybe that could also be avoided if some kind of quoting is enabled, or
alternatively add a comment about this zero-termination requirement for
Annotate_rows_log_event::print().
> === modified file 'sql/mysql_priv.h'
> --- sql/mysql_priv.h 2010-03-15 11:51:23 +0000
> +++ sql/mysql_priv.h 2010-03-24 06:25:17 +0000
> @@ -624,7 +624,12 @@
> /* BINLOG_DUMP options */
>
> #define BINLOG_DUMP_NON_BLOCK 1
> -
> +#endif //MYSQL_CLIENT
Make this #endif /* !MYSQL_CLIENT */
(as it is a negative test that this #endif ends)
> +
> +/* Extended BINLOG_DUMP options (passed to COM_BINLOG_DUMP_OPTIONS_EXT) */
> +#define BINLOG_SEND_ANNOTATE_ROWS_EVENT 1
> +
> +#ifndef MYSQL_CLIENT
> /* sql_show.cc:show_log_files() */
> #define SHOW_LOG_STATUS_FREE "FREE"
> #define SHOW_LOG_STATUS_INUSE "IN USE"
> === modified file 'sql/slave.cc'
> --- sql/slave.cc 2010-03-04 08:03:07 +0000
> +++ sql/slave.cc 2010-03-24 06:25:18 +0000
> @@ -1829,6 +1829,36 @@
> }
>
>
> +static int set_master_binlog_dump_options_ext(MYSQL *mysql,
> + bool *suppress_warnings)
> +{
> + uchar buf[1];
> + uchar flags= 0;
> + DBUG_ENTER("set_master_binlog_dump_options_ext");
> +
> + *suppress_warnings= FALSE;
> +
> + if (opt_log_slave_updates && opt_replicate_annotate_rows_events)
> + flags|= BINLOG_SEND_ANNOTATE_ROWS_EVENT;
> +
> + buf[0]= flags;
> +
> + if (simple_command(mysql, COM_BINLOG_DUMP_OPTIONS_EXT, buf, 1, 0))
> + {
> + if (mysql_errno(mysql) == ER_NET_READ_INTERRUPTED)
> + *suppress_warnings= TRUE;
> + else
> + sql_print_error("Error on COM_BINLOG_DUMP_OPTIONS_EXT: %d %s,"
> + " will retry in %d secs",
> + mysql_errno(mysql), mysql_error(mysql),
> + master_connect_retry);
> + DBUG_RETURN(1);
> + }
This will break replication from a master that does not know the
COM_BINLOG_DUMP_OPTIONS_EXT command. We could maybe log a warning in this
case (saying that query annotation is not available from master), but we
should not fail I think.
> +
> + DBUG_RETURN(0);
> +}
> @@ -2234,16 +2264,33 @@
> }
> exec_res= apply_event_and_update_pos(ev, thd, rli);
>
> - /*
> - Format_description_log_event should not be deleted because it will be
> - used to read info about the relay log's format; it will be deleted when
> - the SQL thread does not need it, i.e. when this thread terminates.
> - */
> - if (ev->get_type_code() != FORMAT_DESCRIPTION_EVENT)
> - {
> - DBUG_PRINT("info", ("Deleting the event after it has been executed"));
> - delete ev;
> - }
> + switch (ev->get_type_code()) {
> + case FORMAT_DESCRIPTION_EVENT:
> + // Format_description_log_event should not be deleted because it
> + // will be used to read info about the relay log's format;
> + // it will be deleted when the SQL thread does not need it,
> + // i.e. when this thread terminates.
> + break;
According to coding style, multi-line comments should use C-style /* ... */,
not C++-style //
http://forge.mysql.com/wiki/MySQL_Internals_Coding_Guidelines#Commenting_Co…
(Should be changed here and in several other places where it occurs).
> + case ANNOTATE_ROWS_EVENT:
> + // Annotate_rows event should not be deleted because after it has
> + // been applied, thd->query points to the string inside this event.
> + // The thd->query will be used to generate new Annotate_rows event
> + // during applying the subsequent Rows events.
> + rli->set_annotate_event((Annotate_rows_log_event*) ev);
> + break;
> + case DELETE_ROWS_EVENT:
> + case UPDATE_ROWS_EVENT:
> + case WRITE_ROWS_EVENT:
> + // After the last Rows event has been applied, the saved Annotate_rows
> + // event (if any) is not needed anymore and can be deleted.
> + if (((Rows_log_event*)ev)->get_flags(Rows_log_event::STMT_END_F))
> + rli->free_annotate_event();
This seems a natural place to explicitly restore thd->query rather than inside
the annotate event destructor (if such restoration is really needed).
> + /* fall through */
> + default:
> + DBUG_PRINT("info", ("Deleting the event after it has been executed"));
> + delete ev;
> + break;
> + }
1
0
[Maria-developers] Updated (by Serg): pluggable authentication (101)
by worklog-noreply@askmonty.org 06 Apr '10
by worklog-noreply@askmonty.org 06 Apr '10
06 Apr '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: pluggable authentication
CREATION DATE..: Tue, 09 Mar 2010, 14:58
SUPERVISOR.....: Sergei
IMPLEMENTOR....: Sergei
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 101 (http://askmonty.org/worklog/?tid=101)
VERSION........: Server-5.2
STATUS.........: Complete
PRIORITY.......: 90
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Serg - Tue, 06 Apr 2010, 13:18)=-=-
Status updated.
--- /tmp/wklog.101.old.1262 2010-04-06 13:18:00.000000000 +0000
+++ /tmp/wklog.101.new.1262 2010-04-06 13:18:00.000000000 +0000
@@ -1 +1 @@
-Code-Review
+Complete
DESCRIPTION:
push the pluggable authentication feature
https://code.launchpad.net/~maria-captains/maria/5.2-pluggable-auth
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Serg): pluggable authentication (101)
by worklog-noreply@askmonty.org 06 Apr '10
by worklog-noreply@askmonty.org 06 Apr '10
06 Apr '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: pluggable authentication
CREATION DATE..: Tue, 09 Mar 2010, 14:58
SUPERVISOR.....: Sergei
IMPLEMENTOR....: Sergei
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 101 (http://askmonty.org/worklog/?tid=101)
VERSION........: Server-5.2
STATUS.........: Complete
PRIORITY.......: 90
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Serg - Tue, 06 Apr 2010, 13:18)=-=-
Status updated.
--- /tmp/wklog.101.old.1262 2010-04-06 13:18:00.000000000 +0000
+++ /tmp/wklog.101.new.1262 2010-04-06 13:18:00.000000000 +0000
@@ -1 +1 @@
-Code-Review
+Complete
DESCRIPTION:
push the pluggable authentication feature
https://code.launchpad.net/~maria-captains/maria/5.2-pluggable-auth
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Rev 2775: keep acl_user->auth_string and acl_user->salt always in sync in http://bazaar.launchpad.net/~maria-captains/maria/5.2/
by serg@askmonty.org 06 Apr '10
by serg@askmonty.org 06 Apr '10
06 Apr '10
At http://bazaar.launchpad.net/~maria-captains/maria/5.2/
------------------------------------------------------------
revno: 2775
revision-id: sergii(a)pisem.net-20100406110456-zhkf13tvdcywb9tb
parent: sergii(a)pisem.net-20100406110224-0wtdkmquq8wmdfer
committer: Sergei Golubchik <sergii(a)pisem.net>
branch nick: 5.2
timestamp: Tue 2010-04-06 13:04:56 +0200
message:
keep acl_user->auth_string and acl_user->salt always in sync
print authenticated via in show grants
=== modified file 'sql/sql_acl.cc'
--- a/sql/sql_acl.cc 2010-04-05 09:50:31 +0000
+++ b/sql/sql_acl.cc 2010-04-06 11:04:56 +0000
@@ -287,9 +287,8 @@ static void init_check_host(void);
static void rebuild_check_host(void);
static ACL_USER *find_acl_user(const char *host, const char *user,
my_bool exact);
-static bool update_user_table(THD *thd, TABLE *table,
- const char *host, const char *user,
- const char *new_password, uint new_password_len);
+static bool update_user_table(THD *, TABLE *, const char *, const char *,
+ const char *, uint);
static void update_hostname(acl_host_and_ip *host, const char *hostname);
static bool compare_hostname(const acl_host_and_ip *host,const char *hostname,
const char *ip);
@@ -1161,7 +1160,12 @@ static void acl_update_user(const char *
acl_user->plugin.str= strmake_root(&mem, plugin->str, plugin->length);
}
else
- set_user_salt(acl_user, password, password_len);
+ if (password)
+ {
+ acl_user->auth_string.str= strmake_root(&mem, password, password_len);
+ acl_user->auth_string.length= password_len;
+ set_user_salt(acl_user, password, password_len);
+ }
acl_user->access=privileges;
if (mqh->specified_limits & USER_RESOURCES::QUERIES_PER_HOUR)
acl_user->user_resource.questions=mqh->questions;
@@ -1611,7 +1615,13 @@ bool change_password(THD *thd, const cha
goto end;
}
/* update loaded acl entry: */
- set_user_salt(acl_user, new_password, new_password_len);
+ if (acl_user->plugin.str == native_password_plugin_name.str ||
+ acl_user->plugin.str == old_password_plugin_name.str)
+ {
+ acl_user->auth_string.str= strmake_root(&mem, new_password, new_password_len);
+ acl_user->auth_string.length= new_password_len;
+ set_user_salt(acl_user, new_password, new_password_len);
+ }
if (update_user_table(thd, table,
acl_user->host.hostname ? acl_user->host.hostname : "",
@@ -4642,16 +4652,27 @@ bool mysql_show_grants(THD *thd,LEX_USER
global.append(lex_user->host.str,lex_user->host.length,
system_charset_info);
global.append ('\'');
- if (acl_user->salt_len)
+ if (acl_user->plugin.str == native_password_plugin_name.str ||
+ acl_user->plugin.str == old_password_plugin_name.str)
{
- char passwd_buff[SCRAMBLED_PASSWORD_CHAR_LENGTH+1];
- if (acl_user->salt_len == SCRAMBLE_LENGTH)
- make_password_from_salt(passwd_buff, acl_user->salt);
- else
- make_password_from_salt_323(passwd_buff, (ulong *) acl_user->salt);
- global.append(STRING_WITH_LEN(" IDENTIFIED BY PASSWORD '"));
- global.append(passwd_buff);
- global.append('\'');
+ if (acl_user->auth_string.length)
+ {
+ DBUG_ASSERT(acl_user->salt_len);
+ global.append(STRING_WITH_LEN(" IDENTIFIED BY PASSWORD '"));
+ global.append(acl_user->auth_string.str, acl_user->auth_string.length);
+ global.append('\'');
+ }
+ }
+ else
+ {
+ global.append(STRING_WITH_LEN(" IDENTIFIED VIA "));
+ global.append(acl_user->plugin.str, acl_user->plugin.length);
+ if (acl_user->auth_string.length)
+ {
+ global.append(STRING_WITH_LEN(" USING '"));
+ global.append(acl_user->auth_string.str, acl_user->auth_string.length);
+ global.append('\'');
+ }
}
/* "show grants" SSL related stuff */
if (acl_user->ssl_type == SSL_TYPE_ANY)
1
0
[Maria-developers] Rev 2774: fix a warning in http://bazaar.launchpad.net/~maria-captains/maria/5.2/
by serg@askmonty.org 06 Apr '10
by serg@askmonty.org 06 Apr '10
06 Apr '10
At http://bazaar.launchpad.net/~maria-captains/maria/5.2/
------------------------------------------------------------
revno: 2774
revision-id: sergii(a)pisem.net-20100406110224-0wtdkmquq8wmdfer
parent: igor(a)askmonty.org-20100406065639-w39nrood9amqc5zk
committer: Sergei Golubchik <sergii(a)pisem.net>
branch nick: 5.2
timestamp: Tue 2010-04-06 13:02:24 +0200
message:
fix a warning
=== modified file 'mysys/mf_keycache.c'
--- a/mysys/mf_keycache.c 2010-04-06 02:08:10 +0000
+++ b/mysys/mf_keycache.c 2010-04-06 11:02:24 +0000
@@ -5207,7 +5207,7 @@ int init_partitioned_key_cache(PARTITION
my_free(partition, MYF(0));
partition= 0;
}
- if (i == 0 && cnt < 0 || i > 0)
+ if ((i == 0 && cnt < 0) || i > 0)
{
/*
Here we have two cases:
1
0