developers
Threads by month
- ----- 2025 -----
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
November 2009
- 22 participants
- 180 discussions
[Maria-developers] Progress (by Bothorsen): Add a mysqlbinlog option to change the used database (36)
by worklog-noreply@askmonty.org 03 Nov '09
by worklog-noreply@askmonty.org 03 Nov '09
03 Nov '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to change the used database
CREATION DATE..: Fri, 07 Aug 2009, 14:57
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 36 (http://askmonty.org/worklog/?tid=36)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 49
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Bothorsen - Tue, 03 Nov 2009, 13:49)=-=-
More cleanup work done by Alexi, Bo and Sergey.
Worked 4 hours and estimate 0 hours remain (original estimate increased by 4 hours).
-=-=(Bothorsen - Tue, 03 Nov 2009, 13:49)=-=-
Sergey and Bo has been working on getting the patch ready, and Alexi has fixed some issues with the
patch.
Worked 15 hours and estimate 0 hours remain (original estimate increased by 15 hours).
-=-=(Bothorsen - Tue, 03 Nov 2009, 13:47)=-=-
Alexi has implemented a patch for this item.
Worked 30 hours and estimate 0 hours remain (original estimate increased by 30 hours).
-=-=(Guest - Tue, 15 Sep 2009, 18:04)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.19322 2009-09-15 18:04:49.000000000 +0300
+++ /tmp/wklog.36.new.19322 2009-09-15 18:04:49.000000000 +0300
@@ -191,7 +191,7 @@
- In process_event() function add switch case for Load_log_event and
add print_use_stmt() invocations where needed (according to the
- events lis above), e.g.:
+ events list above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
-=-=(Guest - Tue, 15 Sep 2009, 15:53)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.13421 2009-09-15 15:53:31.000000000 +0300
+++ /tmp/wklog.36.new.13421 2009-09-15 15:53:31.000000000 +0300
@@ -150,10 +150,17 @@
following events (see process_event() function):
- Query_log_event
-- Execute_load_query_log_event
-- Create_file_log_event
-
-TODO. Needed to check this list requires carefully !!!
+- Load_log_event
+- Execute_load_query_log_event [ :public Query_log_event ]
+- Create_file_log_event [ :public Load_log_event ]
+
+TODO. Needed to check this list carefully (not sure for Create_file_log_event)
+ Notes.
+ - In replication, only Query_log_event and Load_log_event uses
+ rpl_filter->get_rewrite_db();
+ - In mysqlbinlog (process_event), Execute_load_query_log_event
+ and Create_file_log_event are processed in separate switch
+ cases. And Load_log_event is processed in the default switch case.
Conditions for emiting use-statement:
- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
@@ -182,8 +189,9 @@
*/
}
-- In process_event() function add print_use_stmt() invocations where
- needed (according to the events lis above), e.g.:
+- In process_event() function add switch case for Load_log_event and
+ add print_use_stmt() invocations where needed (according to the
+ events lis above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
@@ -207,6 +215,11 @@
}
break;
...
+ case LOAD_EVENT:
+ print_use_stmt((Load_log_event*)ev, print_event_info);
+ break;
+ default:
+ ...
}
...
}
-=-=(Guest - Tue, 15 Sep 2009, 12:12)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.3961 2009-09-15 12:12:26.000000000 +0300
+++ /tmp/wklog.36.new.3961 2009-09-15 12:12:26.000000000 +0300
@@ -144,6 +144,8 @@
3. Supporting rewrite-db for SBR events
---------------------------------------
+Limited to emiting USE <db_to> instead of USE <db_from>.
+
USE statements can be emited by mysqlbinlog as a result of processing the
following events (see process_event() function):
-=-=(Guest - Tue, 15 Sep 2009, 12:08)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.3794 2009-09-15 12:08:54.000000000 +0300
+++ /tmp/wklog.36.new.3794 2009-09-15 12:08:54.000000000 +0300
@@ -1 +1,229 @@
+Content
+-------
+1. Adding rewrite-db option
+2. Supporting rewrite-db option for RBR events
+3. Supporting rewrite-db option for SBR events
+ (Limited to affecting only USE statements)
+4. Current status
+
+1. Adding rewrite-db option
+---------------------------
+
+1.1. Syntax:
+ --rewrite-db='db_from->db_to'
+
+1.2. Add 'OPT_REWRITE_DB' to 'options_client' (in client_priv.h).
+
+1.3. In mysqlbinlog.cc:
+
+- Add { "rewrite-db", OPT_REWRITE_DB, ...} record to my_long_options:
+- Add Rpl_filter object to mysqlbinlog.cc
+
+ Rpl_filter* binlog_filter;
+
+- Add corresponding switch case to get_one_option():
+
+ case OPT_REWRITE_DB:
+ <extract db-from and db-to strings>
+ binlog_filter->add_db_rewrite(db_from, db_to);
+ break;
+ .
+Note. To make Rpl_filter usable in a MYSQL_CLIENT context, few small
+additional changes are required:
+
+- In sql_list.cc/h, Sql_alloc::new(size_t) and Sql_alloc::new[](size_t)
+ uses sql_alloc() which is THD dependent. These are to be modified
+ as follows:
+
+ #ifdef MYSQL_CLIENT
+ extern MEM_ROOT sql_list_client_mem_root; // defined in sql_list.cc
+ #endif
+
+ class Sql_alloc
+ { ...
+ static void *operator new(size_t size) throw ()
+ {
+ #ifndef MYSQL_CLIENT
+ return sql_alloc(size);
+ #else
+ return alloc_root(&sql_list_client_mem_root, size);
+ #endif
+ }
+ static void *operator new[](size_t size) throw ()
+ {
+ #ifndef MYSQL_CLIENT
+ return sql_alloc(size);
+ #else
+ return alloc_root(&sql_list_client_mem_root, size);
+ #endif
+ }
+ ...
+ }
+
+- In rpl_filter.cc:
+
+ Rpl_filter::Rpl_filter() :
+ ...
+ {
+ #ifdef MYSQL_CLIENT
+ init_alloc_root(&sql_list_client_mem_root, ...);
+ #endif
+ ...
+ }
+
+ Rpl_filter::~Rpl_filter()
+ { ...
+ #ifdef MYSQL_CLIENT
+ free_root(&sql_list_client_mem_root, ...);
+ #endif
+ }
+
+2. Supporting rewrite-db for RBR events
+---------------------------------------
+
+In binlog, each row operation event is preceded by Table map event(s) which maps
+table id(s) to database and table names. So, it's enough to support rewriting
+database name in a Table map.
+
+2.1. Add rewrite_db() member to Table_map_log_event:
+
+ int Table_map_log_event::rewrite_db(
+ const char* new_db,
+ size_t new_db_len,
+ const Format_description_log_event* desc)
+ {
+ /* 1. In temp_buf member (possibly reallocating it) rewrite
+ event length, db length, and db parts
+ 2. Change m_dblen and m_dbnam members
+ */
+ }
+
+Comment. This function assumes that temp_buf member contains Table map
+binlog representaion (temp_buf is used for creating corresponding
+BINLOG statement).
+
+2.2. In mysqlbinlog modify corresponding switch case in the
+process_event() function:
+
+ Exit_status process_event(
+ PRINT_EVENT_INFO *print_event_info,
+ Log_event *ev, ...)
+ {
+ ...
+ switch (ev_type) {
+ ...
+ case TABLE_MAP_EVENT:
+ {
+ Table_map_log_event *map= ((Table_map_log_event *)ev);
+ if (shall_skip_database(map->get_db_name()))
+ { ...
+ }
+ // WL36
+ size_t new_len= 0;
+ const char* new_db= binlog_filter->get_rewrite_db(
+ map->get_db_name(), &new_len);
+ if (new_len && map->rewrite_db(new_db, new_len,
+ glob_description_event))
+ { error("Could not rewrite database name");
+ goto err;
+ }
+ }
+ case WRITE_ROWS_EVENT:
+ case DELETE_ROWS_EVENT:
+ case UPDATE_ROWS_EVENT:
+ ...
+ }
+ ...
+ }
+
+Comment. Rpl_filter::get_rewrite_db(db_from, &len): if filter contains
+a (db_from, db_to) pair, this function returns pointer to db_to and
+sets len = db_to length; otherwise, it returns db_from and does not
+change len value.
+
+3. Supporting rewrite-db for SBR events
+---------------------------------------
+
+USE statements can be emited by mysqlbinlog as a result of processing the
+following events (see process_event() function):
+
+- Query_log_event
+- Execute_load_query_log_event
+- Create_file_log_event
+
+TODO. Needed to check this list requires carefully !!!
+
+Conditions for emiting use-statement:
+- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
+ (e.g. it is ON for 'create database' statement)
+- event's db name differs from db_name in PRINT_EVENT_INFO
+ (PRINT_EVENT_INFO keeps db name of the last issued USE statement;
+ initially, this db name is empty).
+
+3.1. In mysqlbinlog.cc
+
+- Add the following function:
+
+ void print_use_stmt(Log_event* event, PRINT_EVENT_INFO* pinfo)
+ {
+ if (event->flags & LOG_EVENT_SUPPRESS_USE_F)
+ return;
+ /*
+ - For events listed above get db_from = event->db;
+ - If db_from is the same as pinfo->db then return;
+ - If there is rewrite-db rule db_from->db_to,
+ set db = db_to. Else set db = db_from;
+ - Print "use <db>" to mysqlbinlog output
+ - Set pinfo->db = db_from
+ (this suppresses emiting use-statements by corresponding
+ log_event's print-function)
+ */
+ }
+
+- In process_event() function add print_use_stmt() invocations where
+ needed (according to the events lis above), e.g.:
+
+ Exit_status process_event(
+ PRINT_EVENT_INFO *print_event_info,
+ Log_event *ev, ...)
+ {
+ ...
+ switch (ev_type) {
+ case QUERY_EVENT:
+ if (shall_skip_database(((Query_log_event*)ev)->db))
+ goto end;
+ if (opt_base64_output_mode == BASE64_OUTPUT_ALWAYS)
+ {
+ // Possibly in case of rewite-db rule for ev->db
+ // a warning should be emited here (see note below)
+ ... write_event_header_and_base64(ev, ...) ...
+ }
+ else
+ {
+ print_use_stmt((Query_log_event*)ev, print_event_info);
+ ev->print(result_file, print_event_info);
+ }
+ break;
+ ...
+ }
+ ...
+ }
+
+Note. write_event_header_and_base64() does not print use-statement. It
+produces BINLOG statement using ev->temp_buf content (i.e. the binary
+log representation of the event). We don't rewrite temp_buf here with
+db_to name (as we do it for Table map event) - this implies the
+limitation 3 mentioned above.
+Question: Is supporting of rewite_db + --base64-output really needed
+currently?
+
+4. Current status
+-----------------
+
+The outlined design (implemented for mysql-5.1.37) is tested for
+simple test-cases.
+
+TODO. 1. Check list of events which can emit use-statement.
+ 2. Supporting of rewite_db + --base64-output ?
+
-=-=(Guest - Mon, 14 Sep 2009, 11:51)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.9711 2009-09-14 11:51:43.000000000 +0300
+++ /tmp/wklog.36.new.9711 2009-09-14 11:51:43.000000000 +0300
@@ -1 +1 @@
-Pay no attention: just check for having access
+
-=-=(Guest - Mon, 14 Sep 2009, 11:51)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.9678 2009-09-14 11:51:28.000000000 +0300
+++ /tmp/wklog.36.new.9678 2009-09-14 11:51:28.000000000 +0300
@@ -1 +1 @@
-
+Pay no attention: just check for having access
-=-=(Knielsen - Mon, 17 Aug 2009, 12:44)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.7834 2009-08-17 12:44:17.000000000 +0300
+++ /tmp/wklog.36.new.7834 2009-08-17 12:44:17.000000000 +0300
@@ -13,7 +13,9 @@
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
-See also MySQL BUG#42941.
+See also MySQL BUG#42941. Note this bug is fixed in MySQL 5.1.37, which is not
+merged into MariaDB at the time of writing, but planned to be merged before
+release.
What we could do
----------------
------------------------------------------------------------
-=-=(View All Progress Notes, 16 total)=-=-
http://askmonty.org/worklog/index.pl?tid=36&nolimit=1
DESCRIPTION:
Sometimes there is a need to take a binary log and apply it to a database with
a different name than the original name of the database on binlog producer.
If one is using statement-based replication, he can achieve this by grepping
out "USE dbname" statements out of the output of mysqlbinlog(*). With
row-based replication this is no longer possible, as database name is encoded
within the the BINLOG '....' statement.
This task is about adding an option to mysqlbinlog that would allow to change
the names of used databases in both RBR and SBR events.
(*) this implies that all statements refer to tables in the current database,
doesn't catch updates made inside stored functions and so forth, but still
works for a practially-important subset of cases.
HIGH-LEVEL SPECIFICATION:
Context
-------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has a replication slave option
--replicate-rewrite-db="from->to"
the option affects
- Table_map_log_event (all RBR events)
- Load_log_event (LOAD DATA)
- Query_log_event (SBR-based updates, with the usual assumption that the
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
See also MySQL BUG#42941. Note this bug is fixed in MySQL 5.1.37, which is not
merged into MariaDB at the time of writing, but planned to be merged before
release.
What we could do
----------------
Option1: make mysqlbinlog accept --replicate-rewrite-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Make mysqlbinlog accept --replicate-rewrite-db options and process them to the
same extent as replication slave would process --replicate-rewrite-db option.
Option2: Add database-agnostic RBR events and --strip-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Right now RBR events require a databasename. It is not possible to have RBR
event stream that won't mention which database the events are for. When I
tried to use debugger and specify empty database name, attempt to apply the
binlog resulted in this error:
090809 17:38:44 [ERROR] Slave SQL: Error 'Table '.tablename' doesn't exist' on
opening tables,
We could do as follows:
- Make the server interpret empty database name in RBR event (i.e. in a
Table_map_log_event) as "use current database". Binlog slave thread
probably should not allow such events as it doesn't have a natural current
database.
- Add a mysqlbinlog --strip-db option that would
= not produce any "USE dbname" statements
= change databasename for all RBR events to be empty
That way, mysqlbinlog output will be database-agnostic and apply to the
current database.
(this will have the usual limitations that we assume that all statements in
the binlog refer to the current database).
Option3: Enhance database rewrite
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If there is a need to support database change for statements that use
dbname.tablename notation and are replicated as statements (i.e. are DDL
statements and/or DML statements that are binlogged as statements),
then that could be supported as follows:
- Make the server's parser recognize special form of comments
/* !database-alias(oldname,newname) */
and save the mapping somewhere
- Put the hooks in table open and name resolution code to use the saved
mapping.
Once we've done the above, it will be easy to perform a complete,
no-compromise or restrictions database name change in binary log.
It will be possible to do the rewrites either on the slave (
--replicate-rewrite-db will work for all kinds of statements), or in
mysqlbinlog (adding a comment is easy and doesn't require mysqlbinlog to
parse the statement).
LOW-LEVEL DESIGN:
Content
-------
1. Adding rewrite-db option
2. Supporting rewrite-db option for RBR events
3. Supporting rewrite-db option for SBR events
(Limited to affecting only USE statements)
4. Current status
1. Adding rewrite-db option
---------------------------
1.1. Syntax:
--rewrite-db='db_from->db_to'
1.2. Add 'OPT_REWRITE_DB' to 'options_client' (in client_priv.h).
1.3. In mysqlbinlog.cc:
- Add { "rewrite-db", OPT_REWRITE_DB, ...} record to my_long_options:
- Add Rpl_filter object to mysqlbinlog.cc
Rpl_filter* binlog_filter;
- Add corresponding switch case to get_one_option():
case OPT_REWRITE_DB:
<extract db-from and db-to strings>
binlog_filter->add_db_rewrite(db_from, db_to);
break;
.
Note. To make Rpl_filter usable in a MYSQL_CLIENT context, few small
additional changes are required:
- In sql_list.cc/h, Sql_alloc::new(size_t) and Sql_alloc::new[](size_t)
uses sql_alloc() which is THD dependent. These are to be modified
as follows:
#ifdef MYSQL_CLIENT
extern MEM_ROOT sql_list_client_mem_root; // defined in sql_list.cc
#endif
class Sql_alloc
{ ...
static void *operator new(size_t size) throw ()
{
#ifndef MYSQL_CLIENT
return sql_alloc(size);
#else
return alloc_root(&sql_list_client_mem_root, size);
#endif
}
static void *operator new[](size_t size) throw ()
{
#ifndef MYSQL_CLIENT
return sql_alloc(size);
#else
return alloc_root(&sql_list_client_mem_root, size);
#endif
}
...
}
- In rpl_filter.cc:
Rpl_filter::Rpl_filter() :
...
{
#ifdef MYSQL_CLIENT
init_alloc_root(&sql_list_client_mem_root, ...);
#endif
...
}
Rpl_filter::~Rpl_filter()
{ ...
#ifdef MYSQL_CLIENT
free_root(&sql_list_client_mem_root, ...);
#endif
}
2. Supporting rewrite-db for RBR events
---------------------------------------
In binlog, each row operation event is preceded by Table map event(s) which maps
table id(s) to database and table names. So, it's enough to support rewriting
database name in a Table map.
2.1. Add rewrite_db() member to Table_map_log_event:
int Table_map_log_event::rewrite_db(
const char* new_db,
size_t new_db_len,
const Format_description_log_event* desc)
{
/* 1. In temp_buf member (possibly reallocating it) rewrite
event length, db length, and db parts
2. Change m_dblen and m_dbnam members
*/
}
Comment. This function assumes that temp_buf member contains Table map
binlog representaion (temp_buf is used for creating corresponding
BINLOG statement).
2.2. In mysqlbinlog modify corresponding switch case in the
process_event() function:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
Log_event *ev, ...)
{
...
switch (ev_type) {
...
case TABLE_MAP_EVENT:
{
Table_map_log_event *map= ((Table_map_log_event *)ev);
if (shall_skip_database(map->get_db_name()))
{ ...
}
// WL36
size_t new_len= 0;
const char* new_db= binlog_filter->get_rewrite_db(
map->get_db_name(), &new_len);
if (new_len && map->rewrite_db(new_db, new_len,
glob_description_event))
{ error("Could not rewrite database name");
goto err;
}
}
case WRITE_ROWS_EVENT:
case DELETE_ROWS_EVENT:
case UPDATE_ROWS_EVENT:
...
}
...
}
Comment. Rpl_filter::get_rewrite_db(db_from, &len): if filter contains
a (db_from, db_to) pair, this function returns pointer to db_to and
sets len = db_to length; otherwise, it returns db_from and does not
change len value.
3. Supporting rewrite-db for SBR events
---------------------------------------
Limited to emiting USE <db_to> instead of USE <db_from>.
USE statements can be emited by mysqlbinlog as a result of processing the
following events (see process_event() function):
- Query_log_event
- Load_log_event
- Execute_load_query_log_event [ :public Query_log_event ]
- Create_file_log_event [ :public Load_log_event ]
TODO. Needed to check this list carefully (not sure for Create_file_log_event)
Notes.
- In replication, only Query_log_event and Load_log_event uses
rpl_filter->get_rewrite_db();
- In mysqlbinlog (process_event), Execute_load_query_log_event
and Create_file_log_event are processed in separate switch
cases. And Load_log_event is processed in the default switch case.
Conditions for emiting use-statement:
- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
(e.g. it is ON for 'create database' statement)
- event's db name differs from db_name in PRINT_EVENT_INFO
(PRINT_EVENT_INFO keeps db name of the last issued USE statement;
initially, this db name is empty).
3.1. In mysqlbinlog.cc
- Add the following function:
void print_use_stmt(Log_event* event, PRINT_EVENT_INFO* pinfo)
{
if (event->flags & LOG_EVENT_SUPPRESS_USE_F)
return;
/*
- For events listed above get db_from = event->db;
- If db_from is the same as pinfo->db then return;
- If there is rewrite-db rule db_from->db_to,
set db = db_to. Else set db = db_from;
- Print "use <db>" to mysqlbinlog output
- Set pinfo->db = db_from
(this suppresses emiting use-statements by corresponding
log_event's print-function)
*/
}
- In process_event() function add switch case for Load_log_event and
add print_use_stmt() invocations where needed (according to the
events list above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
Log_event *ev, ...)
{
...
switch (ev_type) {
case QUERY_EVENT:
if (shall_skip_database(((Query_log_event*)ev)->db))
goto end;
if (opt_base64_output_mode == BASE64_OUTPUT_ALWAYS)
{
// Possibly in case of rewite-db rule for ev->db
// a warning should be emited here (see note below)
... write_event_header_and_base64(ev, ...) ...
}
else
{
print_use_stmt((Query_log_event*)ev, print_event_info);
ev->print(result_file, print_event_info);
}
break;
...
case LOAD_EVENT:
print_use_stmt((Load_log_event*)ev, print_event_info);
break;
default:
...
}
...
}
Note. write_event_header_and_base64() does not print use-statement. It
produces BINLOG statement using ev->temp_buf content (i.e. the binary
log representation of the event). We don't rewrite temp_buf here with
db_to name (as we do it for Table map event) - this implies the
limitation 3 mentioned above.
Question: Is supporting of rewite_db + --base64-output really needed
currently?
4. Current status
-----------------
The outlined design (implemented for mysql-5.1.37) is tested for
simple test-cases.
TODO. 1. Check list of events which can emit use-statement.
2. Supporting of rewite_db + --base64-output ?
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Progress (by Bothorsen): Add a mysqlbinlog option to change the used database (36)
by worklog-noreply@askmonty.org 03 Nov '09
by worklog-noreply@askmonty.org 03 Nov '09
03 Nov '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to change the used database
CREATION DATE..: Fri, 07 Aug 2009, 14:57
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 36 (http://askmonty.org/worklog/?tid=36)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 49
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Bothorsen - Tue, 03 Nov 2009, 13:49)=-=-
More cleanup work done by Alexi, Bo and Sergey.
Worked 4 hours and estimate 0 hours remain (original estimate increased by 4 hours).
-=-=(Bothorsen - Tue, 03 Nov 2009, 13:49)=-=-
Sergey and Bo has been working on getting the patch ready, and Alexi has fixed some issues with the
patch.
Worked 15 hours and estimate 0 hours remain (original estimate increased by 15 hours).
-=-=(Bothorsen - Tue, 03 Nov 2009, 13:47)=-=-
Alexi has implemented a patch for this item.
Worked 30 hours and estimate 0 hours remain (original estimate increased by 30 hours).
-=-=(Guest - Tue, 15 Sep 2009, 18:04)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.19322 2009-09-15 18:04:49.000000000 +0300
+++ /tmp/wklog.36.new.19322 2009-09-15 18:04:49.000000000 +0300
@@ -191,7 +191,7 @@
- In process_event() function add switch case for Load_log_event and
add print_use_stmt() invocations where needed (according to the
- events lis above), e.g.:
+ events list above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
-=-=(Guest - Tue, 15 Sep 2009, 15:53)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.13421 2009-09-15 15:53:31.000000000 +0300
+++ /tmp/wklog.36.new.13421 2009-09-15 15:53:31.000000000 +0300
@@ -150,10 +150,17 @@
following events (see process_event() function):
- Query_log_event
-- Execute_load_query_log_event
-- Create_file_log_event
-
-TODO. Needed to check this list requires carefully !!!
+- Load_log_event
+- Execute_load_query_log_event [ :public Query_log_event ]
+- Create_file_log_event [ :public Load_log_event ]
+
+TODO. Needed to check this list carefully (not sure for Create_file_log_event)
+ Notes.
+ - In replication, only Query_log_event and Load_log_event uses
+ rpl_filter->get_rewrite_db();
+ - In mysqlbinlog (process_event), Execute_load_query_log_event
+ and Create_file_log_event are processed in separate switch
+ cases. And Load_log_event is processed in the default switch case.
Conditions for emiting use-statement:
- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
@@ -182,8 +189,9 @@
*/
}
-- In process_event() function add print_use_stmt() invocations where
- needed (according to the events lis above), e.g.:
+- In process_event() function add switch case for Load_log_event and
+ add print_use_stmt() invocations where needed (according to the
+ events lis above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
@@ -207,6 +215,11 @@
}
break;
...
+ case LOAD_EVENT:
+ print_use_stmt((Load_log_event*)ev, print_event_info);
+ break;
+ default:
+ ...
}
...
}
-=-=(Guest - Tue, 15 Sep 2009, 12:12)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.3961 2009-09-15 12:12:26.000000000 +0300
+++ /tmp/wklog.36.new.3961 2009-09-15 12:12:26.000000000 +0300
@@ -144,6 +144,8 @@
3. Supporting rewrite-db for SBR events
---------------------------------------
+Limited to emiting USE <db_to> instead of USE <db_from>.
+
USE statements can be emited by mysqlbinlog as a result of processing the
following events (see process_event() function):
-=-=(Guest - Tue, 15 Sep 2009, 12:08)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.3794 2009-09-15 12:08:54.000000000 +0300
+++ /tmp/wklog.36.new.3794 2009-09-15 12:08:54.000000000 +0300
@@ -1 +1,229 @@
+Content
+-------
+1. Adding rewrite-db option
+2. Supporting rewrite-db option for RBR events
+3. Supporting rewrite-db option for SBR events
+ (Limited to affecting only USE statements)
+4. Current status
+
+1. Adding rewrite-db option
+---------------------------
+
+1.1. Syntax:
+ --rewrite-db='db_from->db_to'
+
+1.2. Add 'OPT_REWRITE_DB' to 'options_client' (in client_priv.h).
+
+1.3. In mysqlbinlog.cc:
+
+- Add { "rewrite-db", OPT_REWRITE_DB, ...} record to my_long_options:
+- Add Rpl_filter object to mysqlbinlog.cc
+
+ Rpl_filter* binlog_filter;
+
+- Add corresponding switch case to get_one_option():
+
+ case OPT_REWRITE_DB:
+ <extract db-from and db-to strings>
+ binlog_filter->add_db_rewrite(db_from, db_to);
+ break;
+ .
+Note. To make Rpl_filter usable in a MYSQL_CLIENT context, few small
+additional changes are required:
+
+- In sql_list.cc/h, Sql_alloc::new(size_t) and Sql_alloc::new[](size_t)
+ uses sql_alloc() which is THD dependent. These are to be modified
+ as follows:
+
+ #ifdef MYSQL_CLIENT
+ extern MEM_ROOT sql_list_client_mem_root; // defined in sql_list.cc
+ #endif
+
+ class Sql_alloc
+ { ...
+ static void *operator new(size_t size) throw ()
+ {
+ #ifndef MYSQL_CLIENT
+ return sql_alloc(size);
+ #else
+ return alloc_root(&sql_list_client_mem_root, size);
+ #endif
+ }
+ static void *operator new[](size_t size) throw ()
+ {
+ #ifndef MYSQL_CLIENT
+ return sql_alloc(size);
+ #else
+ return alloc_root(&sql_list_client_mem_root, size);
+ #endif
+ }
+ ...
+ }
+
+- In rpl_filter.cc:
+
+ Rpl_filter::Rpl_filter() :
+ ...
+ {
+ #ifdef MYSQL_CLIENT
+ init_alloc_root(&sql_list_client_mem_root, ...);
+ #endif
+ ...
+ }
+
+ Rpl_filter::~Rpl_filter()
+ { ...
+ #ifdef MYSQL_CLIENT
+ free_root(&sql_list_client_mem_root, ...);
+ #endif
+ }
+
+2. Supporting rewrite-db for RBR events
+---------------------------------------
+
+In binlog, each row operation event is preceded by Table map event(s) which maps
+table id(s) to database and table names. So, it's enough to support rewriting
+database name in a Table map.
+
+2.1. Add rewrite_db() member to Table_map_log_event:
+
+ int Table_map_log_event::rewrite_db(
+ const char* new_db,
+ size_t new_db_len,
+ const Format_description_log_event* desc)
+ {
+ /* 1. In temp_buf member (possibly reallocating it) rewrite
+ event length, db length, and db parts
+ 2. Change m_dblen and m_dbnam members
+ */
+ }
+
+Comment. This function assumes that temp_buf member contains Table map
+binlog representaion (temp_buf is used for creating corresponding
+BINLOG statement).
+
+2.2. In mysqlbinlog modify corresponding switch case in the
+process_event() function:
+
+ Exit_status process_event(
+ PRINT_EVENT_INFO *print_event_info,
+ Log_event *ev, ...)
+ {
+ ...
+ switch (ev_type) {
+ ...
+ case TABLE_MAP_EVENT:
+ {
+ Table_map_log_event *map= ((Table_map_log_event *)ev);
+ if (shall_skip_database(map->get_db_name()))
+ { ...
+ }
+ // WL36
+ size_t new_len= 0;
+ const char* new_db= binlog_filter->get_rewrite_db(
+ map->get_db_name(), &new_len);
+ if (new_len && map->rewrite_db(new_db, new_len,
+ glob_description_event))
+ { error("Could not rewrite database name");
+ goto err;
+ }
+ }
+ case WRITE_ROWS_EVENT:
+ case DELETE_ROWS_EVENT:
+ case UPDATE_ROWS_EVENT:
+ ...
+ }
+ ...
+ }
+
+Comment. Rpl_filter::get_rewrite_db(db_from, &len): if filter contains
+a (db_from, db_to) pair, this function returns pointer to db_to and
+sets len = db_to length; otherwise, it returns db_from and does not
+change len value.
+
+3. Supporting rewrite-db for SBR events
+---------------------------------------
+
+USE statements can be emited by mysqlbinlog as a result of processing the
+following events (see process_event() function):
+
+- Query_log_event
+- Execute_load_query_log_event
+- Create_file_log_event
+
+TODO. Needed to check this list requires carefully !!!
+
+Conditions for emiting use-statement:
+- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
+ (e.g. it is ON for 'create database' statement)
+- event's db name differs from db_name in PRINT_EVENT_INFO
+ (PRINT_EVENT_INFO keeps db name of the last issued USE statement;
+ initially, this db name is empty).
+
+3.1. In mysqlbinlog.cc
+
+- Add the following function:
+
+ void print_use_stmt(Log_event* event, PRINT_EVENT_INFO* pinfo)
+ {
+ if (event->flags & LOG_EVENT_SUPPRESS_USE_F)
+ return;
+ /*
+ - For events listed above get db_from = event->db;
+ - If db_from is the same as pinfo->db then return;
+ - If there is rewrite-db rule db_from->db_to,
+ set db = db_to. Else set db = db_from;
+ - Print "use <db>" to mysqlbinlog output
+ - Set pinfo->db = db_from
+ (this suppresses emiting use-statements by corresponding
+ log_event's print-function)
+ */
+ }
+
+- In process_event() function add print_use_stmt() invocations where
+ needed (according to the events lis above), e.g.:
+
+ Exit_status process_event(
+ PRINT_EVENT_INFO *print_event_info,
+ Log_event *ev, ...)
+ {
+ ...
+ switch (ev_type) {
+ case QUERY_EVENT:
+ if (shall_skip_database(((Query_log_event*)ev)->db))
+ goto end;
+ if (opt_base64_output_mode == BASE64_OUTPUT_ALWAYS)
+ {
+ // Possibly in case of rewite-db rule for ev->db
+ // a warning should be emited here (see note below)
+ ... write_event_header_and_base64(ev, ...) ...
+ }
+ else
+ {
+ print_use_stmt((Query_log_event*)ev, print_event_info);
+ ev->print(result_file, print_event_info);
+ }
+ break;
+ ...
+ }
+ ...
+ }
+
+Note. write_event_header_and_base64() does not print use-statement. It
+produces BINLOG statement using ev->temp_buf content (i.e. the binary
+log representation of the event). We don't rewrite temp_buf here with
+db_to name (as we do it for Table map event) - this implies the
+limitation 3 mentioned above.
+Question: Is supporting of rewite_db + --base64-output really needed
+currently?
+
+4. Current status
+-----------------
+
+The outlined design (implemented for mysql-5.1.37) is tested for
+simple test-cases.
+
+TODO. 1. Check list of events which can emit use-statement.
+ 2. Supporting of rewite_db + --base64-output ?
+
-=-=(Guest - Mon, 14 Sep 2009, 11:51)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.9711 2009-09-14 11:51:43.000000000 +0300
+++ /tmp/wklog.36.new.9711 2009-09-14 11:51:43.000000000 +0300
@@ -1 +1 @@
-Pay no attention: just check for having access
+
-=-=(Guest - Mon, 14 Sep 2009, 11:51)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.9678 2009-09-14 11:51:28.000000000 +0300
+++ /tmp/wklog.36.new.9678 2009-09-14 11:51:28.000000000 +0300
@@ -1 +1 @@
-
+Pay no attention: just check for having access
-=-=(Knielsen - Mon, 17 Aug 2009, 12:44)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.7834 2009-08-17 12:44:17.000000000 +0300
+++ /tmp/wklog.36.new.7834 2009-08-17 12:44:17.000000000 +0300
@@ -13,7 +13,9 @@
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
-See also MySQL BUG#42941.
+See also MySQL BUG#42941. Note this bug is fixed in MySQL 5.1.37, which is not
+merged into MariaDB at the time of writing, but planned to be merged before
+release.
What we could do
----------------
------------------------------------------------------------
-=-=(View All Progress Notes, 16 total)=-=-
http://askmonty.org/worklog/index.pl?tid=36&nolimit=1
DESCRIPTION:
Sometimes there is a need to take a binary log and apply it to a database with
a different name than the original name of the database on binlog producer.
If one is using statement-based replication, he can achieve this by grepping
out "USE dbname" statements out of the output of mysqlbinlog(*). With
row-based replication this is no longer possible, as database name is encoded
within the the BINLOG '....' statement.
This task is about adding an option to mysqlbinlog that would allow to change
the names of used databases in both RBR and SBR events.
(*) this implies that all statements refer to tables in the current database,
doesn't catch updates made inside stored functions and so forth, but still
works for a practially-important subset of cases.
HIGH-LEVEL SPECIFICATION:
Context
-------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has a replication slave option
--replicate-rewrite-db="from->to"
the option affects
- Table_map_log_event (all RBR events)
- Load_log_event (LOAD DATA)
- Query_log_event (SBR-based updates, with the usual assumption that the
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
See also MySQL BUG#42941. Note this bug is fixed in MySQL 5.1.37, which is not
merged into MariaDB at the time of writing, but planned to be merged before
release.
What we could do
----------------
Option1: make mysqlbinlog accept --replicate-rewrite-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Make mysqlbinlog accept --replicate-rewrite-db options and process them to the
same extent as replication slave would process --replicate-rewrite-db option.
Option2: Add database-agnostic RBR events and --strip-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Right now RBR events require a databasename. It is not possible to have RBR
event stream that won't mention which database the events are for. When I
tried to use debugger and specify empty database name, attempt to apply the
binlog resulted in this error:
090809 17:38:44 [ERROR] Slave SQL: Error 'Table '.tablename' doesn't exist' on
opening tables,
We could do as follows:
- Make the server interpret empty database name in RBR event (i.e. in a
Table_map_log_event) as "use current database". Binlog slave thread
probably should not allow such events as it doesn't have a natural current
database.
- Add a mysqlbinlog --strip-db option that would
= not produce any "USE dbname" statements
= change databasename for all RBR events to be empty
That way, mysqlbinlog output will be database-agnostic and apply to the
current database.
(this will have the usual limitations that we assume that all statements in
the binlog refer to the current database).
Option3: Enhance database rewrite
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If there is a need to support database change for statements that use
dbname.tablename notation and are replicated as statements (i.e. are DDL
statements and/or DML statements that are binlogged as statements),
then that could be supported as follows:
- Make the server's parser recognize special form of comments
/* !database-alias(oldname,newname) */
and save the mapping somewhere
- Put the hooks in table open and name resolution code to use the saved
mapping.
Once we've done the above, it will be easy to perform a complete,
no-compromise or restrictions database name change in binary log.
It will be possible to do the rewrites either on the slave (
--replicate-rewrite-db will work for all kinds of statements), or in
mysqlbinlog (adding a comment is easy and doesn't require mysqlbinlog to
parse the statement).
LOW-LEVEL DESIGN:
Content
-------
1. Adding rewrite-db option
2. Supporting rewrite-db option for RBR events
3. Supporting rewrite-db option for SBR events
(Limited to affecting only USE statements)
4. Current status
1. Adding rewrite-db option
---------------------------
1.1. Syntax:
--rewrite-db='db_from->db_to'
1.2. Add 'OPT_REWRITE_DB' to 'options_client' (in client_priv.h).
1.3. In mysqlbinlog.cc:
- Add { "rewrite-db", OPT_REWRITE_DB, ...} record to my_long_options:
- Add Rpl_filter object to mysqlbinlog.cc
Rpl_filter* binlog_filter;
- Add corresponding switch case to get_one_option():
case OPT_REWRITE_DB:
<extract db-from and db-to strings>
binlog_filter->add_db_rewrite(db_from, db_to);
break;
.
Note. To make Rpl_filter usable in a MYSQL_CLIENT context, few small
additional changes are required:
- In sql_list.cc/h, Sql_alloc::new(size_t) and Sql_alloc::new[](size_t)
uses sql_alloc() which is THD dependent. These are to be modified
as follows:
#ifdef MYSQL_CLIENT
extern MEM_ROOT sql_list_client_mem_root; // defined in sql_list.cc
#endif
class Sql_alloc
{ ...
static void *operator new(size_t size) throw ()
{
#ifndef MYSQL_CLIENT
return sql_alloc(size);
#else
return alloc_root(&sql_list_client_mem_root, size);
#endif
}
static void *operator new[](size_t size) throw ()
{
#ifndef MYSQL_CLIENT
return sql_alloc(size);
#else
return alloc_root(&sql_list_client_mem_root, size);
#endif
}
...
}
- In rpl_filter.cc:
Rpl_filter::Rpl_filter() :
...
{
#ifdef MYSQL_CLIENT
init_alloc_root(&sql_list_client_mem_root, ...);
#endif
...
}
Rpl_filter::~Rpl_filter()
{ ...
#ifdef MYSQL_CLIENT
free_root(&sql_list_client_mem_root, ...);
#endif
}
2. Supporting rewrite-db for RBR events
---------------------------------------
In binlog, each row operation event is preceded by Table map event(s) which maps
table id(s) to database and table names. So, it's enough to support rewriting
database name in a Table map.
2.1. Add rewrite_db() member to Table_map_log_event:
int Table_map_log_event::rewrite_db(
const char* new_db,
size_t new_db_len,
const Format_description_log_event* desc)
{
/* 1. In temp_buf member (possibly reallocating it) rewrite
event length, db length, and db parts
2. Change m_dblen and m_dbnam members
*/
}
Comment. This function assumes that temp_buf member contains Table map
binlog representaion (temp_buf is used for creating corresponding
BINLOG statement).
2.2. In mysqlbinlog modify corresponding switch case in the
process_event() function:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
Log_event *ev, ...)
{
...
switch (ev_type) {
...
case TABLE_MAP_EVENT:
{
Table_map_log_event *map= ((Table_map_log_event *)ev);
if (shall_skip_database(map->get_db_name()))
{ ...
}
// WL36
size_t new_len= 0;
const char* new_db= binlog_filter->get_rewrite_db(
map->get_db_name(), &new_len);
if (new_len && map->rewrite_db(new_db, new_len,
glob_description_event))
{ error("Could not rewrite database name");
goto err;
}
}
case WRITE_ROWS_EVENT:
case DELETE_ROWS_EVENT:
case UPDATE_ROWS_EVENT:
...
}
...
}
Comment. Rpl_filter::get_rewrite_db(db_from, &len): if filter contains
a (db_from, db_to) pair, this function returns pointer to db_to and
sets len = db_to length; otherwise, it returns db_from and does not
change len value.
3. Supporting rewrite-db for SBR events
---------------------------------------
Limited to emiting USE <db_to> instead of USE <db_from>.
USE statements can be emited by mysqlbinlog as a result of processing the
following events (see process_event() function):
- Query_log_event
- Load_log_event
- Execute_load_query_log_event [ :public Query_log_event ]
- Create_file_log_event [ :public Load_log_event ]
TODO. Needed to check this list carefully (not sure for Create_file_log_event)
Notes.
- In replication, only Query_log_event and Load_log_event uses
rpl_filter->get_rewrite_db();
- In mysqlbinlog (process_event), Execute_load_query_log_event
and Create_file_log_event are processed in separate switch
cases. And Load_log_event is processed in the default switch case.
Conditions for emiting use-statement:
- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
(e.g. it is ON for 'create database' statement)
- event's db name differs from db_name in PRINT_EVENT_INFO
(PRINT_EVENT_INFO keeps db name of the last issued USE statement;
initially, this db name is empty).
3.1. In mysqlbinlog.cc
- Add the following function:
void print_use_stmt(Log_event* event, PRINT_EVENT_INFO* pinfo)
{
if (event->flags & LOG_EVENT_SUPPRESS_USE_F)
return;
/*
- For events listed above get db_from = event->db;
- If db_from is the same as pinfo->db then return;
- If there is rewrite-db rule db_from->db_to,
set db = db_to. Else set db = db_from;
- Print "use <db>" to mysqlbinlog output
- Set pinfo->db = db_from
(this suppresses emiting use-statements by corresponding
log_event's print-function)
*/
}
- In process_event() function add switch case for Load_log_event and
add print_use_stmt() invocations where needed (according to the
events list above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
Log_event *ev, ...)
{
...
switch (ev_type) {
case QUERY_EVENT:
if (shall_skip_database(((Query_log_event*)ev)->db))
goto end;
if (opt_base64_output_mode == BASE64_OUTPUT_ALWAYS)
{
// Possibly in case of rewite-db rule for ev->db
// a warning should be emited here (see note below)
... write_event_header_and_base64(ev, ...) ...
}
else
{
print_use_stmt((Query_log_event*)ev, print_event_info);
ev->print(result_file, print_event_info);
}
break;
...
case LOAD_EVENT:
print_use_stmt((Load_log_event*)ev, print_event_info);
break;
default:
...
}
...
}
Note. write_event_header_and_base64() does not print use-statement. It
produces BINLOG statement using ev->temp_buf content (i.e. the binary
log representation of the event). We don't rewrite temp_buf here with
db_to name (as we do it for Table map event) - this implies the
limitation 3 mentioned above.
Question: Is supporting of rewite_db + --base64-output really needed
currently?
4. Current status
-----------------
The outlined design (implemented for mysql-5.1.37) is tested for
simple test-cases.
TODO. 1. Check list of events which can emit use-statement.
2. Supporting of rewite_db + --base64-output ?
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Progress (by Bothorsen): Add a mysqlbinlog option to change the used database (36)
by worklog-noreply@askmonty.org 03 Nov '09
by worklog-noreply@askmonty.org 03 Nov '09
03 Nov '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to change the used database
CREATION DATE..: Fri, 07 Aug 2009, 14:57
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 36 (http://askmonty.org/worklog/?tid=36)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 45
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Bothorsen - Tue, 03 Nov 2009, 13:49)=-=-
Sergey and Bo has been working on getting the patch ready, and Alexi has fixed some issues with the
patch.
Worked 15 hours and estimate 0 hours remain (original estimate increased by 15 hours).
-=-=(Bothorsen - Tue, 03 Nov 2009, 13:47)=-=-
Alexi has implemented a patch for this item.
Worked 30 hours and estimate 0 hours remain (original estimate increased by 30 hours).
-=-=(Guest - Tue, 15 Sep 2009, 18:04)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.19322 2009-09-15 18:04:49.000000000 +0300
+++ /tmp/wklog.36.new.19322 2009-09-15 18:04:49.000000000 +0300
@@ -191,7 +191,7 @@
- In process_event() function add switch case for Load_log_event and
add print_use_stmt() invocations where needed (according to the
- events lis above), e.g.:
+ events list above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
-=-=(Guest - Tue, 15 Sep 2009, 15:53)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.13421 2009-09-15 15:53:31.000000000 +0300
+++ /tmp/wklog.36.new.13421 2009-09-15 15:53:31.000000000 +0300
@@ -150,10 +150,17 @@
following events (see process_event() function):
- Query_log_event
-- Execute_load_query_log_event
-- Create_file_log_event
-
-TODO. Needed to check this list requires carefully !!!
+- Load_log_event
+- Execute_load_query_log_event [ :public Query_log_event ]
+- Create_file_log_event [ :public Load_log_event ]
+
+TODO. Needed to check this list carefully (not sure for Create_file_log_event)
+ Notes.
+ - In replication, only Query_log_event and Load_log_event uses
+ rpl_filter->get_rewrite_db();
+ - In mysqlbinlog (process_event), Execute_load_query_log_event
+ and Create_file_log_event are processed in separate switch
+ cases. And Load_log_event is processed in the default switch case.
Conditions for emiting use-statement:
- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
@@ -182,8 +189,9 @@
*/
}
-- In process_event() function add print_use_stmt() invocations where
- needed (according to the events lis above), e.g.:
+- In process_event() function add switch case for Load_log_event and
+ add print_use_stmt() invocations where needed (according to the
+ events lis above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
@@ -207,6 +215,11 @@
}
break;
...
+ case LOAD_EVENT:
+ print_use_stmt((Load_log_event*)ev, print_event_info);
+ break;
+ default:
+ ...
}
...
}
-=-=(Guest - Tue, 15 Sep 2009, 12:12)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.3961 2009-09-15 12:12:26.000000000 +0300
+++ /tmp/wklog.36.new.3961 2009-09-15 12:12:26.000000000 +0300
@@ -144,6 +144,8 @@
3. Supporting rewrite-db for SBR events
---------------------------------------
+Limited to emiting USE <db_to> instead of USE <db_from>.
+
USE statements can be emited by mysqlbinlog as a result of processing the
following events (see process_event() function):
-=-=(Guest - Tue, 15 Sep 2009, 12:08)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.3794 2009-09-15 12:08:54.000000000 +0300
+++ /tmp/wklog.36.new.3794 2009-09-15 12:08:54.000000000 +0300
@@ -1 +1,229 @@
+Content
+-------
+1. Adding rewrite-db option
+2. Supporting rewrite-db option for RBR events
+3. Supporting rewrite-db option for SBR events
+ (Limited to affecting only USE statements)
+4. Current status
+
+1. Adding rewrite-db option
+---------------------------
+
+1.1. Syntax:
+ --rewrite-db='db_from->db_to'
+
+1.2. Add 'OPT_REWRITE_DB' to 'options_client' (in client_priv.h).
+
+1.3. In mysqlbinlog.cc:
+
+- Add { "rewrite-db", OPT_REWRITE_DB, ...} record to my_long_options:
+- Add Rpl_filter object to mysqlbinlog.cc
+
+ Rpl_filter* binlog_filter;
+
+- Add corresponding switch case to get_one_option():
+
+ case OPT_REWRITE_DB:
+ <extract db-from and db-to strings>
+ binlog_filter->add_db_rewrite(db_from, db_to);
+ break;
+ .
+Note. To make Rpl_filter usable in a MYSQL_CLIENT context, few small
+additional changes are required:
+
+- In sql_list.cc/h, Sql_alloc::new(size_t) and Sql_alloc::new[](size_t)
+ uses sql_alloc() which is THD dependent. These are to be modified
+ as follows:
+
+ #ifdef MYSQL_CLIENT
+ extern MEM_ROOT sql_list_client_mem_root; // defined in sql_list.cc
+ #endif
+
+ class Sql_alloc
+ { ...
+ static void *operator new(size_t size) throw ()
+ {
+ #ifndef MYSQL_CLIENT
+ return sql_alloc(size);
+ #else
+ return alloc_root(&sql_list_client_mem_root, size);
+ #endif
+ }
+ static void *operator new[](size_t size) throw ()
+ {
+ #ifndef MYSQL_CLIENT
+ return sql_alloc(size);
+ #else
+ return alloc_root(&sql_list_client_mem_root, size);
+ #endif
+ }
+ ...
+ }
+
+- In rpl_filter.cc:
+
+ Rpl_filter::Rpl_filter() :
+ ...
+ {
+ #ifdef MYSQL_CLIENT
+ init_alloc_root(&sql_list_client_mem_root, ...);
+ #endif
+ ...
+ }
+
+ Rpl_filter::~Rpl_filter()
+ { ...
+ #ifdef MYSQL_CLIENT
+ free_root(&sql_list_client_mem_root, ...);
+ #endif
+ }
+
+2. Supporting rewrite-db for RBR events
+---------------------------------------
+
+In binlog, each row operation event is preceded by Table map event(s) which maps
+table id(s) to database and table names. So, it's enough to support rewriting
+database name in a Table map.
+
+2.1. Add rewrite_db() member to Table_map_log_event:
+
+ int Table_map_log_event::rewrite_db(
+ const char* new_db,
+ size_t new_db_len,
+ const Format_description_log_event* desc)
+ {
+ /* 1. In temp_buf member (possibly reallocating it) rewrite
+ event length, db length, and db parts
+ 2. Change m_dblen and m_dbnam members
+ */
+ }
+
+Comment. This function assumes that temp_buf member contains Table map
+binlog representaion (temp_buf is used for creating corresponding
+BINLOG statement).
+
+2.2. In mysqlbinlog modify corresponding switch case in the
+process_event() function:
+
+ Exit_status process_event(
+ PRINT_EVENT_INFO *print_event_info,
+ Log_event *ev, ...)
+ {
+ ...
+ switch (ev_type) {
+ ...
+ case TABLE_MAP_EVENT:
+ {
+ Table_map_log_event *map= ((Table_map_log_event *)ev);
+ if (shall_skip_database(map->get_db_name()))
+ { ...
+ }
+ // WL36
+ size_t new_len= 0;
+ const char* new_db= binlog_filter->get_rewrite_db(
+ map->get_db_name(), &new_len);
+ if (new_len && map->rewrite_db(new_db, new_len,
+ glob_description_event))
+ { error("Could not rewrite database name");
+ goto err;
+ }
+ }
+ case WRITE_ROWS_EVENT:
+ case DELETE_ROWS_EVENT:
+ case UPDATE_ROWS_EVENT:
+ ...
+ }
+ ...
+ }
+
+Comment. Rpl_filter::get_rewrite_db(db_from, &len): if filter contains
+a (db_from, db_to) pair, this function returns pointer to db_to and
+sets len = db_to length; otherwise, it returns db_from and does not
+change len value.
+
+3. Supporting rewrite-db for SBR events
+---------------------------------------
+
+USE statements can be emited by mysqlbinlog as a result of processing the
+following events (see process_event() function):
+
+- Query_log_event
+- Execute_load_query_log_event
+- Create_file_log_event
+
+TODO. Needed to check this list requires carefully !!!
+
+Conditions for emiting use-statement:
+- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
+ (e.g. it is ON for 'create database' statement)
+- event's db name differs from db_name in PRINT_EVENT_INFO
+ (PRINT_EVENT_INFO keeps db name of the last issued USE statement;
+ initially, this db name is empty).
+
+3.1. In mysqlbinlog.cc
+
+- Add the following function:
+
+ void print_use_stmt(Log_event* event, PRINT_EVENT_INFO* pinfo)
+ {
+ if (event->flags & LOG_EVENT_SUPPRESS_USE_F)
+ return;
+ /*
+ - For events listed above get db_from = event->db;
+ - If db_from is the same as pinfo->db then return;
+ - If there is rewrite-db rule db_from->db_to,
+ set db = db_to. Else set db = db_from;
+ - Print "use <db>" to mysqlbinlog output
+ - Set pinfo->db = db_from
+ (this suppresses emiting use-statements by corresponding
+ log_event's print-function)
+ */
+ }
+
+- In process_event() function add print_use_stmt() invocations where
+ needed (according to the events lis above), e.g.:
+
+ Exit_status process_event(
+ PRINT_EVENT_INFO *print_event_info,
+ Log_event *ev, ...)
+ {
+ ...
+ switch (ev_type) {
+ case QUERY_EVENT:
+ if (shall_skip_database(((Query_log_event*)ev)->db))
+ goto end;
+ if (opt_base64_output_mode == BASE64_OUTPUT_ALWAYS)
+ {
+ // Possibly in case of rewite-db rule for ev->db
+ // a warning should be emited here (see note below)
+ ... write_event_header_and_base64(ev, ...) ...
+ }
+ else
+ {
+ print_use_stmt((Query_log_event*)ev, print_event_info);
+ ev->print(result_file, print_event_info);
+ }
+ break;
+ ...
+ }
+ ...
+ }
+
+Note. write_event_header_and_base64() does not print use-statement. It
+produces BINLOG statement using ev->temp_buf content (i.e. the binary
+log representation of the event). We don't rewrite temp_buf here with
+db_to name (as we do it for Table map event) - this implies the
+limitation 3 mentioned above.
+Question: Is supporting of rewite_db + --base64-output really needed
+currently?
+
+4. Current status
+-----------------
+
+The outlined design (implemented for mysql-5.1.37) is tested for
+simple test-cases.
+
+TODO. 1. Check list of events which can emit use-statement.
+ 2. Supporting of rewite_db + --base64-output ?
+
-=-=(Guest - Mon, 14 Sep 2009, 11:51)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.9711 2009-09-14 11:51:43.000000000 +0300
+++ /tmp/wklog.36.new.9711 2009-09-14 11:51:43.000000000 +0300
@@ -1 +1 @@
-Pay no attention: just check for having access
+
-=-=(Guest - Mon, 14 Sep 2009, 11:51)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.9678 2009-09-14 11:51:28.000000000 +0300
+++ /tmp/wklog.36.new.9678 2009-09-14 11:51:28.000000000 +0300
@@ -1 +1 @@
-
+Pay no attention: just check for having access
-=-=(Knielsen - Mon, 17 Aug 2009, 12:44)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.7834 2009-08-17 12:44:17.000000000 +0300
+++ /tmp/wklog.36.new.7834 2009-08-17 12:44:17.000000000 +0300
@@ -13,7 +13,9 @@
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
-See also MySQL BUG#42941.
+See also MySQL BUG#42941. Note this bug is fixed in MySQL 5.1.37, which is not
+merged into MariaDB at the time of writing, but planned to be merged before
+release.
What we could do
----------------
-=-=(Guest - Sun, 16 Aug 2009, 17:11)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.27162 2009-08-16 17:11:12.000000000 +0300
+++ /tmp/wklog.36.new.27162 2009-08-16 17:11:12.000000000 +0300
@@ -13,6 +13,8 @@
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
+See also MySQL BUG#42941.
+
What we could do
----------------
------------------------------------------------------------
-=-=(View All Progress Notes, 15 total)=-=-
http://askmonty.org/worklog/index.pl?tid=36&nolimit=1
DESCRIPTION:
Sometimes there is a need to take a binary log and apply it to a database with
a different name than the original name of the database on binlog producer.
If one is using statement-based replication, he can achieve this by grepping
out "USE dbname" statements out of the output of mysqlbinlog(*). With
row-based replication this is no longer possible, as database name is encoded
within the the BINLOG '....' statement.
This task is about adding an option to mysqlbinlog that would allow to change
the names of used databases in both RBR and SBR events.
(*) this implies that all statements refer to tables in the current database,
doesn't catch updates made inside stored functions and so forth, but still
works for a practially-important subset of cases.
HIGH-LEVEL SPECIFICATION:
Context
-------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has a replication slave option
--replicate-rewrite-db="from->to"
the option affects
- Table_map_log_event (all RBR events)
- Load_log_event (LOAD DATA)
- Query_log_event (SBR-based updates, with the usual assumption that the
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
See also MySQL BUG#42941. Note this bug is fixed in MySQL 5.1.37, which is not
merged into MariaDB at the time of writing, but planned to be merged before
release.
What we could do
----------------
Option1: make mysqlbinlog accept --replicate-rewrite-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Make mysqlbinlog accept --replicate-rewrite-db options and process them to the
same extent as replication slave would process --replicate-rewrite-db option.
Option2: Add database-agnostic RBR events and --strip-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Right now RBR events require a databasename. It is not possible to have RBR
event stream that won't mention which database the events are for. When I
tried to use debugger and specify empty database name, attempt to apply the
binlog resulted in this error:
090809 17:38:44 [ERROR] Slave SQL: Error 'Table '.tablename' doesn't exist' on
opening tables,
We could do as follows:
- Make the server interpret empty database name in RBR event (i.e. in a
Table_map_log_event) as "use current database". Binlog slave thread
probably should not allow such events as it doesn't have a natural current
database.
- Add a mysqlbinlog --strip-db option that would
= not produce any "USE dbname" statements
= change databasename for all RBR events to be empty
That way, mysqlbinlog output will be database-agnostic and apply to the
current database.
(this will have the usual limitations that we assume that all statements in
the binlog refer to the current database).
Option3: Enhance database rewrite
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If there is a need to support database change for statements that use
dbname.tablename notation and are replicated as statements (i.e. are DDL
statements and/or DML statements that are binlogged as statements),
then that could be supported as follows:
- Make the server's parser recognize special form of comments
/* !database-alias(oldname,newname) */
and save the mapping somewhere
- Put the hooks in table open and name resolution code to use the saved
mapping.
Once we've done the above, it will be easy to perform a complete,
no-compromise or restrictions database name change in binary log.
It will be possible to do the rewrites either on the slave (
--replicate-rewrite-db will work for all kinds of statements), or in
mysqlbinlog (adding a comment is easy and doesn't require mysqlbinlog to
parse the statement).
LOW-LEVEL DESIGN:
Content
-------
1. Adding rewrite-db option
2. Supporting rewrite-db option for RBR events
3. Supporting rewrite-db option for SBR events
(Limited to affecting only USE statements)
4. Current status
1. Adding rewrite-db option
---------------------------
1.1. Syntax:
--rewrite-db='db_from->db_to'
1.2. Add 'OPT_REWRITE_DB' to 'options_client' (in client_priv.h).
1.3. In mysqlbinlog.cc:
- Add { "rewrite-db", OPT_REWRITE_DB, ...} record to my_long_options:
- Add Rpl_filter object to mysqlbinlog.cc
Rpl_filter* binlog_filter;
- Add corresponding switch case to get_one_option():
case OPT_REWRITE_DB:
<extract db-from and db-to strings>
binlog_filter->add_db_rewrite(db_from, db_to);
break;
.
Note. To make Rpl_filter usable in a MYSQL_CLIENT context, few small
additional changes are required:
- In sql_list.cc/h, Sql_alloc::new(size_t) and Sql_alloc::new[](size_t)
uses sql_alloc() which is THD dependent. These are to be modified
as follows:
#ifdef MYSQL_CLIENT
extern MEM_ROOT sql_list_client_mem_root; // defined in sql_list.cc
#endif
class Sql_alloc
{ ...
static void *operator new(size_t size) throw ()
{
#ifndef MYSQL_CLIENT
return sql_alloc(size);
#else
return alloc_root(&sql_list_client_mem_root, size);
#endif
}
static void *operator new[](size_t size) throw ()
{
#ifndef MYSQL_CLIENT
return sql_alloc(size);
#else
return alloc_root(&sql_list_client_mem_root, size);
#endif
}
...
}
- In rpl_filter.cc:
Rpl_filter::Rpl_filter() :
...
{
#ifdef MYSQL_CLIENT
init_alloc_root(&sql_list_client_mem_root, ...);
#endif
...
}
Rpl_filter::~Rpl_filter()
{ ...
#ifdef MYSQL_CLIENT
free_root(&sql_list_client_mem_root, ...);
#endif
}
2. Supporting rewrite-db for RBR events
---------------------------------------
In binlog, each row operation event is preceded by Table map event(s) which maps
table id(s) to database and table names. So, it's enough to support rewriting
database name in a Table map.
2.1. Add rewrite_db() member to Table_map_log_event:
int Table_map_log_event::rewrite_db(
const char* new_db,
size_t new_db_len,
const Format_description_log_event* desc)
{
/* 1. In temp_buf member (possibly reallocating it) rewrite
event length, db length, and db parts
2. Change m_dblen and m_dbnam members
*/
}
Comment. This function assumes that temp_buf member contains Table map
binlog representaion (temp_buf is used for creating corresponding
BINLOG statement).
2.2. In mysqlbinlog modify corresponding switch case in the
process_event() function:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
Log_event *ev, ...)
{
...
switch (ev_type) {
...
case TABLE_MAP_EVENT:
{
Table_map_log_event *map= ((Table_map_log_event *)ev);
if (shall_skip_database(map->get_db_name()))
{ ...
}
// WL36
size_t new_len= 0;
const char* new_db= binlog_filter->get_rewrite_db(
map->get_db_name(), &new_len);
if (new_len && map->rewrite_db(new_db, new_len,
glob_description_event))
{ error("Could not rewrite database name");
goto err;
}
}
case WRITE_ROWS_EVENT:
case DELETE_ROWS_EVENT:
case UPDATE_ROWS_EVENT:
...
}
...
}
Comment. Rpl_filter::get_rewrite_db(db_from, &len): if filter contains
a (db_from, db_to) pair, this function returns pointer to db_to and
sets len = db_to length; otherwise, it returns db_from and does not
change len value.
3. Supporting rewrite-db for SBR events
---------------------------------------
Limited to emiting USE <db_to> instead of USE <db_from>.
USE statements can be emited by mysqlbinlog as a result of processing the
following events (see process_event() function):
- Query_log_event
- Load_log_event
- Execute_load_query_log_event [ :public Query_log_event ]
- Create_file_log_event [ :public Load_log_event ]
TODO. Needed to check this list carefully (not sure for Create_file_log_event)
Notes.
- In replication, only Query_log_event and Load_log_event uses
rpl_filter->get_rewrite_db();
- In mysqlbinlog (process_event), Execute_load_query_log_event
and Create_file_log_event are processed in separate switch
cases. And Load_log_event is processed in the default switch case.
Conditions for emiting use-statement:
- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
(e.g. it is ON for 'create database' statement)
- event's db name differs from db_name in PRINT_EVENT_INFO
(PRINT_EVENT_INFO keeps db name of the last issued USE statement;
initially, this db name is empty).
3.1. In mysqlbinlog.cc
- Add the following function:
void print_use_stmt(Log_event* event, PRINT_EVENT_INFO* pinfo)
{
if (event->flags & LOG_EVENT_SUPPRESS_USE_F)
return;
/*
- For events listed above get db_from = event->db;
- If db_from is the same as pinfo->db then return;
- If there is rewrite-db rule db_from->db_to,
set db = db_to. Else set db = db_from;
- Print "use <db>" to mysqlbinlog output
- Set pinfo->db = db_from
(this suppresses emiting use-statements by corresponding
log_event's print-function)
*/
}
- In process_event() function add switch case for Load_log_event and
add print_use_stmt() invocations where needed (according to the
events list above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
Log_event *ev, ...)
{
...
switch (ev_type) {
case QUERY_EVENT:
if (shall_skip_database(((Query_log_event*)ev)->db))
goto end;
if (opt_base64_output_mode == BASE64_OUTPUT_ALWAYS)
{
// Possibly in case of rewite-db rule for ev->db
// a warning should be emited here (see note below)
... write_event_header_and_base64(ev, ...) ...
}
else
{
print_use_stmt((Query_log_event*)ev, print_event_info);
ev->print(result_file, print_event_info);
}
break;
...
case LOAD_EVENT:
print_use_stmt((Load_log_event*)ev, print_event_info);
break;
default:
...
}
...
}
Note. write_event_header_and_base64() does not print use-statement. It
produces BINLOG statement using ev->temp_buf content (i.e. the binary
log representation of the event). We don't rewrite temp_buf here with
db_to name (as we do it for Table map event) - this implies the
limitation 3 mentioned above.
Question: Is supporting of rewite_db + --base64-output really needed
currently?
4. Current status
-----------------
The outlined design (implemented for mysql-5.1.37) is tested for
simple test-cases.
TODO. 1. Check list of events which can emit use-statement.
2. Supporting of rewite_db + --base64-output ?
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Progress (by Bothorsen): Add a mysqlbinlog option to change the used database (36)
by worklog-noreply@askmonty.org 03 Nov '09
by worklog-noreply@askmonty.org 03 Nov '09
03 Nov '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to change the used database
CREATION DATE..: Fri, 07 Aug 2009, 14:57
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 36 (http://askmonty.org/worklog/?tid=36)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 45
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Bothorsen - Tue, 03 Nov 2009, 13:49)=-=-
Sergey and Bo has been working on getting the patch ready, and Alexi has fixed some issues with the
patch.
Worked 15 hours and estimate 0 hours remain (original estimate increased by 15 hours).
-=-=(Bothorsen - Tue, 03 Nov 2009, 13:47)=-=-
Alexi has implemented a patch for this item.
Worked 30 hours and estimate 0 hours remain (original estimate increased by 30 hours).
-=-=(Guest - Tue, 15 Sep 2009, 18:04)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.19322 2009-09-15 18:04:49.000000000 +0300
+++ /tmp/wklog.36.new.19322 2009-09-15 18:04:49.000000000 +0300
@@ -191,7 +191,7 @@
- In process_event() function add switch case for Load_log_event and
add print_use_stmt() invocations where needed (according to the
- events lis above), e.g.:
+ events list above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
-=-=(Guest - Tue, 15 Sep 2009, 15:53)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.13421 2009-09-15 15:53:31.000000000 +0300
+++ /tmp/wklog.36.new.13421 2009-09-15 15:53:31.000000000 +0300
@@ -150,10 +150,17 @@
following events (see process_event() function):
- Query_log_event
-- Execute_load_query_log_event
-- Create_file_log_event
-
-TODO. Needed to check this list requires carefully !!!
+- Load_log_event
+- Execute_load_query_log_event [ :public Query_log_event ]
+- Create_file_log_event [ :public Load_log_event ]
+
+TODO. Needed to check this list carefully (not sure for Create_file_log_event)
+ Notes.
+ - In replication, only Query_log_event and Load_log_event uses
+ rpl_filter->get_rewrite_db();
+ - In mysqlbinlog (process_event), Execute_load_query_log_event
+ and Create_file_log_event are processed in separate switch
+ cases. And Load_log_event is processed in the default switch case.
Conditions for emiting use-statement:
- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
@@ -182,8 +189,9 @@
*/
}
-- In process_event() function add print_use_stmt() invocations where
- needed (according to the events lis above), e.g.:
+- In process_event() function add switch case for Load_log_event and
+ add print_use_stmt() invocations where needed (according to the
+ events lis above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
@@ -207,6 +215,11 @@
}
break;
...
+ case LOAD_EVENT:
+ print_use_stmt((Load_log_event*)ev, print_event_info);
+ break;
+ default:
+ ...
}
...
}
-=-=(Guest - Tue, 15 Sep 2009, 12:12)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.3961 2009-09-15 12:12:26.000000000 +0300
+++ /tmp/wklog.36.new.3961 2009-09-15 12:12:26.000000000 +0300
@@ -144,6 +144,8 @@
3. Supporting rewrite-db for SBR events
---------------------------------------
+Limited to emiting USE <db_to> instead of USE <db_from>.
+
USE statements can be emited by mysqlbinlog as a result of processing the
following events (see process_event() function):
-=-=(Guest - Tue, 15 Sep 2009, 12:08)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.3794 2009-09-15 12:08:54.000000000 +0300
+++ /tmp/wklog.36.new.3794 2009-09-15 12:08:54.000000000 +0300
@@ -1 +1,229 @@
+Content
+-------
+1. Adding rewrite-db option
+2. Supporting rewrite-db option for RBR events
+3. Supporting rewrite-db option for SBR events
+ (Limited to affecting only USE statements)
+4. Current status
+
+1. Adding rewrite-db option
+---------------------------
+
+1.1. Syntax:
+ --rewrite-db='db_from->db_to'
+
+1.2. Add 'OPT_REWRITE_DB' to 'options_client' (in client_priv.h).
+
+1.3. In mysqlbinlog.cc:
+
+- Add { "rewrite-db", OPT_REWRITE_DB, ...} record to my_long_options:
+- Add Rpl_filter object to mysqlbinlog.cc
+
+ Rpl_filter* binlog_filter;
+
+- Add corresponding switch case to get_one_option():
+
+ case OPT_REWRITE_DB:
+ <extract db-from and db-to strings>
+ binlog_filter->add_db_rewrite(db_from, db_to);
+ break;
+ .
+Note. To make Rpl_filter usable in a MYSQL_CLIENT context, few small
+additional changes are required:
+
+- In sql_list.cc/h, Sql_alloc::new(size_t) and Sql_alloc::new[](size_t)
+ uses sql_alloc() which is THD dependent. These are to be modified
+ as follows:
+
+ #ifdef MYSQL_CLIENT
+ extern MEM_ROOT sql_list_client_mem_root; // defined in sql_list.cc
+ #endif
+
+ class Sql_alloc
+ { ...
+ static void *operator new(size_t size) throw ()
+ {
+ #ifndef MYSQL_CLIENT
+ return sql_alloc(size);
+ #else
+ return alloc_root(&sql_list_client_mem_root, size);
+ #endif
+ }
+ static void *operator new[](size_t size) throw ()
+ {
+ #ifndef MYSQL_CLIENT
+ return sql_alloc(size);
+ #else
+ return alloc_root(&sql_list_client_mem_root, size);
+ #endif
+ }
+ ...
+ }
+
+- In rpl_filter.cc:
+
+ Rpl_filter::Rpl_filter() :
+ ...
+ {
+ #ifdef MYSQL_CLIENT
+ init_alloc_root(&sql_list_client_mem_root, ...);
+ #endif
+ ...
+ }
+
+ Rpl_filter::~Rpl_filter()
+ { ...
+ #ifdef MYSQL_CLIENT
+ free_root(&sql_list_client_mem_root, ...);
+ #endif
+ }
+
+2. Supporting rewrite-db for RBR events
+---------------------------------------
+
+In binlog, each row operation event is preceded by Table map event(s) which maps
+table id(s) to database and table names. So, it's enough to support rewriting
+database name in a Table map.
+
+2.1. Add rewrite_db() member to Table_map_log_event:
+
+ int Table_map_log_event::rewrite_db(
+ const char* new_db,
+ size_t new_db_len,
+ const Format_description_log_event* desc)
+ {
+ /* 1. In temp_buf member (possibly reallocating it) rewrite
+ event length, db length, and db parts
+ 2. Change m_dblen and m_dbnam members
+ */
+ }
+
+Comment. This function assumes that temp_buf member contains Table map
+binlog representaion (temp_buf is used for creating corresponding
+BINLOG statement).
+
+2.2. In mysqlbinlog modify corresponding switch case in the
+process_event() function:
+
+ Exit_status process_event(
+ PRINT_EVENT_INFO *print_event_info,
+ Log_event *ev, ...)
+ {
+ ...
+ switch (ev_type) {
+ ...
+ case TABLE_MAP_EVENT:
+ {
+ Table_map_log_event *map= ((Table_map_log_event *)ev);
+ if (shall_skip_database(map->get_db_name()))
+ { ...
+ }
+ // WL36
+ size_t new_len= 0;
+ const char* new_db= binlog_filter->get_rewrite_db(
+ map->get_db_name(), &new_len);
+ if (new_len && map->rewrite_db(new_db, new_len,
+ glob_description_event))
+ { error("Could not rewrite database name");
+ goto err;
+ }
+ }
+ case WRITE_ROWS_EVENT:
+ case DELETE_ROWS_EVENT:
+ case UPDATE_ROWS_EVENT:
+ ...
+ }
+ ...
+ }
+
+Comment. Rpl_filter::get_rewrite_db(db_from, &len): if filter contains
+a (db_from, db_to) pair, this function returns pointer to db_to and
+sets len = db_to length; otherwise, it returns db_from and does not
+change len value.
+
+3. Supporting rewrite-db for SBR events
+---------------------------------------
+
+USE statements can be emited by mysqlbinlog as a result of processing the
+following events (see process_event() function):
+
+- Query_log_event
+- Execute_load_query_log_event
+- Create_file_log_event
+
+TODO. Needed to check this list requires carefully !!!
+
+Conditions for emiting use-statement:
+- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
+ (e.g. it is ON for 'create database' statement)
+- event's db name differs from db_name in PRINT_EVENT_INFO
+ (PRINT_EVENT_INFO keeps db name of the last issued USE statement;
+ initially, this db name is empty).
+
+3.1. In mysqlbinlog.cc
+
+- Add the following function:
+
+ void print_use_stmt(Log_event* event, PRINT_EVENT_INFO* pinfo)
+ {
+ if (event->flags & LOG_EVENT_SUPPRESS_USE_F)
+ return;
+ /*
+ - For events listed above get db_from = event->db;
+ - If db_from is the same as pinfo->db then return;
+ - If there is rewrite-db rule db_from->db_to,
+ set db = db_to. Else set db = db_from;
+ - Print "use <db>" to mysqlbinlog output
+ - Set pinfo->db = db_from
+ (this suppresses emiting use-statements by corresponding
+ log_event's print-function)
+ */
+ }
+
+- In process_event() function add print_use_stmt() invocations where
+ needed (according to the events lis above), e.g.:
+
+ Exit_status process_event(
+ PRINT_EVENT_INFO *print_event_info,
+ Log_event *ev, ...)
+ {
+ ...
+ switch (ev_type) {
+ case QUERY_EVENT:
+ if (shall_skip_database(((Query_log_event*)ev)->db))
+ goto end;
+ if (opt_base64_output_mode == BASE64_OUTPUT_ALWAYS)
+ {
+ // Possibly in case of rewite-db rule for ev->db
+ // a warning should be emited here (see note below)
+ ... write_event_header_and_base64(ev, ...) ...
+ }
+ else
+ {
+ print_use_stmt((Query_log_event*)ev, print_event_info);
+ ev->print(result_file, print_event_info);
+ }
+ break;
+ ...
+ }
+ ...
+ }
+
+Note. write_event_header_and_base64() does not print use-statement. It
+produces BINLOG statement using ev->temp_buf content (i.e. the binary
+log representation of the event). We don't rewrite temp_buf here with
+db_to name (as we do it for Table map event) - this implies the
+limitation 3 mentioned above.
+Question: Is supporting of rewite_db + --base64-output really needed
+currently?
+
+4. Current status
+-----------------
+
+The outlined design (implemented for mysql-5.1.37) is tested for
+simple test-cases.
+
+TODO. 1. Check list of events which can emit use-statement.
+ 2. Supporting of rewite_db + --base64-output ?
+
-=-=(Guest - Mon, 14 Sep 2009, 11:51)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.9711 2009-09-14 11:51:43.000000000 +0300
+++ /tmp/wklog.36.new.9711 2009-09-14 11:51:43.000000000 +0300
@@ -1 +1 @@
-Pay no attention: just check for having access
+
-=-=(Guest - Mon, 14 Sep 2009, 11:51)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.9678 2009-09-14 11:51:28.000000000 +0300
+++ /tmp/wklog.36.new.9678 2009-09-14 11:51:28.000000000 +0300
@@ -1 +1 @@
-
+Pay no attention: just check for having access
-=-=(Knielsen - Mon, 17 Aug 2009, 12:44)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.7834 2009-08-17 12:44:17.000000000 +0300
+++ /tmp/wklog.36.new.7834 2009-08-17 12:44:17.000000000 +0300
@@ -13,7 +13,9 @@
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
-See also MySQL BUG#42941.
+See also MySQL BUG#42941. Note this bug is fixed in MySQL 5.1.37, which is not
+merged into MariaDB at the time of writing, but planned to be merged before
+release.
What we could do
----------------
-=-=(Guest - Sun, 16 Aug 2009, 17:11)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.27162 2009-08-16 17:11:12.000000000 +0300
+++ /tmp/wklog.36.new.27162 2009-08-16 17:11:12.000000000 +0300
@@ -13,6 +13,8 @@
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
+See also MySQL BUG#42941.
+
What we could do
----------------
------------------------------------------------------------
-=-=(View All Progress Notes, 15 total)=-=-
http://askmonty.org/worklog/index.pl?tid=36&nolimit=1
DESCRIPTION:
Sometimes there is a need to take a binary log and apply it to a database with
a different name than the original name of the database on binlog producer.
If one is using statement-based replication, he can achieve this by grepping
out "USE dbname" statements out of the output of mysqlbinlog(*). With
row-based replication this is no longer possible, as database name is encoded
within the the BINLOG '....' statement.
This task is about adding an option to mysqlbinlog that would allow to change
the names of used databases in both RBR and SBR events.
(*) this implies that all statements refer to tables in the current database,
doesn't catch updates made inside stored functions and so forth, but still
works for a practially-important subset of cases.
HIGH-LEVEL SPECIFICATION:
Context
-------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has a replication slave option
--replicate-rewrite-db="from->to"
the option affects
- Table_map_log_event (all RBR events)
- Load_log_event (LOAD DATA)
- Query_log_event (SBR-based updates, with the usual assumption that the
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
See also MySQL BUG#42941. Note this bug is fixed in MySQL 5.1.37, which is not
merged into MariaDB at the time of writing, but planned to be merged before
release.
What we could do
----------------
Option1: make mysqlbinlog accept --replicate-rewrite-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Make mysqlbinlog accept --replicate-rewrite-db options and process them to the
same extent as replication slave would process --replicate-rewrite-db option.
Option2: Add database-agnostic RBR events and --strip-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Right now RBR events require a databasename. It is not possible to have RBR
event stream that won't mention which database the events are for. When I
tried to use debugger and specify empty database name, attempt to apply the
binlog resulted in this error:
090809 17:38:44 [ERROR] Slave SQL: Error 'Table '.tablename' doesn't exist' on
opening tables,
We could do as follows:
- Make the server interpret empty database name in RBR event (i.e. in a
Table_map_log_event) as "use current database". Binlog slave thread
probably should not allow such events as it doesn't have a natural current
database.
- Add a mysqlbinlog --strip-db option that would
= not produce any "USE dbname" statements
= change databasename for all RBR events to be empty
That way, mysqlbinlog output will be database-agnostic and apply to the
current database.
(this will have the usual limitations that we assume that all statements in
the binlog refer to the current database).
Option3: Enhance database rewrite
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If there is a need to support database change for statements that use
dbname.tablename notation and are replicated as statements (i.e. are DDL
statements and/or DML statements that are binlogged as statements),
then that could be supported as follows:
- Make the server's parser recognize special form of comments
/* !database-alias(oldname,newname) */
and save the mapping somewhere
- Put the hooks in table open and name resolution code to use the saved
mapping.
Once we've done the above, it will be easy to perform a complete,
no-compromise or restrictions database name change in binary log.
It will be possible to do the rewrites either on the slave (
--replicate-rewrite-db will work for all kinds of statements), or in
mysqlbinlog (adding a comment is easy and doesn't require mysqlbinlog to
parse the statement).
LOW-LEVEL DESIGN:
Content
-------
1. Adding rewrite-db option
2. Supporting rewrite-db option for RBR events
3. Supporting rewrite-db option for SBR events
(Limited to affecting only USE statements)
4. Current status
1. Adding rewrite-db option
---------------------------
1.1. Syntax:
--rewrite-db='db_from->db_to'
1.2. Add 'OPT_REWRITE_DB' to 'options_client' (in client_priv.h).
1.3. In mysqlbinlog.cc:
- Add { "rewrite-db", OPT_REWRITE_DB, ...} record to my_long_options:
- Add Rpl_filter object to mysqlbinlog.cc
Rpl_filter* binlog_filter;
- Add corresponding switch case to get_one_option():
case OPT_REWRITE_DB:
<extract db-from and db-to strings>
binlog_filter->add_db_rewrite(db_from, db_to);
break;
.
Note. To make Rpl_filter usable in a MYSQL_CLIENT context, few small
additional changes are required:
- In sql_list.cc/h, Sql_alloc::new(size_t) and Sql_alloc::new[](size_t)
uses sql_alloc() which is THD dependent. These are to be modified
as follows:
#ifdef MYSQL_CLIENT
extern MEM_ROOT sql_list_client_mem_root; // defined in sql_list.cc
#endif
class Sql_alloc
{ ...
static void *operator new(size_t size) throw ()
{
#ifndef MYSQL_CLIENT
return sql_alloc(size);
#else
return alloc_root(&sql_list_client_mem_root, size);
#endif
}
static void *operator new[](size_t size) throw ()
{
#ifndef MYSQL_CLIENT
return sql_alloc(size);
#else
return alloc_root(&sql_list_client_mem_root, size);
#endif
}
...
}
- In rpl_filter.cc:
Rpl_filter::Rpl_filter() :
...
{
#ifdef MYSQL_CLIENT
init_alloc_root(&sql_list_client_mem_root, ...);
#endif
...
}
Rpl_filter::~Rpl_filter()
{ ...
#ifdef MYSQL_CLIENT
free_root(&sql_list_client_mem_root, ...);
#endif
}
2. Supporting rewrite-db for RBR events
---------------------------------------
In binlog, each row operation event is preceded by Table map event(s) which maps
table id(s) to database and table names. So, it's enough to support rewriting
database name in a Table map.
2.1. Add rewrite_db() member to Table_map_log_event:
int Table_map_log_event::rewrite_db(
const char* new_db,
size_t new_db_len,
const Format_description_log_event* desc)
{
/* 1. In temp_buf member (possibly reallocating it) rewrite
event length, db length, and db parts
2. Change m_dblen and m_dbnam members
*/
}
Comment. This function assumes that temp_buf member contains Table map
binlog representaion (temp_buf is used for creating corresponding
BINLOG statement).
2.2. In mysqlbinlog modify corresponding switch case in the
process_event() function:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
Log_event *ev, ...)
{
...
switch (ev_type) {
...
case TABLE_MAP_EVENT:
{
Table_map_log_event *map= ((Table_map_log_event *)ev);
if (shall_skip_database(map->get_db_name()))
{ ...
}
// WL36
size_t new_len= 0;
const char* new_db= binlog_filter->get_rewrite_db(
map->get_db_name(), &new_len);
if (new_len && map->rewrite_db(new_db, new_len,
glob_description_event))
{ error("Could not rewrite database name");
goto err;
}
}
case WRITE_ROWS_EVENT:
case DELETE_ROWS_EVENT:
case UPDATE_ROWS_EVENT:
...
}
...
}
Comment. Rpl_filter::get_rewrite_db(db_from, &len): if filter contains
a (db_from, db_to) pair, this function returns pointer to db_to and
sets len = db_to length; otherwise, it returns db_from and does not
change len value.
3. Supporting rewrite-db for SBR events
---------------------------------------
Limited to emiting USE <db_to> instead of USE <db_from>.
USE statements can be emited by mysqlbinlog as a result of processing the
following events (see process_event() function):
- Query_log_event
- Load_log_event
- Execute_load_query_log_event [ :public Query_log_event ]
- Create_file_log_event [ :public Load_log_event ]
TODO. Needed to check this list carefully (not sure for Create_file_log_event)
Notes.
- In replication, only Query_log_event and Load_log_event uses
rpl_filter->get_rewrite_db();
- In mysqlbinlog (process_event), Execute_load_query_log_event
and Create_file_log_event are processed in separate switch
cases. And Load_log_event is processed in the default switch case.
Conditions for emiting use-statement:
- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
(e.g. it is ON for 'create database' statement)
- event's db name differs from db_name in PRINT_EVENT_INFO
(PRINT_EVENT_INFO keeps db name of the last issued USE statement;
initially, this db name is empty).
3.1. In mysqlbinlog.cc
- Add the following function:
void print_use_stmt(Log_event* event, PRINT_EVENT_INFO* pinfo)
{
if (event->flags & LOG_EVENT_SUPPRESS_USE_F)
return;
/*
- For events listed above get db_from = event->db;
- If db_from is the same as pinfo->db then return;
- If there is rewrite-db rule db_from->db_to,
set db = db_to. Else set db = db_from;
- Print "use <db>" to mysqlbinlog output
- Set pinfo->db = db_from
(this suppresses emiting use-statements by corresponding
log_event's print-function)
*/
}
- In process_event() function add switch case for Load_log_event and
add print_use_stmt() invocations where needed (according to the
events list above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
Log_event *ev, ...)
{
...
switch (ev_type) {
case QUERY_EVENT:
if (shall_skip_database(((Query_log_event*)ev)->db))
goto end;
if (opt_base64_output_mode == BASE64_OUTPUT_ALWAYS)
{
// Possibly in case of rewite-db rule for ev->db
// a warning should be emited here (see note below)
... write_event_header_and_base64(ev, ...) ...
}
else
{
print_use_stmt((Query_log_event*)ev, print_event_info);
ev->print(result_file, print_event_info);
}
break;
...
case LOAD_EVENT:
print_use_stmt((Load_log_event*)ev, print_event_info);
break;
default:
...
}
...
}
Note. write_event_header_and_base64() does not print use-statement. It
produces BINLOG statement using ev->temp_buf content (i.e. the binary
log representation of the event). We don't rewrite temp_buf here with
db_to name (as we do it for Table map event) - this implies the
limitation 3 mentioned above.
Question: Is supporting of rewite_db + --base64-output really needed
currently?
4. Current status
-----------------
The outlined design (implemented for mysql-5.1.37) is tested for
simple test-cases.
TODO. 1. Check list of events which can emit use-statement.
2. Supporting of rewite_db + --base64-output ?
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Progress (by Bothorsen): Add a mysqlbinlog option to change the used database (36)
by worklog-noreply@askmonty.org 03 Nov '09
by worklog-noreply@askmonty.org 03 Nov '09
03 Nov '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to change the used database
CREATION DATE..: Fri, 07 Aug 2009, 14:57
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 36 (http://askmonty.org/worklog/?tid=36)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 30
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Bothorsen - Tue, 03 Nov 2009, 13:47)=-=-
Alexi has implemented a patch for this item.
Worked 30 hours and estimate 0 hours remain (original estimate increased by 30 hours).
-=-=(Guest - Tue, 15 Sep 2009, 18:04)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.19322 2009-09-15 18:04:49.000000000 +0300
+++ /tmp/wklog.36.new.19322 2009-09-15 18:04:49.000000000 +0300
@@ -191,7 +191,7 @@
- In process_event() function add switch case for Load_log_event and
add print_use_stmt() invocations where needed (according to the
- events lis above), e.g.:
+ events list above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
-=-=(Guest - Tue, 15 Sep 2009, 15:53)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.13421 2009-09-15 15:53:31.000000000 +0300
+++ /tmp/wklog.36.new.13421 2009-09-15 15:53:31.000000000 +0300
@@ -150,10 +150,17 @@
following events (see process_event() function):
- Query_log_event
-- Execute_load_query_log_event
-- Create_file_log_event
-
-TODO. Needed to check this list requires carefully !!!
+- Load_log_event
+- Execute_load_query_log_event [ :public Query_log_event ]
+- Create_file_log_event [ :public Load_log_event ]
+
+TODO. Needed to check this list carefully (not sure for Create_file_log_event)
+ Notes.
+ - In replication, only Query_log_event and Load_log_event uses
+ rpl_filter->get_rewrite_db();
+ - In mysqlbinlog (process_event), Execute_load_query_log_event
+ and Create_file_log_event are processed in separate switch
+ cases. And Load_log_event is processed in the default switch case.
Conditions for emiting use-statement:
- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
@@ -182,8 +189,9 @@
*/
}
-- In process_event() function add print_use_stmt() invocations where
- needed (according to the events lis above), e.g.:
+- In process_event() function add switch case for Load_log_event and
+ add print_use_stmt() invocations where needed (according to the
+ events lis above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
@@ -207,6 +215,11 @@
}
break;
...
+ case LOAD_EVENT:
+ print_use_stmt((Load_log_event*)ev, print_event_info);
+ break;
+ default:
+ ...
}
...
}
-=-=(Guest - Tue, 15 Sep 2009, 12:12)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.3961 2009-09-15 12:12:26.000000000 +0300
+++ /tmp/wklog.36.new.3961 2009-09-15 12:12:26.000000000 +0300
@@ -144,6 +144,8 @@
3. Supporting rewrite-db for SBR events
---------------------------------------
+Limited to emiting USE <db_to> instead of USE <db_from>.
+
USE statements can be emited by mysqlbinlog as a result of processing the
following events (see process_event() function):
-=-=(Guest - Tue, 15 Sep 2009, 12:08)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.3794 2009-09-15 12:08:54.000000000 +0300
+++ /tmp/wklog.36.new.3794 2009-09-15 12:08:54.000000000 +0300
@@ -1 +1,229 @@
+Content
+-------
+1. Adding rewrite-db option
+2. Supporting rewrite-db option for RBR events
+3. Supporting rewrite-db option for SBR events
+ (Limited to affecting only USE statements)
+4. Current status
+
+1. Adding rewrite-db option
+---------------------------
+
+1.1. Syntax:
+ --rewrite-db='db_from->db_to'
+
+1.2. Add 'OPT_REWRITE_DB' to 'options_client' (in client_priv.h).
+
+1.3. In mysqlbinlog.cc:
+
+- Add { "rewrite-db", OPT_REWRITE_DB, ...} record to my_long_options:
+- Add Rpl_filter object to mysqlbinlog.cc
+
+ Rpl_filter* binlog_filter;
+
+- Add corresponding switch case to get_one_option():
+
+ case OPT_REWRITE_DB:
+ <extract db-from and db-to strings>
+ binlog_filter->add_db_rewrite(db_from, db_to);
+ break;
+ .
+Note. To make Rpl_filter usable in a MYSQL_CLIENT context, few small
+additional changes are required:
+
+- In sql_list.cc/h, Sql_alloc::new(size_t) and Sql_alloc::new[](size_t)
+ uses sql_alloc() which is THD dependent. These are to be modified
+ as follows:
+
+ #ifdef MYSQL_CLIENT
+ extern MEM_ROOT sql_list_client_mem_root; // defined in sql_list.cc
+ #endif
+
+ class Sql_alloc
+ { ...
+ static void *operator new(size_t size) throw ()
+ {
+ #ifndef MYSQL_CLIENT
+ return sql_alloc(size);
+ #else
+ return alloc_root(&sql_list_client_mem_root, size);
+ #endif
+ }
+ static void *operator new[](size_t size) throw ()
+ {
+ #ifndef MYSQL_CLIENT
+ return sql_alloc(size);
+ #else
+ return alloc_root(&sql_list_client_mem_root, size);
+ #endif
+ }
+ ...
+ }
+
+- In rpl_filter.cc:
+
+ Rpl_filter::Rpl_filter() :
+ ...
+ {
+ #ifdef MYSQL_CLIENT
+ init_alloc_root(&sql_list_client_mem_root, ...);
+ #endif
+ ...
+ }
+
+ Rpl_filter::~Rpl_filter()
+ { ...
+ #ifdef MYSQL_CLIENT
+ free_root(&sql_list_client_mem_root, ...);
+ #endif
+ }
+
+2. Supporting rewrite-db for RBR events
+---------------------------------------
+
+In binlog, each row operation event is preceded by Table map event(s) which maps
+table id(s) to database and table names. So, it's enough to support rewriting
+database name in a Table map.
+
+2.1. Add rewrite_db() member to Table_map_log_event:
+
+ int Table_map_log_event::rewrite_db(
+ const char* new_db,
+ size_t new_db_len,
+ const Format_description_log_event* desc)
+ {
+ /* 1. In temp_buf member (possibly reallocating it) rewrite
+ event length, db length, and db parts
+ 2. Change m_dblen and m_dbnam members
+ */
+ }
+
+Comment. This function assumes that temp_buf member contains Table map
+binlog representaion (temp_buf is used for creating corresponding
+BINLOG statement).
+
+2.2. In mysqlbinlog modify corresponding switch case in the
+process_event() function:
+
+ Exit_status process_event(
+ PRINT_EVENT_INFO *print_event_info,
+ Log_event *ev, ...)
+ {
+ ...
+ switch (ev_type) {
+ ...
+ case TABLE_MAP_EVENT:
+ {
+ Table_map_log_event *map= ((Table_map_log_event *)ev);
+ if (shall_skip_database(map->get_db_name()))
+ { ...
+ }
+ // WL36
+ size_t new_len= 0;
+ const char* new_db= binlog_filter->get_rewrite_db(
+ map->get_db_name(), &new_len);
+ if (new_len && map->rewrite_db(new_db, new_len,
+ glob_description_event))
+ { error("Could not rewrite database name");
+ goto err;
+ }
+ }
+ case WRITE_ROWS_EVENT:
+ case DELETE_ROWS_EVENT:
+ case UPDATE_ROWS_EVENT:
+ ...
+ }
+ ...
+ }
+
+Comment. Rpl_filter::get_rewrite_db(db_from, &len): if filter contains
+a (db_from, db_to) pair, this function returns pointer to db_to and
+sets len = db_to length; otherwise, it returns db_from and does not
+change len value.
+
+3. Supporting rewrite-db for SBR events
+---------------------------------------
+
+USE statements can be emited by mysqlbinlog as a result of processing the
+following events (see process_event() function):
+
+- Query_log_event
+- Execute_load_query_log_event
+- Create_file_log_event
+
+TODO. Needed to check this list requires carefully !!!
+
+Conditions for emiting use-statement:
+- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
+ (e.g. it is ON for 'create database' statement)
+- event's db name differs from db_name in PRINT_EVENT_INFO
+ (PRINT_EVENT_INFO keeps db name of the last issued USE statement;
+ initially, this db name is empty).
+
+3.1. In mysqlbinlog.cc
+
+- Add the following function:
+
+ void print_use_stmt(Log_event* event, PRINT_EVENT_INFO* pinfo)
+ {
+ if (event->flags & LOG_EVENT_SUPPRESS_USE_F)
+ return;
+ /*
+ - For events listed above get db_from = event->db;
+ - If db_from is the same as pinfo->db then return;
+ - If there is rewrite-db rule db_from->db_to,
+ set db = db_to. Else set db = db_from;
+ - Print "use <db>" to mysqlbinlog output
+ - Set pinfo->db = db_from
+ (this suppresses emiting use-statements by corresponding
+ log_event's print-function)
+ */
+ }
+
+- In process_event() function add print_use_stmt() invocations where
+ needed (according to the events lis above), e.g.:
+
+ Exit_status process_event(
+ PRINT_EVENT_INFO *print_event_info,
+ Log_event *ev, ...)
+ {
+ ...
+ switch (ev_type) {
+ case QUERY_EVENT:
+ if (shall_skip_database(((Query_log_event*)ev)->db))
+ goto end;
+ if (opt_base64_output_mode == BASE64_OUTPUT_ALWAYS)
+ {
+ // Possibly in case of rewite-db rule for ev->db
+ // a warning should be emited here (see note below)
+ ... write_event_header_and_base64(ev, ...) ...
+ }
+ else
+ {
+ print_use_stmt((Query_log_event*)ev, print_event_info);
+ ev->print(result_file, print_event_info);
+ }
+ break;
+ ...
+ }
+ ...
+ }
+
+Note. write_event_header_and_base64() does not print use-statement. It
+produces BINLOG statement using ev->temp_buf content (i.e. the binary
+log representation of the event). We don't rewrite temp_buf here with
+db_to name (as we do it for Table map event) - this implies the
+limitation 3 mentioned above.
+Question: Is supporting of rewite_db + --base64-output really needed
+currently?
+
+4. Current status
+-----------------
+
+The outlined design (implemented for mysql-5.1.37) is tested for
+simple test-cases.
+
+TODO. 1. Check list of events which can emit use-statement.
+ 2. Supporting of rewite_db + --base64-output ?
+
-=-=(Guest - Mon, 14 Sep 2009, 11:51)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.9711 2009-09-14 11:51:43.000000000 +0300
+++ /tmp/wklog.36.new.9711 2009-09-14 11:51:43.000000000 +0300
@@ -1 +1 @@
-Pay no attention: just check for having access
+
-=-=(Guest - Mon, 14 Sep 2009, 11:51)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.9678 2009-09-14 11:51:28.000000000 +0300
+++ /tmp/wklog.36.new.9678 2009-09-14 11:51:28.000000000 +0300
@@ -1 +1 @@
-
+Pay no attention: just check for having access
-=-=(Knielsen - Mon, 17 Aug 2009, 12:44)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.7834 2009-08-17 12:44:17.000000000 +0300
+++ /tmp/wklog.36.new.7834 2009-08-17 12:44:17.000000000 +0300
@@ -13,7 +13,9 @@
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
-See also MySQL BUG#42941.
+See also MySQL BUG#42941. Note this bug is fixed in MySQL 5.1.37, which is not
+merged into MariaDB at the time of writing, but planned to be merged before
+release.
What we could do
----------------
-=-=(Guest - Sun, 16 Aug 2009, 17:11)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.27162 2009-08-16 17:11:12.000000000 +0300
+++ /tmp/wklog.36.new.27162 2009-08-16 17:11:12.000000000 +0300
@@ -13,6 +13,8 @@
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
+See also MySQL BUG#42941.
+
What we could do
----------------
-=-=(Psergey - Mon, 10 Aug 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.13035 2009-08-10 15:41:51.000000000 +0300
+++ /tmp/wklog.36.new.13035 2009-08-10 15:41:51.000000000 +0300
@@ -1,5 +1,7 @@
Context
-------
+(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
+overview)
At the moment, the server has a replication slave option
--replicate-rewrite-db="from->to"
------------------------------------------------------------
-=-=(View All Progress Notes, 14 total)=-=-
http://askmonty.org/worklog/index.pl?tid=36&nolimit=1
DESCRIPTION:
Sometimes there is a need to take a binary log and apply it to a database with
a different name than the original name of the database on binlog producer.
If one is using statement-based replication, he can achieve this by grepping
out "USE dbname" statements out of the output of mysqlbinlog(*). With
row-based replication this is no longer possible, as database name is encoded
within the the BINLOG '....' statement.
This task is about adding an option to mysqlbinlog that would allow to change
the names of used databases in both RBR and SBR events.
(*) this implies that all statements refer to tables in the current database,
doesn't catch updates made inside stored functions and so forth, but still
works for a practially-important subset of cases.
HIGH-LEVEL SPECIFICATION:
Context
-------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has a replication slave option
--replicate-rewrite-db="from->to"
the option affects
- Table_map_log_event (all RBR events)
- Load_log_event (LOAD DATA)
- Query_log_event (SBR-based updates, with the usual assumption that the
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
See also MySQL BUG#42941. Note this bug is fixed in MySQL 5.1.37, which is not
merged into MariaDB at the time of writing, but planned to be merged before
release.
What we could do
----------------
Option1: make mysqlbinlog accept --replicate-rewrite-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Make mysqlbinlog accept --replicate-rewrite-db options and process them to the
same extent as replication slave would process --replicate-rewrite-db option.
Option2: Add database-agnostic RBR events and --strip-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Right now RBR events require a databasename. It is not possible to have RBR
event stream that won't mention which database the events are for. When I
tried to use debugger and specify empty database name, attempt to apply the
binlog resulted in this error:
090809 17:38:44 [ERROR] Slave SQL: Error 'Table '.tablename' doesn't exist' on
opening tables,
We could do as follows:
- Make the server interpret empty database name in RBR event (i.e. in a
Table_map_log_event) as "use current database". Binlog slave thread
probably should not allow such events as it doesn't have a natural current
database.
- Add a mysqlbinlog --strip-db option that would
= not produce any "USE dbname" statements
= change databasename for all RBR events to be empty
That way, mysqlbinlog output will be database-agnostic and apply to the
current database.
(this will have the usual limitations that we assume that all statements in
the binlog refer to the current database).
Option3: Enhance database rewrite
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If there is a need to support database change for statements that use
dbname.tablename notation and are replicated as statements (i.e. are DDL
statements and/or DML statements that are binlogged as statements),
then that could be supported as follows:
- Make the server's parser recognize special form of comments
/* !database-alias(oldname,newname) */
and save the mapping somewhere
- Put the hooks in table open and name resolution code to use the saved
mapping.
Once we've done the above, it will be easy to perform a complete,
no-compromise or restrictions database name change in binary log.
It will be possible to do the rewrites either on the slave (
--replicate-rewrite-db will work for all kinds of statements), or in
mysqlbinlog (adding a comment is easy and doesn't require mysqlbinlog to
parse the statement).
LOW-LEVEL DESIGN:
Content
-------
1. Adding rewrite-db option
2. Supporting rewrite-db option for RBR events
3. Supporting rewrite-db option for SBR events
(Limited to affecting only USE statements)
4. Current status
1. Adding rewrite-db option
---------------------------
1.1. Syntax:
--rewrite-db='db_from->db_to'
1.2. Add 'OPT_REWRITE_DB' to 'options_client' (in client_priv.h).
1.3. In mysqlbinlog.cc:
- Add { "rewrite-db", OPT_REWRITE_DB, ...} record to my_long_options:
- Add Rpl_filter object to mysqlbinlog.cc
Rpl_filter* binlog_filter;
- Add corresponding switch case to get_one_option():
case OPT_REWRITE_DB:
<extract db-from and db-to strings>
binlog_filter->add_db_rewrite(db_from, db_to);
break;
.
Note. To make Rpl_filter usable in a MYSQL_CLIENT context, few small
additional changes are required:
- In sql_list.cc/h, Sql_alloc::new(size_t) and Sql_alloc::new[](size_t)
uses sql_alloc() which is THD dependent. These are to be modified
as follows:
#ifdef MYSQL_CLIENT
extern MEM_ROOT sql_list_client_mem_root; // defined in sql_list.cc
#endif
class Sql_alloc
{ ...
static void *operator new(size_t size) throw ()
{
#ifndef MYSQL_CLIENT
return sql_alloc(size);
#else
return alloc_root(&sql_list_client_mem_root, size);
#endif
}
static void *operator new[](size_t size) throw ()
{
#ifndef MYSQL_CLIENT
return sql_alloc(size);
#else
return alloc_root(&sql_list_client_mem_root, size);
#endif
}
...
}
- In rpl_filter.cc:
Rpl_filter::Rpl_filter() :
...
{
#ifdef MYSQL_CLIENT
init_alloc_root(&sql_list_client_mem_root, ...);
#endif
...
}
Rpl_filter::~Rpl_filter()
{ ...
#ifdef MYSQL_CLIENT
free_root(&sql_list_client_mem_root, ...);
#endif
}
2. Supporting rewrite-db for RBR events
---------------------------------------
In binlog, each row operation event is preceded by Table map event(s) which maps
table id(s) to database and table names. So, it's enough to support rewriting
database name in a Table map.
2.1. Add rewrite_db() member to Table_map_log_event:
int Table_map_log_event::rewrite_db(
const char* new_db,
size_t new_db_len,
const Format_description_log_event* desc)
{
/* 1. In temp_buf member (possibly reallocating it) rewrite
event length, db length, and db parts
2. Change m_dblen and m_dbnam members
*/
}
Comment. This function assumes that temp_buf member contains Table map
binlog representaion (temp_buf is used for creating corresponding
BINLOG statement).
2.2. In mysqlbinlog modify corresponding switch case in the
process_event() function:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
Log_event *ev, ...)
{
...
switch (ev_type) {
...
case TABLE_MAP_EVENT:
{
Table_map_log_event *map= ((Table_map_log_event *)ev);
if (shall_skip_database(map->get_db_name()))
{ ...
}
// WL36
size_t new_len= 0;
const char* new_db= binlog_filter->get_rewrite_db(
map->get_db_name(), &new_len);
if (new_len && map->rewrite_db(new_db, new_len,
glob_description_event))
{ error("Could not rewrite database name");
goto err;
}
}
case WRITE_ROWS_EVENT:
case DELETE_ROWS_EVENT:
case UPDATE_ROWS_EVENT:
...
}
...
}
Comment. Rpl_filter::get_rewrite_db(db_from, &len): if filter contains
a (db_from, db_to) pair, this function returns pointer to db_to and
sets len = db_to length; otherwise, it returns db_from and does not
change len value.
3. Supporting rewrite-db for SBR events
---------------------------------------
Limited to emiting USE <db_to> instead of USE <db_from>.
USE statements can be emited by mysqlbinlog as a result of processing the
following events (see process_event() function):
- Query_log_event
- Load_log_event
- Execute_load_query_log_event [ :public Query_log_event ]
- Create_file_log_event [ :public Load_log_event ]
TODO. Needed to check this list carefully (not sure for Create_file_log_event)
Notes.
- In replication, only Query_log_event and Load_log_event uses
rpl_filter->get_rewrite_db();
- In mysqlbinlog (process_event), Execute_load_query_log_event
and Create_file_log_event are processed in separate switch
cases. And Load_log_event is processed in the default switch case.
Conditions for emiting use-statement:
- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
(e.g. it is ON for 'create database' statement)
- event's db name differs from db_name in PRINT_EVENT_INFO
(PRINT_EVENT_INFO keeps db name of the last issued USE statement;
initially, this db name is empty).
3.1. In mysqlbinlog.cc
- Add the following function:
void print_use_stmt(Log_event* event, PRINT_EVENT_INFO* pinfo)
{
if (event->flags & LOG_EVENT_SUPPRESS_USE_F)
return;
/*
- For events listed above get db_from = event->db;
- If db_from is the same as pinfo->db then return;
- If there is rewrite-db rule db_from->db_to,
set db = db_to. Else set db = db_from;
- Print "use <db>" to mysqlbinlog output
- Set pinfo->db = db_from
(this suppresses emiting use-statements by corresponding
log_event's print-function)
*/
}
- In process_event() function add switch case for Load_log_event and
add print_use_stmt() invocations where needed (according to the
events list above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
Log_event *ev, ...)
{
...
switch (ev_type) {
case QUERY_EVENT:
if (shall_skip_database(((Query_log_event*)ev)->db))
goto end;
if (opt_base64_output_mode == BASE64_OUTPUT_ALWAYS)
{
// Possibly in case of rewite-db rule for ev->db
// a warning should be emited here (see note below)
... write_event_header_and_base64(ev, ...) ...
}
else
{
print_use_stmt((Query_log_event*)ev, print_event_info);
ev->print(result_file, print_event_info);
}
break;
...
case LOAD_EVENT:
print_use_stmt((Load_log_event*)ev, print_event_info);
break;
default:
...
}
...
}
Note. write_event_header_and_base64() does not print use-statement. It
produces BINLOG statement using ev->temp_buf content (i.e. the binary
log representation of the event). We don't rewrite temp_buf here with
db_to name (as we do it for Table map event) - this implies the
limitation 3 mentioned above.
Question: Is supporting of rewite_db + --base64-output really needed
currently?
4. Current status
-----------------
The outlined design (implemented for mysql-5.1.37) is tested for
simple test-cases.
TODO. 1. Check list of events which can emit use-statement.
2. Supporting of rewite_db + --base64-output ?
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Progress (by Bothorsen): Add a mysqlbinlog option to change the used database (36)
by worklog-noreply@askmonty.org 03 Nov '09
by worklog-noreply@askmonty.org 03 Nov '09
03 Nov '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to change the used database
CREATION DATE..: Fri, 07 Aug 2009, 14:57
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 36 (http://askmonty.org/worklog/?tid=36)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 30
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Bothorsen - Tue, 03 Nov 2009, 13:47)=-=-
Alexi has implemented a patch for this item.
Worked 30 hours and estimate 0 hours remain (original estimate increased by 30 hours).
-=-=(Guest - Tue, 15 Sep 2009, 18:04)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.19322 2009-09-15 18:04:49.000000000 +0300
+++ /tmp/wklog.36.new.19322 2009-09-15 18:04:49.000000000 +0300
@@ -191,7 +191,7 @@
- In process_event() function add switch case for Load_log_event and
add print_use_stmt() invocations where needed (according to the
- events lis above), e.g.:
+ events list above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
-=-=(Guest - Tue, 15 Sep 2009, 15:53)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.13421 2009-09-15 15:53:31.000000000 +0300
+++ /tmp/wklog.36.new.13421 2009-09-15 15:53:31.000000000 +0300
@@ -150,10 +150,17 @@
following events (see process_event() function):
- Query_log_event
-- Execute_load_query_log_event
-- Create_file_log_event
-
-TODO. Needed to check this list requires carefully !!!
+- Load_log_event
+- Execute_load_query_log_event [ :public Query_log_event ]
+- Create_file_log_event [ :public Load_log_event ]
+
+TODO. Needed to check this list carefully (not sure for Create_file_log_event)
+ Notes.
+ - In replication, only Query_log_event and Load_log_event uses
+ rpl_filter->get_rewrite_db();
+ - In mysqlbinlog (process_event), Execute_load_query_log_event
+ and Create_file_log_event are processed in separate switch
+ cases. And Load_log_event is processed in the default switch case.
Conditions for emiting use-statement:
- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
@@ -182,8 +189,9 @@
*/
}
-- In process_event() function add print_use_stmt() invocations where
- needed (according to the events lis above), e.g.:
+- In process_event() function add switch case for Load_log_event and
+ add print_use_stmt() invocations where needed (according to the
+ events lis above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
@@ -207,6 +215,11 @@
}
break;
...
+ case LOAD_EVENT:
+ print_use_stmt((Load_log_event*)ev, print_event_info);
+ break;
+ default:
+ ...
}
...
}
-=-=(Guest - Tue, 15 Sep 2009, 12:12)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.3961 2009-09-15 12:12:26.000000000 +0300
+++ /tmp/wklog.36.new.3961 2009-09-15 12:12:26.000000000 +0300
@@ -144,6 +144,8 @@
3. Supporting rewrite-db for SBR events
---------------------------------------
+Limited to emiting USE <db_to> instead of USE <db_from>.
+
USE statements can be emited by mysqlbinlog as a result of processing the
following events (see process_event() function):
-=-=(Guest - Tue, 15 Sep 2009, 12:08)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.3794 2009-09-15 12:08:54.000000000 +0300
+++ /tmp/wklog.36.new.3794 2009-09-15 12:08:54.000000000 +0300
@@ -1 +1,229 @@
+Content
+-------
+1. Adding rewrite-db option
+2. Supporting rewrite-db option for RBR events
+3. Supporting rewrite-db option for SBR events
+ (Limited to affecting only USE statements)
+4. Current status
+
+1. Adding rewrite-db option
+---------------------------
+
+1.1. Syntax:
+ --rewrite-db='db_from->db_to'
+
+1.2. Add 'OPT_REWRITE_DB' to 'options_client' (in client_priv.h).
+
+1.3. In mysqlbinlog.cc:
+
+- Add { "rewrite-db", OPT_REWRITE_DB, ...} record to my_long_options:
+- Add Rpl_filter object to mysqlbinlog.cc
+
+ Rpl_filter* binlog_filter;
+
+- Add corresponding switch case to get_one_option():
+
+ case OPT_REWRITE_DB:
+ <extract db-from and db-to strings>
+ binlog_filter->add_db_rewrite(db_from, db_to);
+ break;
+ .
+Note. To make Rpl_filter usable in a MYSQL_CLIENT context, few small
+additional changes are required:
+
+- In sql_list.cc/h, Sql_alloc::new(size_t) and Sql_alloc::new[](size_t)
+ uses sql_alloc() which is THD dependent. These are to be modified
+ as follows:
+
+ #ifdef MYSQL_CLIENT
+ extern MEM_ROOT sql_list_client_mem_root; // defined in sql_list.cc
+ #endif
+
+ class Sql_alloc
+ { ...
+ static void *operator new(size_t size) throw ()
+ {
+ #ifndef MYSQL_CLIENT
+ return sql_alloc(size);
+ #else
+ return alloc_root(&sql_list_client_mem_root, size);
+ #endif
+ }
+ static void *operator new[](size_t size) throw ()
+ {
+ #ifndef MYSQL_CLIENT
+ return sql_alloc(size);
+ #else
+ return alloc_root(&sql_list_client_mem_root, size);
+ #endif
+ }
+ ...
+ }
+
+- In rpl_filter.cc:
+
+ Rpl_filter::Rpl_filter() :
+ ...
+ {
+ #ifdef MYSQL_CLIENT
+ init_alloc_root(&sql_list_client_mem_root, ...);
+ #endif
+ ...
+ }
+
+ Rpl_filter::~Rpl_filter()
+ { ...
+ #ifdef MYSQL_CLIENT
+ free_root(&sql_list_client_mem_root, ...);
+ #endif
+ }
+
+2. Supporting rewrite-db for RBR events
+---------------------------------------
+
+In binlog, each row operation event is preceded by Table map event(s) which maps
+table id(s) to database and table names. So, it's enough to support rewriting
+database name in a Table map.
+
+2.1. Add rewrite_db() member to Table_map_log_event:
+
+ int Table_map_log_event::rewrite_db(
+ const char* new_db,
+ size_t new_db_len,
+ const Format_description_log_event* desc)
+ {
+ /* 1. In temp_buf member (possibly reallocating it) rewrite
+ event length, db length, and db parts
+ 2. Change m_dblen and m_dbnam members
+ */
+ }
+
+Comment. This function assumes that temp_buf member contains Table map
+binlog representaion (temp_buf is used for creating corresponding
+BINLOG statement).
+
+2.2. In mysqlbinlog modify corresponding switch case in the
+process_event() function:
+
+ Exit_status process_event(
+ PRINT_EVENT_INFO *print_event_info,
+ Log_event *ev, ...)
+ {
+ ...
+ switch (ev_type) {
+ ...
+ case TABLE_MAP_EVENT:
+ {
+ Table_map_log_event *map= ((Table_map_log_event *)ev);
+ if (shall_skip_database(map->get_db_name()))
+ { ...
+ }
+ // WL36
+ size_t new_len= 0;
+ const char* new_db= binlog_filter->get_rewrite_db(
+ map->get_db_name(), &new_len);
+ if (new_len && map->rewrite_db(new_db, new_len,
+ glob_description_event))
+ { error("Could not rewrite database name");
+ goto err;
+ }
+ }
+ case WRITE_ROWS_EVENT:
+ case DELETE_ROWS_EVENT:
+ case UPDATE_ROWS_EVENT:
+ ...
+ }
+ ...
+ }
+
+Comment. Rpl_filter::get_rewrite_db(db_from, &len): if filter contains
+a (db_from, db_to) pair, this function returns pointer to db_to and
+sets len = db_to length; otherwise, it returns db_from and does not
+change len value.
+
+3. Supporting rewrite-db for SBR events
+---------------------------------------
+
+USE statements can be emited by mysqlbinlog as a result of processing the
+following events (see process_event() function):
+
+- Query_log_event
+- Execute_load_query_log_event
+- Create_file_log_event
+
+TODO. Needed to check this list requires carefully !!!
+
+Conditions for emiting use-statement:
+- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
+ (e.g. it is ON for 'create database' statement)
+- event's db name differs from db_name in PRINT_EVENT_INFO
+ (PRINT_EVENT_INFO keeps db name of the last issued USE statement;
+ initially, this db name is empty).
+
+3.1. In mysqlbinlog.cc
+
+- Add the following function:
+
+ void print_use_stmt(Log_event* event, PRINT_EVENT_INFO* pinfo)
+ {
+ if (event->flags & LOG_EVENT_SUPPRESS_USE_F)
+ return;
+ /*
+ - For events listed above get db_from = event->db;
+ - If db_from is the same as pinfo->db then return;
+ - If there is rewrite-db rule db_from->db_to,
+ set db = db_to. Else set db = db_from;
+ - Print "use <db>" to mysqlbinlog output
+ - Set pinfo->db = db_from
+ (this suppresses emiting use-statements by corresponding
+ log_event's print-function)
+ */
+ }
+
+- In process_event() function add print_use_stmt() invocations where
+ needed (according to the events lis above), e.g.:
+
+ Exit_status process_event(
+ PRINT_EVENT_INFO *print_event_info,
+ Log_event *ev, ...)
+ {
+ ...
+ switch (ev_type) {
+ case QUERY_EVENT:
+ if (shall_skip_database(((Query_log_event*)ev)->db))
+ goto end;
+ if (opt_base64_output_mode == BASE64_OUTPUT_ALWAYS)
+ {
+ // Possibly in case of rewite-db rule for ev->db
+ // a warning should be emited here (see note below)
+ ... write_event_header_and_base64(ev, ...) ...
+ }
+ else
+ {
+ print_use_stmt((Query_log_event*)ev, print_event_info);
+ ev->print(result_file, print_event_info);
+ }
+ break;
+ ...
+ }
+ ...
+ }
+
+Note. write_event_header_and_base64() does not print use-statement. It
+produces BINLOG statement using ev->temp_buf content (i.e. the binary
+log representation of the event). We don't rewrite temp_buf here with
+db_to name (as we do it for Table map event) - this implies the
+limitation 3 mentioned above.
+Question: Is supporting of rewite_db + --base64-output really needed
+currently?
+
+4. Current status
+-----------------
+
+The outlined design (implemented for mysql-5.1.37) is tested for
+simple test-cases.
+
+TODO. 1. Check list of events which can emit use-statement.
+ 2. Supporting of rewite_db + --base64-output ?
+
-=-=(Guest - Mon, 14 Sep 2009, 11:51)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.9711 2009-09-14 11:51:43.000000000 +0300
+++ /tmp/wklog.36.new.9711 2009-09-14 11:51:43.000000000 +0300
@@ -1 +1 @@
-Pay no attention: just check for having access
+
-=-=(Guest - Mon, 14 Sep 2009, 11:51)=-=-
Low Level Design modified.
--- /tmp/wklog.36.old.9678 2009-09-14 11:51:28.000000000 +0300
+++ /tmp/wklog.36.new.9678 2009-09-14 11:51:28.000000000 +0300
@@ -1 +1 @@
-
+Pay no attention: just check for having access
-=-=(Knielsen - Mon, 17 Aug 2009, 12:44)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.7834 2009-08-17 12:44:17.000000000 +0300
+++ /tmp/wklog.36.new.7834 2009-08-17 12:44:17.000000000 +0300
@@ -13,7 +13,9 @@
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
-See also MySQL BUG#42941.
+See also MySQL BUG#42941. Note this bug is fixed in MySQL 5.1.37, which is not
+merged into MariaDB at the time of writing, but planned to be merged before
+release.
What we could do
----------------
-=-=(Guest - Sun, 16 Aug 2009, 17:11)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.27162 2009-08-16 17:11:12.000000000 +0300
+++ /tmp/wklog.36.new.27162 2009-08-16 17:11:12.000000000 +0300
@@ -13,6 +13,8 @@
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
+See also MySQL BUG#42941.
+
What we could do
----------------
-=-=(Psergey - Mon, 10 Aug 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.13035 2009-08-10 15:41:51.000000000 +0300
+++ /tmp/wklog.36.new.13035 2009-08-10 15:41:51.000000000 +0300
@@ -1,5 +1,7 @@
Context
-------
+(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
+overview)
At the moment, the server has a replication slave option
--replicate-rewrite-db="from->to"
------------------------------------------------------------
-=-=(View All Progress Notes, 14 total)=-=-
http://askmonty.org/worklog/index.pl?tid=36&nolimit=1
DESCRIPTION:
Sometimes there is a need to take a binary log and apply it to a database with
a different name than the original name of the database on binlog producer.
If one is using statement-based replication, he can achieve this by grepping
out "USE dbname" statements out of the output of mysqlbinlog(*). With
row-based replication this is no longer possible, as database name is encoded
within the the BINLOG '....' statement.
This task is about adding an option to mysqlbinlog that would allow to change
the names of used databases in both RBR and SBR events.
(*) this implies that all statements refer to tables in the current database,
doesn't catch updates made inside stored functions and so forth, but still
works for a practially-important subset of cases.
HIGH-LEVEL SPECIFICATION:
Context
-------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has a replication slave option
--replicate-rewrite-db="from->to"
the option affects
- Table_map_log_event (all RBR events)
- Load_log_event (LOAD DATA)
- Query_log_event (SBR-based updates, with the usual assumption that the
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
See also MySQL BUG#42941. Note this bug is fixed in MySQL 5.1.37, which is not
merged into MariaDB at the time of writing, but planned to be merged before
release.
What we could do
----------------
Option1: make mysqlbinlog accept --replicate-rewrite-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Make mysqlbinlog accept --replicate-rewrite-db options and process them to the
same extent as replication slave would process --replicate-rewrite-db option.
Option2: Add database-agnostic RBR events and --strip-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Right now RBR events require a databasename. It is not possible to have RBR
event stream that won't mention which database the events are for. When I
tried to use debugger and specify empty database name, attempt to apply the
binlog resulted in this error:
090809 17:38:44 [ERROR] Slave SQL: Error 'Table '.tablename' doesn't exist' on
opening tables,
We could do as follows:
- Make the server interpret empty database name in RBR event (i.e. in a
Table_map_log_event) as "use current database". Binlog slave thread
probably should not allow such events as it doesn't have a natural current
database.
- Add a mysqlbinlog --strip-db option that would
= not produce any "USE dbname" statements
= change databasename for all RBR events to be empty
That way, mysqlbinlog output will be database-agnostic and apply to the
current database.
(this will have the usual limitations that we assume that all statements in
the binlog refer to the current database).
Option3: Enhance database rewrite
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If there is a need to support database change for statements that use
dbname.tablename notation and are replicated as statements (i.e. are DDL
statements and/or DML statements that are binlogged as statements),
then that could be supported as follows:
- Make the server's parser recognize special form of comments
/* !database-alias(oldname,newname) */
and save the mapping somewhere
- Put the hooks in table open and name resolution code to use the saved
mapping.
Once we've done the above, it will be easy to perform a complete,
no-compromise or restrictions database name change in binary log.
It will be possible to do the rewrites either on the slave (
--replicate-rewrite-db will work for all kinds of statements), or in
mysqlbinlog (adding a comment is easy and doesn't require mysqlbinlog to
parse the statement).
LOW-LEVEL DESIGN:
Content
-------
1. Adding rewrite-db option
2. Supporting rewrite-db option for RBR events
3. Supporting rewrite-db option for SBR events
(Limited to affecting only USE statements)
4. Current status
1. Adding rewrite-db option
---------------------------
1.1. Syntax:
--rewrite-db='db_from->db_to'
1.2. Add 'OPT_REWRITE_DB' to 'options_client' (in client_priv.h).
1.3. In mysqlbinlog.cc:
- Add { "rewrite-db", OPT_REWRITE_DB, ...} record to my_long_options:
- Add Rpl_filter object to mysqlbinlog.cc
Rpl_filter* binlog_filter;
- Add corresponding switch case to get_one_option():
case OPT_REWRITE_DB:
<extract db-from and db-to strings>
binlog_filter->add_db_rewrite(db_from, db_to);
break;
.
Note. To make Rpl_filter usable in a MYSQL_CLIENT context, few small
additional changes are required:
- In sql_list.cc/h, Sql_alloc::new(size_t) and Sql_alloc::new[](size_t)
uses sql_alloc() which is THD dependent. These are to be modified
as follows:
#ifdef MYSQL_CLIENT
extern MEM_ROOT sql_list_client_mem_root; // defined in sql_list.cc
#endif
class Sql_alloc
{ ...
static void *operator new(size_t size) throw ()
{
#ifndef MYSQL_CLIENT
return sql_alloc(size);
#else
return alloc_root(&sql_list_client_mem_root, size);
#endif
}
static void *operator new[](size_t size) throw ()
{
#ifndef MYSQL_CLIENT
return sql_alloc(size);
#else
return alloc_root(&sql_list_client_mem_root, size);
#endif
}
...
}
- In rpl_filter.cc:
Rpl_filter::Rpl_filter() :
...
{
#ifdef MYSQL_CLIENT
init_alloc_root(&sql_list_client_mem_root, ...);
#endif
...
}
Rpl_filter::~Rpl_filter()
{ ...
#ifdef MYSQL_CLIENT
free_root(&sql_list_client_mem_root, ...);
#endif
}
2. Supporting rewrite-db for RBR events
---------------------------------------
In binlog, each row operation event is preceded by Table map event(s) which maps
table id(s) to database and table names. So, it's enough to support rewriting
database name in a Table map.
2.1. Add rewrite_db() member to Table_map_log_event:
int Table_map_log_event::rewrite_db(
const char* new_db,
size_t new_db_len,
const Format_description_log_event* desc)
{
/* 1. In temp_buf member (possibly reallocating it) rewrite
event length, db length, and db parts
2. Change m_dblen and m_dbnam members
*/
}
Comment. This function assumes that temp_buf member contains Table map
binlog representaion (temp_buf is used for creating corresponding
BINLOG statement).
2.2. In mysqlbinlog modify corresponding switch case in the
process_event() function:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
Log_event *ev, ...)
{
...
switch (ev_type) {
...
case TABLE_MAP_EVENT:
{
Table_map_log_event *map= ((Table_map_log_event *)ev);
if (shall_skip_database(map->get_db_name()))
{ ...
}
// WL36
size_t new_len= 0;
const char* new_db= binlog_filter->get_rewrite_db(
map->get_db_name(), &new_len);
if (new_len && map->rewrite_db(new_db, new_len,
glob_description_event))
{ error("Could not rewrite database name");
goto err;
}
}
case WRITE_ROWS_EVENT:
case DELETE_ROWS_EVENT:
case UPDATE_ROWS_EVENT:
...
}
...
}
Comment. Rpl_filter::get_rewrite_db(db_from, &len): if filter contains
a (db_from, db_to) pair, this function returns pointer to db_to and
sets len = db_to length; otherwise, it returns db_from and does not
change len value.
3. Supporting rewrite-db for SBR events
---------------------------------------
Limited to emiting USE <db_to> instead of USE <db_from>.
USE statements can be emited by mysqlbinlog as a result of processing the
following events (see process_event() function):
- Query_log_event
- Load_log_event
- Execute_load_query_log_event [ :public Query_log_event ]
- Create_file_log_event [ :public Load_log_event ]
TODO. Needed to check this list carefully (not sure for Create_file_log_event)
Notes.
- In replication, only Query_log_event and Load_log_event uses
rpl_filter->get_rewrite_db();
- In mysqlbinlog (process_event), Execute_load_query_log_event
and Create_file_log_event are processed in separate switch
cases. And Load_log_event is processed in the default switch case.
Conditions for emiting use-statement:
- LOG_EVENT_SUPPRESS_USE_F is OFF for the event
(e.g. it is ON for 'create database' statement)
- event's db name differs from db_name in PRINT_EVENT_INFO
(PRINT_EVENT_INFO keeps db name of the last issued USE statement;
initially, this db name is empty).
3.1. In mysqlbinlog.cc
- Add the following function:
void print_use_stmt(Log_event* event, PRINT_EVENT_INFO* pinfo)
{
if (event->flags & LOG_EVENT_SUPPRESS_USE_F)
return;
/*
- For events listed above get db_from = event->db;
- If db_from is the same as pinfo->db then return;
- If there is rewrite-db rule db_from->db_to,
set db = db_to. Else set db = db_from;
- Print "use <db>" to mysqlbinlog output
- Set pinfo->db = db_from
(this suppresses emiting use-statements by corresponding
log_event's print-function)
*/
}
- In process_event() function add switch case for Load_log_event and
add print_use_stmt() invocations where needed (according to the
events list above), e.g.:
Exit_status process_event(
PRINT_EVENT_INFO *print_event_info,
Log_event *ev, ...)
{
...
switch (ev_type) {
case QUERY_EVENT:
if (shall_skip_database(((Query_log_event*)ev)->db))
goto end;
if (opt_base64_output_mode == BASE64_OUTPUT_ALWAYS)
{
// Possibly in case of rewite-db rule for ev->db
// a warning should be emited here (see note below)
... write_event_header_and_base64(ev, ...) ...
}
else
{
print_use_stmt((Query_log_event*)ev, print_event_info);
ev->print(result_file, print_event_info);
}
break;
...
case LOAD_EVENT:
print_use_stmt((Load_log_event*)ev, print_event_info);
break;
default:
...
}
...
}
Note. write_event_header_and_base64() does not print use-statement. It
produces BINLOG statement using ev->temp_buf content (i.e. the binary
log representation of the event). We don't rewrite temp_buf here with
db_to name (as we do it for Table map event) - this implies the
limitation 3 mentioned above.
Question: Is supporting of rewite_db + --base64-output really needed
currently?
4. Current status
-----------------
The outlined design (implemented for mysql-5.1.37) is tested for
simple test-cases.
TODO. 1. Check list of events which can emit use-statement.
2. Supporting of rewite_db + --base64-output ?
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Alexi): Add a mysqlbinlog option to filter updates to certain tables (40)
by worklog-noreply@askmonty.org 03 Nov '09
by worklog-noreply@askmonty.org 03 Nov '09
03 Nov '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter updates to certain tables
CREATION DATE..: Mon, 10 Aug 2009, 13:25
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Psergey
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 40 (http://askmonty.org/worklog/?tid=40)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 2
ESTIMATE.......: 48 (hours remain)
ORIG. ESTIMATE.: 48
PROGRESS NOTES:
-=-=(Alexi - Tue, 03 Nov 2009, 11:19)=-=-
Low Level Design modified.
--- /tmp/wklog.40.old.7187 2009-11-03 11:19:22.000000000 +0200
+++ /tmp/wklog.40.new.7187 2009-11-03 11:19:22.000000000 +0200
@@ -1 +1,132 @@
+OPTION: 2.5 Extend Query Events With Tables Info
+================================================
+1. Query_log_event Binary Format
+********************************
+Changes to be done:
+
+ Query_log_event binary format
+ ---------------------------------
+ Name Size (bytes)
+ ---------------------------------
+ COMMON HEADER:
+ timestamp 4
+ type 1
+ server_id 4
+ total_size 4
+ master_position 4
+ flags 2
+ ---------------------------------
+ POST HEADER:
+ slave_proxy_id 4
+ exec_time 4
+ db_len 1
++ query_len 2 (see Note 1)
+ error_code 2
+ status_vars_len 2
++ tables_info_len 2 (see Note 2)
+ ---------------------------------
+ BODY:
+ status_vars status_vars_len
+- db db_len + 1
++ db db_len (see Note 3)
+ query query_len
++ tables_info
+
+ tables_info binary format
+ ---------------------------------
+ Name Size (bytes)
+ ---------------------------------
+ db_len 1 (see Note 4)
+ db db_len
+ table_name_len 1
+ table_name table_name_len
+ ...
+ db_len 1
+ db db_len
+ table_name_len 1
+ table_name table_name_len
+
+NOTES
+1. Currently Query_log_event format doesn't include 'query_len' because
+ it considers the query to extent to the end of the event.
+2. If tables_info is not included in the event (--binlog-with-tables-info
+ option), tables_info_len = 0.
+3. The trailing zero is redundant since the length is already known.
+4. In case of db = current db, db_len = 0 and db = empty, because
+ current db is already included in the current event format.
+
+2. Where to get tables info from?
+*********************************
+
+2.1. Case study: CREATE TABLE
+******************************
+
+*** CREATE TABLE table [SELECT ...]
+
+ bool mysql_create_table_no_lock(
+ THD *thd,
+ const char *db,
+ const char *table_name, ...)
+ {
+ ...
+ // -------------------------------------
+ // WL40: To be included in tables_info:
+ // * db, table_name
+ // * thd->lex->query_tables (tables refered to in
+ // the select-part; empty if no select-part)
+ // -------------------------------------
+ write_bin_log(thd, TRUE, thd->query, thd->query_length);
+ }
+
+*** CREATE TABLE table LIKE src-table
+
+ bool mysql_create_like_table(
+ ...
+ TABLE_LIST *table,
+ TABLE_LIST *src_table,
+ ...)
+ {
+ ...
+ if (thd->current_stmt_binlog_row_based)
+ { // RBR: In this case we don't replicate temp tables
+ if (!(create_info->options & HA_LEX_CREATE_TMP_TABLE))
+ {
+ if (src_table->table->s->tmp_table)
+ { // CREATE normal-table LIKE temp-table:
+
+ // Generate new query without LIKE-part
+ store_create_info(thd, table, &query, create_info, FALSE);
+
+ // -------------------------------------
+ // WL40: To include to tables_info:
+ // * table (src_table is not included)
+ // -------------------------------------
+ write_bin_log(thd, TRUE, query.ptr(), query.length());
+ }
+ else
+ { // CREATE normal-table LIKE normal-table
+
+ // -------------------------------------
+ // WL40: To include to log_tables_info:
+ // * table
+ // * src_table
+ // -------------------------------------
+ write_bin_log(thd, TRUE, thd->query, thd->query_length);
+ }
+ }
+ // CREATE temp-table LIKE ...
+ // This case is not replicated
+ }
+ else
+ { // SBR:
+ // -------------------------------------
+ // WL40: To include to tables_info:
+ // * table
+ // * src_table
+ // -------------------------------------
+ write_bin_log(thd, TRUE, thd->query, thd->query_length);
+ }
+ }
+
+To be continued
-=-=(Alexi - Mon, 02 Nov 2009, 11:34)=-=-
Worked 2 hours on option 2.5
Worked 2 hours and estimate 48 hours remain (original estimate increased by 50 hours).
-=-=(Alexi - Mon, 02 Nov 2009, 11:20)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.4848 2009-11-02 11:20:24.000000000 +0200
+++ /tmp/wklog.40.new.4848 2009-11-02 11:20:24.000000000 +0200
@@ -90,3 +90,25 @@
It might be useful to integrate this with the code that already handles
--replicate-ignore-db and similar slave options.
+2.5 Extend Query Events With Tables Info
+----------------------------------------
+
+We could extend query events structure with a tables info - a list of tables
+which the query refers to:
+
+ <current query event structure>
+ tables_info_len
+ dbase_len dbase
+ table_len table
+ ...
+ dbase_len dbase
+ table_len table
+
+Note. In case of <dbase> = current data base, we can set dbase_len = 0
+ and dbase = empty because current query event structure already
+ includes current data base name.
+
+Note. Possibly it is reasonable also to add a --binlog-with-tables-info
+ option which defines whether tables info must be included to the
+ query events.
+
-=-=(Knielsen - Fri, 14 Aug 2009, 15:47)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.10896 2009-08-14 15:47:39.000000000 +0300
+++ /tmp/wklog.40.new.10896 2009-08-14 15:47:39.000000000 +0300
@@ -72,3 +72,21 @@
/* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
and further processing in mysqlbinlog will be trivial.
+
+2.4 Implement server functionality to ignore certain tables
+-----------------------------------------------------------
+
+We could add a general facility in the server to ignore certain tables:
+
+ SET SESSION ignored_tables = "db1.t1,db2.t2";
+
+This would work similar to --replicate-ignore-table, but in a general way not
+restricted to the slave SQL thread.
+
+It would then be trivial for mysqlbinlog to add such statements at the start
+of the output, or probably the user could just do it manually with no need for
+additional options for mysqlbinlog.
+
+It might be useful to integrate this with the code that already handles
+--replicate-ignore-db and similar slave options.
+
-=-=(Psergey - Mon, 10 Aug 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.12989 2009-08-10 15:41:23.000000000 +0300
+++ /tmp/wklog.40.new.12989 2009-08-10 15:41:23.000000000 +0300
@@ -1,6 +1,7 @@
-
1. Context
----------
+(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
+overview)
At the moment, the server has these replication slave options:
--replicate-do-table=db.tbl
-=-=(Guest - Mon, 10 Aug 2009, 14:52)=-=-
Dependency created: 39 now depends on 40
-=-=(Guest - Mon, 10 Aug 2009, 14:51)=-=-
High Level Description modified.
--- /tmp/wklog.40.old.16985 2009-08-10 14:51:59.000000000 +0300
+++ /tmp/wklog.40.new.16985 2009-08-10 14:51:59.000000000 +0300
@@ -1,3 +1,4 @@
Replication slave can be set to filter updates to certain tables with
---replicate-[wild-]{do,ignore}-table options. This task is about adding similar
-functionality to mysqlbinlog.
+--replicate-[wild-]{do,ignore}-table options.
+
+This task is about adding similar functionality to mysqlbinlog.
-=-=(Guest - Mon, 10 Aug 2009, 14:51)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.16949 2009-08-10 14:51:33.000000000 +0300
+++ /tmp/wklog.40.new.16949 2009-08-10 14:51:33.000000000 +0300
@@ -1 +1,73 @@
+1. Context
+----------
+At the moment, the server has these replication slave options:
+
+ --replicate-do-table=db.tbl
+ --replicate-ignore-table=db.tbl
+ --replicate-wild-do-table=pattern.pattern
+ --replicate-wild-ignore-table=pattern.pattern
+
+They affect both RBR and SBR events. SBR events are checked after the
+statement has been parsed, the server iterates over list of used tables and
+checks them againist --replicate instructions.
+
+What is interesting is that this scheme still allows to update the ignored
+table through a VIEW.
+
+2. Table filtering in mysqlbinlog
+---------------------------------
+
+Per-table filtering of RBR events is easy (as it is relatively easy to extract
+the name of the table that the event applies to).
+
+Per-table filtering of SBR events is hard, as generally it is not apparent
+which tables the statement refers to.
+
+This opens possible options:
+
+2.1 Put the parser into mysqlbinlog
+-----------------------------------
+Once we have a full parser in mysqlbinlog, we'll be able to check which tables
+are used by a statement, and will allow to show behaviour identical to those
+that one obtains when using --replicate-* slave options.
+
+(It is not clear how much effort is needed to put the parser into mysqlbinlog.
+Any guesses?)
+
+
+2.2 Use dumb regexp match
+-------------------------
+Use a really dumb approach. A query is considered to be modifying table X if
+it matches an expression
+
+CREATE TABLE $tablename
+DROP $tablename
+UPDATE ...$tablename ... SET // here '...' can't contain the word 'SET'
+DELETE ...$tablename ... WHERE // same as above
+ALTER TABLE $tablename
+.. etc (go get from the grammar) ..
+
+The advantage over doing the same in awk is that mysqlbinlog will also process
+RBR statements, and together with that will provide a working solution for
+those who are careful with their table names not mixing with string constants
+and such.
+
+(TODO: string constants are of particular concern as they come from
+[potentially hostile] users, unlike e.g. table aliases which come from
+[not hostile] developers. Remove also all string constants before attempting
+to do match?)
+
+2.3 Have the master put annotations
+-----------------------------------
+We could add a master option so that it injects into query a mark that tells
+which tables the query will affect, e.g. for the query
+
+ UPDATE t1 LEFT JOIN db3.t2 ON ... WHERE ...
+
+
+the binlog will have
+
+ /* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
+
+and further processing in mysqlbinlog will be trivial.
DESCRIPTION:
Replication slave can be set to filter updates to certain tables with
--replicate-[wild-]{do,ignore}-table options.
This task is about adding similar functionality to mysqlbinlog.
HIGH-LEVEL SPECIFICATION:
1. Context
----------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has these replication slave options:
--replicate-do-table=db.tbl
--replicate-ignore-table=db.tbl
--replicate-wild-do-table=pattern.pattern
--replicate-wild-ignore-table=pattern.pattern
They affect both RBR and SBR events. SBR events are checked after the
statement has been parsed, the server iterates over list of used tables and
checks them againist --replicate instructions.
What is interesting is that this scheme still allows to update the ignored
table through a VIEW.
2. Table filtering in mysqlbinlog
---------------------------------
Per-table filtering of RBR events is easy (as it is relatively easy to extract
the name of the table that the event applies to).
Per-table filtering of SBR events is hard, as generally it is not apparent
which tables the statement refers to.
This opens possible options:
2.1 Put the parser into mysqlbinlog
-----------------------------------
Once we have a full parser in mysqlbinlog, we'll be able to check which tables
are used by a statement, and will allow to show behaviour identical to those
that one obtains when using --replicate-* slave options.
(It is not clear how much effort is needed to put the parser into mysqlbinlog.
Any guesses?)
2.2 Use dumb regexp match
-------------------------
Use a really dumb approach. A query is considered to be modifying table X if
it matches an expression
CREATE TABLE $tablename
DROP $tablename
UPDATE ...$tablename ... SET // here '...' can't contain the word 'SET'
DELETE ...$tablename ... WHERE // same as above
ALTER TABLE $tablename
.. etc (go get from the grammar) ..
The advantage over doing the same in awk is that mysqlbinlog will also process
RBR statements, and together with that will provide a working solution for
those who are careful with their table names not mixing with string constants
and such.
(TODO: string constants are of particular concern as they come from
[potentially hostile] users, unlike e.g. table aliases which come from
[not hostile] developers. Remove also all string constants before attempting
to do match?)
2.3 Have the master put annotations
-----------------------------------
We could add a master option so that it injects into query a mark that tells
which tables the query will affect, e.g. for the query
UPDATE t1 LEFT JOIN db3.t2 ON ... WHERE ...
the binlog will have
/* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
and further processing in mysqlbinlog will be trivial.
2.4 Implement server functionality to ignore certain tables
-----------------------------------------------------------
We could add a general facility in the server to ignore certain tables:
SET SESSION ignored_tables = "db1.t1,db2.t2";
This would work similar to --replicate-ignore-table, but in a general way not
restricted to the slave SQL thread.
It would then be trivial for mysqlbinlog to add such statements at the start
of the output, or probably the user could just do it manually with no need for
additional options for mysqlbinlog.
It might be useful to integrate this with the code that already handles
--replicate-ignore-db and similar slave options.
2.5 Extend Query Events With Tables Info
----------------------------------------
We could extend query events structure with a tables info - a list of tables
which the query refers to:
<current query event structure>
tables_info_len
dbase_len dbase
table_len table
...
dbase_len dbase
table_len table
Note. In case of <dbase> = current data base, we can set dbase_len = 0
and dbase = empty because current query event structure already
includes current data base name.
Note. Possibly it is reasonable also to add a --binlog-with-tables-info
option which defines whether tables info must be included to the
query events.
LOW-LEVEL DESIGN:
OPTION: 2.5 Extend Query Events With Tables Info
================================================
1. Query_log_event Binary Format
********************************
Changes to be done:
Query_log_event binary format
---------------------------------
Name Size (bytes)
---------------------------------
COMMON HEADER:
timestamp 4
type 1
server_id 4
total_size 4
master_position 4
flags 2
---------------------------------
POST HEADER:
slave_proxy_id 4
exec_time 4
db_len 1
+ query_len 2 (see Note 1)
error_code 2
status_vars_len 2
+ tables_info_len 2 (see Note 2)
---------------------------------
BODY:
status_vars status_vars_len
- db db_len + 1
+ db db_len (see Note 3)
query query_len
+ tables_info
tables_info binary format
---------------------------------
Name Size (bytes)
---------------------------------
db_len 1 (see Note 4)
db db_len
table_name_len 1
table_name table_name_len
...
db_len 1
db db_len
table_name_len 1
table_name table_name_len
NOTES
1. Currently Query_log_event format doesn't include 'query_len' because
it considers the query to extent to the end of the event.
2. If tables_info is not included in the event (--binlog-with-tables-info
option), tables_info_len = 0.
3. The trailing zero is redundant since the length is already known.
4. In case of db = current db, db_len = 0 and db = empty, because
current db is already included in the current event format.
2. Where to get tables info from?
*********************************
2.1. Case study: CREATE TABLE
******************************
*** CREATE TABLE table [SELECT ...]
bool mysql_create_table_no_lock(
THD *thd,
const char *db,
const char *table_name, ...)
{
...
// -------------------------------------
// WL40: To be included in tables_info:
// * db, table_name
// * thd->lex->query_tables (tables refered to in
// the select-part; empty if no select-part)
// -------------------------------------
write_bin_log(thd, TRUE, thd->query, thd->query_length);
}
*** CREATE TABLE table LIKE src-table
bool mysql_create_like_table(
...
TABLE_LIST *table,
TABLE_LIST *src_table,
...)
{
...
if (thd->current_stmt_binlog_row_based)
{ // RBR: In this case we don't replicate temp tables
if (!(create_info->options & HA_LEX_CREATE_TMP_TABLE))
{
if (src_table->table->s->tmp_table)
{ // CREATE normal-table LIKE temp-table:
// Generate new query without LIKE-part
store_create_info(thd, table, &query, create_info, FALSE);
// -------------------------------------
// WL40: To include to tables_info:
// * table (src_table is not included)
// -------------------------------------
write_bin_log(thd, TRUE, query.ptr(), query.length());
}
else
{ // CREATE normal-table LIKE normal-table
// -------------------------------------
// WL40: To include to log_tables_info:
// * table
// * src_table
// -------------------------------------
write_bin_log(thd, TRUE, thd->query, thd->query_length);
}
}
// CREATE temp-table LIKE ...
// This case is not replicated
}
else
{ // SBR:
// -------------------------------------
// WL40: To include to tables_info:
// * table
// * src_table
// -------------------------------------
write_bin_log(thd, TRUE, thd->query, thd->query_length);
}
}
To be continued
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Alexi): Add a mysqlbinlog option to filter updates to certain tables (40)
by worklog-noreply@askmonty.org 03 Nov '09
by worklog-noreply@askmonty.org 03 Nov '09
03 Nov '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter updates to certain tables
CREATION DATE..: Mon, 10 Aug 2009, 13:25
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Psergey
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 40 (http://askmonty.org/worklog/?tid=40)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 2
ESTIMATE.......: 48 (hours remain)
ORIG. ESTIMATE.: 48
PROGRESS NOTES:
-=-=(Alexi - Tue, 03 Nov 2009, 11:19)=-=-
Low Level Design modified.
--- /tmp/wklog.40.old.7187 2009-11-03 11:19:22.000000000 +0200
+++ /tmp/wklog.40.new.7187 2009-11-03 11:19:22.000000000 +0200
@@ -1 +1,132 @@
+OPTION: 2.5 Extend Query Events With Tables Info
+================================================
+1. Query_log_event Binary Format
+********************************
+Changes to be done:
+
+ Query_log_event binary format
+ ---------------------------------
+ Name Size (bytes)
+ ---------------------------------
+ COMMON HEADER:
+ timestamp 4
+ type 1
+ server_id 4
+ total_size 4
+ master_position 4
+ flags 2
+ ---------------------------------
+ POST HEADER:
+ slave_proxy_id 4
+ exec_time 4
+ db_len 1
++ query_len 2 (see Note 1)
+ error_code 2
+ status_vars_len 2
++ tables_info_len 2 (see Note 2)
+ ---------------------------------
+ BODY:
+ status_vars status_vars_len
+- db db_len + 1
++ db db_len (see Note 3)
+ query query_len
++ tables_info
+
+ tables_info binary format
+ ---------------------------------
+ Name Size (bytes)
+ ---------------------------------
+ db_len 1 (see Note 4)
+ db db_len
+ table_name_len 1
+ table_name table_name_len
+ ...
+ db_len 1
+ db db_len
+ table_name_len 1
+ table_name table_name_len
+
+NOTES
+1. Currently Query_log_event format doesn't include 'query_len' because
+ it considers the query to extent to the end of the event.
+2. If tables_info is not included in the event (--binlog-with-tables-info
+ option), tables_info_len = 0.
+3. The trailing zero is redundant since the length is already known.
+4. In case of db = current db, db_len = 0 and db = empty, because
+ current db is already included in the current event format.
+
+2. Where to get tables info from?
+*********************************
+
+2.1. Case study: CREATE TABLE
+******************************
+
+*** CREATE TABLE table [SELECT ...]
+
+ bool mysql_create_table_no_lock(
+ THD *thd,
+ const char *db,
+ const char *table_name, ...)
+ {
+ ...
+ // -------------------------------------
+ // WL40: To be included in tables_info:
+ // * db, table_name
+ // * thd->lex->query_tables (tables refered to in
+ // the select-part; empty if no select-part)
+ // -------------------------------------
+ write_bin_log(thd, TRUE, thd->query, thd->query_length);
+ }
+
+*** CREATE TABLE table LIKE src-table
+
+ bool mysql_create_like_table(
+ ...
+ TABLE_LIST *table,
+ TABLE_LIST *src_table,
+ ...)
+ {
+ ...
+ if (thd->current_stmt_binlog_row_based)
+ { // RBR: In this case we don't replicate temp tables
+ if (!(create_info->options & HA_LEX_CREATE_TMP_TABLE))
+ {
+ if (src_table->table->s->tmp_table)
+ { // CREATE normal-table LIKE temp-table:
+
+ // Generate new query without LIKE-part
+ store_create_info(thd, table, &query, create_info, FALSE);
+
+ // -------------------------------------
+ // WL40: To include to tables_info:
+ // * table (src_table is not included)
+ // -------------------------------------
+ write_bin_log(thd, TRUE, query.ptr(), query.length());
+ }
+ else
+ { // CREATE normal-table LIKE normal-table
+
+ // -------------------------------------
+ // WL40: To include to log_tables_info:
+ // * table
+ // * src_table
+ // -------------------------------------
+ write_bin_log(thd, TRUE, thd->query, thd->query_length);
+ }
+ }
+ // CREATE temp-table LIKE ...
+ // This case is not replicated
+ }
+ else
+ { // SBR:
+ // -------------------------------------
+ // WL40: To include to tables_info:
+ // * table
+ // * src_table
+ // -------------------------------------
+ write_bin_log(thd, TRUE, thd->query, thd->query_length);
+ }
+ }
+
+To be continued
-=-=(Alexi - Mon, 02 Nov 2009, 11:34)=-=-
Worked 2 hours on option 2.5
Worked 2 hours and estimate 48 hours remain (original estimate increased by 50 hours).
-=-=(Alexi - Mon, 02 Nov 2009, 11:20)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.4848 2009-11-02 11:20:24.000000000 +0200
+++ /tmp/wklog.40.new.4848 2009-11-02 11:20:24.000000000 +0200
@@ -90,3 +90,25 @@
It might be useful to integrate this with the code that already handles
--replicate-ignore-db and similar slave options.
+2.5 Extend Query Events With Tables Info
+----------------------------------------
+
+We could extend query events structure with a tables info - a list of tables
+which the query refers to:
+
+ <current query event structure>
+ tables_info_len
+ dbase_len dbase
+ table_len table
+ ...
+ dbase_len dbase
+ table_len table
+
+Note. In case of <dbase> = current data base, we can set dbase_len = 0
+ and dbase = empty because current query event structure already
+ includes current data base name.
+
+Note. Possibly it is reasonable also to add a --binlog-with-tables-info
+ option which defines whether tables info must be included to the
+ query events.
+
-=-=(Knielsen - Fri, 14 Aug 2009, 15:47)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.10896 2009-08-14 15:47:39.000000000 +0300
+++ /tmp/wklog.40.new.10896 2009-08-14 15:47:39.000000000 +0300
@@ -72,3 +72,21 @@
/* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
and further processing in mysqlbinlog will be trivial.
+
+2.4 Implement server functionality to ignore certain tables
+-----------------------------------------------------------
+
+We could add a general facility in the server to ignore certain tables:
+
+ SET SESSION ignored_tables = "db1.t1,db2.t2";
+
+This would work similar to --replicate-ignore-table, but in a general way not
+restricted to the slave SQL thread.
+
+It would then be trivial for mysqlbinlog to add such statements at the start
+of the output, or probably the user could just do it manually with no need for
+additional options for mysqlbinlog.
+
+It might be useful to integrate this with the code that already handles
+--replicate-ignore-db and similar slave options.
+
-=-=(Psergey - Mon, 10 Aug 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.12989 2009-08-10 15:41:23.000000000 +0300
+++ /tmp/wklog.40.new.12989 2009-08-10 15:41:23.000000000 +0300
@@ -1,6 +1,7 @@
-
1. Context
----------
+(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
+overview)
At the moment, the server has these replication slave options:
--replicate-do-table=db.tbl
-=-=(Guest - Mon, 10 Aug 2009, 14:52)=-=-
Dependency created: 39 now depends on 40
-=-=(Guest - Mon, 10 Aug 2009, 14:51)=-=-
High Level Description modified.
--- /tmp/wklog.40.old.16985 2009-08-10 14:51:59.000000000 +0300
+++ /tmp/wklog.40.new.16985 2009-08-10 14:51:59.000000000 +0300
@@ -1,3 +1,4 @@
Replication slave can be set to filter updates to certain tables with
---replicate-[wild-]{do,ignore}-table options. This task is about adding similar
-functionality to mysqlbinlog.
+--replicate-[wild-]{do,ignore}-table options.
+
+This task is about adding similar functionality to mysqlbinlog.
-=-=(Guest - Mon, 10 Aug 2009, 14:51)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.16949 2009-08-10 14:51:33.000000000 +0300
+++ /tmp/wklog.40.new.16949 2009-08-10 14:51:33.000000000 +0300
@@ -1 +1,73 @@
+1. Context
+----------
+At the moment, the server has these replication slave options:
+
+ --replicate-do-table=db.tbl
+ --replicate-ignore-table=db.tbl
+ --replicate-wild-do-table=pattern.pattern
+ --replicate-wild-ignore-table=pattern.pattern
+
+They affect both RBR and SBR events. SBR events are checked after the
+statement has been parsed, the server iterates over list of used tables and
+checks them againist --replicate instructions.
+
+What is interesting is that this scheme still allows to update the ignored
+table through a VIEW.
+
+2. Table filtering in mysqlbinlog
+---------------------------------
+
+Per-table filtering of RBR events is easy (as it is relatively easy to extract
+the name of the table that the event applies to).
+
+Per-table filtering of SBR events is hard, as generally it is not apparent
+which tables the statement refers to.
+
+This opens possible options:
+
+2.1 Put the parser into mysqlbinlog
+-----------------------------------
+Once we have a full parser in mysqlbinlog, we'll be able to check which tables
+are used by a statement, and will allow to show behaviour identical to those
+that one obtains when using --replicate-* slave options.
+
+(It is not clear how much effort is needed to put the parser into mysqlbinlog.
+Any guesses?)
+
+
+2.2 Use dumb regexp match
+-------------------------
+Use a really dumb approach. A query is considered to be modifying table X if
+it matches an expression
+
+CREATE TABLE $tablename
+DROP $tablename
+UPDATE ...$tablename ... SET // here '...' can't contain the word 'SET'
+DELETE ...$tablename ... WHERE // same as above
+ALTER TABLE $tablename
+.. etc (go get from the grammar) ..
+
+The advantage over doing the same in awk is that mysqlbinlog will also process
+RBR statements, and together with that will provide a working solution for
+those who are careful with their table names not mixing with string constants
+and such.
+
+(TODO: string constants are of particular concern as they come from
+[potentially hostile] users, unlike e.g. table aliases which come from
+[not hostile] developers. Remove also all string constants before attempting
+to do match?)
+
+2.3 Have the master put annotations
+-----------------------------------
+We could add a master option so that it injects into query a mark that tells
+which tables the query will affect, e.g. for the query
+
+ UPDATE t1 LEFT JOIN db3.t2 ON ... WHERE ...
+
+
+the binlog will have
+
+ /* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
+
+and further processing in mysqlbinlog will be trivial.
DESCRIPTION:
Replication slave can be set to filter updates to certain tables with
--replicate-[wild-]{do,ignore}-table options.
This task is about adding similar functionality to mysqlbinlog.
HIGH-LEVEL SPECIFICATION:
1. Context
----------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has these replication slave options:
--replicate-do-table=db.tbl
--replicate-ignore-table=db.tbl
--replicate-wild-do-table=pattern.pattern
--replicate-wild-ignore-table=pattern.pattern
They affect both RBR and SBR events. SBR events are checked after the
statement has been parsed, the server iterates over list of used tables and
checks them againist --replicate instructions.
What is interesting is that this scheme still allows to update the ignored
table through a VIEW.
2. Table filtering in mysqlbinlog
---------------------------------
Per-table filtering of RBR events is easy (as it is relatively easy to extract
the name of the table that the event applies to).
Per-table filtering of SBR events is hard, as generally it is not apparent
which tables the statement refers to.
This opens possible options:
2.1 Put the parser into mysqlbinlog
-----------------------------------
Once we have a full parser in mysqlbinlog, we'll be able to check which tables
are used by a statement, and will allow to show behaviour identical to those
that one obtains when using --replicate-* slave options.
(It is not clear how much effort is needed to put the parser into mysqlbinlog.
Any guesses?)
2.2 Use dumb regexp match
-------------------------
Use a really dumb approach. A query is considered to be modifying table X if
it matches an expression
CREATE TABLE $tablename
DROP $tablename
UPDATE ...$tablename ... SET // here '...' can't contain the word 'SET'
DELETE ...$tablename ... WHERE // same as above
ALTER TABLE $tablename
.. etc (go get from the grammar) ..
The advantage over doing the same in awk is that mysqlbinlog will also process
RBR statements, and together with that will provide a working solution for
those who are careful with their table names not mixing with string constants
and such.
(TODO: string constants are of particular concern as they come from
[potentially hostile] users, unlike e.g. table aliases which come from
[not hostile] developers. Remove also all string constants before attempting
to do match?)
2.3 Have the master put annotations
-----------------------------------
We could add a master option so that it injects into query a mark that tells
which tables the query will affect, e.g. for the query
UPDATE t1 LEFT JOIN db3.t2 ON ... WHERE ...
the binlog will have
/* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
and further processing in mysqlbinlog will be trivial.
2.4 Implement server functionality to ignore certain tables
-----------------------------------------------------------
We could add a general facility in the server to ignore certain tables:
SET SESSION ignored_tables = "db1.t1,db2.t2";
This would work similar to --replicate-ignore-table, but in a general way not
restricted to the slave SQL thread.
It would then be trivial for mysqlbinlog to add such statements at the start
of the output, or probably the user could just do it manually with no need for
additional options for mysqlbinlog.
It might be useful to integrate this with the code that already handles
--replicate-ignore-db and similar slave options.
2.5 Extend Query Events With Tables Info
----------------------------------------
We could extend query events structure with a tables info - a list of tables
which the query refers to:
<current query event structure>
tables_info_len
dbase_len dbase
table_len table
...
dbase_len dbase
table_len table
Note. In case of <dbase> = current data base, we can set dbase_len = 0
and dbase = empty because current query event structure already
includes current data base name.
Note. Possibly it is reasonable also to add a --binlog-with-tables-info
option which defines whether tables info must be included to the
query events.
LOW-LEVEL DESIGN:
OPTION: 2.5 Extend Query Events With Tables Info
================================================
1. Query_log_event Binary Format
********************************
Changes to be done:
Query_log_event binary format
---------------------------------
Name Size (bytes)
---------------------------------
COMMON HEADER:
timestamp 4
type 1
server_id 4
total_size 4
master_position 4
flags 2
---------------------------------
POST HEADER:
slave_proxy_id 4
exec_time 4
db_len 1
+ query_len 2 (see Note 1)
error_code 2
status_vars_len 2
+ tables_info_len 2 (see Note 2)
---------------------------------
BODY:
status_vars status_vars_len
- db db_len + 1
+ db db_len (see Note 3)
query query_len
+ tables_info
tables_info binary format
---------------------------------
Name Size (bytes)
---------------------------------
db_len 1 (see Note 4)
db db_len
table_name_len 1
table_name table_name_len
...
db_len 1
db db_len
table_name_len 1
table_name table_name_len
NOTES
1. Currently Query_log_event format doesn't include 'query_len' because
it considers the query to extent to the end of the event.
2. If tables_info is not included in the event (--binlog-with-tables-info
option), tables_info_len = 0.
3. The trailing zero is redundant since the length is already known.
4. In case of db = current db, db_len = 0 and db = empty, because
current db is already included in the current event format.
2. Where to get tables info from?
*********************************
2.1. Case study: CREATE TABLE
******************************
*** CREATE TABLE table [SELECT ...]
bool mysql_create_table_no_lock(
THD *thd,
const char *db,
const char *table_name, ...)
{
...
// -------------------------------------
// WL40: To be included in tables_info:
// * db, table_name
// * thd->lex->query_tables (tables refered to in
// the select-part; empty if no select-part)
// -------------------------------------
write_bin_log(thd, TRUE, thd->query, thd->query_length);
}
*** CREATE TABLE table LIKE src-table
bool mysql_create_like_table(
...
TABLE_LIST *table,
TABLE_LIST *src_table,
...)
{
...
if (thd->current_stmt_binlog_row_based)
{ // RBR: In this case we don't replicate temp tables
if (!(create_info->options & HA_LEX_CREATE_TMP_TABLE))
{
if (src_table->table->s->tmp_table)
{ // CREATE normal-table LIKE temp-table:
// Generate new query without LIKE-part
store_create_info(thd, table, &query, create_info, FALSE);
// -------------------------------------
// WL40: To include to tables_info:
// * table (src_table is not included)
// -------------------------------------
write_bin_log(thd, TRUE, query.ptr(), query.length());
}
else
{ // CREATE normal-table LIKE normal-table
// -------------------------------------
// WL40: To include to log_tables_info:
// * table
// * src_table
// -------------------------------------
write_bin_log(thd, TRUE, thd->query, thd->query_length);
}
}
// CREATE temp-table LIKE ...
// This case is not replicated
}
else
{ // SBR:
// -------------------------------------
// WL40: To include to tables_info:
// * table
// * src_table
// -------------------------------------
write_bin_log(thd, TRUE, thd->query, thd->query_length);
}
}
To be continued
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Rewrite of the free documentation to make it about MariaDB, not MySQL
by Henrik Ingo 03 Nov '09
by Henrik Ingo 03 Nov '09
03 Nov '09
Daniel
I glanced through some of the commits today and noticed Kristian
committing the new version of MySQL 5.1.38 documentation (man pages
and other free docs).
The docs seem to include lots of references to where you can download
MySQL rpm's, where you can buy MySQL support, Suns support lifecycle
policies, etc. As part of your documentation work, could you be in
touch with Kristian and find out how you could help in cleaning that
up. Meaning: Either remove such sections, or replace with relevant
MariaDB information.
While at it, I'd like to review the Readme file of MariaDB. Could you
help me with that (ie extract it from bzr and work with me to rewrite
it - so that I can just participate by email).
The Readme file we should fix before GA release. The other docs are
only semi-urgent.
henrik
--
email: henrik.ingo(a)avoinelama.fi
tel: +358-40-5697354
www: www.avoinelama.fi/~hingo
book: www.openlife.cc
3
5
[Maria-developers] New (by Sanja): Add info to engine description (61)
by worklog-noreply@askmonty.org 02 Nov '09
by worklog-noreply@askmonty.org 02 Nov '09
02 Nov '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add info to engine description
CREATION DATE..: Mon, 02 Nov 2009, 22:58
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 61 (http://askmonty.org/worklog/?tid=61)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 5 (hours remain)
ORIG. ESTIMATE.: 5
PROGRESS NOTES:
DESCRIPTION:
Add additional information about engine and show it in SHOW ENGINES:
License (PROPRIETARY, GPL, BSD) (it is present just have to be shown)
Maturity (TEST, ALPHA, BETA, GAMMA, RELEASE)
Version (just string from engine developer like "0.99 betta", better if it will
be monotonically increasing in terms of alphabetical sort).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0