Alexi1952 <Alexi1952@yandex.ru> writes: Hi Alexi! Sorry for the delay in answering, things have been quite busy...
With WL36 and WL40 we have: mysqlbinlog options: --database=db --rewrite-db=db_from->db_to --do-table=db.tbl --ignore-table=db.tbl --wild-do-table=pattern.pattern --wild-ignore-table=pattern.pattern
replication options: --replicate-rewrite-db=db_from->db_to --replicate-do-db=db --replicate-ignore-db=db --replicate-do-table=tbl --replicate-ignore-table=tbl --replicate-wild-do-table=db.tbl --replicate-wild-ignore-table=db.tbl
1. In mysqlbinlog we do not have --do_db and --ignore_db options. Does it mean that instead it is supposed to use:
--replicate-wild-do-table=db.% --replicate-wild-ignore-table=db.%
respectively?
I think the --database option of mysqlbinlog is supposed to be similar to mysqldump of a particular database, rather than similar to replication.
Compared with other options, --database option looks like a "foreign body" :
- contrary to other options, it allows to specify only one database (with multiple --database's only the last one is used);
- having with --database an analog of do_db, we have no similar analog of ignore_db.
2. In replication two functions are used for filtering databases:
- db_ok(const char* db) which matches db only with do-db and ignore-db rules;
- db_ok_with_wild_table(const char* db) which matches db only with wild-do-table=db.% and wild-ignore-table=db.% rules. This function is applied only to CREATE DB, DROP DB, and ALTER DATABASE statements.
In mysqlbinlog, should we follow the same scheme, namely:
- db_ok() for matching db with --database option only; - db_ok_with_wild_table() for statements listed above?
It is a bit of a complex issue, but your suggestion sounds reasonable.
3. According to replication filtering rules, --replicate-rewrite_db is always done _before_ other --replicate-* rules are tested; see explanation for --replicate-rewrite-db in RefMan (16.1.3.3. Replication Slave Options and Variables), or the following piece of code in og_event.cc:
int Table_map_log_event::do_apply_event(Relay_log_info const* rli) { RPL_TABLE_LIST* table_list; ... strmov(table_list->db, rpl_filter->get_rewrite_db(m_dbnam, &dummy_len)); ... if (...!rpl_filter->db_ok(table_list->db) ...) ... }
And what about --database + rewrite-db for mysqlbinlog? If we mean to output only database xxx with renaming it to yyy, should we use
(1) mysqlbinlog --database=xxx --rewrite-db=xxx->yyy or (2) mysqlbinlog --database=yyy --rewrite-db=xxx->yyy
In current WL36 design it is supposed that (1) should be used (surely, this can easily be redesigned). But this becomes confused with using of --wild-do-table + replication filtering rules, for which we should use:
(3) --wild-do-table=yyy.% --rewrite-db=xxx->yyy
Personally, I think it makes more sense to apply the filter before the rewrite in mysqlbinlog. Even though this is different from how replication works. (2) and (3) appear quite confusing (to the user). I think the difference is that with replication, the --replicate-[wild-]{do,ignore}-table options concern how the binlog is _applied_. So it makes sense (maybe) to rewrite the events first, and then apply them (since the other way around is impossible). But with mysqlbinlog we are not applying events, only filtering. So I think it makes more sense for filtering rules to apply to the name before rewriting. We can't change replication to do filtering before rewriting, which leaves it inconsistent with mysqlbinlog according to this. But instead we could use different names for the options, as I think the -do- implies actually applying the events rather than filtering. So maybe --include-table --exclude-table --wild-include-table --wild-exclude-table and these (and --database) are done before rewrite. And the --wild-* forms should be sufficient for more complex database filtering (so no need for options similar to --replicate-{do,ignore}-db). At least that's my immediate opinion. Good analysis BTW! - Kristian.