Kristian,
Thanks for the explanation.
I see why the record is there, but I still consider it a glitch because the event was not replicated to that server. It was originated there. In other topologies (which do not include log-slave-update in any node) there is no such problem.
This issue does not prevent the setup. As I mentioned before, the star topology works to my satisfaction. Expect an article on this topic in next week (after the ones about fan-in and all-masters point-to-point topologies).

Cheers

Giuseppe



On Thu, Aug 13, 2015 at 11:20 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Giuseppe Maxia <g.maxia@gmail.com> writes:

> Thanks for the quick answer. What you are saying is that the transaction in
> node 101 is retained because it is part of the stream coming from node 100.

Yes.

> Here I note a difference with the usage of domain_ids.
> * If the domain_id is the same, then the transaction from 101 is discarded
> and replaced by the latest events coming from 100.
> * If I set a different domain ID for each master, the transaction from 101
> is retained even if I then insert hundreds of events from 100 while 101
> stays idle.

The way it works is that @@gtid_slave_pos records, for each configured
domain_id, the last GTID seen within that replication domain. So in general,
@@gtid_slave_pos will have as many elements as there are domains in the
replication topology (though a domain can appear empty if no GTIDs were ever
replicated from it on a given server).

So this difference is always there: configuring one more domain_id will
result in one extra entry in @@gtid_slave_pos.

In a simple replication topology with only a single master, domain_id is not
needed. Slaves preserve commit order from their master, so the order of
GTIDs in all binlogs is the same, and a slave that reconnects need only
start at the point of the last GTID it applied before. When a slave switches
to a new master, it is essential that binlog order is the same on old and
new master for this to work correctly.

With multiple masters (multi-source, or ring-topology, or otherwise), in
general we can not be sure that binlog order will be identical between old
and new master when a slave switches master. This is the purpose of
domain_id. Effectively, it makes the binlog consist of independent streams,
one per replication domain. The slave keeps track of its position (last
applied GTID) for each domain individually. Then only order within each
stream needs to be consistent for slave connect to new master to work
correctly. And this is ensured by configuring domain_id so that each has at
most one master active at any one time.

So in your star topology, if you configure different domain ids, and you
have a slave off one endpoint, you can switch it to use a different endpoint
(or they hub) as a new master - and later switch it back. This requires
remembering the position within each domain indefinitely.

If you were to use the same domain id everywhere, replication would still
work, same as in non-GTID mode. But you would not be able to easily switch a
slave from one endpoint to another - again the same as in non-GTID mode.

> To give you a clearer view of what I am doing, I am experimenting with a
> star topology, where I have endpoints that are masters connected to a hub,
> which is the node with log-slave-updates enabled. The topology works
> perfectly: data produced in the endpoints or in the hub reach all the other
> endpoints. The only glitch is the endpoints (master nodes without
> log-slave-updates) have their own transactions in gtid_slave_pos long after
> purging the logs in all nodes.

If this entry was missing, it would mean that upon next slave connect, that
server should fetch _all_ events in that domain (that it created itself)
from the master, only to skip them because of --replicate-save-server-id=0.
Which surely is not intended.

I guess the thing that is not clear to me is why you consider this a glitch?
Does it create any problems for your setup?

 - Kristian.