Re: [Maria-discuss] gtid_slave_pos row count

29 Sep 2018

      ...
Do you have any errors in the error log about failure to delete rows?
Nope, no errors.
...
Anything else special to your setup that might be causing this?
At some point I thought maybe the tokudb_analyze_in_background / tokudb_auto_analyze messes things up as it does the background check (you can also see here the row count growing):

2018-09-29 11:05:48 134488 [Note] TokuDB: Auto scheduling background analysis for ./mysql/gtid_slave_pos_TokuDB, delta_activity 423840 is greater than 40 percent of 1059601 rows. - succeeded.
2018-09-29 11:09:35 134490 [Note] TokuDB: Auto scheduling background analysis for ./mysql/gtid_slave_pos_TokuDB, delta_activity 424359 is greater than 40 percent of 1060885 rows. - succeeded.
2018-09-29 11:13:23 134488 [Note] TokuDB: Auto scheduling background analysis for ./mysql/gtid_slave_pos_TokuDB, delta_activity 424888 is greater than 40 percent of 1062196 rows. - succeeded.

(it triggers also in conservative mode but then it happens just because of a single row being >40% of the table)

I tried to switch off the gtid_pos_auto_engines to use a single gtid_pos InnoDB table and it makes no difference - in conservative mode everything is fine in optimistic the table fills up.

The odd thing is that I'm actually not using gtid for the replication:

MariaDB [mysql]> show slave status\G

                Slave_IO_State: Waiting for master to send event
                   Master_Host: 10.0.8.211
                   Master_User: repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: mysql-bin.096519
           Read_Master_Log_Pos: 79697585
                Relay_Log_File: db-relay-bin.000142
                 Relay_Log_Pos: 78464847
         Relay_Master_Log_File: mysql-bin.096519
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 79697245
               Relay_Log_Space: 595992008
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
..
         Seconds_Behind_Master: 0
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 211
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: No
                   Gtid_IO_Pos:
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: optimistic
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
              Slave_DDL_Groups: 25
Slave_Non_Transactional_Groups: 284193
    Slave_Transactional_Groups: 452098720

The other "special" thing maybe is that the master is still 10.2.4 - but that shouldn’t be the problem?

I have 2 slaves (both 10.3.9 / might try to downgrade back to 10.2.x or previous versions of 10.3.x as I don't know the exact point when it started to happen) and the issue is triggered immediately when switching the parallel mode.
...
Can you share the contents of the mysql.gtid_slave_pos table when this
happens?
Sure, 

MariaDB [mysql]> select @@gtid_slave_pos;
+--------------------+
| @@gtid_slave_pos   |
+--------------------+
| 0-211-211038653075 |
+--------------------+
1 row in set (0.000 sec)

MariaDB [mysql]> select * from gtid_slave_pos limit 10;
+-----------+--------+-----------+--------------+
| domain_id | sub_id | server_id | seq_no       |
+-----------+--------+-----------+--------------+
|         0 |  29488 |       211 | 210594092751 |
|         0 |  29490 |       211 | 210594092753 |
|         0 |  29957 |       211 | 210594093220 |
|         0 |  29958 |       211 | 210594093221 |
|         0 |  29961 |       211 | 210594093224 |
|         0 |  29962 |       211 | 210594093225 |
|         0 |  30095 |       211 | 210594093358 |
|         0 |  30096 |       211 | 210594093359 |
|         0 |  30247 |       211 | 210594093510 |
|         0 |  30275 |       211 | 210594093538 |
+-----------+--------+-----------+--------------+
10 rows in set (0.000 sec)

MariaDB [mysql]> select count(*) from gtid_slave_pos;
+----------+
| count(*) |
+----------+
|  2395877 |
+----------+
1 row in set (0.578 sec)

MariaDB [mysql]> select * from gtid_slave_pos_TokuDB limit 10;;
+-----------+--------+-----------+--------------+
| domain_id | sub_id | server_id | seq_no       |
+-----------+--------+-----------+--------------+
|         0 |  29373 |       211 | 210594092636 |
|         0 |  29911 |       211 | 210594093174 |
|         0 |  29912 |       211 | 210594093175 |
|         0 |  30282 |       211 | 210594093545 |
|         0 |  30283 |       211 | 210594093546 |
|         0 |  30284 |       211 | 210594093547 |
|         0 |  30285 |       211 | 210594093548 |
|         0 |  30287 |       211 | 210594093550 |
|         0 |  30348 |       211 | 210594093611 |
|         0 |  30349 |       211 | 210594093612 |
+-----------+--------+-----------+--------------+
10 rows in set (0.001 sec)

MariaDB [mysql]> select count(*) from gtid_slave_pos_TokuDB;
+----------+
| count(*) |
+----------+
|   840001 |
...
...
Is there something wrong with the purger?
(something similar like in https://jira.mariadb.org/browse/MDEV-12147
? )
That bug is rather different - the row count in the table is not growing, but
number of unpurged rows is.
True, I just searched the Jira for similar (related to gtid_slave_pos) kind of issues and saw this still being open.

The optimistic mode makes a big difference in our setup as with the conservative there are times when the slaves start to lag several days behind.

rr

Re: [Maria-discuss] gtid_slave_pos row count

Reinis Rozitis