Re: [Maria-developers] [server] Mdev 7802 binlog groupcommit stats (#30)
Daniel Black <notifications@github.com> writes: Thanks for your patch. I agree that these kinds of status information could be useful. But I have some parts that are not clear, and I want you to think about the questions below and answer with your thoughts. This is to make sure you understand yourself the issue in detail; to help me understand it; and in some cases maybe some changes should be made.
The following global status variables where added: * binlog_group_commit_reason_count * binlog_group_commit_reason_usec
So, these names seem a bit odd. Neither "count" nor "usec" makes much sense as a reason for something to happen. "_timeout" would be a better reason that "_usec", but I am not sure of a better name for "_count". The good thing about these names of course is that they match the corresponding --binlog-commit-wait-* option. What do you think of something like binlog_commit_trigger_count binlog_commit_trigger_timeout instead ?
* binlog_group_commit_reason_transaction * binlog_group_commit_reason_immediate
binlog_group_commit_reason_transaction is a result of ordered transaction that need to occur in the same order on the slave and can't be parallelised.
binlog_group_commit_reason_immediate is caused to prevent stalls with row locks as described in log.cc:binlog_report_wait_for. This immediate count is also counted a second time in binlog_group_commit_reason_transaction.
Please re-read that explanation, and consider: do you think an average user/DBA will fully understand what these status variables mean? Honestly, I am wondering if you can even yourself explain exactly what the difference is between the two :-). Can you give two simple examples of transactions, one example of which will increment the _transaction status variable and one of which will increment the _immediate? Do you think the distinction between the two is useful for the user? If not, maybe just have a single status variable, like binlog_commit_trigger_lock_wait for example?
Overall binlog_group_commits = binlog_group_commit_reason_count + binlog_group_commit_reason_usec + binlog_group_commit_reason_transaction
Is this also true if --binlog-commit-wait-count=0? If so, what will the values of the different status variables be in this case? Now, to the patches.
+ { + if (++count >= opt_binlog_commit_wait_count) + { + group_commit_reason_count++; + return; + } + if (unlikely(e->thd->has_waiter)) + { + group_commit_reason_transaction++; return; + } + }
So this increments the global status variables directly, I think? How does the thread locking work? Is it possible for a reader to see a corrupt value (eg. one word of the old value and one word of the new) on 32-bit platform? How is this handled for other similar status variables? It would not be good if we had to take some LOCK_status or something inside the critical group commit code. With respect to the test cases: Can you please comment yourself on the different places where you added output of the status variables, and explain why this output will always be the same, no matter how threads are scheduled on the machine running the test? (This is the critical part of most tests related to replication; because of the inherently multi-threaded nature of such tests, a lot of care is needed to make sure the test will not produce different output on different test runs depending on the speed of the host or how threads are scheduled on a loaded machine). - Kristian.
----- Original Message -----
Daniel Black <notifications@github.com> writes:
Thanks for your patch. I agree that these kinds of status information could be useful. But I have some parts that are not clear, and I want you to think about the questions below and answer with your thoughts. This is to make sure you understand yourself the issue in detail; to help me understand it; and in some cases maybe some changes should be made.
The following global status variables where added: * binlog_group_commit_reason_count * binlog_group_commit_reason_usec
So, these names seem a bit odd. Neither "count" nor "usec" makes much sense as a reason for something to happen. "_timeout" would be a better reason that "_usec", but I am not sure of a better name for "_count". The good thing about these names of course is that they match the corresponding --binlog-commit-wait-* option.
What do you think of something like
binlog_commit_trigger_count binlog_commit_trigger_timeout
instead ?
Sure. reason->trigger and usec->timeout work. Regarding the dropping of "group_", I was trying to keep it aligned in name the total - binlog_group_commits so instead binlog_group_commit_trigger_{count,timeout,....}
* binlog_group_commit_reason_transaction * binlog_group_commit_reason_immediate
binlog_group_commit_reason_transaction is a result of ordered transaction that need to occur in the same order on the slave and can't be parallelised.
binlog_group_commit_reason_immediate is caused to prevent stalls with row locks as described in log.cc:binlog_report_wait_for. This immediate count is also counted a second time in binlog_group_commit_reason_transaction.
Please re-read that explanation, and consider: do you think an average user/DBA will fully understand what these status variables mean? Honestly, I am wondering if you can even yourself explain exactly what the difference is between the two :-).
Good test. I'd be struggling a bit without the above explanation. I was looking at something that would make up the difference to
Can you give two simple examples of transactions, one example of which will increment the _transaction status variable
e.g.1 create temporary table x ( x int); insert into x values (4); (probably don't even need this one(?)). e.g.2 C1: begin C1: update y set x=x/3 where z=4 C1: commit C2: begin C2: update y set x=4 where z=4 C2: commit where they occur in different connections within the same binlog-commit-wait-usec so the first is committed in a different group so that C2 doesn't have to deadlock.
and one of which will increment the _immediate?
update x set y=3; (statement) insert into y select * from x where y=3; Perhaps I was measuring in the wrong spot however group_commit_reason_immediate was always 0 in the patch to binlog_commit_wait.test/result that I added.
Do you think the distinction between the two is useful for the user?
the _immediate suggests a tighter contention in row locks that perhaps can't be avoided however _transaction would indicate that perhaps binlog-commit-wait-{} is too high
If not, maybe just have a single status variable, like
binlog_commit_trigger_lock_wait
for example?
Overall binlog_group_commits = binlog_group_commit_reason_count + binlog_group_commit_reason_usec + binlog_group_commit_reason_transaction
Yes holding that relationship would be good. the overlap of transaction/immediate was bad form.
Is this also true if --binlog-commit-wait-count=0?
I'd hope so but I'm not sure.
If so, what will the values of the different status variables be in this case?
Now, to the patches.
+ { + if (++count >= opt_binlog_commit_wait_count) + { + group_commit_reason_count++; + return; + } + if (unlikely(e->thd->has_waiter)) + { + group_commit_reason_transaction++; return; + } + }
So this increments the global status variables directly, I think?
group_commit_reason_count / group_commit_reason_transaction are members of the MYSQL_BIN_LOG class so I'm assuming there is only one of these.
How does the thread locking work? Is it possible for a reader to see a corrupt value (eg. one word of the old value and one word of the new) on 32-bit platform?
Badly. Looks like I'll need to move some bits so they are under LOCK_prepare_ordered which seems to be acquired close to reachable. I'll push this as the next commit on to the PR.
How is this handled for other similar status variables?
LOCK_commit_ordered seems to cover ++num_groups_commits (status var group_commits)
It would not be good if we had to take some LOCK_status or something inside the critical group commit code.
With respect to the test cases: Can you please comment yourself on the different places where you added output of the status variables, and explain why this output will always be the same, no matter how threads are scheduled on the machine running the test? (This is the critical part of most tests related to replication; because of the inherently multi-threaded nature of such tests, a lot of care is needed to make sure the test will not produce different output on different test runs depending on the speed of the host or how threads are scheduled on a loaded machine).
Ack.
- Kristian.
Thank you. -- -- Daniel Black, Engineer @ Open Query (http://openquery.com.au) Remote expertise & maintenance for MySQL/MariaDB server environments.
----- Original Message -----
Daniel Black <notifications@github.com> writes: What do you think of something like
binlog_commit_trigger_count binlog_commit_trigger_timeout
instead ?
to maintain consistency with binlog_group_commits... binlog_group_commit_trigger_count binlog_group_commit_trigger_timeout binlog_group_commit_reason_immediate removed ref: https://github.com/openquery/mariadb-server/commit/1d5220d1124111f563f9faec3...
If not, maybe just have a single status variable, like
binlog_commit_trigger_lock_wait
done.
So this increments the global status variables directly, I think? How does the thread locking work? Is it possible for a reader to see a corrupt value (eg. one word of the old value and one word of the new) on 32-bit platform?
Incrementing of counters was already under LOCK_prepare_ordered. Used this lock in the status retrieval and made sure it didn't overlap with the LOCK_commit_ordered used by the other binlog status variables. https://github.com/openquery/mariadb-server/commit/dd7026a703aabdbe0430bf5f3...
With respect to the test cases: Can you please comment yourself on the different places where you added output of the status variables, and explain why this output will always be the same, no matter how threads are scheduled on the machine running the test? (This is the critical part of most tests related to replication; because of the inherently multi-threaded nature of such tests, a lot of care is needed to make sure the test will not produce different output on different test runs depending on the speed of the host or how threads are scheduled on a loaded machine).
Added: https://github.com/openquery/mariadb-server/commit/0695fdd9df3501a02ae473c23... -- -- Daniel Black, Engineer @ Open Query (http://openquery.com.au) Remote expertise & maintenance for MySQL/MariaDB server environments.
participants (2)
-
Daniel Black
-
Kristian Nielsen