Hi Marko:
 
thanks for your prompt reply.
 
What would EXPLAIN show for the SELECT of session 1?  Can you set a breakpoint on lock_rec_lock() while the SELECT? What are the invocations and their stack traces?
 
explain select * from sbtest1 where id=200057 for update;
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | sbtest1 | NULL       | const | PRIMARY       | PRIMARY | 4       | const |    1 |   100.00 | NULL  |
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.01 sec)
 
use the primary key for searching, not a full table scan, so add X Lock on record(id=200057),  no doubt about it. 
I debugged it, so I'm sure the lock is added on supremum in row0sel.cc::row_search_mvcc:5072(ver: mysql-5.7.12), it says:
 
if (page_rec_is_infimum(rec)) {
 
    /* The infimum record on a page cannot be in the result set,
    and neither can a record lock be placed on it: we skip such
    a record. */
 
    goto next_rec;
  }
 
  if (page_rec_is_supremum(rec)) {
 
    if (set_also_gap_locks
        && !(srv_locks_unsafe_for_binlog
       || trx->isolation_level <= TRX_ISO_READ_COMMITTED)
        && prebuilt->select_lock_type != LOCK_NONE
        && !dict_index_is_spatial(index)) {
 
      /* Try to place a lock on the index record */
 
      /* If innodb_locks_unsafe_for_binlog option is used
      or this session is using a READ COMMITTED isolation
      level we do not lock gaps. Supremum record is really
      a gap and therefore we do not set locks there. */
 
      offsets = rec_get_offsets(rec, index, offsets,
              ULINT_UNDEFINED, &heap);
      err = sel_set_rec_lock(pcur,
                 rec, index, offsets,
                 prebuilt->select_lock_type,
                 LOCK_ORDINARY, thr, &mtr);
      .....
      }
    }
 
This would not be the only scenario where the internal workings of InnoDB are exposed to the SQL locking layer. Another example is……
 
Yes,I known this,  gap Lock could be added on the record even if it's deleted but not purged yet.
 
Could you please try to write .test files for these? ...
 
Yes, I will write a test case soon.

 
2017-06-05
boyce@netease

发件人:Marko Mäkelä <marko.makela@mariadb.com>
发送时间:2017-06-05 18:07
主题:Re: [Maria-developers] the Lock on supremum is neccessary ?
收件人:"jhx1008"<jhx1008@gmail.com>
抄送:"maria-developers"<maria-developers@lists.launchpad.net>
 
Hi,
I can confirm that this looks strange. Unfortunately I cannot study the details right now, but I thought I would give you some generic advice, so that you can debug this further.

On Mon, Jun 5, 2017 at 11:53 AM, jhx1008 <jhx1008@gmail.com> wrote:
secondary index is not neccessary,  I do not create it!!
 
I have two sessions:
 
session1: begin; select * from sbtest1 where id=123456 for update
session2: insert into sbtest1 values(123450, 0,'aaa','bbb');
 
the records in the table like below:
(...),(123440, 264454, xxxxxx, xxxxx), (123456, 251028, xxxxx, xxxxxx),(....)
 
in my opinion,  the record with the id=123456 existed, so  session1 add the X lock on this record,  session2 add the insert intention lock between(123440, 123456), these two locks are not conflicted.  session2 will not be blocked by session1.
 
unfortunately, in my case,  session2 is blocked by session1, what happens?
What would EXPLAIN show for the SELECT of session 1?  Can you set a breakpoint on lock_rec_lock() while the SELECT? What are the invocations and their stack traces?
 
then I have debugged this instance(row0sel.cc::row_search_mvcc), I got that: when a record existed on the page for the first one, if you want to add the X Lock on it, the supremum of pre-page will be add X Lock(maybe next key lock, supremum can be add gap lock only) also.

It appears you already did something like my suggestion above.
 
In my case, the records' primary key stored like this:
 
(...,  123440, supremum), (infimum, 123456, ...), the id=123456 is the first record of that page, when add X Lock,  first, find the pre-record id=123440, and then add the X lock after this record(id=123440) until match the record witch id=123456, so the supremum is locked also..., because of this lock(supremum), session2 is blocked.  if id=123456 is not the first record of page,   session2 will not be blocked.

This would not be the only scenario where the internal workings of InnoDB are exposed to the SQL locking layer. Another example is
MySQL Bug#19762 InnoDB should not lock a delete-marked record
which I filed eleven years ago. One issue with that bug is that the timing of the InnoDB purge thread is affecting the ‘width’ of a gap lock. If you get lots of purge lag, you might have a gap lock attached to a delete-marked record, and you’d have no problem inserting into the gap followed by the purgeable record. However, if purge removes the record, it will widen the gap lock to the successor of the purgeable record. So, after purge has run, there would be a lock conflict with future operations.
so my doubt is:    in my case,  the session1 lock the supremum is necessary? if it's necessary,  the lock on supremum maybe resulting the difference even if two mysql instances have the same records,  just as in a replication.
To answer that, I would have to see the details of the supremum lock creation. Also page splits and merges could play a role here; locks can be moved by them.
the record located on the page not all the same even if two instances have the same table sturcture & data(if slave define the table compressed, or innodb page_size not set as the same as master, etc..).  if lock on supremum is neccessary,  two instances execute the same SQLs, master is OK, but slave conflicted? on master, the two transactions may be committed in one group,  but on slave maybe conflicted,  slave's parallell replication is OK?  also in MySQL 5.7.x,  the xa prepare recorded in binlog,  if on the master, execute two xa transactions:
 
xa start 'aaa'; delete from table_name where id=xxxx; xa end 'aaa';xa prepare 'aaa';
xa start 'bbb';insert into table_name values(xxxx); end 'bbb'; xa prepare 'bbb';
xa commit 'aaa'; xa commit 'bbb'
 
on the master, these xa not conflicted,  I can committed these transaction,  but  on the slave,  maybe conflicted sometimes,  the replication is broken.

This is very true. Somewhat related to this I can name two bugs:
MySQL Bug#3300 a feature request for ‘semi-consistent read’ (do not lock non-matching rows; I implemented this in 5.1)
MySQL Bug#57978 PREPARED transactions forget locks on server restart

how to repeat:
 
  • create a table with primary key(id auto_increment is better)
  • insert some records in this table
  • use 3 sessions to find the first record on one page(perhaps, there are many pages, any one is OK).how to?
       session1: begin; select * from table_name where id=xxx for update // id=xxx must be exist
       session2: select * from table_name where id=xxx for update
       session3: when session2 is blocked, execute: show engine innodb status\G;
       in the transaction section, you can find some details like this:

   ------- TRX HAS BEEN WAITING 2 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 29 page no 2772 n bits 144 index PRIMARY of table `mydb`.`sbtest1` trx id 2388 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

   from the information "space id 29 page no 2772",  we can find the page of this record.
   then test for next id until the page no become to 2773, remember this id, this record is the first one of page 2773
  • delete some records before this id witch get in step 3
  • use 2 sessions to do what i described
All of your scenarios sound so simple that it should be feasible to repeat these in mysql-test-run.
Could you please try to write .test files for these? Even if the problems could not be fixed soon, I think that they should be filed on https://jira.mariadb.org/ for future reference.

Best regards,

 Marko
--
Marko Mäkelä, Lead Developer InnoDB
MariaDB Corporation