Re: [Maria-developers] Implementing implicit primary key in mysql server
Hi Sergei, You are right, let users add primary key is best. But I can't let users who don't want to create primary key can't use our MySQL service. Amazon RDS also allow users to create the tables without primary key, just change binlog_format to MIXED to solve replication problem. I think this feature is very useful for any MySQL cloud service providers, all of them will face this problem in one day. I will try to do some analysis/research in this feature implement first, I will need your help :-) I will notice any new updates in this email. Thanks, Lixun On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Lixun!
On Jun 18, Lixun Peng wrote:
Hi,
As we know, InnoDB has implicit primary key if a table hasn't defined a primary key. However MySQL server doesn't know this primary key, so this primary key will not apear in binlog.
When we are using ROW format binlog for replication, if a table has no any indexes, that's a disaster. If a table without indexes do a DML (UPDATE/DELETE), of course it will run a long time in master, but in slave, it still need a long time. It will cause serious slave replication delay. ... I think mysql server can do the same thing as InnoDB do, if user doesn't define the primary key, mysql can add the primary key automatically.
How do you think?
Well, that's doable. A much easier solution would be to require a user to create a primary key. It's a one-line change:
- Table_flags ha_table_flags() const { return cached_table_flags; } + Table_flags ha_table_flags() const { return cached_table_flags | HA_REQUIRE_PRIMARY_KEY; }
But what you suggest is possible too, I believe.
Regards, Sergei
-- Senior Database Engineer @ Taobao.com Mobile Phone: +86 18658156856 (Hangzhou) Gtalk: penglixun(at)gmail.com Twitter: http://www.twitter.com/plinux Blog: http://www.penglixun.com
There is also sql_safe_updates which (among other things) will cause updates/deletes not using keys to fail. Also, MySQL 5.6 made an attempt at fixing the replication part of this issue (see slave_allow_batching and slave-rows-search-algorithms). Davi On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng <penglixun@gmail.com> wrote:
Hi Sergei,
You are right, let users add primary key is best. But I can't let users who don't want to create primary key can't use our MySQL service. Amazon RDS also allow users to create the tables without primary key, just change binlog_format to MIXED to solve replication problem.
I think this feature is very useful for any MySQL cloud service providers, all of them will face this problem in one day. I will try to do some analysis/research in this feature implement first, I will need your help :-)
I will notice any new updates in this email.
Thanks, Lixun
On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik <serg@mariadb.org>wrote:
Hi, Lixun!
On Jun 18, Lixun Peng wrote:
Hi,
As we know, InnoDB has implicit primary key if a table hasn't defined a primary key. However MySQL server doesn't know this primary key, so this primary key will not apear in binlog.
When we are using ROW format binlog for replication, if a table has no any indexes, that's a disaster. If a table without indexes do a DML (UPDATE/DELETE), of course it will run a long time in master, but in slave, it still need a long time. It will cause serious slave replication delay. ...
I think mysql server can do the same thing as InnoDB do, if user doesn't define the primary key, mysql can add the primary key automatically.
How do you think?
Well, that's doable. A much easier solution would be to require a user to create a primary key. It's a one-line change:
- Table_flags ha_table_flags() const { return cached_table_flags; } + Table_flags ha_table_flags() const { return cached_table_flags | HA_REQUIRE_PRIMARY_KEY; }
But what you suggest is possible too, I believe.
Regards, Sergei
-- Senior Database Engineer @ Taobao.com
Mobile Phone: +86 18658156856 (Hangzhou) Gtalk: penglixun(at)gmail.com Twitter: http://www.twitter.com/plinux Blog: http://www.penglixun.com
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Lixun, I've thought about this a bit and I'm not sure this will be very simple to do (or rather it's more complicated than it might seem). While I think it is not that hard to expose the __id field to replication internals, I think in order for this to really work it would need to be exposed to other tools, such as mysqldump. It is also unclear how to decide when it is safe to use this __id field (how to determine if it is in sync between master and slave). As an alternate suggestion, what about ignoring the current implicit PK behavior, and instead automatically adding a field using auto_increment when the user doesn't provide a PK: __id BIGINT UNSIGNED NOT NULL auto_increment, PRIMARY KEY(__id) Regards, Jeremy On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng <penglixun@gmail.com> wrote:
Hi Sergei,
You are right, let users add primary key is best. But I can't let users who don't want to create primary key can't use our MySQL service. Amazon RDS also allow users to create the tables without primary key, just change binlog_format to MIXED to solve replication problem.
I think this feature is very useful for any MySQL cloud service providers, all of them will face this problem in one day. I will try to do some analysis/research in this feature implement first, I will need your help :-)
I will notice any new updates in this email.
Thanks, Lixun
On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik <serg@mariadb.org>wrote:
Hi, Lixun!
On Jun 18, Lixun Peng wrote:
Hi,
As we know, InnoDB has implicit primary key if a table hasn't defined a primary key. However MySQL server doesn't know this primary key, so this primary key will not apear in binlog.
When we are using ROW format binlog for replication, if a table has no any indexes, that's a disaster. If a table without indexes do a DML (UPDATE/DELETE), of course it will run a long time in master, but in slave, it still need a long time. It will cause serious slave replication delay. ...
I think mysql server can do the same thing as InnoDB do, if user doesn't define the primary key, mysql can add the primary key automatically.
How do you think?
Well, that's doable. A much easier solution would be to require a user to create a primary key. It's a one-line change:
- Table_flags ha_table_flags() const { return cached_table_flags; } + Table_flags ha_table_flags() const { return cached_table_flags | HA_REQUIRE_PRIMARY_KEY; }
But what you suggest is possible too, I believe.
Regards, Sergei
-- Senior Database Engineer @ Taobao.com
Mobile Phone: +86 18658156856 (Hangzhou) Gtalk: penglixun(at)gmail.com Twitter: http://www.twitter.com/plinux Blog: http://www.penglixun.com
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Hi Jeremy, Thank you for your suggestion. I also want to just add the PK field for custom automatically, but some of our customs can't accept it. Because they are using "SELECT * FROM table .... " or " INSERT INTO table VALUES(...) ", if I add a visible PK for them, "SELECT *" will show this value, then their applications will report errors. So I have to set this field as an implicit filed. Thanks, Lixun On Sat, Jun 22, 2013 at 4:36 AM, Jeremy Cole <jeremycole@google.com> wrote:
Lixun,
I've thought about this a bit and I'm not sure this will be very simple to do (or rather it's more complicated than it might seem). While I think it is not that hard to expose the __id field to replication internals, I think in order for this to really work it would need to be exposed to other tools, such as mysqldump. It is also unclear how to decide when it is safe to use this __id field (how to determine if it is in sync between master and slave).
As an alternate suggestion, what about ignoring the current implicit PK behavior, and instead automatically adding a field using auto_increment when the user doesn't provide a PK:
__id BIGINT UNSIGNED NOT NULL auto_increment, PRIMARY KEY(__id)
Regards,
Jeremy
On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng <penglixun@gmail.com> wrote:
Hi Sergei,
You are right, let users add primary key is best. But I can't let users who don't want to create primary key can't use our MySQL service. Amazon RDS also allow users to create the tables without primary key, just change binlog_format to MIXED to solve replication problem.
I think this feature is very useful for any MySQL cloud service providers, all of them will face this problem in one day. I will try to do some analysis/research in this feature implement first, I will need your help :-)
I will notice any new updates in this email.
Thanks, Lixun
On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik <serg@mariadb.org>wrote:
Hi, Lixun!
On Jun 18, Lixun Peng wrote:
Hi,
As we know, InnoDB has implicit primary key if a table hasn't defined a primary key. However MySQL server doesn't know this primary key, so this primary key will not apear in binlog.
When we are using ROW format binlog for replication, if a table has no any indexes, that's a disaster. If a table without indexes do a DML (UPDATE/DELETE), of course it will run a long time in master, but in slave, it still need a long time. It will cause serious slave replication delay. ...
I think mysql server can do the same thing as InnoDB do, if user doesn't define the primary key, mysql can add the primary key automatically.
How do you think?
Well, that's doable. A much easier solution would be to require a user to create a primary key. It's a one-line change:
- Table_flags ha_table_flags() const { return cached_table_flags; } + Table_flags ha_table_flags() const { return cached_table_flags | HA_REQUIRE_PRIMARY_KEY; }
But what you suggest is possible too, I believe.
Regards, Sergei
-- Senior Database Engineer @ Taobao.com
Mobile Phone: +86 18658156856 (Hangzhou) Gtalk: penglixun(at)gmail.com Twitter: http://www.twitter.com/plinux Blog: http://www.penglixun.com
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
-- Senior Database Engineer @ Taobao.com Mobile Phone: +86 18658156856 (Hangzhou) Gtalk: penglixun(at)gmail.com Twitter: http://www.twitter.com/plinux Blog: http://www.penglixun.com
Hi all, I implement a demo patch, based on 5.5.18. *1. CREATE TABLE* root@localhost : plx 11:54:46> create table test_no_pk (col1 varchar(32)); Query OK, 0 rows affected (0.01 sec) root@localhost : plx 11:55:05> desc test_no_pk; +----------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------+-------------+------+-----+---------+----------------+ | col1 | varchar(32) | YES | | NULL | | | __row_id | bigint(20) | NO | PRI | NULL | auto_increment | +----------+-------------+------+-----+---------+----------------+ 2 rows in set (0.01 sec) if users has not defined a PK, I will add it automatically. *2. ALTER TABLE* root@localhost : plx 11:55:10> alter table test_no_pk add id int, add primary key(id); Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 root@localhost : plx 11:57:02> desc test_no_pk; +-------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+---------+-------+ | col1 | varchar(32) | YES | | NULL | | | id | int(11) | NO | PRI | 0 | | +-------+-------------+------+-----+---------+-------+ 2 rows in set (0.01 sec) When users add a PK, I will remove implicit PK automatically. root@localhost : plx 11:57:07> alter table test_no_pk drop primary key; Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 root@localhost : plx 11:57:42> desc test_no_pk; +----------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------+-------------+------+-----+---------+----------------+ | col1 | varchar(32) | YES | | NULL | | | id | int(11) | NO | | 0 | | | __row_id | bigint(20) | NO | PRI | NULL | auto_increment | +----------+-------------+------+-----+---------+----------------+ 3 rows in set (0.00 sec) When users dropped PK, I will add it automatically. *3. INSERT VALUES* root@localhost : plx 11:59:22> insert into test_no_pk values('abc',2); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' root@localhost : plx 11:59:23> insert into test_no_pk values('abc',4); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' it will report duplicate, *Sergei, can you help me to find why?* *4. SELECT ** root@localhost : plx 12:07:23> select * from test_no_pk; +------+----+ | col1 | id | +------+----+ | abc | 6 | +------+----+ 1 row in set (0.00 sec) root@localhost : plx 12:07:30> select __row_id from test_no_pk; +----------+ | __row_id | +----------+ | 1 | +----------+ 1 row in set (0.00 sec) When users run "SELECT *", row_id will be filter. *5. SHOW CREATE* root@localhost : plx 12:07:35> show create table test_no_pk\G *************************** 1. row *************************** Table: test_no_pk Create Table: CREATE TABLE `test_no_pk` ( `col1` varchar(32) DEFAULT NULL, `id` int(11) NOT NULL DEFAULT '0' ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 1 row in set (0.00 sec) row_id will be hidden. Thanks, Lixun On Wed, Jun 26, 2013 at 10:59 AM, Lixun Peng <penglixun@gmail.com> wrote:
Hi Jeremy,
Thank you for your suggestion. I also want to just add the PK field for custom automatically, but some of our customs can't accept it. Because they are using "SELECT * FROM table .... " or " INSERT INTO table VALUES(...) ", if I add a visible PK for them, "SELECT *" will show this value, then their applications will report errors. So I have to set this field as an implicit filed.
Thanks, Lixun
On Sat, Jun 22, 2013 at 4:36 AM, Jeremy Cole <jeremycole@google.com>wrote:
Lixun,
I've thought about this a bit and I'm not sure this will be very simple to do (or rather it's more complicated than it might seem). While I think it is not that hard to expose the __id field to replication internals, I think in order for this to really work it would need to be exposed to other tools, such as mysqldump. It is also unclear how to decide when it is safe to use this __id field (how to determine if it is in sync between master and slave).
As an alternate suggestion, what about ignoring the current implicit PK behavior, and instead automatically adding a field using auto_increment when the user doesn't provide a PK:
__id BIGINT UNSIGNED NOT NULL auto_increment, PRIMARY KEY(__id)
Regards,
Jeremy
On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng <penglixun@gmail.com> wrote:
Hi Sergei,
You are right, let users add primary key is best. But I can't let users who don't want to create primary key can't use our MySQL service. Amazon RDS also allow users to create the tables without primary key, just change binlog_format to MIXED to solve replication problem.
I think this feature is very useful for any MySQL cloud service providers, all of them will face this problem in one day. I will try to do some analysis/research in this feature implement first, I will need your help :-)
I will notice any new updates in this email.
Thanks, Lixun
On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik <serg@mariadb.org>wrote:
Hi, Lixun!
On Jun 18, Lixun Peng wrote:
Hi,
As we know, InnoDB has implicit primary key if a table hasn't defined a primary key. However MySQL server doesn't know this primary key, so this primary key will not apear in binlog.
When we are using ROW format binlog for replication, if a table has no any indexes, that's a disaster. If a table without indexes do a DML (UPDATE/DELETE), of course it will run a long time in master, but in slave, it still need a long time. It will cause serious slave replication delay. ...
I think mysql server can do the same thing as InnoDB do, if user doesn't define the primary key, mysql can add the primary key automatically.
How do you think?
Well, that's doable. A much easier solution would be to require a user to create a primary key. It's a one-line change:
- Table_flags ha_table_flags() const { return cached_table_flags; } + Table_flags ha_table_flags() const { return cached_table_flags | HA_REQUIRE_PRIMARY_KEY; }
But what you suggest is possible too, I believe.
Regards, Sergei
-- Senior Database Engineer @ Taobao.com
Mobile Phone: +86 18658156856 (Hangzhou) Gtalk: penglixun(at)gmail.com Twitter: http://www.twitter.com/plinux Blog: http://www.penglixun.com
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
-- Senior Database Engineer @ Taobao.com
Mobile Phone: +86 18658156856 (Hangzhou) Gtalk: penglixun(at)gmail.com Twitter: http://www.twitter.com/plinux Blog: http://www.penglixun.com
-- Senior Database Engineer @ Taobao.com Mobile Phone: +86 18658156856 (Hangzhou) Gtalk: penglixun(at)gmail.com Twitter: http://www.twitter.com/plinux Blog: http://www.penglixun.com
it makes hardly sense to add a primary key not used in select statements this way and leads only in a lot of problems and wasted space / performance it is common practice to remove keys before large bulk inserts and add the key *after* inserting the data which would not work with the expected benefit with your patch Am 05.07.2013 06:08, schrieb Lixun Peng:
Hi all,
I implement a demo patch, based on 5.5.18.
*1. CREATE TABLE* root@localhost : plx 11:54:46> create table test_no_pk (col1 varchar(32)); Query OK, 0 rows affected (0.01 sec)
root@localhost : plx 11:55:05> desc test_no_pk; +----------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------+-------------+------+-----+---------+----------------+ | col1 | varchar(32) | YES | | NULL | | | __row_id | bigint(20) | NO | PRI | NULL | auto_increment | +----------+-------------+------+-----+---------+----------------+ 2 rows in set (0.01 sec)
if users has not defined a PK, I will add it automatically.
*2. ALTER TABLE* root@localhost : plx 11:55:10> alter table test_no_pk add id int, add primary key(id); Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0
root@localhost : plx 11:57:02> desc test_no_pk; +-------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+---------+-------+ | col1 | varchar(32) | YES | | NULL | | | id | int(11) | NO | PRI | 0 | | +-------+-------------+------+-----+---------+-------+ 2 rows in set (0.01 sec)
When users add a PK, I will remove implicit PK automatically.
root@localhost : plx 11:57:07> alter table test_no_pk drop primary key; Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0
root@localhost : plx 11:57:42> desc test_no_pk; +----------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------+-------------+------+-----+---------+----------------+ | col1 | varchar(32) | YES | | NULL | | | id | int(11) | NO | | 0 | | | __row_id | bigint(20) | NO | PRI | NULL | auto_increment | +----------+-------------+------+-----+---------+----------------+ 3 rows in set (0.00 sec)
When users dropped PK, I will add it automatically.
*3. INSERT VALUES* root@localhost : plx 11:59:22> insert into test_no_pk values('abc',2); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' root@localhost : plx 11:59:23> insert into test_no_pk values('abc',4); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY'
it will report duplicate, *Sergei, can you help me to find why?*
*4. SELECT ** root@localhost : plx 12:07:23> select * from test_no_pk; +------+----+ | col1 | id | +------+----+ | abc | 6 | +------+----+ 1 row in set (0.00 sec)
root@localhost : plx 12:07:30> select __row_id from test_no_pk; +----------+ | __row_id | +----------+ | 1 | +----------+ 1 row in set (0.00 sec)
When users run "SELECT *", row_id will be filter.
*5. SHOW CREATE*
root@localhost : plx 12:07:35> show create table test_no_pk\G *************************** 1. row *************************** Table: test_no_pk Create Table: CREATE TABLE `test_no_pk` ( `col1` varchar(32) DEFAULT NULL, `id` int(11) NOT NULL DEFAULT '0' ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
row_id will be hidden.
Thanks, Lixun
On Wed, Jun 26, 2013 at 10:59 AM, Lixun Peng <penglixun@gmail.com <mailto:penglixun@gmail.com>> wrote:
Hi Jeremy,
Thank you for your suggestion. I also want to just add the PK field for custom automatically, but some of our customs can't accept it. Because they are using "SELECT * FROM table .... " or " INSERT INTO table VALUES(...) ", if I add a visible PK for them, "SELECT *" will show this value, then their applications will report errors. So I have to set this field as an implicit filed.
Thanks, Lixun
On Sat, Jun 22, 2013 at 4:36 AM, Jeremy Cole <jeremycole@google.com <mailto:jeremycole@google.com>> wrote:
Lixun,
I've thought about this a bit and I'm not sure this will be very simple to do (or rather it's more complicated than it might seem). While I think it is not that hard to expose the __id field to replication internals, I think in order for this to really work it would need to be exposed to other tools, such as mysqldump. It is also unclear how to decide when it is safe to use this __id field (how to determine if it is in sync between master and slave).
As an alternate suggestion, what about ignoring the current implicit PK behavior, and instead automatically adding a field using auto_increment when the user doesn't provide a PK:
__id BIGINT UNSIGNED NOT NULL auto_increment, PRIMARY KEY(__id)
Regards,
Jeremy
On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng <penglixun@gmail.com <mailto:penglixun@gmail.com>> wrote:
Hi Sergei,
You are right, let users add primary key is best. But I can't let users who don't want to create primary key can't use our MySQL service. Amazon RDS also allow users to create the tables without primary key, just change binlog_format to MIXED to solve replication problem.
I think this feature is very useful for any MySQL cloud service providers, all of them will face this problem in one day. I will try to do some analysis/research in this feature implement first, I will need your help :-)
I will notice any new updates in this email.
Thanks, Lixun
On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik <serg@mariadb.org <mailto:serg@mariadb.org>> wrote:
Hi, Lixun!
On Jun 18, Lixun Peng wrote: > Hi, > > As we know, InnoDB has implicit primary key if a table hasn't defined > a primary key. However MySQL server doesn't know this primary key, so > this primary key will not apear in binlog. > > When we are using ROW format binlog for replication, if a table has no any > indexes, that's a disaster. If a table without indexes do a DML > (UPDATE/DELETE), of course it will run a long time in master, but in slave, > it still need a long time. It will cause serious slave replication delay. ...
> I think mysql server can do the same thing as InnoDB do, if user doesn't > define the primary key, mysql can add the primary key automatically. > > How do you think?
Well, that's doable. A much easier solution would be to require a user to create a primary key. It's a one-line change:
- Table_flags ha_table_flags() const { return cached_table_flags; } + Table_flags ha_table_flags() const { return cached_table_flags | HA_REQUIRE_PRIMARY_KEY; }
But what you suggest is possible too, I believe.
Regards, Sergei
Yes, for normal user is un-meaningful. The case is our MySQL Cloud Service, so many users are using our MySQL db for CMS or other programs. And it usually has no primary key or any unique keys, so it makes me headache. I think you know, if binlog_format=ROW, and tables have no any unique keys, what will happen. Now we just change binlog_forma=MIXED to avoid this problem. But our middleware need ROW format, so it still makes me headache. Because our users don't want to add PK by themselves, they don't know how to modify their application, they just download it and install in their web server. So I want to add a implicit Primary Key for each tables that have no unique keys. Then we can use ROW binlog format, so many problems will be solved. On Fri, Jul 5, 2013 at 12:15 PM, Reindl Harald <h.reindl@thelounge.net>wrote:
it makes hardly sense to add a primary key not used in select statements this way and leads only in a lot of problems and wasted space / performance
it is common practice to remove keys before large bulk inserts and add the key *after* inserting the data which would not work with the expected benefit with your patch
Hi all,
I implement a demo patch, based on 5.5.18.
*1. CREATE TABLE* root@localhost : plx 11:54:46> create table test_no_pk (col1 varchar(32)); Query OK, 0 rows affected (0.01 sec)
root@localhost : plx 11:55:05> desc test_no_pk; +----------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------+-------------+------+-----+---------+----------------+ | col1 | varchar(32) | YES | | NULL | | | __row_id | bigint(20) | NO | PRI | NULL | auto_increment | +----------+-------------+------+-----+---------+----------------+ 2 rows in set (0.01 sec)
if users has not defined a PK, I will add it automatically.
*2. ALTER TABLE* root@localhost : plx 11:55:10> alter table test_no_pk add id int, add
Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0
root@localhost : plx 11:57:02> desc test_no_pk; +-------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+---------+-------+ | col1 | varchar(32) | YES | | NULL | | | id | int(11) | NO | PRI | 0 | | +-------+-------------+------+-----+---------+-------+ 2 rows in set (0.01 sec)
When users add a PK, I will remove implicit PK automatically.
root@localhost : plx 11:57:07> alter table test_no_pk drop primary key; Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0
root@localhost : plx 11:57:42> desc test_no_pk; +----------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------+-------------+------+-----+---------+----------------+ | col1 | varchar(32) | YES | | NULL | | | id | int(11) | NO | | 0 | | | __row_id | bigint(20) | NO | PRI | NULL | auto_increment | +----------+-------------+------+-----+---------+----------------+ 3 rows in set (0.00 sec)
When users dropped PK, I will add it automatically.
*3. INSERT VALUES* root@localhost : plx 11:59:22> insert into test_no_pk values('abc',2); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' root@localhost : plx 11:59:23> insert into test_no_pk values('abc',4); ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY'
it will report duplicate, *Sergei, can you help me to find why?*
*4. SELECT ** root@localhost : plx 12:07:23> select * from test_no_pk; +------+----+ | col1 | id | +------+----+ | abc | 6 | +------+----+ 1 row in set (0.00 sec)
root@localhost : plx 12:07:30> select __row_id from test_no_pk; +----------+ | __row_id | +----------+ | 1 | +----------+ 1 row in set (0.00 sec)
When users run "SELECT *", row_id will be filter.
*5. SHOW CREATE*
root@localhost : plx 12:07:35> show create table test_no_pk\G *************************** 1. row *************************** Table: test_no_pk Create Table: CREATE TABLE `test_no_pk` ( `col1` varchar(32) DEFAULT NULL, `id` int(11) NOT NULL DEFAULT '0' ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
row_id will be hidden.
Thanks, Lixun
On Wed, Jun 26, 2013 at 10:59 AM, Lixun Peng <penglixun@gmail.com<mailto:
Am 05.07.2013 06:08, schrieb Lixun Peng: primary key(id); penglixun@gmail.com>> wrote:
Hi Jeremy,
Thank you for your suggestion. I also want to just add the PK field for custom automatically, but
some of our customs can't accept it.
Because they are using "SELECT * FROM table .... " or " INSERT INTO
table VALUES(...) ", if I add a visible PK
for them, "SELECT *" will show this value, then their applications
will report errors.
So I have to set this field as an implicit filed.
Thanks, Lixun
On Sat, Jun 22, 2013 at 4:36 AM, Jeremy Cole <jeremycole@google.com<mailto:
jeremycole@google.com>> wrote:
Lixun,
I've thought about this a bit and I'm not sure this will be very
simple to do (or rather it's more
complicated than it might seem). While I think it is not that
hard to expose the __id field to replication
internals, I think in order for this to really work it would
need to be exposed to other tools, such as
mysqldump. It is also unclear how to decide when it is safe to
use this __id field (how to determine if it
is in sync between master and slave).
As an alternate suggestion, what about ignoring the current
implicit PK behavior, and instead automatically
adding a field using auto_increment when the user doesn't
provide a PK:
__id BIGINT UNSIGNED NOT NULL auto_increment, PRIMARY KEY(__id)
Regards,
Jeremy
On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng <penglixun@gmail.com<mailto:
penglixun@gmail.com>> wrote:
Hi Sergei,
You are right, let users add primary key is best. But I can't let users who don't want to create primary key
can't use our MySQL service.
Amazon RDS also allow users to create the tables without
primary key, just change binlog_format to
MIXED to solve replication problem.
I think this feature is very useful for any MySQL cloud
service providers, all of them will face this
problem in one day. I will try to do some analysis/research in this feature
implement first, I will need your help :-)
I will notice any new updates in this email.
Thanks, Lixun
On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik <
serg@mariadb.org <mailto:serg@mariadb.org>> wrote:
Hi, Lixun!
On Jun 18, Lixun Peng wrote: > Hi, > > As we know, InnoDB has implicit primary key if a table
hasn't defined
> a primary key. However MySQL server doesn't know this
primary key, so
> this primary key will not apear in binlog. > > When we are using ROW format binlog for replication,
if a table has no any
> indexes, that's a disaster. If a table without indexes
do a DML
> (UPDATE/DELETE), of course it will run a long time in
master, but in slave,
> it still need a long time. It will cause serious slave
replication delay.
...
> I think mysql server can do the same thing as InnoDB
do, if user doesn't
> define the primary key, mysql can add the primary key
automatically.
> > How do you think?
Well, that's doable. A much easier solution would be to
require a user
to create a primary key. It's a one-line change:
- Table_flags ha_table_flags() const { return
cached_table_flags; }
+ Table_flags ha_table_flags() const { return
cached_table_flags | HA_REQUIRE_PRIMARY_KEY; }
But what you suggest is possible too, I believe.
Regards, Sergei
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
-- Senior Database Engineer @ Taobao.com Mobile Phone: +86 18658156856 (Hangzhou) Gtalk: penglixun(at)gmail.com Twitter: http://www.twitter.com/plinux Blog: http://www.penglixun.com
Am 05.07.2013 06:32, schrieb Lixun Peng:
Yes, for normal user is un-meaningful
it is not only un-meaningful it is harmful for people knwoing what they are doing by massive overhead with no benefit - hence after a bulk insert your implicit key has to be removed while as example the intented unique key on a varchar added
The case is our MySQL Cloud Service, so many users are using our MySQL db for CMS or other programs. And it usually has no primary key or any unique keys, so it makes me headache.
i doubt you can solve social problems with tech
I think you know, if binlog_format=ROW, and tables have no any unique keys, what will happen. Now we just change binlog_forma=MIXED to avoid this problem. But our middleware need ROW format, so it still makes me headache. Because our users don't want to add PK by themselves, they don't know how to modify their application, they just download it and install in their web server.
So I want to add a implicit Primary Key for each tables that have no unique keys. Then we can use ROW binlog format, so many problems will be solved.
corner cases - this at least needs to be enabled via "my.cnf" and must not affect users with well database designs and shoot them in the leg due bulk inserts
On Fri, Jul 5, 2013 at 12:15 PM, Reindl Harald <h.reindl@thelounge.net <mailto:h.reindl@thelounge.net>> wrote:
it makes hardly sense to add a primary key not used in select statements this way and leads only in a lot of problems and wasted space / performance
it is common practice to remove keys before large bulk inserts and add the key *after* inserting the data which would not work with the expected benefit with your patch
Am 05.07.2013 06:08, schrieb Lixun Peng: > Hi all, > > I implement a demo patch, based on 5.5.18. > > > *1. CREATE TABLE* > root@localhost : plx 11:54:46> create table test_no_pk (col1 varchar(32)); > Query OK, 0 rows affected (0.01 sec) > > root@localhost : plx 11:55:05> desc test_no_pk; > +----------+-------------+------+-----+---------+----------------+ > | Field | Type | Null | Key | Default | Extra | > +----------+-------------+------+-----+---------+----------------+ > | col1 | varchar(32) | YES | | NULL | | > | __row_id | bigint(20) | NO | PRI | NULL | auto_increment | > +----------+-------------+------+-----+---------+----------------+ > 2 rows in set (0.01 sec) > > if users has not defined a PK, I will add it automatically. > > *2. ALTER TABLE* > root@localhost : plx 11:55:10> alter table test_no_pk add id int, add primary key(id); > Query OK, 0 rows affected (0.00 sec) > Records: 0 Duplicates: 0 Warnings: 0 > > root@localhost : plx 11:57:02> desc test_no_pk; > +-------+-------------+------+-----+---------+-------+ > | Field | Type | Null | Key | Default | Extra | > +-------+-------------+------+-----+---------+-------+ > | col1 | varchar(32) | YES | | NULL | | > | id | int(11) | NO | PRI | 0 | | > +-------+-------------+------+-----+---------+-------+ > 2 rows in set (0.01 sec) > > When users add a PK, I will remove implicit PK automatically. > > root@localhost : plx 11:57:07> alter table test_no_pk drop primary key; > Query OK, 0 rows affected (0.00 sec) > Records: 0 Duplicates: 0 Warnings: 0 > > root@localhost : plx 11:57:42> desc test_no_pk; > +----------+-------------+------+-----+---------+----------------+ > | Field | Type | Null | Key | Default | Extra | > +----------+-------------+------+-----+---------+----------------+ > | col1 | varchar(32) | YES | | NULL | | > | id | int(11) | NO | | 0 | | > | __row_id | bigint(20) | NO | PRI | NULL | auto_increment | > +----------+-------------+------+-----+---------+----------------+ > 3 rows in set (0.00 sec) > > When users dropped PK, I will add it automatically. > > *3. INSERT VALUES* > root@localhost : plx 11:59:22> insert into test_no_pk values('abc',2); > ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' > root@localhost : plx 11:59:23> insert into test_no_pk values('abc',4); > ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' > > it will report duplicate, *Sergei, can you help me to find why?* > > *4. SELECT ** > root@localhost : plx 12:07:23> select * from test_no_pk; > +------+----+ > | col1 | id | > +------+----+ > | abc | 6 | > +------+----+ > 1 row in set (0.00 sec) > > root@localhost : plx 12:07:30> select __row_id from test_no_pk; > +----------+ > | __row_id | > +----------+ > | 1 | > +----------+ > 1 row in set (0.00 sec) > > When users run "SELECT *", row_id will be filter. > > *5. SHOW CREATE* > > root@localhost : plx 12:07:35> show create table test_no_pk\G > *************************** 1. row *************************** > Table: test_no_pk > Create Table: CREATE TABLE `test_no_pk` ( > `col1` varchar(32) DEFAULT NULL, > `id` int(11) NOT NULL DEFAULT '0' > ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 > 1 row in set (0.00 sec) > > row_id will be hidden. > > > Thanks, > Lixun > > > > > On Wed, Jun 26, 2013 at 10:59 AM, Lixun Peng <penglixun@gmail.com <mailto:penglixun@gmail.com> <mailto:penglixun@gmail.com <mailto:penglixun@gmail.com>>> wrote: > > Hi Jeremy, > > Thank you for your suggestion. > I also want to just add the PK field for custom automatically, but some of our customs can't accept it. > Because they are using "SELECT * FROM table .... " or " INSERT INTO table VALUES(...) ", if I add a visible PK > for them, "SELECT *" will show this value, then their applications will report errors. > So I have to set this field as an implicit filed. > > Thanks, > Lixun > > > > On Sat, Jun 22, 2013 at 4:36 AM, Jeremy Cole <jeremycole@google.com <mailto:jeremycole@google.com> <mailto:jeremycole@google.com <mailto:jeremycole@google.com>>> wrote: > > Lixun, > > I've thought about this a bit and I'm not sure this will be very simple to do (or rather it's more > complicated than it might seem). While I think it is not that hard to expose the __id field to replication > internals, I think in order for this to really work it would need to be exposed to other tools, such as > mysqldump. It is also unclear how to decide when it is safe to use this __id field (how to determine if it > is in sync between master and slave). > > As an alternate suggestion, what about ignoring the current implicit PK behavior, and instead automatically > adding a field using auto_increment when the user doesn't provide a PK: > > __id BIGINT UNSIGNED NOT NULL auto_increment, > PRIMARY KEY(__id) > > Regards, > > Jeremy > > > On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng <penglixun@gmail.com <mailto:penglixun@gmail.com> <mailto:penglixun@gmail.com <mailto:penglixun@gmail.com>>> wrote: > > Hi Sergei, > > You are right, let users add primary key is best. > But I can't let users who don't want to create primary key can't use our MySQL service. > Amazon RDS also allow users to create the tables without primary key, just change binlog_format to > MIXED to solve replication problem. > > I think this feature is very useful for any MySQL cloud service providers, all of them will face this > problem in one day. > I will try to do some analysis/research in this feature implement first, I will need your help :-) > > I will notice any new updates in this email. > > > Thanks, > Lixun > > > > > On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik <serg@mariadb.org <mailto:serg@mariadb.org> <mailto:serg@mariadb.org <mailto:serg@mariadb.org>>> wrote: > > Hi, Lixun! > > > On Jun 18, Lixun Peng wrote: > > Hi, > > > > As we know, InnoDB has implicit primary key if a table hasn't defined > > a primary key. However MySQL server doesn't know this primary key, so > > this primary key will not apear in binlog. > > > > When we are using ROW format binlog for replication, if a table has no any > > indexes, that's a disaster. If a table without indexes do a DML > > (UPDATE/DELETE), of course it will run a long time in master, but in slave, > > it still need a long time. It will cause serious slave replication delay. > ... > > > I think mysql server can do the same thing as InnoDB do, if user doesn't > > define the primary key, mysql can add the primary key automatically. > > > > How do you think? > > Well, that's doable. A much easier solution would be to require a user > to create a primary key. It's a one-line change: > > - Table_flags ha_table_flags() const { return cached_table_flags; } > + Table_flags ha_table_flags() const { return cached_table_flags | HA_REQUIRE_PRIMARY_KEY; } > > But what you suggest is possible too, I believe. > > Regards, > Sergei
Hi All, New patch about this feature, fixed "*INSERT VALUES*" bug. Adding a new option "implicit_primary_key", when this feature is un-useful, can set implicit_primary_key=0. On Fri, Jul 5, 2013 at 12:42 PM, Reindl Harald <h.reindl@thelounge.net>wrote:
Am 05.07.2013 06:32, schrieb Lixun Peng:
Yes, for normal user is un-meaningful
it is not only un-meaningful it is harmful for people knwoing what they are doing by massive overhead with no benefit - hence after a bulk insert your implicit key has to be removed while as example the intented unique key on a varchar added
The case is our MySQL Cloud Service, so many users are using our MySQL db for CMS or other programs. And it usually has no primary key or any unique keys, so it makes me headache.
i doubt you can solve social problems with tech
I think you know, if binlog_format=ROW, and tables have no any unique keys, what will happen. Now we just change binlog_forma=MIXED to avoid this problem. But our middleware need ROW format, so it still makes me headache. Because our users don't want to add PK by themselves, they don't know how to modify their application, they just download it and install in their web server.
So I want to add a implicit Primary Key for each tables that have no unique keys. Then we can use ROW binlog format, so many problems will be solved.
corner cases - this at least needs to be enabled via "my.cnf" and must not affect users with well database designs and shoot them in the leg due bulk inserts
On Fri, Jul 5, 2013 at 12:15 PM, Reindl Harald <h.reindl@thelounge.net<mailto: h.reindl@thelounge.net>> wrote:
it makes hardly sense to add a primary key not used in select statements this way and leads only in a lot of problems and wasted space / performance
it is common practice to remove keys before large bulk inserts and add the key *after* inserting the data which would not work with the expected benefit with your patch
Am 05.07.2013 06:08, schrieb Lixun Peng: > Hi all, > > I implement a demo patch, based on 5.5.18. > > > *1. CREATE TABLE* > root@localhost : plx 11:54:46> create table test_no_pk (col1 varchar(32)); > Query OK, 0 rows affected (0.01 sec) > > root@localhost : plx 11:55:05> desc test_no_pk; > +----------+-------------+------+-----+---------+----------------+ > | Field | Type | Null | Key | Default | Extra | > +----------+-------------+------+-----+---------+----------------+ > | col1 | varchar(32) | YES | | NULL | | > | __row_id | bigint(20) | NO | PRI | NULL | auto_increment | > +----------+-------------+------+-----+---------+----------------+ > 2 rows in set (0.01 sec) > > if users has not defined a PK, I will add it automatically. > > *2. ALTER TABLE* > root@localhost : plx 11:55:10> alter table test_no_pk add id int, add primary key(id); > Query OK, 0 rows affected (0.00 sec) > Records: 0 Duplicates: 0 Warnings: 0 > > root@localhost : plx 11:57:02> desc test_no_pk; > +-------+-------------+------+-----+---------+-------+ > | Field | Type | Null | Key | Default | Extra | > +-------+-------------+------+-----+---------+-------+ > | col1 | varchar(32) | YES | | NULL | | > | id | int(11) | NO | PRI | 0 | | > +-------+-------------+------+-----+---------+-------+ > 2 rows in set (0.01 sec) > > When users add a PK, I will remove implicit PK automatically. > > root@localhost : plx 11:57:07> alter table test_no_pk drop primary key; > Query OK, 0 rows affected (0.00 sec) > Records: 0 Duplicates: 0 Warnings: 0 > > root@localhost : plx 11:57:42> desc test_no_pk; > +----------+-------------+------+-----+---------+----------------+ > | Field | Type | Null | Key | Default | Extra | > +----------+-------------+------+-----+---------+----------------+ > | col1 | varchar(32) | YES | | NULL | | > | id | int(11) | NO | | 0 | | > | __row_id | bigint(20) | NO | PRI | NULL | auto_increment | > +----------+-------------+------+-----+---------+----------------+ > 3 rows in set (0.00 sec) > > When users dropped PK, I will add it automatically. > > *3. INSERT VALUES* > root@localhost : plx 11:59:22> insert into test_no_pk values('abc',2); > ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' > root@localhost : plx 11:59:23> insert into test_no_pk values('abc',4); > ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' > > it will report duplicate, *Sergei, can you help me to find why?* > > *4. SELECT ** > root@localhost : plx 12:07:23> select * from test_no_pk; > +------+----+ > | col1 | id | > +------+----+ > | abc | 6 | > +------+----+ > 1 row in set (0.00 sec) > > root@localhost : plx 12:07:30> select __row_id from test_no_pk; > +----------+ > | __row_id | > +----------+ > | 1 | > +----------+ > 1 row in set (0.00 sec) > > When users run "SELECT *", row_id will be filter. > > *5. SHOW CREATE* > > root@localhost : plx 12:07:35> show create table test_no_pk\G > *************************** 1. row *************************** > Table: test_no_pk > Create Table: CREATE TABLE `test_no_pk` ( > `col1` varchar(32) DEFAULT NULL, > `id` int(11) NOT NULL DEFAULT '0' > ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 > 1 row in set (0.00 sec) > > row_id will be hidden. > > > Thanks, > Lixun > > > > > On Wed, Jun 26, 2013 at 10:59 AM, Lixun Peng <penglixun@gmail.com<mailto: penglixun@gmail.com> <mailto:penglixun@gmail.com <mailto:penglixun@gmail.com>>> wrote: > > Hi Jeremy, > > Thank you for your suggestion. > I also want to just add the PK field for custom automatically, but some of our customs can't accept it. > Because they are using "SELECT * FROM table .... " or " INSERT INTO table VALUES(...) ", if I add a visible PK > for them, "SELECT *" will show this value, then their applications will report errors. > So I have to set this field as an implicit filed. > > Thanks, > Lixun > > > > On Sat, Jun 22, 2013 at 4:36 AM, Jeremy Cole < jeremycole@google.com <mailto:jeremycole@google.com> <mailto:jeremycole@google.com <mailto:jeremycole@google.com>>> wrote: > > Lixun, > > I've thought about this a bit and I'm not sure this will be very simple to do (or rather it's more > complicated than it might seem). While I think it is not that hard to expose the __id field to replication > internals, I think in order for this to really work it would need to be exposed to other tools, such as > mysqldump. It is also unclear how to decide when it is safe to use this __id field (how to determine if it > is in sync between master and slave). > > As an alternate suggestion, what about ignoring the current implicit PK behavior, and instead automatically > adding a field using auto_increment when the user doesn't provide a PK: > > __id BIGINT UNSIGNED NOT NULL auto_increment, > PRIMARY KEY(__id) > > Regards, > > Jeremy > > > On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng < penglixun@gmail.com <mailto:penglixun@gmail.com> <mailto:penglixun@gmail.com <mailto:penglixun@gmail.com>>> wrote: > > Hi Sergei, > > You are right, let users add primary key is best. > But I can't let users who don't want to create primary key can't use our MySQL service. > Amazon RDS also allow users to create the tables without primary key, just change binlog_format to > MIXED to solve replication problem. > > I think this feature is very useful for any MySQL cloud service providers, all of them will face this > problem in one day. > I will try to do some analysis/research in this feature implement first, I will need your help :-) > > I will notice any new updates in this email. > > > Thanks, > Lixun > > > > > On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik < serg@mariadb.org <mailto:serg@mariadb.org> <mailto:serg@mariadb.org <mailto:serg@mariadb.org>>> wrote: > > Hi, Lixun! > > > On Jun 18, Lixun Peng wrote: > > Hi, > > > > As we know, InnoDB has implicit primary key if a table hasn't defined > > a primary key. However MySQL server doesn't know this primary key, so > > this primary key will not apear in binlog. > > > > When we are using ROW format binlog for replication, if a table has no any > > indexes, that's a disaster. If a table without indexes do a DML > > (UPDATE/DELETE), of course it will run a long time in master, but in slave, > > it still need a long time. It will cause serious slave replication delay. > ... > > > I think mysql server can do the same thing as InnoDB do, if user doesn't > > define the primary key, mysql can add the primary key automatically. > > > > How do you think? > > Well, that's doable. A much easier solution would be to require a user > to create a primary key. It's a one-line change: > > - Table_flags ha_table_flags() const { return cached_table_flags; } > + Table_flags ha_table_flags() const { return cached_table_flags | HA_REQUIRE_PRIMARY_KEY; } > > But what you suggest is possible too, I believe. > > Regards, > Sergei
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
-- Senior Database Engineer @ Taobao.com Mobile Phone: +86 18658156856 (Hangzhou) Gtalk: penglixun(at)gmail.com Twitter: http://www.twitter.com/plinux Blog: http://www.penglixun.com
participants (4)
-
Davi Arnaut
-
Jeremy Cole
-
Lixun Peng
-
Reindl Harald