Am 05.07.2013 06:32, schrieb Lixun Peng:
Yes, for normal user is un-meaningful
it is not only un-meaningful it is harmful for people knwoing what they are doing by massive overhead with no benefit - hence after a bulk insert your implicit key has to be removed while as example the intented unique key on a varchar added
The case is our MySQL Cloud Service, so many users are using our MySQL db for CMS or other programs. And it usually has no primary key or any unique keys, so it makes me headache.
i doubt you can solve social problems with tech
I think you know, if binlog_format=ROW, and tables have no any unique keys, what will happen. Now we just change binlog_forma=MIXED to avoid this problem. But our middleware need ROW format, so it still makes me headache. Because our users don't want to add PK by themselves, they don't know how to modify their application, they just download it and install in their web server.
So I want to add a implicit Primary Key for each tables that have no unique keys. Then we can use ROW binlog format, so many problems will be solved.
corner cases - this at least needs to be enabled via "my.cnf" and must not affect users with well database designs and shoot them in the leg due bulk inserts
On Fri, Jul 5, 2013 at 12:15 PM, Reindl Harald <h.reindl@thelounge.net <mailto:h.reindl@thelounge.net>> wrote:
it makes hardly sense to add a primary key not used in select statements this way and leads only in a lot of problems and wasted space / performance
it is common practice to remove keys before large bulk inserts and add the key *after* inserting the data which would not work with the expected benefit with your patch
Am 05.07.2013 06:08, schrieb Lixun Peng: > Hi all, > > I implement a demo patch, based on 5.5.18. > > > *1. CREATE TABLE* > root@localhost : plx 11:54:46> create table test_no_pk (col1 varchar(32)); > Query OK, 0 rows affected (0.01 sec) > > root@localhost : plx 11:55:05> desc test_no_pk; > +----------+-------------+------+-----+---------+----------------+ > | Field | Type | Null | Key | Default | Extra | > +----------+-------------+------+-----+---------+----------------+ > | col1 | varchar(32) | YES | | NULL | | > | __row_id | bigint(20) | NO | PRI | NULL | auto_increment | > +----------+-------------+------+-----+---------+----------------+ > 2 rows in set (0.01 sec) > > if users has not defined a PK, I will add it automatically. > > *2. ALTER TABLE* > root@localhost : plx 11:55:10> alter table test_no_pk add id int, add primary key(id); > Query OK, 0 rows affected (0.00 sec) > Records: 0 Duplicates: 0 Warnings: 0 > > root@localhost : plx 11:57:02> desc test_no_pk; > +-------+-------------+------+-----+---------+-------+ > | Field | Type | Null | Key | Default | Extra | > +-------+-------------+------+-----+---------+-------+ > | col1 | varchar(32) | YES | | NULL | | > | id | int(11) | NO | PRI | 0 | | > +-------+-------------+------+-----+---------+-------+ > 2 rows in set (0.01 sec) > > When users add a PK, I will remove implicit PK automatically. > > root@localhost : plx 11:57:07> alter table test_no_pk drop primary key; > Query OK, 0 rows affected (0.00 sec) > Records: 0 Duplicates: 0 Warnings: 0 > > root@localhost : plx 11:57:42> desc test_no_pk; > +----------+-------------+------+-----+---------+----------------+ > | Field | Type | Null | Key | Default | Extra | > +----------+-------------+------+-----+---------+----------------+ > | col1 | varchar(32) | YES | | NULL | | > | id | int(11) | NO | | 0 | | > | __row_id | bigint(20) | NO | PRI | NULL | auto_increment | > +----------+-------------+------+-----+---------+----------------+ > 3 rows in set (0.00 sec) > > When users dropped PK, I will add it automatically. > > *3. INSERT VALUES* > root@localhost : plx 11:59:22> insert into test_no_pk values('abc',2); > ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' > root@localhost : plx 11:59:23> insert into test_no_pk values('abc',4); > ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY' > > it will report duplicate, *Sergei, can you help me to find why?* > > *4. SELECT ** > root@localhost : plx 12:07:23> select * from test_no_pk; > +------+----+ > | col1 | id | > +------+----+ > | abc | 6 | > +------+----+ > 1 row in set (0.00 sec) > > root@localhost : plx 12:07:30> select __row_id from test_no_pk; > +----------+ > | __row_id | > +----------+ > | 1 | > +----------+ > 1 row in set (0.00 sec) > > When users run "SELECT *", row_id will be filter. > > *5. SHOW CREATE* > > root@localhost : plx 12:07:35> show create table test_no_pk\G > *************************** 1. row *************************** > Table: test_no_pk > Create Table: CREATE TABLE `test_no_pk` ( > `col1` varchar(32) DEFAULT NULL, > `id` int(11) NOT NULL DEFAULT '0' > ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 > 1 row in set (0.00 sec) > > row_id will be hidden. > > > Thanks, > Lixun > > > > > On Wed, Jun 26, 2013 at 10:59 AM, Lixun Peng <penglixun@gmail.com <mailto:penglixun@gmail.com> <mailto:penglixun@gmail.com <mailto:penglixun@gmail.com>>> wrote: > > Hi Jeremy, > > Thank you for your suggestion. > I also want to just add the PK field for custom automatically, but some of our customs can't accept it. > Because they are using "SELECT * FROM table .... " or " INSERT INTO table VALUES(...) ", if I add a visible PK > for them, "SELECT *" will show this value, then their applications will report errors. > So I have to set this field as an implicit filed. > > Thanks, > Lixun > > > > On Sat, Jun 22, 2013 at 4:36 AM, Jeremy Cole <jeremycole@google.com <mailto:jeremycole@google.com> <mailto:jeremycole@google.com <mailto:jeremycole@google.com>>> wrote: > > Lixun, > > I've thought about this a bit and I'm not sure this will be very simple to do (or rather it's more > complicated than it might seem). While I think it is not that hard to expose the __id field to replication > internals, I think in order for this to really work it would need to be exposed to other tools, such as > mysqldump. It is also unclear how to decide when it is safe to use this __id field (how to determine if it > is in sync between master and slave). > > As an alternate suggestion, what about ignoring the current implicit PK behavior, and instead automatically > adding a field using auto_increment when the user doesn't provide a PK: > > __id BIGINT UNSIGNED NOT NULL auto_increment, > PRIMARY KEY(__id) > > Regards, > > Jeremy > > > On Wed, Jun 19, 2013 at 2:40 AM, Lixun Peng <penglixun@gmail.com <mailto:penglixun@gmail.com> <mailto:penglixun@gmail.com <mailto:penglixun@gmail.com>>> wrote: > > Hi Sergei, > > You are right, let users add primary key is best. > But I can't let users who don't want to create primary key can't use our MySQL service. > Amazon RDS also allow users to create the tables without primary key, just change binlog_format to > MIXED to solve replication problem. > > I think this feature is very useful for any MySQL cloud service providers, all of them will face this > problem in one day. > I will try to do some analysis/research in this feature implement first, I will need your help :-) > > I will notice any new updates in this email. > > > Thanks, > Lixun > > > > > On Wed, Jun 19, 2013 at 5:14 PM, Sergei Golubchik <serg@mariadb.org <mailto:serg@mariadb.org> <mailto:serg@mariadb.org <mailto:serg@mariadb.org>>> wrote: > > Hi, Lixun! > > > On Jun 18, Lixun Peng wrote: > > Hi, > > > > As we know, InnoDB has implicit primary key if a table hasn't defined > > a primary key. However MySQL server doesn't know this primary key, so > > this primary key will not apear in binlog. > > > > When we are using ROW format binlog for replication, if a table has no any > > indexes, that's a disaster. If a table without indexes do a DML > > (UPDATE/DELETE), of course it will run a long time in master, but in slave, > > it still need a long time. It will cause serious slave replication delay. > ... > > > I think mysql server can do the same thing as InnoDB do, if user doesn't > > define the primary key, mysql can add the primary key automatically. > > > > How do you think? > > Well, that's doable. A much easier solution would be to require a user > to create a primary key. It's a one-line change: > > - Table_flags ha_table_flags() const { return cached_table_flags; } > + Table_flags ha_table_flags() const { return cached_table_flags | HA_REQUIRE_PRIMARY_KEY; } > > But what you suggest is possible too, I believe. > > Regards, > Sergei