That is really weird. Just having a visible vs. invisible PK should not have made any difference at all. In InnoDB there is always a PK, if you don't define one, an invisible 48-bit integer one will be defined for you. Reasoning behind my approach was that secondary keys point at the PK, and the table is clustered in PK. So what I proposed should have avoided the additional key dereference that might have save about half of the time. So if explicit vs. implicit PK makes a 10x difference, and it is not due to buffer pool being hot vs. cold, that is most definitely a critical performance bug you just stumbled upon. On Sun, 29 Oct 2023, 21:08 Ivan Krylov via discuss, < discuss@lists.mariadb.org> wrote:
On Sun, 29 Oct 2023 18:12:53 +0200 Gordan Bobic <gordan.bobic@gmail.com> wrote:
ALTER TABLE mol_trans DROP INDEX spid_flag_wl, ADD COLUMN id int unsigned auto_increment, ADD PRIMARY KEY (species_id, flag, wl_vac, id);
I wasn't able to add a primary key like this. My version of MariaDB only allows the id column in the beginning of the compound primary key. Additionally, trying to add a primary key to an existing table results in an error message complaining about the index for the table being corrupted (and rolling back the implicit transaction).
I tried recreating the table with PRIMARY KEY (species_id, flag, wl_vac, upper_id, lower_id). By itself, the combination of upper_id and lower_id must be unique, and it's our mistake that we didn't properly declare them as foreign keys into a different table. Unfortunately, this didn't improve the performance.
What _did_ improve performance was creating the id column with the type INT UNSIGNED AUTO_INCREMENT and setting it to be the primary key. With the index spid_flag_wl(species_id, flag, wl_vac) recreated, the EXPLAIN output now looks a bit differently:
id: 1 select_type: SIMPLE table: mtr type: range possible_keys: spid_flag_wl key: spid_flag_wl key_len: 16 ref: NULL rows: 5487882 Extra: Using index condition
...and I get my 3024559 rows in slightly more than 6 seconds.
So many thanks for giving me a pointer in the direction that eventually helped solve my problem, even if I don't fully understand why it works. Is the lesson here to always create a synthetic primary key for a table?
I've also tried recreating the table with the "natural" primary key of (upper_id, lower_id), but it takes much longer to reinsert the rows (it still isn't complete after tens of minutes, and previously reinsert would be done in just a few minutes), probably because the rows don't go in the natural order by upper_id and lower_id at all.
-- Best regards, Ivan _______________________________________________ discuss mailing list -- discuss@lists.mariadb.org To unsubscribe send an email to discuss-leave@lists.mariadb.org