Hi Pierre, there are quite a few MySQL bug reports in MySQL bug database that can affect your use case. Like: http://bugs.mysql.com/bug.php?id=5731 http://bugs.mysql.com/bug.php?id=29446 http://bugs.mysql.com/bug.php?id=59925 http://bugs.mysql.com/bug.php?id=62570 http://bugs.mysql.com/bug.php?id=62827 http://bugs.mysql.com/bug.php?id=45702 BUG#62827 is probably the most interesting. Even though some of them are fixed, it is very likely that they're not fixed completely. I tend to remember 32-bit variables on both key cache and "repair by sort" ways. Could you check actual value of myisam_sort_buffer_size and key_buffer_size just to make sure it wasn't cut at startup at least? I'd suggest to report a bug in jira: https://mariadb.atlassian.net I believe it is somthing worth checking at least. Regards, Sergey On Mon, Feb 10, 2014 at 01:45:34PM +0100, Pierre wrote:
Hello,
Mariadb is getting slower and slower when I'm inserting a massive amout of data. I'm trying to insert 166 507 066 rows (12go) using load data infile '' into an empty table. I splitted my file in 13 parts of the same size because it was too long to insert in one shot. When I inserted more than 100M rows, it starts to be extremely slow.
Here are the time for each load data infile :
LOAD DATA INFILE '/tmp/hash/xaa' : 2min 49sec LOAD DATA INFILE '/tmp/hash/xab' : 2min 49sec LOAD DATA INFILE '/tmp/hash/xac' : 3min 48sec LOAD DATA INFILE '/tmp/hash/xad' : 3min 48sec LOAD DATA INFILE '/tmp/hash/xae' : 3min 49sec LOAD DATA INFILE '/tmp/hash/xaf' : 5min 59sec LOAD DATA INFILE '/tmp/hash/xag' : 10min 50sec LOAD DATA INFILE '/tmp/hash/xah' : 20min 7sec LOAD DATA INFILE '/tmp/hash/xai' : 47min 7sec LOAD DATA INFILE '/tmp/hash/xaj' : 4 hours 1 min 9.34 sec ( ouch !) LOAD DATA INFILE '/tmp/hash/xak' : still running...
Now the task is running but mariadb is only using 2% cpu and read to the disk at 2500 Kb/sec (which is slow regarding the disk performance). There is a lot of disk space. If insertion speed continue to slow down at this rate, it will take weeks to insert the 3 remainings files ! It's only 166M rows so I think it's not normal. If I disable the "unique" constraint insertion speed is great but I need to be sure there is no duplicate and I also need an index.
Can someone explain me what is happening internaly and why it is so slow ? Do you think if I just create an index (not a unique index) it will be faster ? The problem is I specially delegated this task of unique checking to mysql. If no solution, do you know a database or a key/value store with better for performance for this use case ? (create an index and remove duplicate ?) I can rearrange my table structure (at a cost of more disk space usage) to match a key/value structure.
Here is my current table structure : CREATE TABLE `wallets` ( `p1` varbinary(20) DEFAULT NULL, `p2` varbinary(20) DEFAULT NULL, `data` varbinary(32) DEFAULT NULL, UNIQUE KEY `p1` (`p1`), UNIQUE KEY `p2` (`p2`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 ROW_FORMAT=FIXED;
Computer specs: 16go RAM and 4x3.1 ghz, 2x1To (with 90% unused disk space)
The configuration variables (I tweaked specially for this insertion job) :
key_buffer_size = 6G # MyISAM: used with insert to an empty table bulk_insert_buffer_size = 6G # MyISAM/Aria: insert to a non empty table; default 8M myisam_sort_buffer_size = 6G # MyISAM: used for sort but only when : "CREATE INDEX", "ALTER TABLE" or "REPAIR TABLE" read_buffer_size = 6G # MyISAM/Aria/MERGE: allocated for each table scan, for order by, nested query caching, bulk insert into partitions sort_buffer_size = 4M # ALL: allocated each time a session need to do a sort myisam_max_sort_file_size = 128G # MyISAM: tmp file max size
I'm using mariadb 5.5.31.
Best regards,
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp