We're running MariaDB 10.1.16 on Debian 7 (wheezy). We were running
v5.5.51 but we started having occasional unrecoverable semaphore waits
once or twice a month. We'd have to shutdown all Apache processes
running PHP code and then shutdown and restart MariaDB. Lastly we'd
restart the Apache servers.
That got us by until one day we had the same thing happen every 5
minutes after we started everything back up. It was nearly exactly 5
minutes from the time the Apache services would start that we'd see the
beginning symptoms (connection counts surging, CPU hitting the roof)
and then within two minutes from there the log would start filling with
semaphore wait messages.
So I performed an emergency upgrade to 10.1. Its been about a
month and I thought this had solved the problem until last Monday...
when it happened again. The usual restart process fixed it and I
haven't seen any more of those messages logged.
There is a ton of information I could give but I'm not sure what will
be helpful. So I'll start with the basics:
hardware/os:
12 core Intel Xeon @ 1.8GHz
64GB of RAM.
6x Samsung 850 SSDs in software RAID6 array.
Debian 7 64bit Linux
General MariaDB info:
v10.1.16 installed from MariaDB repos.
12,000 connection limit
<1,000 typically used.
Using XtraDB on all DBs other than "mysql".
InnoDB pool size 15GB
Total DB file size ~3GB
Adaptive hash index turned off
Individual InnoDB files
Workload has heavy writes, lots of subqueries and temp tables.
I counted up the various wait messages and grouped them by source file
and line # and came up with this:
lock0lock.cc:05075: 16
lock0lock.cc:06671: 2
lock0lock.cc:06822: 18602
lock0lock.cc:07078: 2
lock0lock.cc:07159: 2492
lock0lock.cc:07631: 16
lock0lock.cc:07721: 6
lock0wait.cc:00079: 34301
lock0wait.cc:00097: 9
lock0wait.cc:00247: 22996
lock0wait.cc:00291: 2
lock0wait.cc:00358: 22
lock0wait.cc:00485: 11
lock0wait.cc:00543: 14
row0ins.cc:01846: 60
row0mysql.cc:01772: 172
row0undo.cc:00298: 619
srv0srv.cc:02874: 349
srv0srv.cc:03573: 2
trx0rec.cc:01458: 97
This was from the period of time covering from when I noticed a problem
up to the time I shutdown MariaDB. I'm sure there are plenty of dupes
in there. But thought that it might give someone an idea of where to
look next.
I'm not sure where to start looking so I was hoping to get pointed in
the right direction. I have config files, log output and probably
anything else someone might want.
Thanks in advance for any help!
THX - Jon
--
Sent from my Debian Linux workstation -- http://www.debian.org/intro/about
Jon Foster
JF Possibilities, Inc.
jon@jfpossibilities.com
541-410-2760
Making computers work for you!