Excessive memory usage 10.11.6
hi seems like a memory leak issue. take a look at https://jira.mariadb.org/browse/MDEV-30889 On Mon, May 6, 2024 at 1:06 AM Clint Todish via discuss < discuss@lists.mariadb.org> wrote:
We're seeing a sudden memory usage issue on one of our replication slaves where memory usage continues to grow until OOMkiller takes out the process - at least until we dropped pool size down substantially. This server has been running in production for a few years now without any issues but it appears a recent update (Debian 12.5 repo) has caused something to break. The system is currently configured with a max inno buffer pool of 100G, but the process still chews up nearly 500G of RAM doing nothing but processing replication data (no clients).
Server version: 10.11.6-MariaDB-0+deb12u1-log Debian 12
MariaDB [(none)]> show processlist; +-------+-------------+-----------+--------------+--------------+------+--------------------------------------------------------------+------------------+----------+
| Id | User | Host | db | Command | Time | State | Info | Progress | +-------+-------------+-----------+--------------+--------------+------+--------------------------------------------------------------+------------------+----------+
| 5 | system user | | NULL | Slave_IO | 2504 | Waiting for master to send event | NULL | 0.000 | | 7 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 9 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 10 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 11 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 12 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 13 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 14 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 15 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 16 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 17 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 18 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 19 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 20 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 21 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 22 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 6 | system user | | NULL | Slave_SQL | 213 | Slave has read all relay log; waiting for more updates | NULL | 0.000 | ....
MariaDB [(none)]> SELECT ((@@innodb_buffer_pool_size + @@innodb_log_buffer_size + @@key_buffer_size + @@query_cache_size + 12 * (@@bulk_insert_buffer_size + @@join_buffer_ size + @@read_buffer_size + @@read_rnd_buffer_size + @@sort_buffer_size + @@tmp_table_size)) / 1024 / 1024 / 1024) AS max_memory_GB; +------------------+ | max_memory_GB | +------------------+ | 124.333496093750 | +------------------+ 1 row in set (0.000 sec)
Does anyone have a clue how to figure out what is actually consuming this memory internally and how we could mitigate it? The system is barely stable after lowering the pool size to 100G but this isn't ideal.
thanks -C
_______________________________________________ discuss mailing list -- discuss@lists.mariadb.org To unsubscribe send an email to discuss-leave@lists.mariadb.org
I suspect this is something more recent as this issue just showed up with 10.11.6 for us. Short of valgrind or dumping a (massive) core file, is there some other way to see current allocations in mariadb? -C
not a db expert here but you can try performance schema. a query like "select * from performance_schema.memory_summary_global_by_event_name;" or something precise like "select CURRENT_NUMBER_OF_BYTES_USED from performance_schema.memory_summary_global_by_event_name where event_name="memory/performance_schema/memory_summary_by_thread_by_event_name";" that might pinpoint something specific On Mon, May 6, 2024 at 1:32 AM clint--- via discuss < discuss@lists.mariadb.org> wrote:
I suspect this is something more recent as this issue just showed up with 10.11.6 for us. Short of valgrind or dumping a (massive) core file, is there some other way to see current allocations in mariadb?
-C _______________________________________________ discuss mailing list -- discuss@lists.mariadb.org To unsubscribe send an email to discuss-leave@lists.mariadb.org
Hi, clint On May 05, clint--- via discuss wrote:
I suspect this is something more recent as this issue just showed up with 10.11.6 for us. Short of valgrind or dumping a (massive) core file, is there some other way to see current allocations in mariadb?
It's likely not MariaDB as such - see MEMORY_USED in the INFORMATION_SCHEMA.PROCESSLIST and in SHOW GLOBAL STATUS, or check PERFORMANCE_SCHEMA. In all those cases we've heard about, the memory growth was caused by memory fragmentation, that is the memory is neither allocated by MariaDB nor available to the OS. Switching to a different memory allocator, like jemalloc or tcmalloc, have reportedly helped some users. Also, try to disable transparent huge pages. If you'll search for "transparent huge pages databases" you'll find a lot of links like https://www.pingcap.com/blog/transparent-huge-pages-why-we-disable-it-for-da... https://www.mongodb.com/docs/manual/tutorial/transparent-huge-pages/ https://docs.oracle.com/en/database/oracle/oracle-database/18/ladbi/disablin... https://www.percona.com/blog/why-tokudb-hates-transparent-hugepages/ etc. Or upgrade to 10.11.7 where MariaDB does it for you automatically. https://jira.mariadb.org/browse/MDEV-33279 Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
Howdy Sergei, I did try swapping to tcmalloc when the issue first popped up as this was mentioned as a possible solution to buggy allocators in the past, but that didn't help here. However, disabling THP as you suggested appears to have immediately resolved this problem here. Thanks for your time! -C
clint--- via discuss <discuss@lists.mariadb.org> writes:
I suspect this is something more recent as this issue just showed up with 10.11.6 for us. Short of valgrind or dumping a (massive) core file, is there some other way to see current allocations in mariadb?
i_s.processlist has memory usage per thread: SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST; SELECT command, memory_used, max_memory_used FROM INFORMATION_SCHEMA.PROCESSLIST WHERE command="Slave_worker"; And you can also try SHOW GLOBAL STATUS LIKE "%memory_used%"; If one of these show excessive memory usage, that's a place to start. If not, then maybe it's some non-accounted memory usage. Or it could be malloc internal fragmentation. You can try the command: mysqladmin debug It prints in the server error log some internal memory statistics from the malloc implementation, like this: Memory status: Non-mmapped space allocated from system: 11759616 Number of free chunks: 127 Number of fastbin blocks: 1 Number of mmapped regions: 44 Space in mmapped regions: 79798272 Maximum total allocated space: 0 Space available in freed fastbin blocks: 48 Total allocated space: 9955376 Total free space: 1804240 Top-most, releasable space: 73968 Estimated memory (with thread stack): 93351936 Global memory allocated by server: 17005952 Memory allocated by threads: 419216 There is some description of these fields in `man mallinfo`. If the numbers reported by malloc are much smaller than what you see in actual system usage, then maybe internal malloc memory fragmentation is the issue. - Kristian.
participants (5)
-
Ankesh Anand
-
Clint Todish
-
clint@todish.com
-
Kristian Nielsen
-
Sergei Golubchik