Excessive memory usage 10.11.6

newer
Detect INSERT vs REPLACE in BEFORE...

older
Re: Detect INSERT vs REPLACE in...

Clint Todish

5 May 2024 5 May '24

7:35 p.m.

Attachments:

attachment.html (text/html — 9.3 KB)

Show replies by date

Ankesh Anand

5 May 5 May

7:58 p.m.

hi seems like a memory leak issue. take a look at https://jira.mariadb.org/browse/MDEV-30889 On Mon, May 6, 2024 at 1:06 AM Clint Todish via discuss < discuss@lists.mariadb.org> wrote:

...

We're seeing a sudden memory usage issue on one of our replication slaves where memory usage continues to grow until OOMkiller takes out the process - at least until we dropped pool size down substantially. This server has been running in production for a few years now without any issues but it appears a recent update (Debian 12.5 repo) has caused something to break. The system is currently configured with a max inno buffer pool of 100G, but the process still chews up nearly 500G of RAM doing nothing but processing replication data (no clients).

Server version: 10.11.6-MariaDB-0+deb12u1-log Debian 12

MariaDB [(none)]> show processlist; +-------+-------------+-----------+--------------+--------------+------+--------------------------------------------------------------+------------------+----------+

| Id | User | Host | db | Command | Time | State | Info | Progress | +-------+-------------+-----------+--------------+--------------+------+--------------------------------------------------------------+------------------+----------+

| 5 | system user | | NULL | Slave_IO | 2504 | Waiting for master to send event | NULL | 0.000 | | 7 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 9 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 10 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 11 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 12 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 13 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 14 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 15 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 16 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 17 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 18 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 19 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 20 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 21 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 22 | system user | | NULL | Slave_worker | 0 | Waiting for prior transaction to start commit | NULL | 0.000 | | 6 | system user | | NULL | Slave_SQL | 213 | Slave has read all relay log; waiting for more updates | NULL | 0.000 | ....

MariaDB [(none)]> SELECT ((@@innodb_buffer_pool_size + @@innodb_log_buffer_size + @@key_buffer_size + @@query_cache_size + 12 * (@@bulk_insert_buffer_size + @@join_buffer_ size + @@read_buffer_size + @@read_rnd_buffer_size + @@sort_buffer_size + @@tmp_table_size)) / 1024 / 1024 / 1024) AS max_memory_GB; +------------------+ | max_memory_GB | +------------------+ | 124.333496093750 | +------------------+ 1 row in set (0.000 sec)

https://imgur.com/Xey2Pr1

Does anyone have a clue how to figure out what is actually consuming this memory internally and how we could mitigate it? The system is barely stable after lowering the pool size to 100G but this isn't ideal.

thanks -C

_______________________________________________ discuss mailing list -- discuss@lists.mariadb.org To unsubscribe send an email to discuss-leave@lists.mariadb.org

I suspect this is something more recent as this issue just showed up with 10.11.6 for us. Short of valgrind or dumping a (massive) core file, is there some other way to see current allocations in mariadb? -C

Ankesh Anand

8:14 p.m.

‌ not a db expert here but you can try performance schema. a query like "select * from performance_schema.memory_summary_global_by_event_name;" or something precise like "select CURRENT_NUMBER_OF_BYTES_USED from performance_schema.memory_summary_global_by_event_name where event_name="memory/performance_schema/memory_summary_by_thread_by_event_name";" that might pinpoint something specific On Mon, May 6, 2024 at 1:32 AM clint--- via discuss < discuss@lists.mariadb.org> wrote:

...

I suspect this is something more recent as this issue just showed up with 10.11.6 for us. Short of valgrind or dumping a (massive) core file, is there some other way to see current allocations in mariadb?

-C _______________________________________________ discuss mailing list -- discuss@lists.mariadb.org To unsubscribe send an email to discuss-leave@lists.mariadb.org

Sergei Golubchik

8:28 p.m.

Hi, clint On May 05, clint--- via discuss wrote:

...

I suspect this is something more recent as this issue just showed up with 10.11.6 for us. Short of valgrind or dumping a (massive) core file, is there some other way to see current allocations in mariadb?

It's likely not MariaDB as such - see MEMORY_USED in the INFORMATION_SCHEMA.PROCESSLIST and in SHOW GLOBAL STATUS, or check PERFORMANCE_SCHEMA. In all those cases we've heard about, the memory growth was caused by memory fragmentation, that is the memory is neither allocated by MariaDB nor available to the OS. Switching to a different memory allocator, like jemalloc or tcmalloc, have reportedly helped some users. Also, try to disable transparent huge pages. If you'll search for "transparent huge pages databases" you'll find a lot of links like https://www.pingcap.com/blog/transparent-huge-pages-why-we-disable-it-for-da... https://www.mongodb.com/docs/manual/tutorial/transparent-huge-pages/ https://docs.oracle.com/en/database/oracle/oracle-database/18/ladbi/disablin... https://www.percona.com/blog/why-tokudb-hates-transparent-hugepages/ etc. Or upgrade to 10.11.7 where MariaDB does it for you automatically. https://jira.mariadb.org/browse/MDEV-33279 Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org

clint＠todish.com

8:41 p.m.

Howdy Sergei, I did try swapping to tcmalloc when the issue first popped up as this was mentioned as a possible solution to buggy allocators in the past, but that didn't help here. However, disabling THP as you suggested appears to have immediately resolved this problem here. Thanks for your time! -C

Kristian Nielsen

8:29 p.m.

clint--- via discuss <discuss@lists.mariadb.org> writes:

...

I suspect this is something more recent as this issue just showed up with 10.11.6 for us. Short of valgrind or dumping a (massive) core file, is there some other way to see current allocations in mariadb?

i_s.processlist has memory usage per thread: SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST; SELECT command, memory_used, max_memory_used FROM INFORMATION_SCHEMA.PROCESSLIST WHERE command="Slave_worker"; And you can also try SHOW GLOBAL STATUS LIKE "%memory_used%"; If one of these show excessive memory usage, that's a place to start. If not, then maybe it's some non-accounted memory usage. Or it could be malloc internal fragmentation. You can try the command: mysqladmin debug It prints in the server error log some internal memory statistics from the malloc implementation, like this: Memory status: Non-mmapped space allocated from system: 11759616 Number of free chunks: 127 Number of fastbin blocks: 1 Number of mmapped regions: 44 Space in mmapped regions: 79798272 Maximum total allocated space: 0 Space available in freed fastbin blocks: 48 Total allocated space: 9955376 Total free space: 1804240 Top-most, releasable space: 73968 Estimated memory (with thread stack): 93351936 Global memory allocated by server: 17005952 Memory allocated by threads: 419216 There is some description of these fields in `man mallinfo`. If the numbers reported by malloc are much smaller than what you see in actual system usage, then maybe internal malloc memory fragmentation is the issue. - Kristian.

422

Age (days ago)

422

Last active (days ago)

List overview

6 comments

5 participants

participants (5)

Ankesh Anand
Clint Todish
clint＠todish.com
Kristian Nielsen
Sergei Golubchik