using systemtap I got

[Tue Mar  3 18:48:46 2015] SIGKILL was sent to ps (pid:9763) by processes uid:0
[Tue Mar  3 18:49:46 2015] SIGKILL was sent to ps (pid:9824) by processes uid:0
[Tue Mar  3 18:50:44 2015] SIGKILL was sent to mysqld (pid:7597) by mysqld uid:498
[Tue Mar  3 18:50:46 2015] SIGKILL was sent to ps (pid:9890) by processes uid:0
[Tue Mar  3 18:51:46 2015] SIGKILL was sent to ps (pid:9997) by processes uid:0
[Tue Mar  3 18:52:35 2015] SIGKILL was sent to cdm (pid:10045) by cdm uid:0


I saw this several times but I wasn't able to find (in this case) which process has the PID 498. @jocelyn, you are right...to tweaks at all for tokudb (my bad here), although I wasn't expecting such naive approach, I mean, I could have not enable the tokudb plugin and all other tables using the other engines would work just fine (maybe just slow though)

free -m returns about 47M free...which isn't much...

the issue is that if I need to provide tokudb more RAM I will need to take if from the InnoDB buffer pool, which we use a lot...the main reason to use tokudb was to "compress" some huge tables that are not transactional nor readed all the time.

  


On Tue, Mar 3, 2015 at 4:06 PM, Justin Swanhart <greenlion@gmail.com> wrote:
Hi,

Presumably, if the server is using too much ram it will a) crash and leave a stack trace (not happening) or b) invoke the OOMPK which leaves a trace in syslog, which also, presumably, is not happening, since she can find no trace of OOMPK activity.

So memory is a good guess (output of free -m would be useful) I doubt it is memory related do to lack of evidence for it.

--Justin

On Tue, Mar 3, 2015 at 9:58 AM, jocelyn fournier <jocelyn.fournier@gmail.com> wrote:
Hi,

No variables to tweak the tokudb memory allocation ?
By default tokudb_cache_size takes half of the total system memory.
With 6G allocated to InnoDB buffer pool, and 1G to the MyISAM key buffer, are you sure you have enough memory for TokuDB ?

  Jocelyn


Le 03/03/2015 17:37, Gabriel Sosa a écrit :
@joocelyn Here is the my.cfg file

@justin I will check that post. Thanks

[mysqld]

# -------------------------------------------------------------------------------
# ++++ General
# -------------------------------------------------------------------------------
datadir                         = /var/lib/mysql/data
pid-file                        = /var/run/mysqld/mysqld.pid
socket                          = /var/lib/mysql/mysql.sock
tmpdir                          = /var/lib/mysql/tmp
port                            = 3306

# -------------------------------------------------------------------------------
# ++++ Logging
# -------------------------------------------------------------------------------
log-error                       = /var/lib/mysql/log/mysqld-error.log
long_query_time                 = 10
slow-query-log-file             = /var/lib/mysql/log/mysqld-queries_slow.log
log-slave-updates
log-bin                         = /var/lib/mysql/log/bin/slave-bin
log-warnings                    = 1

max_relay_log_size              = 200M 
relay_log_space_limit           = 25000M

# -------------------------------------------------------------------------------
# ++++ Network
# -------------------------------------------------------------------------------
max_connections                 = 3000
max_connect_errors              = 1000
wait_timeout                    = 120
connect_timeout                 = 30
interactive_timeout             = 3600
slave_net_timeout               = 120
back_log                        = 50
max_allowed_packet              = 128M


# -------------------------------------------------------------------------------
# ++++ Misc
# -------------------------------------------------------------------------------
# Text Searchs
ft_min_word_len                  = 2
plugin-load  = ha_tokudb

# -------------------------------------------------------------------------------
# ++++ Threads
# -------------------------------------------------------------------------------
thread_concurrency              = 8
thread_cache                    = 64

# -------------------------------------------------------------------------------
# ++++ Memory
# -------------------------------------------------------------------------------
# Tables
table_cache                     = 4096
tmp_table_size                  = 256M
# Memory per Thread
sort_buffer_size                = 8M
read_buffer_size                = 4M
read_rnd_buffer_size            = 16M
# Query Cache
query_cache_type                = 1
query_cache_limit               = 2M
query_cache_size                = 64M

# -------------------------------------------------------------------------------
# ++++ MyISAM Parameters
# -------------------------------------------------------------------------------
key_buffer_size                 = 1024M
myisam_sort_buffer_size         = 64M
myisam_recover                  = FORCE,BACKUP

# -------------------------------------------------------------------------------
# ++++ InnoDB Parameters
# -------------------------------------------------------------------------------
# General
innodb_data_home_dir            = /var/lib/mysql/innodb
innodb_log_group_home_dir       = /var/lib/mysql/innodblogs
innodb_file_per_table
innodb_data_file_path           = ibdata1:100M:autoextend
innodb_status_file              = ib_status
innodb_autoextend_increment     = 10M
innodb_support_xa               = 0
innodb_thread_concurrency       = 8
innodb_flush_method             = O_DIRECT
innodb_flush_log_at_trx_commit  = 2
# Memory
innodb_buffer_pool_size         = 6G
innodb_additional_mem_pool_size = 8M
innodb_open_files               = 512
# Logging
innodb_log_buffer_size          = 8M
innodb_log_file_size            = 256M
innodb_log_files_in_group       = 2


# -------------------------------------------------------------------------------
# ++++ Replication : SLAVE Profile
# -------------------------------------------------------------------------------
#skip-slave-start
server-id                       = 124388
relay-log                       = /var/lib/mysql/log/replication/slave-bin
relay-log-info-file             = /var/lib/mysql/log/replication/slave-log.info
master-info-file                = /var/lib/mysql/log/replication/master-log.info
max_binlog_size                 = 20971520

read-only                       = 1

[mysqldump]
quick
max_allowed_packet              = 128M

[isamchk]
key_buffer                      = 256M
sort_buffer_size                = 256M
read_buffer                     = 2M
write_buffer                    = 2M

[myisamchk]
key_buffer                      = 256M
sort_buffer_size                = 256M
read_buffer                     = 2M
write_buffer                    = 2M

[mysqlhotcopy]
interactive-timeout

On Tue, Mar 3, 2015 at 12:56 PM, Justin Swanhart <greenlion@gmail.com> wrote:
Hi,

You might try to find the source of the termination with this:

On Tue, Mar 3, 2015 at 8:45 AM, jocelyn fournier <jocelyn.fournier@gmail.com> wrote:
Hi,

Could you send your my.cnf ?

Thanks,
  Jocelyn

Le 03/03/2015 16:24, Gabriel Sosa a écrit :
Hello,

I've been a proudly user of mariadb 5.5.x for a long time now  but given the nice feature setI decided to give mariadb 10.x a try.

I took one of our current slaves running mariadb 5.5.x (on centos 6.5) and followed the upgrade steps using yum and ran *mysql_upgrade*. The upgrade ran without any trouble...then I moved a huge table (about 1B records right now) from InnoDB to TokuDB. 

Now, every couple of hours I find that the replication is far behind the master and the reason is because this server keeps checking tables marked as crashed....

I can't seems to find any indicator of OOM killer in the system logs NOR anything related to that in the mysql log, the only clue I have is:


---------------
150303 10:08:57 mysqld_safe Number of processes running now: 0
150303 10:08:57 mysqld_safe mysqld restarted
150303 10:08:58 [Warning] 'THREAD_CONCURRENCY' is deprecated and will be removed in a future release.
150303 10:08:59 [Warning] option 'innodb-status-file': boolean value 'ib_status' wasn't recognized. Set to OFF.
150303 10:08:59 [Warning] option 'innodb-autoextend-increment': unsigned value 10485760 adjusted to 1000
150303 10:08:59 [Note] InnoDB: Using mutexes to ref count buffer pool pages
150303 10:08:59 [Note] InnoDB: The InnoDB memory heap is disabled
150303 10:08:59 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
150303 10:08:59 [Note] InnoDB: Memory barrier is not used
150303 10:08:59 [Note] InnoDB: Compressed tables use zlib 1.2.3
150303 10:08:59 [Note] InnoDB: Using Linux native AIO
150303 10:08:59 [Note] InnoDB: Using CPU crc32 instructions
150303 10:08:59 [Note] InnoDB: Initializing buffer pool, size = 6.0G
150303 10:08:59 [Note] InnoDB: Completed initialization of buffer pool
150303 10:08:59 [Note] InnoDB: Highest supported file format is Barracuda.
150303 10:08:59 [Note] InnoDB: Log scan progressed past the checkpoint lsn 14178608217047
150303 10:08:59 [Note] InnoDB: Database was not shutdown normally!
150303 10:08:59 [Note] InnoDB: Starting crash recovery.
150303 10:08:59 [Note] InnoDB: Reading tablespace information from the .ibd files...
150303 10:09:05 [Note] InnoDB: Restoring possible half-written data pages 
150303 10:09:05 [Note] InnoDB: from the doublewrite buffer...
InnoDB: Doing recovery: scanned up to log sequence number 14178613459456
InnoDB: Doing recovery: scanned up to log sequence number 14178618702336
InnoDB: Doing recovery: scanned up to log sequence number 14178623945216
InnoDB: Doing recovery: scanned up to log sequence number 14178629188096
InnoDB: Doing recovery: scanned up to log sequence number 14178634430976
InnoDB: Doing recovery: scanned up to log sequence number 14178639673856
InnoDB: Doing recovery: scanned up to log sequence number 14178644916736
......
......
......
......
150303 10:09:14 [Note] InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 
InnoDB: Apply batch completed
InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 0 16433109, file name slave-bin.852141
InnoDB: Last MySQL binlog file position 0 16784214, file name /var/lib/mysql/log/bin/slave-bin.005211
150303 10:09:34 [Note] InnoDB: 128 rollback segment(s) are active.
150303 10:09:34 [Note] InnoDB: Waiting for purge to start
150303 10:09:34 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.22-72.0 started; log sequence number 14178834821658
Tue Mar  3 10:09:36 2015 TokuFT recovery starting in env /var/lib/mysql/data/
Tue Mar  3 10:09:36 2015 TokuFT recovery scanning backward from 1330297
Tue Mar  3 10:09:36 2015 TokuFT recovery bw_end_checkpoint at 1330297 timestamp 1425395277901416 xid 1330281 (bw_newer)
Tue Mar  3 10:09:36 2015 TokuFT recovery bw_begin_checkpoint at 1330281 timestamp 1425395273487081 (bw_between)
Tue Mar  3 10:09:36 2015 TokuFT recovery turning around at begin checkpoint 1330281 time 4414335
Tue Mar  3 10:09:36 2015 TokuFT recovery starts scanning forward to 1330297 from 1330281 left 16 (fw_between)
Tue Mar  3 10:09:36 2015 TokuFT recovery closing 14 dictionaries
Tue Mar  3 10:09:36 2015 TokuFT recovery making a checkpoint
Tue Mar  3 10:09:36 2015 TokuFT recovery done
150303 10:09:36 [Note] Recovering after a crash using /var/lib/mysql/log/bin/slave-bin

---------------

The odd thing is that the 5.5.x version was working just fine (since about a year) in the same hardware...nothing changed in that front.

Any clue?

Thank you in advance



_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to     : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to     : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp





--
Gabriel Sosa
Sometimes the questions are complicated and the answers are simple. -- Dr. Seuss





--
Gabriel Sosa
Sometimes the questions are complicated and the answers are simple. -- Dr. Seuss