discuss
Threads by month
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
September 2018
- 14 participants
- 14 discussions
[Maria-discuss] MariaDB + RocksDB: MemTables Compactation
by Juan Telleria Ruiz de Aguirre 24 Sep '18
by Juan Telleria Ruiz de Aguirre 24 Sep '18
24 Sep '18
Dear MariaDB Developers,
I have an issue with RocksDB, as it takes a lot of Memory after
performing a Bulk Load, taking almost 30GB of RAM Memory after data
ingestion.
My question is: ¿How can I flush RocksDB Compactation MemTables after
the Data Bulk Load? ¿Is there any FLUSH TABLES ROCKSDB command sort
of?
Thank you,
Juan
Here is my.ini Configuration:
########################################################################
########################################################################
########### MARIADB: TORNADO24 CONFIGURATION FILE - ROCKSDB ############
########################################################################
########################################################################
#########################################################################
############################# client ####################################
#########################################################################
# The following options will be passed to all MariaDB clients
[client]
port = 3305
#socket = /tmp/mysql.sock
socket = C:/Program Files/MariaDB 10.3/data/mysql.sock
#pid-file = C:/Program Files/MariaDB 10.3/data/mysql.pid
default-character-set = latin1
# Here follows entries for some specific programs
#########################################################################
############################ mariadb ####################################
#########################################################################
[mariadb]
#local_infile = 1
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# |||||||||||||||||||||||||||||| GENERAL |||||||||||||||||||||||||||||| #
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# The MariaDB server
[mysqld]
port = 3305
#socket = /tmp/mysql.sock
socket = C:/Program Files/MariaDB 10.3/data/mysql.sock
#pid-file = C:/Program Files/MariaDB 10.3/data/mysql.pid
tmpdir = F:/MariaDB 10.3/temp/
# Table type which is used by default when creating new tables, if not
# specified differently during the CREATE TABLE statement.
# The default storage engine that will be used when new tables are created.
default-storage-engine = rocksdb
# Default storage engine that will be used for tables created with
# CREATE TEMPORARY TABLE (Whichever this Engine, Aria Engine will always
# be used for internal temporary tables, as materialized tables in subqueries,
# or other internal tables). RocksDB does not perform well for Temporary Tables,
# which require fast INSERT INTO...VALUES in Loops (For example: Cursors).
default_tmp_storage_engine = aria
# The default character set that will be used when a new schema or table is
# created and no character set is defined
character-set-server = latin1_bin
collation-server = latin1_spanish_ci
# Set the SQL mode to strict
# Default: sql-mode="STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"
sql_mode = ""
# *************************** OTHER *********************************** #
group_concat_max_len = 1048576
event_scheduler = ON
net_write_timeout = 3600
net_read_timeout = 3600
#local_infile = 1
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# ||||||||||||||||||||||||||| DATA STORAGE |||||||||||||||||||||||||||| #
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# Path to the database root
datadir = F:/MariaDB 10.3/data
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# |||||||||||||||||||||||||||||| SAFETY ||||||||||||||||||||||||||||||| #
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# The maximum size of a query packet the server can handle as well as
# maximum query size server can process (Important when working with
# large BLOBs). enlarged dynamically, for each connection.
max_allowed_packet = 512M
# Maximum amount of errors allowed per host. If this limit is reached,
# the host will be blocked from connecting to the MariaDB server until
# "FLUSH HOSTS" has been run or the server was restarted. Invalid
# passwords and other errors during the connect phase result in
# increasing this value. See the "Aborted_connects" status variable for
# global counter.
max_connect_errors = 1000
# Secure File Priv.
# Disabled: secure-file-priv="C:/ProgramData/MySQL/MySQL Server 5.7/Uploads/"
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# |||||||||||||||||||||||| CACHES AND LIMITS |||||||||||||||||||||||||| #
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# Maximum size for internal (in-memory) temporary tables. If a table
# grows larger than this value, it is automatically converted to disk
# based table This limitation is for a single table. There can be many
# of them.
tmp_table_size = 512M
# Maximum allowed size for a single HEAP (in memory) table. This option
# is a protection against the accidential creation of a very large HEAP
# table which could otherwise use up all memory resources.
max_heap_table_size = 1024M
# Query cache is used to cache SELECT results and later return them
# without actual executing the same query once again. Having the query
# cache enabled may result in significant speed improvements, if your
# have a lot of identical queries and rarely changing tables. See the
# "Qcache_lowmem_prunes" status variable to check if the current value
# is high enough for your load.
# Note: In case your tables change very often or if your queries are
# textually different every time, the query cache may result in a
# slowdown instead of a performance improvement.
query_cache_size = 1M
# How many threads we should keep in a cache for reuse. When a client
# disconnects, the client's threads are put in the cache if there aren't
# more than thread_cache_size threads from before. This greatly reduces
# the amount of thread creations needed if you have a lot of new
# connections. (Normally this doesn't give a notable performance
# improvement if you have a good thread implementation.)
thread_cache_size = 256
# The maximum amount of concurrent sessions the MariaDB server will
# allow. One of these connections will be reserved for a user with
# SUPER privileges to allow the administrator to login even if the
# connection limit has been reached.
# Low value choosen is order to reduce RAM Memory Allocation.
max_connections = 50
# The number of open tables for all threads. Increasing this value
# increases the number of file descriptors that mysqld requires.
# Therefore you have to make sure to set the amount of open files
# allowed to at least 4096 in the variable "open-files-limit" in
# section [mysqld_safe]
table_open_cache = 4096
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# ||||||||||||||||||||||||||||| LOGGING ||||||||||||||||||||||||||||||| #
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# General and Slow logging.
log-output='FILE,TABLE'
general-log=0
general_log_file="NGALARRETA-P700.log"
# Error Logging.
log-error = C:/Program Files/MariaDB
10.3/data/NGALARRETA-P700-error.log
log_queries_not_using_indexes = 0
slow-query-log = 0
slow-query-log-file = C:/Program Files/MariaDB
10.3/data/NGALARRETA-P700-slow.log
long_query_time = 1800
# Binary Logging.
# log-bin
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# ||||||||||||||||||||||||||| ALL ENGINES ||||||||||||||||||||||||||||| #
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# Sort buffer is used to perform sorts for some ORDER BY and GROUP BY
# queries. If sorted data does not fit into the sort buffer, a disk
# based merge sort is used instead - See the "Sort_merge_passes"
# status variable. Allocated per thread if sort is needed.
# If you see many sort_merge_passes per second in SHOW GLOBAL STATUS
# output, you can consider increasing the sort_buffer_size value to
# speed up ORDER BY or GROUP BY operations that cannot be improved with
# query optimization or improved indexing.
# Commented for choosing default values.
sort_buffer_size = 256K
# This buffer is used for the optimization of full JOINs (JOINs without
# indexes). Such JOINs are very bad for performance in most cases
# anyway, but setting this variable to a large value reduces the
# performance impact. See the "Select_full_join" status variable for a
# count of full JOINs. Allocated per thread if full join is found
# The minimum size of the buffer that is used for plain index scans, range
# index scans, and joins that do not use indexes and thus perform full
# table scans.
# Commented for Default Value.
join_buffer_size = 256K
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# |||||||||||||||||||||||||||||| OTHERS ||||||||||||||||||||||||||||||| #
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# back_log is the number of connections the operating system can keep in
# the listen queue, before the MariaDB connection manager thread has
# processed them. If you have a very high connection rate and experience
# "connection refused" errors, you might need to increase this value.
# Check your OS documentation for the maximum value of this parameter.
# Attempting to set back_log higher than your operating system limit
# will have no effect.
# You need to increase this only if you expect a large number of
# connections in a short period of time.
back_log = 150
# Only cache result sets that are smaller than this limit. This is to
# protect the query cache of a very large result set overwriting all
# other query results.
query_cache_limit = 2M
# Minimum word length to be indexed by the full text search index.
# You might wish to decrease it if you need to search for shorter words.
# Note that you need to rebuild your FULLTEXT index, after you have
# modified this value.
ft_min_word_len = 3
# Thread stack size to use. This amount of memory is always reserved at
# connection time. MariaDB itself usually needs no more than 64K of
# memory, while if you use your own stack hungry UDF functions or your
# OS requires more stack for some operations, you might need to set this
# to a higher value.
thread_stack = 297K
# Set the default transaction isolation level. Levels available are:
# READ-UNCOMMITTED, READ-COMMITTED, REPEATABLE-READ, SERIALIZABLE
# This variable's name might change to tx_isolation in future versions
# of MariaDB
transaction-isolation = REPEATABLE-READ
# Set the query cache type. 0 for OFF, 1 for ON and 2 for DEMAND.
query_cache_type=0
# The number of table definitions (from .frm files) that can be stored
# in the definition cache. If you use a large number of tables, you can
# create a large table definition cache to speed up opening of tables.
# The table definition cache takes less space and does not use file
# descriptors, unlike the normal table cache.
# The minimum and default values are both 400.
table_definition_cache=1400
# If the value of this variable is greater than 0, a replication slave
# synchronizes its master.info file to disk.
# (using fdatasync()) after every sync_master_info events.
sync_master_info=10000
# If the value of this variable is greater than 0, the MySQL server
# synchronizes its relay log to disk.
# (using fdatasync()) after every sync_relay_log writes to the relay log.
sync_relay_log=10000
# If the value of this variable is greater than 0, a replication
# slave synchronizes its relay-log.info file to disk.
# (using fdatasync()) after every sync_relay_log_info transactions.
sync_relay_log_info=10000
#########################################################################
########################## REPLICATION ##################################
#########################################################################
# Server Id.
# Used to identify master and slave servers in replication. The server_id
# must be unique for each server in the replicating group. If left at 0,
# the default, a slave will not connect to a master, and a master will
# refuse all slave connections.
# server_id=0
# Replication Slave (comment out master section to use this)
#
# To configure this host as a replication slave, you can choose between
# two methods :
#
# 1) Use the CHANGE MASTER TO command (fully described in our manual) -
# the syntax is:
#
# CHANGE MASTER TO MASTER_HOST=<host>, MASTER_PORT=<port>,
# MASTER_USER=<user>, MASTER_PASSWORD=<password> ;
#
# where you replace <host>, <user>, <password> by quoted strings and
# <port> by the master's port number (3306 by default).
#
# Example:
#
# CHANGE MASTER TO MASTER_HOST='125.564.12.1', MASTER_PORT=3306,
# MASTER_USER='joe', MASTER_PASSWORD='secret';
#
# OR
#
# 2) Set the variables below. However, in case you choose this method, then
# start replication for the first time (even unsuccessfully, for example
# if you mistyped the password in master-password and the slave fails to
# connect), the slave will create a master.info file, and any later
# change in this file to the variables' values below will be ignored and
# overridden by the content of the master.info file, unless you shutdown
# the slave server, delete master.info and restart the slaver server.
# For that reason, you may want to leave the lines below untouched
# (commented) and instead use CHANGE MASTER TO (see above)
#
# required unique id between 2 and 2^32 - 1
# (and different from the master)
# defaults to 2 if master-host is set
# but will not function as a slave if omitted
#server-id = 2
#
# The replication master for this slave - required
#master-host = <hostname>
#
# The username the slave will use for authentication when connecting
# to the master - required
#master-user = <username>
#
# The password the slave will authenticate with when connecting to
# the master - required
#master-password = <password>
#
# The port the master is listening on.
# optional - defaults to 3306
#master-port = <port>
#
# binary logging - not required for slaves, but recommended
#log-bin=mysql-bin
#
# binary logging format - mixed recommended
#binlog_format=mixed
# *** Replication related settings
# Unique server identification number between 1 and 2^32-1. This value
# is required for both master and slave hosts. It defaults to 1 if
# "master-host" is not set, but will MariaDB will not function as a master
# if it is omitted.
# server-id = 1
#
# Replication Slave (comment out master section to use this)
#
# To configure this host as a replication slave, you can choose between
# two methods :
#
# 1) Use the CHANGE MASTER TO command (fully described in our manual) -
# the syntax is:
#
# CHANGE MASTER TO MASTER_HOST=<host>, MASTER_PORT=<port>,
# MASTER_USER=<user>, MASTER_PASSWORD=<password> ;
#
# where you replace <host>, <user>, <password> by quoted strings and
# <port> by the master's port number (3306 by default).
#
# Example:
#
# CHANGE MASTER TO MASTER_HOST='125.564.12.1', MASTER_PORT=3306,
# MASTER_USER='joe', MASTER_PASSWORD='secret';
#
# OR
#
# 2) Set the variables below. However, in case you choose this method, then
# start replication for the first time (even unsuccessfully, for example
# if you mistyped the password in master-password and the slave fails to
# connect), the slave will create a master.info file, and any later
# changes in this file to the variable values below will be ignored and
# overridden by the content of the master.info file, unless you shutdown
# the slave server, delete master.info and restart the slaver server.
# For that reason, you may want to leave the lines below untouched
# (commented) and instead use CHANGE MASTER TO (see above)
#
# required unique id between 2 and 2^32 - 1
# (and different from the master)
# defaults to 2 if master-host is set
# but will not function as a slave if omitted
#server-id = 2
#
# The replication master for this slave - required
#master-host = <hostname>
#
# The username the slave will use for authentication when connecting
# to the master - required
#master-user = <username>
#
# The password the slave will authenticate with when connecting to
# the master - required
#master-password = <password>
#
# The port the master is listening on.
# optional - defaults to 3306
#master-port = <port>
#
# Make the slave read-only. Only users with the SUPER privilege and the
# replication slave thread will be able to modify data on it. You can
# use this to ensure that no applications will accidently modify data on
# the slave instead of the master
#read_only
#########################################################################
######################### innodb (Skipped) ##############################
#########################################################################
skip-innodb
############################################################################################################################
############################################################################################################################
############################################################################################################################
# MariaDB ROCKSDB Engine Server Variables Performance-Tuning
(Optimized 50/50 Snappy)
############################################################################################################################
############################################################################################################################
############################################################################################################################
#############################################################################################################################
# rocksdb_block_size
#############################################################################################################################
# NOTES:
# a) I/O Unit (Not fully Aligned).
# b) Default is 4KB. 16 KB gives better space savings, but needs extra
CPU for decompression.
# c) Measure trade-offs between 4K, 8K, 16K and 32 K.
# d) Check the size of data of the block cache in DB_BLOCK_CACHE_USAGE
Column of the INFORMATION_SCHEMA.ROCKSDB_DBSTATS table.
rocksdb_block_size = 8192 # Default Value: 4096 (4K)
#############################################################################################################################
# rocksdb_block_cache_size
#############################################################################################################################
# DESCRIPTION: Block_cache size for RocksDB.
# NOTES: RocksDB’s Internal Cache. Similar to innodb_buffer_pool_size,
but less important since RocksDB relies on OS cache too.
# Set Variable depending on objective "Commit(kB)" to be set in RAM
Memory by the "mysqld" process.
rocksdb_block_cache_size = 2G # Default: 512 MB
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Index & Filter Blocks
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# IMPORTANT: Both Options MUST always be set to ON (1) in order to
limit resources to rocksdb_block_cache_size Memory to the
# objective "Commit" Load.
# rocksdb_cache_index_and_filter_blocks: If you set
cache_index_and_filter_blocks to true, index and filter blocks
# will be stored in block cache, together with all other data blocks.
#
# This allows to effectively limit allocated resources.
#
# rocksdb_pin_l0_filter_and_index_blocks_in_cache: Minimizes
performance impact of rocksdb_cache_index_and_filter_blocks = ON,
# in order to effectively limit resources without a decrease in performance.
rocksdb_cache_index_and_filter_blocks = 1
rocksdb_pin_l0_filter_and_index_blocks_in_cache = 1
#############################################################################################################################
# ROCKSDB WAL Configuration Settings
#############################################################################################################################
# NOTES:
# Control Maximum WAL Size.
# Setting as large as total InnoDB Log Size would be fine.
rocksdb_max_total_wal_size = 4G # Default: 0
rocksdb_wal_dir = F:/MariaDB 10.3/data
# rocksdb_wal_dir: Recommended to set the WAL path on a different SSD
(Whenever its possible).
#############################################################################################################################
# rocksdb_max_background_jobs
#############################################################################################################################
# DEFINITION: Maximum number of concurrent background jobs, including
flushes and compactions. RocksDB will automatically
# decide how to allocate the available job slots to flushes and compactions.
# VALUE RANGE: -1 to 64.
# DEFAULT: 2.
# WARNING: This option must be used with MariaDB 10.3.8 as
“rocksdb_max_background_compactions”
# and “rocksdb_max_background_flushes” variables have been DEPRECATED.
rocksdb_max_background_jobs = 36
#############################################################################################################################
# rocksdb_lock_wait_timeout
#############################################################################################################################
# DESCRIPTION: Number of seconds to wait for lock.
rocksdb_lock_wait_timeout = 2 # Default: 1
#############################################################################################################################
# rocksdb_max_open_files
#############################################################################################################################
# NOTE:
# a) Increase file descriptor limit for mysqld process (Increase
nofile in /etc/security/limits.conf)
# b) If setting greater than 0, RocksDB still uses table_cache, which
will lock a mutex every time you access the file.
# I think you'll see much greater benefit with -1 because then you
will not need to go through LRUCache to get the table you
# need.
#
# IMPORTANT: Keep all files Open! (-1)
#
rocksdb_max_open_files = -1 # Default Value: -1
#############################################################################################################################
# rocksdb_db_write_buffer_size
#############################################################################################################################
# DESCRIPTION: This is the maximum size of all Write Buffers across
all Collumn Families in the database.
# It represents the amount of data to build up in memory (backed by an
unsorted log on disk) across all
# column families before writing to a sorted on-disk file.
#
# "Total size of memtables across column families". This can be used
to manage the total memory used by memtables.
#
# This is distinct from write_buffer_size, which enforces a limit for
a single memtable.
#
# By default this feature is disabled (by being set to 0). You should
not need to change it.
rocksdb_db_write_buffer_size = 4G # Default Value: 0 (Disabled).
#############################################################################################################################
# Other Configurations (CF Options)
#############################################################################################################################
rocksdb_table_cache_numshardbits = 6 # Default: 6. Optimal 50/50 Snappy: 6.
# -----------------------------------------------------------------------------------
# Options to Enable or Disable
# -----------------------------------------------------------------------------------
# NOTE: "allow_mmap_reads" cannot be used with "use_direct_reads" or
"use_direct_io_for_flush_and_compaction". Otherwise Server
# CRASHES.
rocksdb_allow_mmap_reads = 1 # Default: 0 # Useful for SSD Disks. Do
not set with Spinning Disks.
rocksdb_allow_mmap_writes = 0 # Default: 0
rocksdb_use_fsync = 0 # Default: 0
rocksdb_use_adaptive_mutex = 0 # Default: 0
# -----------------------------------------------------------------------------------
# Rate Limiter
# -----------------------------------------------------------------------------------
rocksdb_bytes_per_sync = 512000 # Default: 0. Optimal: 512 KB.
rocksdb_wal_bytes_per_sync = 4194304 # Default: 0. Set to 4MB.
rocksdb_rate_limiter_bytes_per_sec = 419430400 # Default: 0. Set to
400 MB/s. Increase if you're running on higher.
# -----------------------------------------------------------------------------------
# Triggering Compaction if there are many Sequential Deletes
# -----------------------------------------------------------------------------------
# NOTE: DESCRIPTION: "RocksDB will trigger compaction for the file if
it has more than the specified number
# of sequential deletes per window".
rocksdb_compaction_sequential_deletes_count_sd=1 # Default: 0
rocksdb_compaction_sequential_deletes=199999 # Default: 0
rocksdb_compaction_sequential_deletes_window=200000 # Default: 0
# -----------------------------------------------------------------------------------
# Max Subcompactations
# -----------------------------------------------------------------------------------
# DESCRIPTION: This value represents the maximum number of threads
that will concurrently perform a compaction job by breaking it into
multiple,
# smaller ones that are run simultaneously.
# DEFAULT: 1 (i.e. no subcompactions)
#
# NOTES: L0 to L1 compaction cannot be parallelized. In some cases, it
may become a bottleneck that limit the total
# compaction speed. In this case, users can set max_subcompactions to
more than 1. In this case, we'll try to partition the
# range and use multiple threads to execute it.
#
# Set up compression more aggressively and allocate more threads for
flush and compaction.
rocksdb_max_subcompactions = 2 # Default: 1
#############################################################################################################################
# Override cf options for RocksDB.
#############################################################################################################################
rocksdb_default_cf_options="write_buffer_size=512m;max_bytes_for_level_base=2048m;target_file_size_base=256m;max_bytes_for_level_multiplier=10;max_write_buffer_number=12;min_write_buffer_number_to_merge=4;target_file_size_multiplier=1;level0_file_num_compaction_trigger=1;level0_slowdown_writes_trigger=36;level0_stop_writes_trigger=42;source_compaction_factor=2;max_grandparent_overlap_factor=10"
#-------------------------------------------------------------
# MOST IMPORTANT SETTINGS: Optimal 50/50 Snappy
#-------------------------------------------------------------
# write_buffer_size: Sets the size of a single memtable. Once
memtable exceeds this size, it is marked immutable and
# a new one is created.It represents the
amount of data to build up in memory (backed by an unsorted
# log on disk) before converting to a sorted
on-disk file. The default is 64 MB.
# You need to budget for 2 x your worst case
memory use. If you don't have enough memory for this,
# you should reduce this value.
#
# "You can think of memtables as in-memory
write buffers. Each new key-value pair is first written
# to the memtable. Memtable size is
controlled by the option write_buffer_size. It's usually not a
# big memory consumer. However, memtable
size is inversely proportional to write amplification
# -- the more memory you give to the
memtable, the less the write amplification is.
#
# If you increase your memtable size, be
sure to also increase your L1 size! L1 size is controlled by
# the option *"max_bytes_for_level_base"*.
#
# Baseline: 256 MB. Optimized 50/50 Snappy: 512MB
#
# max_bytes_for_level_base: Write amplification.
*"max_bytes_for_level_base"* is total size of level 1. As mentioned,
we recommend
# that this be around the size of level 0.
Each subsequent level is *"max_bytes_for_level_multiplier"*
# larger than previous one. The default is
10 and we do not recommend changing that.
# It could be Set Same Value than "Cache Size".
#
# target_file_size_base: "Default Value is 64MB". Increasing
"*target_file_size_base*" will reduce total number of database files,
# which is generally a good thing. We
recommend setting target_file_size_base to be
# *"max_bytes_for_level_base*" / 10, so that
there are 10 files in level 1".
#
# max_bytes_for_level_multiplier: Default: 10 (Optimal Value).
#
# target_file_size_multiplier: Optimal Value: 1
#
# compression: Default: Snappy (Optimal).
#
#-------------------------------------------------------------
# OTHER SETTINGS:
#-------------------------------------------------------------
# min_write_buffer_number_to_merge: "Minimum number of memtables to be
merged before flushing to storage.
#
# For example:
# * If this option is set to 2,
immutable memtables are only flushed when there are 2 of them
# - a single immutable
memtable will never be flushed.
# * If multiple memtables are merged together, less data may be
written to storage since two updates
# are merged to a single key.
#
# Recommended Value: 4 (Optimal
50/50 Snappy).
# level0_file_num_compaction_trigger: "Number of files to trigger
level-0 compaction. A value <0 means that level-0 compaction will
# not be triggered by number of files at all". Default: 4.
#
# For example, if level0_file_num_compaction_trigger = 8 and
every flushed file is 100MB.
# Then as soon as there is 8
files, they are compacted to one 800MB file. And after we have 8
# new 100MB files, they are
compacted in the second 800MB, and so on. Eventually we'll have a list
# of 800MB files and no more
than 8 100MB files.
#
# Optimal Value: 1.
# max_write_buffer_number: "When the number of memtables waiting to
flush is greater or equal to max_write_buffer_number, writes are
# fully stopped to wait for flush finishes.
#
# The number of in-memory memtables. Each
memtable is of size write_buffer_size.
#
# Default: 2. Recommended Value: Same as
"Total Physical CPU"
# level0_slowdown_writes_trigger: If level0_slowdown_writes_trigger, a
limited write rate to DB is activated.
#
# Recommended Value: Same as
"Total Physical CPU" x3
#
# level0_stop_writes_trigger: If level0_stop_writes_trigger, writes to
DB are stopped.
#
# Recommended Value: Same as
"Total Physical CPU" x3.5
#
# source_compaction_factor: Maximum number of bytes in all source
files to be compacted in a single compaction run.
#
# Default: 1. Optimal 50/50 Snappy: 2
#
# max_grandparent_overlap_factor: Only for level-based
compactions. Default: 10. Optimal 50/50 Snappy: 10
#############################################################################################################################
# Options to Use with Spinning Disks
#############################################################################################################################
# WARNING: Do not use the Options in this Section
("rocksdb_compaction_readahead_size", "rocksdb_use_direct_reads",
# "rocksdb_use_direct_io_for_flush_and_compaction"), as they do not
work with "rocksdb_allow_mmap_reads", resulting
# in Server's Data Corruption.
# Throughput gap between random read vs. sequential read is much
higher in spinning disks. Suggestions:
#---------------------------------------------------------
# Compaction Readahead Size
#---------------------------------------------------------
#If non-zero, we perform bigger reads when doing compaction. If you're
running RocksDB on spinning disks, you should set this to
# at least 2MB (e.g: 16MB). We enforce it to be 2MB if you don't set
it with direct I/O.
rocksdb_compaction_readahead_size=0 # Default: 0
#---------------------------------------------------------
# Direct Reads
#---------------------------------------------------------
# NOTE: * "allow_mmap_reads" cannot be used with "use_direct_reads" or
"use_direct_io_for_flush_and_compaction"
# * "allow_mmap_writes" cannot be used with
"use_direct_io_for_flush_and_compaction", i.e., they cannot be set to
true at
# the same time.
rocksdb_use_direct_reads=0 # Default: 0
rocksdb_use_direct_io_for_flush_and_compaction=0 # Default: 0
#############################################################################################################################
# Data Loading: Bulk Load
#############################################################################################################################
# DESCRIPTION: Commit rows implicitly every rocksdb_bulk_load_size, on
bulk load/insert, update and delete.
# NOTE: Session Variables.
rocksdb_commit_in_the_middle = 1 # Default: 0
# Maximum number of locks a transaction can have:
rocksdb_max_row_locks = 1048576
#############################################################################################################################
# MyRocks Crash Recovery
#############################################################################################################################
# MyRocks supports crash recovery. Crash recovery is done by replaying
transaction logs from WAL (Write Ahead Log) files.
# MyRocks has a system variable rocksdb_wal_recovery_mode to control
how to apply logs, if any of the WAL entries is corrupted.
# This variable can be any of the following options:
#
# 0: If a corrupted WAL entry is detected as the last entry in the
WAL, truncate the entry and start up normally; otherwise,
# refuse to start.
#
# 1 (default): If a corrupted WAL entry is detected, fail to start.
This is the most conservative recovery mode.
#
# 2: If a corrupted WAL entry is detected in the middle of the WAL,
truncate all of WAL entries after that (even though there
# may be uncorrupted entries) and then start up normally. For
Replication Slaves, this option is fine, since the slave
# instance can recover any lost data from the master without
breaking consistency. For Replication Masters, this option
# may end up losing data if you do not run failover. For example,
if the master crashed and was restarted (by mysqld_safe,
# or auto restart after OS reboot) with this mode and it silently
truncated several WAL entries, the master would lose some
# data which may be present on one or more slaves.
#
# 3: If a corrupted WAL entry is detected in the middle of the WAL,
skip the WAL entry and continue to apply as many healthy WAL
# entries as possible. This is the most dangerous recovery option
and it is not generally recommended.
rocksdb_wal_recovery_mode = 1 # Default: 1
#########################################################################
###### myisam (Performance_Schema Variables: Joins not necessary) #######
#########################################################################
# If set, external locking for MyISAM tables is disabled.
# skip_external_locking = 1
# Size of the Key Buffer, used to cache index blocks for MyISAM tables.
# Do not set it larger than 30% of your available memory, as some memory
# is also required by the OS to cache rows. Even if you're not using
# MyISAM tables, you should still set it to 8-64M as it will also be
# used for internal temporary disk tables.
# If you don't use MyISAM tables explicitly you can set key_buffer_size
# to a very low value, 64K for example.
# 32 [GB] (Tornado24 Half RAM)· 0,25 = 8[GB]
key_buffer_size = 64K
# This buffer is allocated when MariaDB needs to rebuild the index in
# REPAIR, OPTIMIZE, ALTER table statements as well as in LOAD DATA INFILE
# into an empty table. It is allocated per thread so be careful with
# large settings.
# myisam_sort_buffer_size = 128M
# The maximum size of the temporary file MariaDB is allowed to use while
# recreating the index (during REPAIR, ALTER TABLE or LOAD DATA INFILE.
# If the file-size would be bigger than this, the index will be created
# through the key cache (which is slower).
# myisam_max_sort_file_size = 50G
# If a table has more than one index, MyISAM can use more than one
# thread to repair them by sorting in parallel. This makes sense if you
# have multiple CPUs and plenty of memory.
# myisam_repair_threads = 6
# Automatically check and repair not properly closed MyISAM tables.
# myisam_recover_options = FORCE,BACKUP
# When reading rows in sorted order after a sort, the rows are read
# through this buffer to avoid disk seeks. You can improve ORDER BY
# performance a lot, if set this to a high value.
# Allocated per thread, when needed.
# read_rnd_buffer_size = 265K
#########################################################################
################# MyISAM & Aria System Variables ########################
#########################################################################
# Values in [MB] range turn out into a performance decrease
# Size of the buffer used for doing full table scans.
# Allocated per thread, if a full scan is needed.
read_buffer_size = 256K
# MyISAM uses special tree-like cache to make bulk inserts (that is,
# INSERT ... SELECT, INSERT ... VALUES (...), (...), ..., and LOAD DATA
# INFILE) faster. This variable limits the size of the cache tree in
# bytes per thread. Setting it to 0 will disable this optimisation. Do
# not set it larger than "key_buffer_size" for optimal performance.
# This buffer is allocated when a bulk insert is detected.
# Values of 1/4 key_buffer_size make sense (Percona). It is per connection,
# so a 1/16 ratio is choosen in order not to use excesive resources:
# 8[GB] (key_buffer_size) / 16 = 512[MB] (bulk_insert_buffer_size)
# Same value has been choosen as tmp_table_size (For those cases in which
# Temporary Tables exceed 512M and use MyISAM instead of Memory Engine).
# MyISAM uses a special tree-like cache to make bulk
# inserts faster for INSERT ... SELECT, INSERT ...
# VALUES (...), (...), ..., and LOAD DATA INFILE when
# adding data to NON-EMPTY TABLES.
# limits the size of the cache tree in bytes per thread.
bulk_insert_buffer_size=256M
#########################################################################
############# Aria System Variables (Disk Temporary Tables) #############
#########################################################################
# The maximum size of the temporary file MariaDB is allowed to use while
# recreating the index (during REPAIR, ALTER TABLE or LOAD DATA INFILE.
# If the file-size would be bigger than this, the index will be created
# through the key cache (which is slower).
aria_max_sort_file_size = 100G
# Size of the Key Buffer, used to cache index blocks for Aria tables.
# Do not set it larger than 30% of your available memory, as some memory
# is also required by the OS to cache rows. Even if you're not using
# Aria tables, you should still set it to 8-64M as it will also be
# used for internal temporary disk tables.
aria_pagecache_buffer_size = 4G
# If a table has more than one index, MyISAM can use more than one
# thread to repair them by sorting in parallel. This makes sense if you
# have multiple CPUs and plenty of memory.
aria_repair_threads = 49
# This buffer is allocated when MariaDB needs to rebuild the index in
# REPAIR, OPTIMIZE, ALTER table statements as well as in LOAD DATA INFILE
# into an empty table. It is allocated per thread so be careful with
# large settings.
aria_sort_buffer_size = 128M
# Automatically check and repair not properly closed MyISAM tables.
# In MariaDB 10.3.0 aria_recover is renamed to aria_recover_options.
aria_recover_options = FORCE,BACKUP
# Path to the directory where to store transactional log:
aria_log_dir_path=C:/Program Files/MariaDB 10.3/data
#########################################################################
########################### mysqldump ###################################
#########################################################################
[mysqldump]
# Do not buffer the whole result set in memory before writing it to
# file. Required for dumping very large tables quick
quick
# The maximum size of one packet or any generated or intermediate string,
# or any parameter sent by the mysql_stmt_send_long_data() C API function.
max_allowed_packet = 16M
#########################################################################
############################# mysql #####################################
#########################################################################
[mysql]
no-auto-rehash
# Remove the next comment character if you are not familiar with SQL
# sql_safe_updates = 0
#local_infile = 1
#########################################################################
############################## mysqld_safe ##############################
#########################################################################
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
# |||||||||||||||||||||||| CACHES AND LIMITS |||||||||||||||||||||||||| #
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| #
[mysqld_safe]
# Increase the amount of open files allowed per process. Warning: Make
# sure you have set the global system limit high enough! The high value
# is required for a large number of opened tables
# Changes the number of file descriptors available to mysqld.
# You should try increasing the value of this option if mysqld gives you
# the error "Too many open files".
# Default Value: Autosized
# open_files_limit = 65535
# Number of table definitions that can be cached.
table_definition_cache = 4096
#########################################################################
############################## myisamchk ################################
#########################################################################
[myisamchk]
key_buffer_size = 256M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M
#########################################################################
############################## mysqlhotcopy #############################
#########################################################################
[mysqlhotcopy]
interactive-timeout
#########################################################################
############################ mysqld #####################################
#########################################################################
[mysqld]
character-set-server = latin1
collation-server = latin1_spanish_ci
#local_infile = 1
[client-server]
#local_infile = 1
2
4
17 Sep '18
Hello everyone,
We are evaluating various approaches for horizontal scalabilty of fulltext
searching, and we thought about trying out Galera. We set up a 3-node
cluster in Google Cloud Compute, and we tried loading a collection of 90k
synthetic documents into a table like this one:
CREATE TABLE `documents` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`filename` varchar(255) DEFAULT NULL,
`text` longtext DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=282232 DEFAULT CHARSET=latin1
We followed the recommendation in the MariaDB docs of inserting everything,
then creating the FULLTEXT index by running:
ALTER TABLE documents ADD FULLTEXT (text);
While this works fine with a regular single node configuration (without
Galera), it is crashing in our Galera setup with a message like this:
Exception in thread "main" java.sql.SQLException: (conn=41) Index for table
'documents' is corrupt; try to repair it
at
org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.get(ExceptionMapper.java:198)
at
org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.getException(ExceptionMapper.java:110)
at
org.mariadb.jdbc.MariaDbStatement.executeExceptionEpilogue(MariaDbStatement.java:228)
at
org.mariadb.jdbc.MariaDbStatement.executeInternal(MariaDbStatement.java:334)
at
org.mariadb.jdbc.MariaDbStatement.executeUpdate(MariaDbStatement.java:520)
at
uk.ac.aston.erdf.mariadb.MariaDBIndexer.indexFiles(MariaDBIndexer.java:211)
at
uk.ac.aston.erdf.mariadb.MariaDBIndexer.runIteration(MariaDBIndexer.java:142)
at
uk.ac.aston.erdf.mariadb.MariaDBIndexer.run(MariaDBIndexer.java:117)
at
uk.ac.aston.erdf.mariadb.MariaDBIndexer.main(MariaDBIndexer.java:106)
Caused by: java.sql.SQLException: Index for table 'documents' is corrupt;
try to repair it
Query is: ALTER TABLE documents ADD FULLTEXT (text);
at
org.mariadb.jdbc.internal.util.LogQueryTool.exceptionWithQuery(LogQueryTool.java:119)
at
org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.executeQuery(AbstractQueryProtocol.java:200)
at
org.mariadb.jdbc.MariaDbStatement.executeInternal(MariaDbStatement.java:328)
... 5 more
We have tried dropping and recreating the database, and SHOW STATUS lists
the 3 nodes just fine:
MariaDB [(none)]> show status like 'wsrep_cluster%';
+--------------------------+--------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------+
| wsrep_cluster_conf_id | 12 |
| wsrep_cluster_size | 3 |
| wsrep_cluster_state_uuid | 258a45dd-b9e0-11e8-bdd1-e260a7908d0f |
| wsrep_cluster_status | Primary |
+--------------------------+--------------------------------------+
4 rows in set (0.001 sec)
Our Galera section is pretty bog standard: we followed most of the defaults
in the Ubuntu MariaDB 10.3 packages, except for disabling bin logs (they
take up too much space for our 40GB cloud VMs). Here is our [galera]
section:
[galera]
# Mandatory settings
wsrep_on=ON
wsrep_provider=/usr/lib/libgalera_smm.so
wsrep_cluster_address=gcomm://mariadb-g1,mariadb-g2,mariadb-g3
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
#
# Allow server to accept connections on all interfaces.
#
bind-address=0.0.0.0
#
# Optional setting
#wsrep_slave_threads=1
innodb_flush_log_at_trx_commit=0
Are we missing something obvious, or is InnoDB FULLTEXT not yet fully
supported on Galera? Should we try Mroonga instead?
Thank you for your help,
Antonio
1
1
Hi everyone.
I have some questions regarding InnoDB deletion process.
As far as I know, when we issue a DELETE command to a row stored in a
InnoDB table, the row contents are stored in the UNDO Log, and the contents
of the row are only deleted when the last transaction after the DELETE
transaction has finished. But what actually happens to the row that was
deleted? Is any disk space ever released to the OS, or are the contents
simply erased? Defragmenting a table is only done through OPTIMIZE TABLE,
or are there any automatic threads responsible for rearranging the rows?
And if the are no such threads, how do we maintain tables which suffer from
many deletions?
I hope I made myself clear. Thanks in advance,
Arlindo Neto.
5
7
Hi guys, i'm checking a dump optimization, probably a problem:
EXPLAIN SELECT * FROM table WHERE pk1=1001 AND pk2='v' AND pk3=19808671980883
pk3 = bigint, pk1=integer, pkg2=enum('v','i','c')
this return "Impossible WHERE noticed after reading const tables", my
max(pk3) = 1923425, pk1 and pk2 exists
but:
EXPLAIN UPDATE table SET field1=1234 WHERE pk1=1001 AND pk2='v' AND
pk3=19808671980883
shows:
index=PRIMARY - rows = 1.055.901 - Using where
maybe update didn't check max value? it's a INNODB engine
this execute a full table read, and slowquery
# Query_time: 15.440127 Lock_time: 0.000079 Rows_sent: 0
Rows_examined: 1.914.183 (more than table rows)
-- thanks
1
0