In most database systems, like MySQL, PostgreSQL, and MongoDB, replication of some kind is used to create a highly available architecture. Valkey/Redis is no different in this regard. Replication is native functionality to Valkey, supporting multiple replicas, and even chains of replication.To clear up any confusion, understand that Valkey replication is a different concept compared […]
29
2024
Valkey/Redis: Setting Up Replication
15
2024
Valkey/Redis: Configuration Best Practices
In a few of our previous blog posts, we’ve seen how to get started with Valkey/Redis by running a simple docker container and how to persist some configuration settings to disk. In this post, let’s take a step back and go over a few configuration best practices.ConnectionsThe Valkey/Redis config file is typically located at /etc/valkey/valkey.conf, but […]
11
2018
How to Fix ProxySQL Configuration When it Won’t Start
With the exception of the three configuration variables described here, ProxySQL will only parse the configuration files the first time it is started, or if the proxysql.db file is missing for some other reason.
If we want to change any of this data we need to do so via ProxySQL’s admin interface and then save them to disk. That’s fine if ProxySQL is running, but what if it won’t start because of these values?
For example, perhaps we accidentally configured ProxySQL to run on port 3306 and restarted it, but there’s already a production MySQL instance running on this port. ProxySQL won’t start, so we can’t edit the value that way:
2018-10-02 09:18:33 network.cpp:53:listen_on_port(): [ERROR] bind(): Address already in use
We could delete proxysql.db and have it reload the configuration files, but that would mean any changes we didn’t mirror into the configuration files will be lost.
Another option is to edit ProxySQL’s database file using sqlite3:
[root@centos7-pxc57-4 ~]# cd /var/lib/proxysql/ [root@centos7-pxc57-4 proxysql]# sqlite3 proxysql.db sqlite> SELECT * FROM global_variables WHERE variable_name='mysql-interfaces'; mysql-interfaces|127.0.0.1:3306 sqlite> UPDATE global_variables SET variable_value='127.0.0.1:6033' WHERE variable_name='mysql-interfaces'; sqlite> SELECT * FROM global_variables WHERE variable_name='mysql-interfaces'; mysql-interfaces|127.0.0.1:6033
Or if we have a few edits to make we may prefer to do so with a text editor:
[root@centos7-pxc57-4 ~]# cd /var/lib/proxysql/ [root@centos7-pxc57-4 proxysql]# sqlite3 proxysql.db sqlite> .output /tmp/global_variables sqlite> .dump global_variables sqlite> .exit
The above commands will dump the global_variables table into a file in SQL format, which we can then edit:
[root@centos7-pxc57-4 proxysql]# grep mysql-interfaces /tmp/global_variables INSERT INTO “global_variables” VALUES(‘mysql-interfaces’,’127.0.0.1:3306’); [root@centos7-pxc57-4 proxysql]# vim /tmp/global_variables [root@centos7-pxc57-4 proxysql]# grep mysql-interfaces /tmp/global_variables INSERT INTO “global_variables” VALUES(‘mysql-interfaces’,’127.0.0.1:6033’);
Now we need to restore this data. We’ll use the restore command to empty the table (as we’re restoring from a missing backup):
[root@centos7-pxc57-4 proxysql]# sqlite3 proxysql.db sqlite> .restore global_variables sqlite> .read /tmp/global_variables sqlite> .exit
Once we’ve made the change, we should be able to start ProxySQL again:
[root@centos7-pxc57-4 proxysql]# /etc/init.d/proxysql start Starting ProxySQL: DONE! [root@centos7-pxc57-4 proxysql]# lsof -I | grep proxysql proxysql 15171 proxysql 19u IPv4 265881 0t0 TCP localhost:6033 (LISTEN) proxysql 15171 proxysql 20u IPv4 265882 0t0 TCP localhost:6033 (LISTEN) proxysql 15171 proxysql 21u IPv4 265883 0t0 TCP localhost:6033 (LISTEN) proxysql 15171 proxysql 22u IPv4 265884 0t0 TCP localhost:6033 (LISTEN) proxysql 15171 proxysql 23u IPv4 266635 0t0 TCP *:6032 (LISTEN)
While you are here
You might enjoy my recent post Using ProxySQL to connect to IPV6-only databases over IPV4
You can download ProxySQL from Percona repositories, and you might also want to check out our recorded webinars that feature ProxySQL too.
28
2018
Scaling Percona Monitoring and Management (PMM)
Starting with PMM 1.13, PMM uses Prometheus 2 for metrics storage, which tends to be heaviest resource consumer of CPU and RAM. With Prometheus 2 Performance Improvements, PMM can scale to more than 1000 monitored nodes per instance in default configuration. In this blog post we will look into PMM scaling and capacity planning—how to estimate the resources required, and what drives resource consumption.
We have now tested PMM with up to 1000 nodes, using a virtualized system with 128GB of memory, 24 virtual cores, and SSD storage. We found PMM scales pretty linearly with the available memory and CPU cores, and we believe that a higher number of nodes could be supported with more powerful hardware.
What drives resource usage in PMM ?
Depending on your system configuration and workload, a single node can generate very different loads on the PMM server. The main factors that impact the performance of PMM are:
- Number of samples (data points) injected into PMM per second
- Number of distinct time series they belong to (cardinality)
- Number of distinct query patterns your application uses
- Number of queries you have on PMM, through the user interface on the API, and their complexity
These specifically can be impacted by:
- Software version – modern database software versions expose more metrics)
- Software configuration – some metrics are only exposed in certain configuration
- Workload – a large number of database objects and high concurrency will increase both the number of samples ingested and their cardinality.
- Exporter configuration – disabling collectors can reduce amount of data collectors
- Scrape frequency – controlled by METRICS_RESOLUTION
All these factors together may impact resource requirements by a factor of ten or more, so do your own testing to be sure. However, the numbers in this article should serve as good general guidance as a start point for your research.
On the system supporting 1000 instances we observed the following performance:
As you can see, we have more than 2.000 scrapes/sec performed, providing almost two million samples/sec, and more than eight million active time series. These are the main numbers that define the load placed on Prometheus.
Capacity planning to scale PMM
Both CPU and memory are very important resources for PMM capacity planning. Memory is the more important as Prometheus 2 does not have good options for limiting memory consumption. If you do not have enough memory to handle your workload, then it will run out of memory and crash.
We recommend at least 2GB of memory for a production PMM Installation. A test installation with 1GB of memory is possible. However, it may not be able to monitor more than one or two nodes without running out of memory. With 2GB of memory you should be able to monitor at least five nodes without problem.
With powerful systems (8GB of more) you can have approximately eight systems per 1GB of memory, or about 15,000 samples ingested/sec per 1GB of memory.
To calculate the CPU usage resources required, allow for about 50 monitored systems per core (or 100K metrics/sec per CPU core).
One problem you’re likely to encounter if you’re running PMM with 100+ instances is the “Home Dashboard”. This becomes way too heavy with such a large number of servers. We plan to fix this issue in future releases of PMM, but for now you can work around it in two simple ways:
You can select the host, for example “pmm-server” in your home dashboard and save it, before adding a large amount of hosts to the system.
Or you can make some other dashboard of your choice and set it as the home dashboard.
Summary
- More than 1,000 monitored systems is possible per single PMM server
- Your specific workload and configuration may significantly change the resources required
- If deploying with 8GB or more, plan 50 systems per core, and eight systems per 1GB of RAM
The post Scaling Percona Monitoring and Management (PMM) appeared first on Percona Database Performance Blog.
25
2018
Webinar Tues, 5/29: MySQL, Percona XtraDB Cluster, ProxySQL, Kubernetes: How they work together
Please join Percona’s Principal Architect Alex Rubin as he presents MySQL, Percona XtraDB Cluster, ProxySQL, Kubernetes: How they work together to give you a highly available cluster database environment on Tuesday, May 29th at 7:00 AM PDT (UTC-7) / 10:00 AM EDT (UTC-4).
In this webinar, Alex will discuss how to deploy a highly available MySQL database environment on Kubernetes/Openshift using Percona XtraDB Cluster (PXC) together with MySQL Proxy to implement read/write splitting.
If you have never used Kubernetes and Openshift, or never used PXC / MySQL Proxy, Alex will do a quick introduction to these technologies. There will also be a demo where Alex sets up a PXC cluster with ProxySQL in Openshift Origin and tries to break it.
By the end of this webinar you will have a better understanding of:
- How to deploy Percona XtraDB Cluster with ProxySQL for HA solutions
- How to leverage Kubernetes/Openshift in your environments
- How to troubleshoot performance issues
Alexander Rubin, Principal Consultant
Alexander joined Percona in 2013. Alexander worked with MySQL since 2000 as DBA and Application Developer. Before joining Percona he was doing MySQL consulting as a principal consultant for over 7 years (started with MySQL AB in 2006, then Sun Microsystems and then Oracle). He helped many customers design large, scalable and highly available MySQL systems and optimize MySQL performance. Alexander also helped customers design Big Data stores with Apache Hadoop and related technologies.
The post Webinar Tues, 5/29: MySQL, Percona XtraDB Cluster, ProxySQL, Kubernetes: How they work together appeared first on Percona Database Performance Blog.
16
2018
Binlog and Replication Improvements in Percona Server for MySQL
Due to continuous development and improvement, Percona Server for MySQL incorporates a number of improvements related to binary log handling and replication. This results in replication specifics, distinguishing it from MySQL Server.
Temporary tables and mixed logging format
Summary of the fix:
As soon as some statement involving temporary tables was met when using a mixed binlog format, MySQL switched to row-based logging for all statements until the end of the session (or until all temporary tables used in the session were dropped). This is inconvenient when you have long-lasting connections, including replication-related ones. Percona Server for MySQL fixes the situation by switching between statement-based and row-based logging when necessary.
Details:
The new mixed binary logging format, supported by Percona Server for MySQL, means that the server runs in statement-based logging by default, but switches to row-based logging when replication would be unpredictable. For example, in the case of a nondeterministic SQL statement that could cause data divergence if reproduced on a slave server. The switch is done when matching any condition from a long list, and one of these conditions is the use of temporary tables.
Temporary tables are never logged using row-based format, but any statement that touches a temporary table is logged in row mode. This way, we intercept all the side effects that temporary tables can produce on non-temporary ones.
There is no need to use the row logging format for any other statements, solely because of the temp table presence. However, MySQL undertook such an excessive precaution: once some statement with a temporary table had appeared and the row-based logging was used, MySQL was logging unconditionally put all subsequent statements in row format.
Percona Server for MySQL has implemented more accurate behavior. Instead of switching to row-based logging until the last temporary table is closed, the usual rules of row vs. statement format apply, and we don’t consider the presence of currently opened temporary tables. This change was introduced with the fix of bug #151 (upstream #72475).
Temporary table drops and binloging on GTID-enabled server
Summary of the fix:
MySQL logs DROP
statements for all temporary tables regardless of the logging mode under which these tables were created. This produces binlog writes and errand GTIDs on slaves with row and mixed logging. Percona Server for MySQL fixes this by tracking the binlog format at temporary table create time and uses it to decide whether a DROP
should be logged or not.
Details:
Even with read_only mode enabled, the server permits some operations, including ones with temporary tables. With the previous fix, temporary table operations are not binlogged in row- or mixed-mode. But MySQL server doesn’t track what the logging mode was when a temporary table was created, and therefore unconditionally logs DROP
statements for all temporary tables. These DROP
statements receive IF EXISTS
addition, which is intended to make them harmless.
Percona Server for MySQL has fixed this with the bug fixes #964, upstream #83003, and upstream #85258. Moreover, with all the binlogging fixes discussed so far nothing involving temporary tables is logged to the binary log in row or mixed format. There is no need to consider CREATE/DROP TEMPORARY TABLE
unsafe for use in stored functions, triggers and multi-statement transactions in row/mixed format. Therefore, we introduced an additional fix to mark the creation and drop of temporary tables as unsafe inside transactions in statement-based replication only (the fixed bug is #1816, while the correspondent upstream one is #89467 and it is still open).
Safety of statements with a LIMIT clause
Summary of the fix:
MySQL Server considers all UPDATE/DELETE/INSERT ... SELECT
statements with the LIMIT
clause unsafe, no matter if they are really producing non-deterministic results or not. Percona Server for MySQL is more accurate because it acknowledges such instructions as safe when they include ORDER BY PK
or WHERE
condition.
Details:
MySQL Server treats UPDATE/DELETE/INSERT ... SELECT
statements with the LIMIT
clause as unsafe, considering that they produce an unpredictable number of rows. But some such statements can still produce an absolutely predictable result. One such deterministic case takes place when a statement with the LIMIT
clause has an ORDER BY PK
or WHERE
condition.
The patch, making updates and deletes with a limit to be supposed as safe if they have an ORDER BY pk_column clause, was initially provided on the upstream bug report and incorporated later into Percona Server for MySQL with additional improvements. Bug fixed #44 (upstream #42415).
Performance improvements
There are also two modifications in Percona Server related to multi-source replication that improve performance on slaves.
The first improvement is about relay log position, which was always updated in multi-source replications setups regardless of whether the committed transaction has already been executed or not. Percona Server omits relay log position updates for the already logged GTIDs.
These unconditional relay log position updates caused additional fsync operations in the case of relay-log-info-repository=TABLE
. With the higher number of channels transmitting such duplicate (already executed) transactions, the situation became proportionally worse. The problem was solved in Percona Server 5.7.18-14. Bug fixed #1786 (upstream #85141).
The second improvement decreases the load on slave nodes configured to update the master status and connection information only on log file rotation. MySQL additionally updated this information in the case of multi-source replication when a slave had to skip the already executed GTID event. This behavior was the cause of substantially higher write loads on slaves and lower replication throughput.
The configuration with master_info_repository=TABLE
and sync_master_info=0
makes the slave update the master status and connection information in this table on log file rotation and not after each sync_master_info
event, but it didn’t work on multi-source replication setups. Heartbeats sent to the slave to skip GTID events that it had already executed previously were evaluated as relay log rotation events and reacted with mysql.slave_master_info
table sync. This inaccuracy could produce a huge (up to five times on some setups) increase in write load on the slave, before this problem was fixed in Percona Server for MySQL 5.7.20-19. Bug fixed #1812 (upstream #85158).
Current status of fixes
The three issues related to temporary tables that were fixed in Percona Server 5.5 and contributed upstream, and the final fixes of the bugs #72475, #83003, and #85258, have landed into MySQL Server 8.0.4.
The post Binlog and Replication Improvements in Percona Server for MySQL appeared first on Percona Database Performance Blog.
09
2017
MySQL and MariaDB Default Configuration Differences
In this blog post, I’ll discuss some of the MySQL and MariaDB default configuration differences, focusing on MySQL 5.7 and MariaDB 10.2.
MariaDB Server is a general purpose open source database, created by the founders of MySQL. MariaDB Server (referred to as MariaDB for brevity) has similar roots as Percona Server for MySQL, but is quickly diverging from MySQL compatibility and growing on its own. MariaDB has become the default installation for several operating systems (such as Red Hat Enterprise Linux/CentOS/Fedora). Changes in the default variables can make a large difference in the out-of-box performance of the database, so knowing what is different is important.
As MariaDB grows on its own and doesn’t remain 100% compatible with MySQL, the defaults configuration settings might not mean everything or behave the way they used to. It might use different variable names, or implement the same variables in new ways. You also need to take into account that MariaDB uses it’s own Aria storage engine that has many configuration options that do not exist in MySQL.
Note: In this blog, I am looking at variables common to both MySQL or MariaDB, but have different defaults, not variables that are specific to either MySQL or MariaDB (except for the different switches inside the optimizer_switch).
Binary Logs
Variable | MariaDB Default | MySQL Default |
sync_binlog | 0 | 1 |
binlog_format | Mixed | Row |
MySQL has taken a more conservative stance when it comes to the binary log. In the newest versions of MySQL 5.7, they have updated two variables to help ensure all committed data remains intact and identical. Binlog_format was updated to row in MySQL in order to prevent non-deterministic statements from having different results on the slave. Row-based replication also helps when performing a lot of smaller updates. MariaDB defaults to the Mixed format. Mixed uses statement-based format unless certain criteria are met. It hat case, it uses the row format. You can see the detailed criteria for when the row format is used here: https://mariadb.com/kb/en/the-mariadb-library/binary-log-formats/.
The other difference that can cause a significant impact on performance is related to sync_binlog. Sync_binlog controls the number of commit groups to collect before synchronizing the binary log to disk. MySQL has changed this to 1, which means that every transaction is flushed to disk before it is committed. This guarantees that there can never be a committed transaction that is not recorded (even during a system failure). This can create a big impact to performance, as shown by a Roel Van de Paar in his blog: https://www.percona.com/blog/2016/06/03/binary-logs-make-mysql-5-7-slower-than-5-6/
MariaDB utilizes a value of 0 for sync_binlog, which allows the operating system to determine when the binlog needs to be flushed. This provides better performance, but adds the risk that if MariaDB crashes (or power is lost) that some data may be lost.
MyISAM
Variable | MariaDB Default | MySQL Default |
myisam_recover_options | BACKUP,QUICK | OFF |
key_buffer_size | 134217728 | 8388608 |
InnoDB replaced MyISAM as the default storage engine for some time now, but it is still used for many system tables. MySQL has tuned down the MyISAM settings, since it is not heavily used.
When mysqld opens a table, it checks whether the table is marked as crashed, or was not closed properly, and runs a check on it based on the myisam_recover_options settings. MySQL disables this by default, preventing recovery. MariaDB has enabled the BACKUP and QUICK recovery options. BACKUP causes a table_name-datetime.bak file to be created whenever a data file is changed during recovery. QUICK causes mysqld to not check the rows in a table if there are no delete blocks, ensuring recovery can occur faster.
MariaDB 10.2 increased the key_buffer_size. This allows for more index blocks to be stored in memory. All threads use this buffer, so a small buffer can cause information to get moved in and out of it more quickly. MariaDB 10.2 uses a buffer 16 times the size of MySQL 5.7: 134217728 in MariaDB 10.2 vsx 8388608 in MySQL 5.7.
Innodb
Variable | MariaDB Default | MySQL Default |
innodb_max_undo_log_size | 10485760(10 MiB) | 1073741824(1024 MiB) |
InnoDB variables have remained primarily unchanged between MariaDB 10.2 and MySQL 5.7. MariaDB has reduced the innodb_max_undo_log_size starting in 10.2.6. This was reduced from MySQL’s default of 1073741824(1024 MiB) to 10485760(10 MiB). These sizes reflect the maximum size an undo tablespace can become before it is marked for truncation. The tablespace doesn’t get truncated unless innodb_undo_log_truncate is enabled, and it is disabled in MySQL 5.7 and MariaDB 10.2 by default.
Logging
Variable | MariaDB Default | MySQL Default |
log_error | /var/log/mysqld.log | |
log_slow_admin_statements | ON | OFF |
log_slow_slave_statements | ON | OFF |
lc_messages_dir | /usr/share/mysql |
Logs are extremely important for troubleshooting any issues so the different choices in logging for MySQL 5.7 and MariaDB 10.2 are very interesting.
The log_error variable allows you to control where errors get logged. MariaDB 10.2 leaves this variable blank, writing all errors to stderr. MySQL 5.7 uses an explicitly created file at: /var/log/mysqld.log.
MariaDB 10.2 has also enabled additional slow statement logging. Log_slow_admin_statements create a record for any administrative statements that are not typically written to the binlog. Log_slow_slave_statements log the replicated statements sent from the master, if they are slow to complete. MySQL 5.7 does not enable logging of these statements by default.
Lc_messages_dir is the directory that contains the error message files for various languages. The variable defaults might be a little misleading in MariaDB 10.2. Lc_messages_dir is left empty by default, although it still uses the same path as MySQL 5.7. The files are located in /usr/share/mysql by default for both databases.
Performance Schema
Variable | MariaDB Default | MySQL Default |
performance_schema | OFF | ON |
performance_schema_setup_actors_size | 100 | -1 (auto adjusted) |
performance_schema_setup_objects_size | 100 | -1 (auto adjusted) |
The performance schema is an instrumentation tool that is designed to help troubleshoot various performance concerns. MySQL 5.7 enables the performance schema, and many of its instruments, by default. MySQL even goes so far as to detect the appropriate value for many Performance Schema variables instead of setting a static default. The Performance Schema does come with some overhead, and there are many blogs regarding how much this can impact performance. I think Sveta Smirnova said it best in her blog Performance Schema Benchmarks OLTP RW: “…test on your system! No generic benchmark can exactly repeat a workload on your site.”
MariaDB has disabled the Performance Schema by default, as well as adjusted a couple of the dynamic variables. Note that if you wish to disable or enable the Performance Schema, it requires a restart of the server since these variables are not dynamic. Performance_schema_setup_actors_size and performance_schema_setup_objects_size have both been set to a static 100, instead of the dynamic -1 used in MySQL 5.7. These both limit the number of rows that can be stored in relative tables. This creates a hard limit to the size these tables can grow to, helping to reduce their data footprint.
SSL/TLS
Variable | MariaDB Default | MySQL Default |
ssl_ca | ca.pem | |
ssl_cert | server-cert.pem | |
ssl_key | server-key.pem |
Secure Sockets Layer (SSL) and Transport Layer Security (TLS) are cryptographic protocols that allow for secure communication. SSL is actually the predecessor of TLS, although both are often referred to as SSL. MySQL 5.7 and MariaDB 10.2 support both yaSSL and OpenSSL. The default configurations for SSL/TLS differ only slightly between MySQL 5.7 and MariaDB 10.2. MySQL 5.7 sets a specific file name for ssl_ca, ssl_cert, and ssl_key. These files are created in the base directory, identified by the variable basedir. Each of these variables is left blank in MariaDB 10.2, so you need to set them before using secure connections. These variables are not dynamic, so be sure to set the values before starting your database.
Query Optimizer
MariaDB 10.2 | MySQL 5.7 | Optimization | Meaning | Switch |
N/A | OFF | Batched Key Access | Controls use of BKA join algorithm | batched_key_access |
N/A | ON | Block Nested-Loop | Controls use of BNL join algorithm | block_nested_loop |
N/A | ON | Condition Filtering | Controls use of condition filtering | condition_fanout_filter |
Deprecated | ON | Engine Condition Pushdown | Controls engine condition pushdown | engine_condition_pushdown |
ON | N/A | Engine Condition Pushdown | Controls ability to push conditions down into non-mergeable views and derived tables | condition_pushdown_for_derived |
ON | N/A | Exists Subquery | Allows conversion of in statements to exists statements | exists_to_in |
ON | N/A | Exists Subquery | Allows conversion of exists statements to in statements | in_to_exists |
N/A | ON | Index Extensions | Controls use of index extensions | use_index_extensions |
OFF | N/A | Index Merge | Allows index_merge for non-equality conditions | index_merge_sort_intersection |
ON | N/A | Join Algorithms | Perform index lookups for a batch of records from the join buffer | join_cache_bka |
ON | N/A | Join Algorithms | Controls use of BNLH and BKAH algorithms | join_cache_hashed |
ON | N/A | Join Algorithms | Controls use of incremental algorithms | join_cache_incremental |
ON | N/A | Join Algorithms | Controls use of block-based algorithms for outer joins | outer_join_with_cache |
ON | N/A | Join Algorithms | Controls block-based algorithms for use with semi-join operations | semijoin_with_cache |
OFF | N/A | Join Buffer | Creates the join buffer with an estimated size based on the estimated number of rows in the result | optimize_join_buffer_size |
ON | N/A | Materialized Temporary Tables | Allows index creation on derived temporary tables | derived_keys |
ON | N/A | Materialized Temporary Tables | Controls use of the rowid-merge strategy | partial_match_rowid_merge |
ON | N/A | Materialized Temporary Tables | Controls use of the partial_match_table-scan strategy | partial_match_table_scan |
OFF | ON | Multi-Range Read | Controls use of the multi-range read strategy | mrr |
OFF | ON | Multi-Range Read | Controls use of cost-based MRR, if mrr=on | mrr_cost_based |
OFF | N/A | Multi-Range Read | Enables key ordered scans if mrr=on | mrr_sort_keys |
ON | N/A | Order By | Considers multiple equalities when ordering results | ordery_uses_equalities |
ON | N/A | Query Plan | Allows the optimizer to use hidden components of InnoDB keys | extended_keys |
ON | N/A | Query Plan | Controls the removal of irrelevant tables from the execution plan | table_elimination |
ON | N/A | Subquery | Stores subquery results and correlation parameters for reuse | subquery_cache |
N/A | ON | Subquery Materialization | Controls us of cost-based materialization | ubquery_materialization_cost_based |
N/A | ON | Subquery Materialization &
Semi-join |
Controls the semi-join duplicate weedout strategy | duplicateweedout |
The query optimizer has several variances that not only affect query performance but also how you write SQL statements. The query optimizer is substantially different between MariaDB and MySQL, so even with identical configurations you are likely to see varying performance.
The sql_mode puts restrictions on how you can write queries. MySQL 5.7 has several additional restrictions compared to MariaDB 10.2. Only_full_group_by requires that all fields in any select…group by statement are either aggregated or inside the group by clause. The optimizer doesn’t assume anything regarding the grouping, so you must specify it explicitly.
No_zero_date, and no_zero_in_date both affect how the server interprets 0’s in dates. When no_zero_date is enabled, values of ‘0000-00-00’ are permitted but produce a warning. With strict mode enabled, then the value is not permitted and produces an error. No_zero_in_date is similar, except it applies to any section of the date(month, day, or year). With this disabled, dates with 0 parts, such as ‘2017-00-16’ are allowed as is. When enabled, the date is changed to ‘0000-00-00’ without warning. Strict mode prevents the date being inserted, unless ignore is provided as well. “INSERT IGNORE” and “UPDATE IGNORE” inserts the dates as ‘0000-00-00’. 5.7.4 changed this. No_zero_in_date was consolidated with strict mode, and the explicit option is deprecated.
The query_prealloc_size determines the size of the persistent buffer used for statement parsing and execution. If you regularly use complex queries, it can be useful to increase the size of this buffer, as it does not need to allocate additional memory during the query parsing. MySQL 5.7 has set this buffer to 8192, with a block size of 1024. MariaDB increased this value in 10.1.2 up to 24576.
Query_alloc_block_size dictates the size in bytes of any extra blocks allocated during query parsing. If memory fragmentation is a common problem, you might want to look at increasing this value. MySQL 5.7 uses 8192, while MariaDB 10.2 uses 16384 (twice that). Be careful when adjusting the block sizes: going too high consumes more than the needed amount of memory, and too low causes significant fragmentation.
The optimizer_switch variable contains many different switches that impact how the query optimizer plans and performs different queries. MariaDB 10.2 and MySQL 5.7 have many differences in their enabled options, and even the available options. You can see a brief breakdown of each of the options below. Any options with N/A is not supported in that server.
Miscellaneous
Variable | MariaDB Default | MySQL Default |
default_tmp_storage_engine | NULL | InnoDB |
group_concat_max_len | 1048576(1M) | 1024(1K) |
Lock_wait_timeout | 86400 (1 DAY) | 31536000 (1 YEAR) |
Max_allowed_packet | (16777216) 16MB | 4194304 (4MB) |
Max_write_lock_count | 4294967295 | 18446744073709551615 |
Old_passwords | OFF | 0 |
Open_files_limit | 0 | dependent on OS |
pid_file | /var/lib/mysql/ | /var/run/mysqld/ |
secure_file_priv | Varies by installation | |
sort_buffer_size | 2097152 | 262144 |
table_definition_cache | 400 | autosized |
table_open_cache_instances | 8 | 16 |
thread_cache_size | autosized | autosized |
thread_stack | 292KB | 192KB/256KB |
There are many variables that do not fit well into a group. I will go over those here.
When creating temporary tables, if you do not specify a storage engine then a default is used. In MySQL 5.7 this is set to InnoDB, the same as the default_storage_engine. MariaDB 10.2 also uses InnoDB, but it is not explicitly set. MariaDB sets the default_tmp_storage_engine to NULL, which causes it to use the default_storage_engine. This is important to remember if you change your default storage engine, as it would also change the default for temporary tables. An Important note, in MariaDB this is only relevant to tables created with “CREATE TEMPORARY TABLE”. Internal in-memory temporary tables use the memory storage engine, and internal, on-disk temporary tables use the aria engine by default.
The Group_concat function can cause some very large results if left unchecked. You can restrict the maximum size of results from this function with group_concat_max_len. MySQL 5.7 limits this to 1024(1K). MariaDB increased the value in 10.2.4 up to 1048576(1M).
Lock_wait_timeout controls how long a thread waits as it attempts to acquire a metadata lock. Several statements require a metadata lock, including DDL and DML operations, Lock Tables, Flush Tables with Read Lock and Handler statements. MySQL 5.7 defaults to the maximum possible value (one year), while MariaDB 10.2 has toned this down to one day.
Max_allowed_packet sets a limit to the maximum size of a packet, or a generated/intermediate string. This value is intentionally kept small (4MB) on MySQL 5.7 in order to detect the larger, intentionally incorrect packets. MariaDB has increased this value to 16MB. If using any large BLOB fields, you need to adjust this value to the size of the largest BLOB, in multiples of 1024, or you risk running into errors transferring the results.
Max_write_lock_count controls the number of write locks that can be given before some read lock requests being processed. In extremely heavy write loads your reads can pile up while waiting for the writes to complete. Modifying the max_write_lock_count allows you to tune how many writes can occur before some reads are allowed against the table. MySQL 5.7 keeps this value at the maximum (18446744073709551615), while MariaDB 10.2 lowered this to 4294967295. One thing to note is that this is still the maximum value on MariaDB 10.2.
Old_passwords controls the hashing method used by the password function, create user and grant statements. This variable has undergone several changes in MySQL 5.7. As of 5.7.4 the valid options were MySQL 4.1 native hashing, Pre-4.1 (“old”) hashing, and SHA-256 hashing. Version 5.7.5 removed the “old” Pre-4.1 method, and in 5.7.6 the variable has been deprecated with the intent of removing it entirely. MariaDB 10.2 uses a simple boolean value for this variable instead of the enumerated one in MySQL 5.7, though the intent is the same. Both default the old_passwords to OFF, or 0, and allow you to enable the older method if necessary.
Open_files_limit restricts the number of file descriptors mysqld can reserve. If set to 0 (the default in MariaDB 10.2) then mysqld reserves max_connections * 5 or max_connections + table_open_cache * 2, whichever is larger. It should be noted that mysqld cannot use an amount larger than the hard limit imposed by the operating system. MySQL 5.7 is also restricted by the operating systems hard limit, but is set at runtime to the real value permitted by the system (not a calculated value).
The pid_file allows you to control where you store the process id file. This isn’t a file you typically need, but it is good to know where it is located in case some unusual errors occur. On MariaDB you can find this inside /var/lib/mysql/, while on MySQL 5.7 you will find it inside /var/run/mysqld/. You will also notice a difference in the actual name of the file. MariaDB 10.2 uses the hostname as the name of the pid, while MySQL 5.7 simply uses the process name (mysqld.pid).
Secure_file_priv is a security feature that allows you to restrict the location of files used in data import and export operations. When this variable is empty, which was the default in MySQL before 5.7.6, there is no restriction. If the value is set to NULL, import and export operations are not permitted. The only other valid value is the directory path where files can be imported from or exported to. MariaDB 10.2 defaults to empty. As of MySQL 5.7.6, the default will depend on the install_layout CMAKE option.
INSTALL_LAYOUT | DEFAULT VALUE |
STANDALONE,WIN | NULL(>=MySQL 5.7.16_,empty(<MySQL 5.7.16) |
DEB,RPM,SLES,SVR4 | /var/lib/mysql-files |
Other | Mysql-files under the CMAKE_INSTALL_PREFIX value |
Mysqld uses a sort buffer regardless of storage engine. Every session that must perform a sort allocates a buffer equal to the value of sort_buffer_size. This buffer should at minimum be large enough to contain 15 tuples. In MySQL 5.7, this defaults to 262144, while MariaDB 10.2 uses the larger value 2097152.
The table_definition_cache restricts the number of table definitions that can be cached. If you have a large number of tables, mysqld may have to read the .frm file to get this information. MySQL 5.7 auto detects the appropriate size to use, while MariaDB 10.2 defaults this value to 400. On my small test VM, MySQL 5.7 chose a value of 1400.
The table_open_cache_instances vary in implementation between MySQL and MariaDB. MySQL 5.7 creates multiple instances of the table_open_cache, each holding a portion of the tables. This helps reduce contention, as a session needs to lock only one instance of the cache for DML statements. In MySQL 5.7.7 the default was a single instance, but this was changed in MySQL 5.7.8 (increased to 16). MariaDB has a more dynamic approach to the table_open_cache. Initially there is only a single instance of the cache, and the table_open_cache_instances variable is the maximum number of instances that can be created. If contention is detected on the single cache, another instance is created and an error logged. MariaDB 10.2 suspects that the maximum eight instances it sets by default should support up to 100 CPU cores.
The thread_cache_size controls when a new thread is created. When a client disconnects the thread is stored in the cache, as long as the maximum number of threads do not exist. Although this is not typically noticeable, if your server sees hundreds of connections per second you should increase this value to so that new connections can use the cache. Thread_cache_size is an automatically detected variable in both MySQL 5.7 and MariaDB 10.2, but their methods to calculate the default vary significantly. MySQL uses a formula, with a maximum of 100: 8+ (max_connections / 100). MariaDB 10.2 uses the smaller value out of 256 or the max_connections size.
The thread_stack is the stack size for each thread. If the stack size is too small, it limits the complexity of SQL statements, the recursion depth of stored procedures and other memory-consuming actions. MySQL 5.7 defaults the stack size to 192KB on 32-bit platforms and 256KB on 64-bit systems. MariaDB 10.2 adjusted this value several times. MariaDB 10.2.0 used 290KB, 10.2.1 used 291KB and 10.2.5 used 292KB.
Conclusion
Hopefully, this helps you with the configurations options between MySQL and MariaDB. Use the comments for any questions.
20
2017
Webinar Thursday June 22, 2017: Deploying MySQL in Production
Join Percona’s Senior Operations Engineer, Daniel Kowalewski as he presents Deploying MySQL in Production on Thursday, June 22, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7).
This webinar is a soup-to-nuts talk that will have you going from zero to hero in no time. It includes discussion of the best practices for installation, configuration, taking backups, monitoring, etc.
Register for the webinar here.
Daniel Kowalewski, Senior Technical Operations Engineer
Daniel has been designing and deploying solutions around MySQL for over ten years. He lives for those magic moments where response time drops by 90%, and loves adding more “nines” to everything.
13
2016
Webinar Wednesday 12/14: MongoDB System Tuning Best Practices
Please join Percona Senior Technical Operations Architect Tim Vaillancourt on Wednesday December 14, at 10:00 am PST/ 1:00pm EST (UTC-8) as he presents MongoDB System Tuning Best Practices.
People give much love to optimizing document design, provisioning, and even selecting an engine in MongoDB. They give little attention to tuning Linux to handle databases efficiently. In this session we will talk about what schedulers you should use, what network settings, what memory and cache settings, what file systems, should you use NUMA and Huge Pages, and more.
This will be a data-packed webinar for the advanced user, but still accessible by the budding systems admin type that wants to learn more about system internals.
Register for this webinar here.
Tim joined Percona in 2016 as Sr. Technical Operations Architect for MongoDB, with a goal to make the operations of MongoDB as smooth as possible. With experience operating infrastructures in industries such as government, online marketing/publishing, SaaS and gaming – combined with experience tuning systems from the hard disk all the way up to the end-user – Tim has spent time in nearly every area of the modern IT stack with many lessons learned.
Tim lives in Amsterdam, NL and enjoys traveling, coding and music. Prior to Percona Tim was the Lead MySQL DBA of Electronic Arts’ DICE studios, helping some of the largest games in the world (“Battlefield” series, “Mirrors Edge” series, “Star Wars: Battlefront”) launch and operate smoothly while also leading the automation of MongoDB deployments for EA systems. Before the role of DBA at EA’s DICE studio, Tim served as a subject matter expert in NoSQL databases, queues and search on the Online Operations team at EA SPORTS. Prior to moving to the gaming industry, Tim served as a Database/Systems Admin operating a large MySQL-based SaaS infrastructure at AbeBooks/Amazon Inc.
28
2016
Blog Series: MySQL Configuration Management
MySQL configuration management remains a hot topic, as I’ve noticed on numerous occasions during my conversations with customers.
I thought it might be a good idea to start a blog series that goes deeper in detail into some of the different options, and what modules potentially might be used for managing your MySQL database infrastructure.
Configuration management has been around since way before the beginning of my professional career. I, myself, originally began working on integrating an infrastructure with my colleagues using Puppet.
Why is configuration management important?
- Reproducibility. It’s giving us the ability to provision any environment in an automated way, and feel sure that the new environment will contain the same configuration.
- Fast restoration. Thanks to reproducibility, you can quickly provision machines in case of disasters. This makes sure you can focus on restoring your actual data instead of worrying about the deployment and configuration of your machines.
- Integral part of continuous deployment. Continuous deployment is a terminology everyone loves: being able to deploy changes rapidly and automatically after automated regression testing requires a configuration management solution.
- Compliance and security. Solutions like Puppet and Chef maintain and enforce configuration parameters on your infrastructure. This can sound bothersome at first, but it’s essential for maintaining a well-configured environment.
- Documented environment. Although reading someone’s puppet code can potentially harm you beyond insanity, it provides you with the real truth about your infrastructure.
- Efficiency and manageability. Configuration management can automate repetitive tasks (for example, user grants, database creation, configuration variables), as well as security updates, service restarts, etc. These can potentially bring you less work and faster rollouts.
Which players are active in this field?
The most popular open source solutions are Puppet, Chef, Ansible, and CFengine (among others). In this series, we will go deeper in the first three of them.
Let’s first start by giving you a quick, high-level introduction.
Puppet
Puppet is a language used to describe the desired state of an environment. The Puppet client reads the catalog of the expected state from the server and enforces these changes on the client. The system works based on a client/server principle.
Puppet has as default four essential components:
- Puppet Server: A Java virtual machine offering Puppet’s core services.
- Puppet Agent: A client library that requests configuration catalog info from the puppet-server.
- Hiera: A key-value lookup database, which can store and modify values for specific hosts.
- Facter: An application that keeps an inventory of the local node variables.
How can you integrate puppet in your MySQL infrastructure?
- A ready-made puppet module: https://forge.puppet.com/puppetlabs/mysql
This will allow you and your team to create users, databases, install and configure MySQL
- Use the community Luke: https://forge.puppet.com/modules?utf-8=?&sort=rank&q=mysql
Probably my old “code from hell” module is still somewhere out there.
Chef
Chef also consists of a declarative language (like Puppet) based on Ruby which will allow you to write cookbooks for potential integrable technologies. Chef is also based on a server/client solution. The client being chef nodes, the server managing the cookbooks, catalogs and recipes.
In short, Chef consists of:
- Chef server: Manages the multiple cookbooks and the catalog
- Chef clients (nodes): The actual system requesting the catalog information from the chef server.
- Workstations: This is a system that is configured to run Chef command-line tools that synchronize with a Chef-repository or the Chef server. You could also describe this as a Chef development and tooling environment.
How can you integrate Chef in your MySQL infrastructure:
Ansible
Ansible originated with something different in mind. System engineers typically chose to use their own management scripts. This can be troublesome and hard to maintain. Why wouldn’t you use something easy and automated and standardized? Ansible fills in these gaps, and simplifies management of Ansible targets.
Ansible works by connecting to your nodes (by SSH default) and pushes out Ansible modules to them. These modules represent the desired state of the node, and will be used to execute commands to attain the desired state.
This procedure is different to Puppet and Chef, which are essentially preferably client/server solutions.
Some pre-made modules for MySQL are:
- http://docs.ansible.com/ansible/mysql_db_module.html
- http://docs.ansible.com/ansible/mysql_user_module.html
- http://docs.ansible.com/ansible/mysql_variables_module.html
Conclusion and Next Steps
Choose your poison (or magical medicine, you pick the wording), every solution has its perks.
Keep in mind that in some situations running a complicated Puppet or Chef infrastructure could be overkill. At this moment, a solution like Ansible might be a quick and easily integrable answer for you.
The next blog post will go over the Puppet Forge MySQL module, so stay tuned!