Webinar Wed 8/29: Databases in the Hosted Cloud


databases-in-the-cloudPlease join Percona’s Chief Evangelist, Colin Charles on Wednesday, August 29th, 2018, as he presents Databases in the Hosted Cloud at 7:00 AM PDT (UTC-7) / 10:00 AM EDT (UTC-4).


Nearly everyone today uses some form of database in the hosted cloud. You can use hosted MySQL, MariaDB, Percona Server, and PostgreSQL in several cloud providers as a database as a service (DBaaS).

In this webinar, Colin Charles explores how to efficiently deploy a cloud database configured for optimal performance, with a particular focus on MySQL.

You’ll learn the differences between the various public cloud offerings for Amazon RDS including Aurora, Google Cloud SQL, Rackspace OpenStack DBaaS, Microsoft Azure, and Alibaba Cloud, as well as the access methods and the level of control you have. Hosting in the cloud can be a challenge but after today’s webinar, we’ll make sure you walk away with a better understanding of how you can leverage the cloud for your business needs.

Topics include:

  • Backup strategies
  • Planning multiple data centers for availability
  • Where to host your application
  • How to get the most performance out of the solution
  • Cost
  • Monitoring
  • Moving from one DBaaS to another
  • Moving from a DBaaS to your own hosted platform

Register Now.

The post Webinar Wed 8/29: Databases in the Hosted Cloud appeared first on Percona Database Performance Blog.


Webinar Wed 7/18: MariaDB 10.3 vs. MySQL 8.0

MariaDB 10.3 vs MySQL 8.0

MariaDB 10.3 vs MySQL 8.0Please join Percona’s Chief Evangelist, Colin Charles as he presents as he presents MariaDB 10.3 vs. MySQL 8.0 on Wednesday, July 18th, 2018, at 9:00 AM PDT (UTC-7) / 12:00 PM EDT (UTC-4).


Technical considerations

Are they syntactically similar? Where do these two databases differ? Why would I use one over the other?

MariaDB 10.3 is on the path of gradually diverging from MySQL 8.0. One obvious example is the internal data dictionary currently under development for MySQL 8.0. This is a major change to the way metadata is stored and used within the server, and MariaDB doesn’t have an equivalent feature. Implementing this feature could mark the end of datafile-level compatibility between MySQL and MariaDB.

Non-technical considerations

There are also non-technical differences between MySQL 8.0 and MariaDB 10.3, including:

Licensing: MySQL offers their code as open-source under the GPL, and provides the option of non-GPL commercial distribution in the form of MySQL Enterprise. MariaDB can only use the GPL, because their work is derived from the MySQL source code under the terms of that license.

Support services: Oracle provides technical support, training, certification and consulting for MySQL, while MariaDB has their own support services. Some people will prefer working with smaller companies, as traditionally it affords them more leverage as a customer.

Community contributions: MariaDB touts the fact that they accept more community contributions than Oracle. Part of the reason for this disparity is that developers like to contribute features, bug fixes and other code without a lot of paperwork overhead (and they complain about the Oracle Contributor Agreement). However, MariaDB has its own MariaDB Contributor Agreement — which more or less serves the same purpose.

Colin will take a look at some of the differences between MariaDB 10.3 and MySQL 8.0 and help answer some of the common questions our Database Performance Experts get about the two databases.

Register Now

The post Webinar Wed 7/18: MariaDB 10.3 vs. MySQL 8.0 appeared first on Percona Database Performance Blog.


MySQL 8.0 General Tablespaces: File per Database (and no FRM files)

MySQL 8.0 General Tablespaces

MySQL 8.0 General TablespacesIn this blog post, we’ll look at MySQL 8.0 general tablespaces.


MySQL 8.0 (the DMR version is available now) has two great features (among others):

  1. The new data dictionary completely removed *.frm files, which is great
  2. The ability to create a tablespace and assign a group of tables to it (originally introduced in 5.7).

With those two options, we can use MySQL for creating multi-tenant environments with a “schema per customer” approach.

Schema per Customer with MySQL 8.0

Using schema per customer with older MySQL versions presents issues  … namely the number of files. (I’ve described schema per customer approach in MySQL in an older blog post.) Let’s say you are hosting a Drupal-based site for your customers, and you create a new database (AKA “schema”) per each customer. You do not want to create one schema for all because each customer wants to extend Drupal and use plugins that will create their own unique set of tables. With tablespace per table and an FRM file, 10K customers will end up with:

  • 65 tables per schema,
  • Two files per table, and
  • 10K schemas

. . . or a grand total of 1.3 million files!

With MySQL 8.0, we can create a tablespace file per each schema and place those tablespace files in a specific set of directories. For example, if we have demo, test and production accounts, we can create a set of directories (outside of the MySQL datadir) and place tablespaces inside them. With no FRM files, we will only have 10 thousands of files, evenly split across multiple locations.


mysql>  create database if not exists drupal_customer_name;
Query OK, 1 row affected (0.00 sec)
mysql> CREATE TABLESPACE drupal_customer_name
ADD DATAFILE '/var/lib/mysql_datafiles/drupal_demo/drupal_customer_name.ibd'
Query OK, 0 rows affected (0.16 sec)
mysql> use drupal_customer_name;
Database changed
mysql> create table t(i int) ENGINE=InnoDB
TABLESPACE drupal_customer_name;
Query OK, 0 rows affected (0.13 sec)
mysql> create table t1(i int) ENGINE=InnoDB
TABLESPACE drupal_customer_name;
Query OK, 0 rows affected (0.00 sec)

Now let’s look at the directory:

ls -lah /var/lib/mysql_datafiles/drupal_demo/
-rw-r----- 1 mysql mysql 144K Sep 26 00:58 drupal_customer_name.ibd

The downside of this approach is that the “create tables” command should have the tablespace name in it. I’ve created a sample “deploy” script to create a new schema for a customer:

mysql -f -vvv -e "create database if not exists $customer_name;
CREATE TABLESPACE $customer_name ADD DATAFILE '/var/lib/mysql_datafiles/drupal_demo/${customer_name}.ibd'
cat drupal.sql | sed -e "s/ENGINE=InnoDB/ENGINE=InnoDB TABLESPACE $customer_name/g"|mysql $customer_name

Size and Timing

In the next post, I plan to benchmark the performance of millions of tables with MySQL 8.0 and tablespace file per database. Here, I’ve compared the create table performance between MySQL 5.7 (with FRM and file per table), MySQL 8.0 with a file per table (FRMs are gone) and MySQL 8.0 with a file per database. Time to create 1000 databases for Drupal (no data), 65 tables in each database:

  • MySQL 5.7, file per table: 3m21.819s
  • MySQL 8.0,  file per table: 2m54.358s
  • MySQL 8.0,  file per database: 1m55.133s

What about the size on disk? It did not change much. Actually, the size on disk for the blank tables (no data) is more in MySQL 8.0:

  • 8.0: 10M (10485760 bytes) per 65 blank tables (Drupal)
  • 5.7: 9.2M (9280821 bytes) per 65 blank tables, including FRM files

With 10K schemas, it is 100G just to store tablespaces (schema overhead). At the same time, it is not 100% overhead: InnoDB creates 112K+ the tablespace file right away (it depends upon the table structure). When the data is loaded it will use this reserved space.


New Data Dictionary

MySQL 8.0 uses a new transactional data dictionary.

MySQL Server 8.0 now incorporates a global data dictionary containing information about database objects in transactional tables. In previous MySQL releases, dictionary data was stored in metadata files and nontransactional system tables.

That also means that all old metadata files are gone: no .frm, .par, .trn, .trg files. In addition tables inside a MySQL database are not using MyISAM tables anymore. The new installation has no single MyISAM table, although the MyISAM engine is supported:

Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 2299
Server version: 8.0.0-dmr MySQL Community Server (GPL)
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> select count(*) from information_schema.tables where engine = 'MyISAM';
| count(*) |
|        0 |
1 row in set (0.03 sec)
mysql> select count(*), engine from information_schema.tables where table_schema = 'mysql' group by engine;
| count(*) | ENGINE |
|        2 | CSV    |
|       30 | InnoDB |
2 rows in set (0.00 sec)


FRM free installation looks great (performance testing of it is my next step). I would also love to see some additional features in MySQL 8.0:

  • Easier tablespace level manipulations, i.e. “optimize tablespace” to re-claim space in the general tablespace file; add “if exists / if not exists” to create/drop tablespace
  • Much smaller “reserved” space: if the data dictionary is stored elsewhere, we can create a one-page file (16K) + table structure.

Please note: MySQL 8.0 is not production ready and only available for preview.


InnoDB scalability issues due to tables without primary keys

Each day there is probably work done to improve performance of the InnoDB storage engine and remove bottlenecks and scalability issues. Hence there was another one I wanted to highlight:

Scalability issues due to tables without primary keys

This scalability issue is caused by the usage of tables without primary keys. This issue typically shows itself as contention on the InnoDB dict_sys mutex. Now the dict_sys mutex controls access to the data dictionary. This mutex is used at various places. I will only mention a few of them:

  • During operations such as opening and closing table handles, or
  • When accessing I_S tables, or
  • During undo of a freshly inserted row, or
  • During other data dictionary modification operations such as CREATE TABLE, or
  • Within the “Persistent Stats” subsystem, among other things.

Of course this list is not exhaustive but should give you a good picture of how heavily it is used.

But the thing is when you are mainly debugging contention related to a data dictionary control structure, you start to look off at something that is directly related to data dictionary modifications. You look for execution of CREATE TABLE, DROP TABLE, TRUNCATE TABLE, etc. But what if none of that is actually causing the contention on the dict_sys mutex? Are you aware when generating “row-id” values, for tables without explicit primary keys, or without non-nullable unique keys, dict_sys mutex is acquired. So INSERTs to tables with implicit primary keys is a InnoDB system-wide contention point.

Let’s also take a look at the relevant source code.

Firstly, below is the function that does the row-id allocation which is defined in the file storage/innobase/row/

3060 /***********************************************************//**
3061 Allocates a row id for row and inits the node->index field. */
3063 void
3064 row_ins_alloc_row_id_step(
3065 /*======================*/
3066         ins_node_t*     node)   /*!< in: row insert node */ 3067 { 3068         row_id_t        row_id; 3069  3070         ut_ad(node->state == INS_NODE_ALLOC_ROW_ID);
3072         if (dict_index_is_unique(dict_table_get_first_index(node->table))) {
3074                 /* No row id is stored if the clustered index is unique */
3076                 return;
3077         }
3079         /* Fill in row id value to row */
3081         row_id = dict_sys_get_new_row_id();
3083         dict_sys_write_row_id(node->row_id_buf, row_id);
3084 }

Secondly, below is the function that actually generates the row-id which is defined in the file storage/innobase/include/dict0boot.ic

26 /**********************************************************************//**
 27 Returns a new row id.
 28 @return the new id */
 30 row_id_t
 31 dict_sys_get_new_row_id(void)
 32 /*=========================*/
 33 {
 34         row_id_t        id;
 36         mutex_enter(&(dict_sys->mutex));
 38         id = dict_sys->row_id;
 40         if (0 == (id % DICT_HDR_ROW_ID_WRITE_MARGIN)) {
 42                 dict_hdr_flush_row_id();
 43         }
 45         dict_sys->row_id++;
 47         mutex_exit(&(dict_sys->mutex));
 49         return(id);
 50 }

Finally, I would like to share results of a few benchmarks that I conducted in order to show you how this affects performance.

Benchmarking affects of non-presence of primary keys

First off all, let me share information about the host that was used in the benchmarks. I will also share the MySQL version and InnoDB configuration used.


The host was a “hi1.4xlarge” Amazon EC2 instance. The instance comes with 16 vCPUs and 60.5GB of memory. The instance storage consists of 2×1024 SSD-backed storage volumes, and the instance is connected to a 10 Gigabit ethernet network. So the IO performance is very decent. I created a RAID 0 array from the 2 instance storage volumes and created XFS filesystem on the resultant software RAID 0 volume. This configuration would allows us to get the best possible IO performance out of the instance.


The MySQL version used was 5.5.34 MySQL Community Server, and the InnoDB configuration looked as follows:

innodb-flush-method            = O_DIRECT
innodb-log-files-in-group      = 2
innodb-log-file-size           = 512M
innodb-flush-log-at-trx-commit = 2
innodb-file-per-table          = 1
innodb-buffer-pool-size        = 42G
innodb-buffer-pool-instances   = 8
innodb-io-capacity             = 10000
innodb_adaptive_hash_index     = 1

I conducted two different types of benchmarks, and both of them were done by using sysbench.

First one involved benchmarking the performance of single-row INSERTs for tables with and without explicit primary keys. That’s what I would be showing first.

Single-row INSERTs

The tables were generated as follows for the benchmark involving tables with primary keys:

sysbench --test=/root/sysbench/sysbench/tests/db/insert.lua --oltp-tables-count=64 --oltp-table-size=1000000 --mysql-table-engine=innodb --mysql-user=root --mysql-host= --mysql-port=3306 --mysql-db=test prepare

This resulted in the following table being created:

CREATE TABLE `sbtest1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `k` int(10) unsigned NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)

While the tables without primary keys were generated as follows:

sysbench --test=/root/sysbench/sysbench/tests/db/insert.lua --oltp-tables-count=64 --oltp-table-size=1000000 --oltp-secondary --mysql-table-engine=innodb --mysql-user=root --mysql-host= --mysql-port=3306 --mysql-db=test prepare

This resulted in the tables being created with the following structure:

CREATE TABLE `sbtest1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `k` int(10) unsigned NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  KEY `xid` (`id`),
  KEY `k_1` (`k`)

The actual benchmark for the table with primary keys was run as follows:

sysbench --test=/root/sysbench/sysbench/tests/db/insert.lua --oltp-tables-count=64 --oltp-table-size=1000000 --oltp-dist-type=uniform --mysql-table-engine=innodb --mysql-user=root --mysql-host= --mysql-port=3306 --mysql-db=test --max-time=300 --num-threads=16 --max-requests=0 --report-interval=1 run

While the actual benchmark for the table without primary keys was run as follows:

sysbench --test=/root/sysbench/sysbench/tests/db/insert.lua --oltp-tables-count=64 --oltp-table-size=1000000 --oltp-secondary --oltp-dist-type=uniform --mysql-table-engine=innodb --mysql-user=root --mysql-host= --mysql-port=3306 --mysql-db=test --max-time=300 --num-threads=16 --max-requests=0 --report-interval=1 run

Note that the benchmarks were run with three variations in the number of concurrent threads used by sysbench: 16, 32 and 64.
Below are how the graphs look like for each of these benchmarks.

Writes per second 16 threads
Writes per second 32 threads
Writes per second 64 threads

Some key things to note from the graphs are that the throughput of the INSERTs to the tables without explicit primary keys never goes above 87% of the throughput of the INSERTs to the tables with primary keys defined. Furthermore, as we increase the concurrency downward spikes start appearing. These become more apparent when we move to a concurrency of 64 threads. This is expected, because the contention is supposed to increase as we increase the concurrency of operations that contend on the dict_sys mutex.

Now let’s take a look at how this impacts the bulk load performance.

Bulk Loads

The bulk loads to the tables with primary keys were performed as follows:

sysbench --test=/root/sysbench/sysbench/tests/db/parallel_prepare.lua --oltp-tables-count=64 --oltp-table-size=1000000 --mysql-table-engine=innodb --mysql-user=root --mysql-host= --mysql-port=3306 --mysql-db=test --num-threads=16 run

While the bulk loads to the tables without primary keys were performed as follows:

sysbench --test=/root/sysbench/sysbench/tests/db/parallel_prepare.lua --oltp-tables-count=64 --oltp-table-size=1000000 --oltp-secondary --mysql-table-engine=innodb --mysql-user=root --mysql-host= --mysql-port=3306 --mysql-db=test --num-threads=16 run

Note that the benchmarks were again run with three variations in the number of concurrent threads used by sysbench: 16, 32 and 64.
Below is what the picture is portrayed by the graph.

Parallel Bulk Loading of Tables

Here again, you can see how the bulk load time increases as we increase the number of concurrent threads. This against points to the increase in contention on the dict_sys mutex. With 16 threads the bulk load time for tables without primary keys is 107% more than the bulk load time for the tables with primary keys. This increases to 116% with 32 threads and finally 124% with 64 threads.


Tables without primary keys cause a wide range of contention because they rely on acquiring dict_sys mutex to generate row-id values. This mutex is used at critical places within InnoDB. Hence the affect of large amount of INSERTs to tables without primary keys is not only isolated to that table alone but can be seen very widely. There are a number of times I have seen tables without primary keys being used in many different scenarios that include simple INSERTs to these tables as well as multi-row INSERTs as a result of, for example, INSERT … SELECT into a table that is being temporarily created. The advice is always to have primary keys present in your tables. Hopefully I have been able to highlight the true impact non-presence of primary keys can have.

The post InnoDB scalability issues due to tables without primary keys appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by