Oct
29
2018
--

One Week Until Percona Live Open Source Database Conference Europe 2018

Percona Live Europe 2018

Percona Live Europe Open Source Database Conference PLE 2018It’s almost here! One week until the Percona Live Europe Open Source Database Conference 2018 in Frankfurt, Germany! Are you ready?

This year’s theme is “Connect. Accelerate. Innovate.” We want to live these words by making sure that the conference allows you to connect with others in the open source community, accelerate your ideas and solutions and innovate when you get back to your projects and companies.

  • There is one day of tutorials (Monday) and two days of sessions (Tuesday and Wednesday). We have multiple tracks: MySQL 8.0, Using MySQL, MongoDB, PostgreSQL, Cloud, Database Security and Compliance, Monitoring and Ops, and Containers and Emerging Technologies. This year also includes a specialized “Business Track” aimed at how open source can solve critical enterprise issues.
  • Each of the session days begins with excellent keynote presentations in the main room by well-known people and players in the open source community. Don’t miss them!
  • Don’t forget to attend our Welcome Reception on Monday.
  • Want to meet with our Product Managers? Join them for Lunch on Wednesday, November 7, where you’ll have a chance to participate in the development of Percona Software!
  • On our community blog, we’ve been highlighting some of the sessions that will be occurring during the conference. You can check them out here.

Percona Live Europe TutorialsThe entire conference schedule is up and available here.

Percona Live Europe provides the community with an opportunity to discover and discuss the latest open source trends, technologies and innovations. The conference includes the best and brightest innovators and influencers in the open source database industry.

Our daily sessions, day-one tutorials, demonstrations, keynotes and events provide access to what is happening NOW in the world of open source databases. At the conference, you can mingle with all levels of the database community: DBAs, developers, C-level executives and the latest database technology trend-setters.

Network with peers and technology professionals and unite the open source database community! Share knowledge, experiences and use cases! Learn about how open source database technology can power your applications, improve your websites and solve your critical database issues.

Come to the conference.

Don’t miss out, buy your tickets here!

Percona Live Europe TutorialsConnect. Accelerate. Innovate.

With a lot of focus on the benefits of open source over proprietary models of software delivery, you surely can’t afford to miss this opportunity to connect with leading figures of the open source database world. On Monday, November 5 you can opt to accelerate your knowledge with our in-depth tutorials, or choose to attend our business track geared towards open source innovation and adoption.

Tuesday and Wednesday’s sessions across eight different tracks provides something for all levels of experience, and addresses a range of business challenges. See the full schedule.

Oct
26
2018
--

Announcing Keynotes for Percona Live Europe!

Percona Live Keynotes

There’s just over one week to go so it’s time to announce the keynote addresses for Percona Live Europe 2018! We’re excited to share our lineup of conference keynotes, featuring talks from Paddy Power Betfair, Amazon Web Services, Facebook, PingCap and more!

The speakers will address the current status of key open source database projects MySQL®, PostgreSQL, MongoDB®, and MariaDB®. They’ll be sharing with you how organizations are shifting from a single use database to a polyglot strategy, thereby avoiding vendor lock-in and enabling business growth.

Without further ado, here’s the full keynote line-up for 2018!

Tuesday, November 6

Maximizing the Power and Value of Open Source Databases

Open source database adoption continues to grow in enterprise organizations, as companies look to scale for growth, maintain performance, keep up with changing technologies, control risks and contain costs. In today’s environment, a single database technology or platform is no longer an option, as organizations shift to a best-of-breed, polyglot strategy to avoid vendor lock-in, increase agility and enable business growth. Percona’s CEO Peter Zaitsev shares his perspective.

Following this keynote, there will be a round of lightning talks featuring the latest releases from PostgreSQL, MongoDB and MariaDB.

Technology Lightning Talks

PostgreSQL 11

PostgreSQL benefits from over 20 years of open source development, and has become the preferred open source relational database for developers. PostgreSQL 11 was released on October 18. It provides users with improvements to the overall performance of the database system, with specific enhancements associated with very large databases and high computational workloads.

MongoDB 4.0

Do you love MongoDB? With version 4.0 you have a reason to love it even more! MongoDB 4.0 adds support for multi-document ACID transactions, combining the document model with ACID guarantees. Through snapshot isolation, transactions provide a consistent view of data and enforce all-or-nothing execution to maintain data integrity. And not only transactions – MongoDB 4.0 has more exciting features like non-blocking secondary reads, improved sharding, security improvements, and more.

MariaDB 10.3

MariaDB benefits from a thriving community of contributors. The latest release, MariaDB 10.3, provides several new features not found anywhere else, as well back-ported and reimplemented features from MySQL.

Paddy Power Betfair, Percona, and MySQL

This keynote highlights the collaborative journey Paddy Power Betfair and Percona have taken through the adoption of MySQL within the PPB enterprise. The keynote focuses on how Percona has assisted PPB in adopting MySQL, and how PPB has used this partnership to deliver a full DBaaS for a MySQL solution on OpenStack.

Wednesday 7th November

State of the Dolphin

Geir Høydalsvik (Oracle) will talk about the focus, strategy, investments, and innovations evolving MySQL to power next-generation web, mobile, cloud, and embedded applications. He will also discuss the latest and the most significant MySQL database release ever in its history, MySQL 8.0.

Amazon Relational Database Services (RDS)

Amazon RDS is a fully managed database service that allows you to launch an optimally configured, secure, and highly available database with just a few clicks. It manages time-consuming database administration tasks, freeing you to focus on your applications and business. This keynote features the latest news and announcements from RDS, including the launches of Aurora Serverless, Parallel Query, Backtrack, RDS MySQL 8.0, PostgreSQL 10.0, Performance Insights, and several other recent innovations.

TiDB 2.1, MySQL Compatibility, and Multi-Cloud Deployment

This keynote talk from PingCap will provide an architectural overview of TiDB, how and why it’s MySQL compatible, the latest features and improvements in TiDB 2.1 GA release, and how its multi-cloud fully-managed solution works.

MyRocks in the Real World

In this keynote, Yoshinori Matsunobu, Facebook, will share interesting lessons learned from Facebook’s production deployment and operations of MyRocks and future MyRocks development roadmaps. Vadim Tkachenko, Percona’s CTO, will discuss MyRocks in Percona Server for MySQL and share performance benchmark results from on-premise and cloud deployments.

Don’t miss out, buy your tickets here!

Connect. Accelerate. Innovate.

With a lot of focus on the benefits of open source over proprietary models of software delivery, you surely can’t afford to miss this opportunity to connect with leading figures of the open source database world. On Monday, November 5 you can opt to accelerate your knowledge with our in-depth tutorials, or choose to attend our business track geared towards open source innovation and adoption.

Tuesday and Wednesday with sessions across 8 different tracks, there’s something for all levels of experience, addressing a range of business challenges. See the full schedule.

With thanks to our sponsors!

Platinum: AWS, Percona
Gold: Facebook
Silver: Altinity, PingCap, Shannon Systems, OlinData, MySQL
Startup: Silver Nines
Community: PostgreSQL, MariaDB Foundation
Contributing: Intel Optane, Idera, Studio3T

Media Sponsors: Datanami, Enterprise Tech, HPC Wire, ODBMS.org, Database Trends and Applications, Packt

Percona Live Keynotes

 

Oct
25
2018
--

Percona Live Europe 2018: Our Sponsors

Sponsors PLE 2018

Without our sponsors, it would be almost out of reach to deliver a conference of the size and format  that everyone has come to expect from Percona Live. As well as financial support, our sponsors contribute massively by supporting their teams in presenting at the conference, and adding to the quality and atmosphere of the event. Having their support means we can present excellent in-depth technical content for the tutorials and talks, and that’s highly valued by conference delegates. This year, too, Amazon Web Services (AWS) sponsors the cloud track on day two, with a superb line up of cloud content.

Here’s a shout out to our sponsors, you’ll find more information on the Percona Live sponsors page:

Platinum

aws

 

For over 12 years, Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud platform. https://aws.amazon.com/

Gold

facebookFacebook offer a fantastic contribution to open source databases with MyRocks and are greatly appreciated for their ongoing support of Percona Live.
https://www.facebook.com

Silver

altinityAltinity is the leading service provider for ClickHouse
https://www.altinity.com/

PingCAP
PingCAP is the company and core team building TiDB, a popular open-source MySQL-compatible NewSQL hybrid database.
https://www.pingcap.com/en/

Shannon Systems

Shannon Systems is a global leader in providing enterprise-grade Flash storage devices and system solutions.
http://en.shannon-sys.com/

OlinData

OlinData is an open source infrastructure management company providing services to help companies from small to large with their infrastructure.
https://www.olindata.com/en

MySQL

MySQL is the world’s most popular OS database, delivered by Oracle.
https://www.mysql.com/

Start Up

SeveralNines
SeveralNines provide automation and management software for MySQL, MariaDB and MongoDB clusters
https://severalnines.com/

Community Sponsors

PostgreSQL
PostgreSQL is a powerful, open source object-relational database system.
https://www.postgresql.org/
MariaDB Foundation

MariaDB Server is one of the most popular database servers in the world.
https://mariadb.org

Branding

Intel OPTANE
Intel is the world’s leading technology company, powering the cloud and billions of smart, connected computing devices.
https://www.intel.com

Idera
IDERA designs powerful software with one goal in mind – to solve customers’ most complex challenges with elegant solutions.
https://www.idera.com/

Studio 3T

Studio 3T is a GUI and IDE for developers and data engineers who work with MongoDB.
https://studio3t.com/

Media

  • datanami online portal for data science, AI and advanced analytics
  • Enterprise Tech online portal addressing high performance computing technologies at scale
  • HPC Wire covering the fastest computers in the world and the people who run them
  • odbms.org a resource portal for big data, new data management technologies, data science and AI
  • Packt online technical publications and videos

Thanks again to all – appreciated!

Sponsors PLE 2018

Oct
23
2018
--

Reclaiming space on your Docker PMM server deployment

reclaiming space Docker PMM

reclaiming space Docker PMMRecently we had a customer that had issues with a filled disk on the server hosting their Docker pmm-server environment. They were not able to access the web UI, or even stop the pmm-server container because they had filled the /var/ mount point.

Setting correct expectations

The best way to avoid these kinds of issues in the first place is to plan ahead, and to know exactly with what you are dealing with in terms of disk space requirements. Michael Coburn has written a great blogpost on this matter:

https://www.percona.com/blog/2017/05/04/how-much-disk-space-should-i-allocate-for-percona-monitoring-and-management/

We are now using Prometheus version 2 inside PMM server, so you should take it with a pinch of salt. On the other hand, it will show how you should plan ahead, and think about the “steady state” disk usage, so it’s a good read.

That’s the first step to make sure you won’t get into trouble down the line. But, what happens if you are already in trouble? We’ll see two quick ways that may help reclaiming space.

Before anything else, you should stop any and all PMM clients running, so that you don’t have a race condition after recovering some space, in which metrics coming from the running clients will fill up whatever disk you had freed.

If

pmm-admin stop --all

  won’t work, you can stop the services manually, or even manually kill the running processes as a last resort:

shell> systemctl list-unit-files | grep enabled | grep pmm | awk '{print $1}' | xargs -n 1 systemctl stop
shell> ps ax | egrep "exporter|qan-agent|pmm" | grep -v "ssh" | awk '{print $1}' | xargs kill

Removing unused containers

In order for the next steps to be as effective as possible, make sure there are no unused containers running, or stopped:

shell> docker ps -a

If you see any container that you know you don’t need anymore:

shell> docker stop <container_name>
shell> docker rm -v <container_name>

WARNING! Do not remove the pmm-data container!

Reclaiming space from unused Docker images

After you are done cleaning unused containers, we can move forward with removing unused images. Unless you are manually building your own Docker images, it’s really easy to get them again if needed, so you shouldn’t be afraid of deleting the ones that are not being used. In fact, you don’t need to explicitly download the images. By simply running

docker run … image_name

  Docker will automatically do it for you if it’s not found locally.

shell> docker image prune -a
WARNING! This will remove all images without at least one container associated to them.
Are you sure you want to continue? [y/N] y
Deleted Images:
...
Total reclaimed space: 3.97GB

Not too bad, we just reclaimed 4Gb of disk space. This alone should be enough to restart the Docker service and have the pmm-server container back up. But we want more, just because we can ?

Reclaiming space from orphaned Docker volumes

By default, when removing a container (with

docker rm

 ) Docker will not delete the associated volumes, unless you use the -v switch as we did above. This will mean that, unless you were aware of this fact, you will probably have some other gigabytes worth of data occupying disk space. We can easily do this with the volume prune command:

shell> docker volume prune
WARNING! This will remove all local volumes not used by at least one container.
Are you sure you want to continue? [y/N] y
Deleted Volumes:
...
Total reclaimed space: 115GB

Yeah… that’s some significant amount of disk space we just reclaimed back! Again, make sure you don’t care about any of the volumes from your past containers to be able to do this safely, since there is no turning back from this, obviously.

For earlier versions of Docker where this command is not available, you can check this link.

Planning ahead

As mentioned before, you should now revisit Michael’s blogpost, and set the metrics retention and queries retention variables to whatever makes sense for your environment. Even if you plan ahead, you may not be counting on the additional variable overhead of images and orphaned volumes, so you may want to (warning: shameless plug for my own blogpost ahead) use different mount points for your PMM deployment, and avoid using the shared /var/lib/docker/ mount point for it.

PMM also includes a Disk Space usage dashboard, that you can use to monitor this.

Don’t forget to start back up your PMM clients, and continue to monitor them 24×7!

Photo by Andrew Wulf on Unsplash

Oct
22
2018
--

One Billion Tables in MySQL 8.0 with ZFS

one billion tables MySQL

The short version

I created > one billion InnoDB tables in MySQL 8.0 (tables, not rows) just for fun. Here is the proof:

$ mysql -A
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1425329
Server version: 8.0.12 MySQL Community Server - GPL
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> select count(*) from information_schema.tables;
+------------+
| count(*)   |
+------------+
| 1011570298 |
+------------+
1 row in set (6 hours 57 min 6.31 sec)

Yes, it took 6 hours and 57 minutes to count them all!

Why does anyone need one billion tables?

In my previous blog post, I created and tested MySQL 8.0 with 40 million tables (that was a real case study). The One Billion Tables project is not a real world scenario, however. I was challenged by Billion Tables Project (BTP) in PostgreSQL, and decided to repeat it with MySQL, creating 1 billion InnoDB tables.

As an aside: I think MySQL 8.0 is the first MySQL version where creating 1 billion InnoDB tables is even practically possible.

Challenges with one billion InnoDB tables

Disk space

The first and one of the most important challenges is disk space. InnoDB allocates data pages on disk when creating .ibd files. Without disk level compression we need > 25Tb of disk. The good news: we have ZFS which provides transparent disk compression. Here’s how the disk utilization looks:

Actual data (apparent-size):

# du -sh --apparent-size /mysqldata/
26T     /mysqldata/

Compressed data:

# du -sh /mysqldata/
2.4T    /mysqldata/

Compression ratio:

# zfs get compression,compressratio
...
mysqldata/mysql/data             compressratio         7.14x                      -
mysqldata/mysql/data             compression           gzip                       inherited from mysqldata/mysql

(Looks like the compression ratio reported is not 100% correct, we expect ~10x compression ratio.)

Too many tiny files

This is usually the big issue with databases that create a file per table. With MySQL 8.0 we can create a shared tablespace and “assign” a table to it. I created a tablespace per database, and created 1000 tables in each database.

The result:

mysql> select count(*) from information_schema.schemata;
+----------+
| count(*) |
+----------+
|  1011575 |
+----------+
1 row in set (1.31 sec)

Creating tables

Another big challenge is how to create tables fast enough so it will not take months. I have used three approaches:

  1. Disabled all possible consistency checks in MySQL, and decreased the innodb page size to 4K (these config options are NOT for production use)
  2. Created tables in parallel: as the mutex contention bug in MySQL 8.0 has been fixed, creating tables in parallel works fine.
  3. Use local NVMe cards on top of an AWS ec2 i3.8xlarge instance

my.cnf config file (I repeat: do not use this in production):

[mysqld]
default-authentication-plugin = mysql_native_password
performance_schema=0
datadir=/mysqldata/mysql/data
socket=/mysqldata/mysql/data/mysql.sock
log-error = /mysqldata/mysql/log/error.log
skip-log-bin=1
innodb_log_group_home_dir = /mysqldata/mysql/log/
innodb_doublewrite = 0
innodb_checksum_algorithm=none
innodb_log_checksums=0
innodb_flush_log_at_trx_commit=0
innodb_log_file_size=2G
innodb_buffer_pool_size=100G
innodb_page_size=4k
innodb_flush_method=nosync
innodb_io_capacity_max=20000
innodb_io_capacity=5000
innodb_buffer_pool_instances=32
innodb_stats_persistent = 0
tablespace_definition_cache = 524288
schema_definition_cache = 524288
table_definition_cache = 524288
table_open_cache=524288
table_open_cache_instances=32
open-files-limit=1000000

ZFS pool:

# zpool status
  pool: mysqldata
 state: ONLINE
  scan: scrub repaired 0B in 1h49m with 0 errors on Sun Oct 14 02:13:17 2018
config:
        NAME        STATE     READ WRITE CKSUM
        mysqldata   ONLINE       0     0     0
          nvme0n1   ONLINE       0     0     0
          nvme1n1   ONLINE       0     0     0
          nvme2n1   ONLINE       0     0     0
          nvme3n1   ONLINE       0     0     0
errors: No known data errors

A simple “deploy” script to create tables in parallel (includes the sysbench table structure):

#/bin/bash
function do_db {
        db_exist=$(mysql -A -s -Nbe "select 1 from information_schema.schemata where schema_name = '$db'")
        if [ "$db_exist" == "1" ]; then echo "Already exists: $db"; return 0; fi;
        tbspace="create database $db; use $db; CREATE TABLESPACE $db ADD DATAFILE '$db.ibd' engine=InnoDB";
        #echo "Tablespace $db.ibd created!"
        tables=""
        for i in {1..1000}
        do
                table="CREATE TABLE sbtest$i ( id int(10) unsigned NOT NULL AUTO_INCREMENT, k int(10) unsigned NOT NULL DEFAULT '0', c varchar(120) NOT NULL DEFAULT '', pad varchar(60) NOT NULL DEFAULT '', PRIMARY KEY (id), KEY k_1 (k) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 tablespace $db;"
                tables="$tables; $table;"
        done
        echo "$tbspace;$tables" | mysql
}
c=0
echo "starting..."
c=$(mysql -A -s -Nbe "select max(cast(SUBSTRING_INDEX(schema_name, '_', -1) as unsigned)) from information_schema.schemata where schema_name like 'sbtest_%'")
for m in {1..100000}
do
        echo "m=$m"
        for i in {1..30}
        do
                let c=$c+1
                echo $c
                db="sbtest_$c"
                do_db &
        done
        wait
done

How fast did we create tables? Here are some stats:

# mysqladmin -i 10 -r ex|grep Com_create_table
...
| Com_create_table                                      | 6497                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| Com_create_table                                      | 6449

So we created ~650 tables per second. The average, above, is per 10 seconds.

Counting the tables

It took > 6 hours to do “count(*) from information_schema.tables”! Here is why:

  1. MySQL 8.0 uses a new data dictionary (this is great as it avoids creating 1 billion frm files). Everything is stored in this file:
    # ls -lah /mysqldata/mysql/data/mysql.ibd
    -rw-r----- 1 mysql mysql 6.1T Oct 18 15:02 /mysqldata/mysql/data/mysql.ibd
  2. The information_schema.tables is actually a view:
mysql> show create table information_schema.tables\G
*************************** 1. row ***************************
                View: TABLES
         Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`mysql.infoschema`@`localhost` SQL SECURITY DEFINER VIEW `information_schema`.`TABLES` AS select `cat`.`name` AS `TABLE_CATALOG`,`sch`.`name` AS `TABLE_SCHEMA`,`tbl`.`name` AS `TABLE_NAME`,`tbl`.`type` AS `TABLE_TYPE`,if((`tbl`.`type` = 'BASE TABLE'),`tbl`.`engine`,NULL) AS `ENGINE`,if((`tbl`.`type` = 'VIEW'),NULL,10) AS `VERSION`,`tbl`.`row_format` AS `ROW_FORMAT`,internal_table_rows(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`table_rows`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `TABLE_ROWS`,internal_avg_row_length(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`avg_row_length`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `AVG_ROW_LENGTH`,internal_data_length(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`data_length`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `DATA_LENGTH`,internal_max_data_length(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`max_data_length`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `MAX_DATA_LENGTH`,internal_index_length(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`index_length`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `INDEX_LENGTH`,internal_data_free(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`data_free`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `DATA_FREE`,internal_auto_increment(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`auto_increment`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0),`tbl`.`se_private_data`) AS `AUTO_INCREMENT`,`tbl`.`created` AS `CREATE_TIME`,internal_update_time(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(cast(`stat`.`update_time` as unsigned),0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `UPDATE_TIME`,internal_check_time(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(cast(`stat`.`check_time` as unsigned),0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `CHECK_TIME`,`col`.`name` AS `TABLE_COLLATION`,internal_checksum(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`checksum`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `CHECKSUM`,if((`tbl`.`type` = 'VIEW'),NULL,get_dd_create_options(`tbl`.`options`,if((ifnull(`tbl`.`partition_expression`,'NOT_PART_TBL') = 'NOT_PART_TBL'),0,1))) AS `CREATE_OPTIONS`,internal_get_comment_or_error(`sch`.`name`,`tbl`.`name`,`tbl`.`type`,`tbl`.`options`,`tbl`.`comment`) AS `TABLE_COMMENT` from (((((`mysql`.`tables` `tbl` join `mysql`.`schemata` `sch` on((`tbl`.`schema_id` = `sch`.`id`))) join `mysql`.`catalogs` `cat` on((`cat`.`id` = `sch`.`catalog_id`))) left join `mysql`.`collations` `col` on((`tbl`.`collation_id` = `col`.`id`))) left join `mysql`.`tablespaces` `ts` on((`tbl`.`tablespace_id` = `ts`.`id`))) left join `mysql`.`table_stats` `stat` on(((`tbl`.`name` = `stat`.`table_name`) and (`sch`.`name` = `stat`.`schema_name`)))) where (can_access_table(`sch`.`name`,`tbl`.`name`) and is_visible_dd_object(`tbl`.`hidden`))
character_set_client: utf8
collation_connection: utf8_general_ci

and the explain plan looks like this:

mysql> explain select count(*) from information_schema.tables \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: cat
   partitions: NULL
         type: index
possible_keys: PRIMARY
          key: name
      key_len: 194
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: tbl
   partitions: NULL
         type: ALL
possible_keys: schema_id
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1023387060
     filtered: 100.00
        Extra: Using where; Using join buffer (Block Nested Loop)
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: sch
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY,catalog_id
          key: PRIMARY
      key_len: 8
          ref: mysql.tbl.schema_id
         rows: 1
     filtered: 11.11
        Extra: Using where
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: stat
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 388
          ref: mysql.sch.name,mysql.tbl.name
         rows: 1
     filtered: 100.00
        Extra: Using index
*************************** 5. row ***************************
           id: 1
  select_type: SIMPLE
        table: ts
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: mysql.tbl.tablespace_id
         rows: 1
     filtered: 100.00
        Extra: Using index
*************************** 6. row ***************************
           id: 1
  select_type: SIMPLE
        table: col
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: mysql.tbl.collation_id
         rows: 1
     filtered: 100.00
        Extra: Using index

Conclusions

  1. I have created more than 1 billion real InnoDB tables with indexes in MySQL 8.0, just for fun, and it worked. It took ~2 weeks to create.
  2. Probably MySQL 8.0 is the first version where it is even practically possible to create billion InnoDB tables
  3. ZFS compression together with NVMe cards makes it reasonably cheap to do, for example, by using i3.4xlarge or i3.8xlarge instances on AWS.

one billion tables MySQL

Oct
22
2018
--

Percona Live Europe Presents … In Their Own Words

Percona Live Europe 2018 two weeks to go

Percona Live Europe 2018 two weeks to goFor those who are looking forward to Percona Live Europe in just two weeks time—and for those yet to make up their minds—some of our presenters have shared some insight into their talks and what they’re most looking forward to themselves. Make no mistake, this is one of the most exciting events in the conference calendar for those of us who work with open source databases.

This year, our conference previews are being hosted over on the Percona community blog and the posts have been written by the presenters.

Percona Live Europe presents…

Here are the first six posts in this series of Percona Live Europe presents. There are more to come, so do come back over the next few days to see if any of the writers can help you pinpoint the talks that you are most interested in attending this year:

  • Dinesh Joshi will be taking a look at boosting Apache Cassandra’s performance using Netty
  • Federico Razzoli on why he’s investigating MariaDB system versioned tables
  • Jaime Crespo of Wikimedia Foundation will be presenting a entry level (but detailed) tutorial on query optimization, and a break out talk on TLS security, you can find out more in his blog post
  • Tiago Jorge of Oracle on his talk about MySQL 8.0 replication
  • There’s going to be an ElasticSearch 101 tutorial presented by three of the team from ObjectRocket—Antonios Giannopoulos tells you more about that stellar opportunity—while last but by no means least…
  • Arjen Lentz, new CEO of MariaDB Foundation, is keen to share with you the latest information on MariaDB 10.3

Tantalized? Keep meaning to book your seat? There’s not long left now, so head straight to the registration page and book your place. Percona Live Europe will be in  Frankfurt from November 5-7 2018.

About the community blog

We’re really pleased that the community blog is gaining some great support. It offers a platform for all to write on the general topic of open source databases. Commercial and non-commercial. Those who are already prolific bloggers, and those who maybe only want to write a blog or two on a topic that they feel strongly about. If you’d like to join us and write for the community blog, please get in touch! You can email me.

Oct
18
2018
--

ProxySQL 1.4.11 and Updated proxysql-admin Tool Now in the Percona Repository

ProxySQL 1.4.11

ProxySQL 1.4.11ProxySQL 1.4.11, released by ProxySQL, is now available for download in the Percona Repository along with an updated version of Percona’s proxysql-admin tool.

ProxySQL is a high-performance proxy, currently for MySQL and its forks (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. René Cannaò created ProxySQL for DBAs as a means of solving complex replication topology issues.

The ProxySQL 1.4.11 source and binary packages available at https://percona.com/downloads/proxysql include ProxySQL Admin – a tool, developed by Percona to configure Percona XtraDB Cluster nodes into ProxySQL. Docker images for release 1.4.11 are available as well: https://hub.docker.com/r/percona/proxysql/. You can download the original ProxySQL from https://github.com/sysown/proxysql/releases. The documentation is hosted on GitHub in the wiki format.

Improvements

  • mysql_query_rules_fast_routing is enabled in ProxySQL Cluster. For more information, see #1674 at GitHub.
  • In this release, rmpdb checksum error is ignored when building ProxySQL in Docker.
  • By default, the permissions for proxysql.cnf are set to 600 (only the owner of the file can read it or make changes to it).

Bugs Fixed

  • Fixed the bug that could cause crashing of ProxySQL if IPv6 listening was enabled. For more information, see #1646 at GitHub.

ProxySQL is available under Open Source license GPLv3.

Oct
17
2018
--

Can MySQL Parallel Replication Help My Slave?

InnoDB Row Operations per Hour graph from Percona Monitoring and Management performance monitoring tool

Parallel replication has been around for a few years now but is still not that commonly used. I had a customer where the master had a very large write workload. The slave could not keep up so I recommended to use parallel slave threads. But how can I measure if it really helps and is working?

At my customer the

slave_parallel_workers

  was 0. But how big should I increase it, maybe to 1? Maybe to 10? There is a blog post about how can we see how many threads are actually used, which is a great help.

We changed the following variables on the slave:

slave_parallel_type = LOGICAL_CLOCK;
slave_parallel_workers = 40;
slave_preserve_commit_order = ON;

40 threads sounds a lot, right? Of course, this is workload specific: if the transactions are independent it might be useful.

Let’s have a look, how many threads are working:

mysql> SELECT performance_schema.events_transactions_summary_by_thread_by_event_name.THREAD_ID AS THREAD_ID
, performance_schema.events_transactions_summary_by_thread_by_event_name.COUNT_STAR AS COUNT_STAR
FROM performance_schema.events_transactions_summary_by_thread_by_event_name
WHERE performance_schema.events_transactions_summary_by_thread_by_event_name.THREAD_ID IN
     (SELECT performance_schema.replication_applier_status_by_worker.THREAD_ID
      FROM performance_schema.replication_applier_status_by_worker);
+-----------+------------+
| THREAD_ID | COUNT_STAR |
+-----------+------------+
| 25882 | 442481 |
| 25883 | 433200 |
| 25884 | 426460 |
| 25885 | 419772 |
| 25886 | 413751 |
| 25887 | 407511 |
| 25888 | 401592 |
| 25889 | 395169 |
| 25890 | 388861 |
| 25891 | 380657 |
| 25892 | 371923 |
| 25893 | 362482 |
| 25894 | 351601 |
| 25895 | 339282 |
| 25896 | 325148 |
| 25897 | 310051 |
| 25898 | 292187 |
| 25899 | 272990 |
| 25900 | 252843 |
| 25901 | 232424 |
+-----------+------------+

You can see all the threads are working. Which is great.

But did this really speed up the replication? Could the slave write more in the same period of time?

Let’s see the replication lag:

MySQL Replication Delay graph from PMM

As we can see, lag goes down quite quickly. Is this because the increased thread numbers? Or because the job which generated the many inserts finished and there are no more writes coming? (The replication delay did not go to 0 because this slave is deliberately delayed by an hour.)

Luckily in PMM we have other graphs as well that can help us. Like this one showing InnoDB row operations:

InnoDB Row Operations graph from PMM

 

That looks promising: the slave now inserts many more rows than usual. But how much rows were inserted, actually? Let’s create a new graph to see how many rows were inserted per hour. In PMM we already have all this information, we just have to create a new graph using the following query:

increase(mysql_global_status_innodb_row_ops_total{instance="$host",operation!="read"}[1h])

And this is the result:

InnoDB Row Operations per Hour graph from Percona Monitoring and Management performance monitoring tool

We can see a huge jump in the number of inserted rows per hour, it went from ~50Mil to 200-400Mil per hours. We can say that increasing the number of 

slave_parallel_workers

  really helped.

Conclusion

In this case, parallel replication was extremely useful and we could confirm that using PMM and Performance Schema. If you tune the

slave_parallel_workers

  check the graphs. You can show the impact to your boss. ?

 

 

Oct
15
2018
--

Identifying High Load Spots in MySQL Using Slow Query Log and pt-query-digest

pt-query-digest MySQL slow queries

pt-query-digest MySQL slow queriespt-query-digest is one of the most commonly used tool when it comes to query auditing in MySQL®. By default, pt-query-digest reports the top ten queries consuming the most amount of time inside MySQL. A query that takes more time than the set threshold for completion is considered slow but it’s not always true that tuning such queries makes them faster. Sometimes, when resources on server are busy, it will impact every other operation on the server, and so will impact queries too. In such cases, you will see the proportion of slow queries goes up. That can also include queries that work fine in general.

This article explains a small trick to identify such spots using pt-query-digest and the slow query log. pt-query-digest is a component of Percona Toolkit, open source software that is free to download and use.

Some sample data

Let’s have a look at sample data in Percona Server 5.7. Slow query log is configured to capture queries longer than ten seconds with no limit on rate of logging, which is generally considered to throttle the IO that comes while writing slow queries to the log file.

mysql> show variables like 'log_slow_rate%' ;
+---------------------+---------+
| Variable_name       | Value    |
+---------------------+---------+
| log_slow_rate_limit | 1       |  --> Log all queries
| log_slow_rate_type  | session |
+---------------------+---------+
2 rows in set (0.00 sec)
mysql> show variables like 'long_query_time' ;
+-----------------+-----------+
| Variable_name   | Value     |
+-----------------+-----------+
| long_query_time | 10.000000 |  --> 10 seconds
+-----------------+-----------+
1 row in set (0.01 sec)

When I run pt-query-digest, I see in the summary report that 80% of the queries have come from just three query patterns.

# Profile
# Rank Query ID                      Response time    Calls R/Call   V/M
# ==== ============================= ================ ===== ======== =====
#    1 0x7B92A64478A4499516F46891... 13446.3083 56.1%   102 131.8266  3.83 SELECT performance_schema.events_statements_history
#    2 0x752E6264A9E73B741D3DC04F...  4185.0857 17.5%    30 139.5029  0.00 SELECT table1
#    3 0xAFB5110D2C576F3700EE3F7B...  1688.7549  7.0%    13 129.9042  8.20 SELECT table2
#    4 0x6CE1C4E763245AF56911E983...  1401.7309  5.8%    12 116.8109 13.45 SELECT table4
#    5 0x85325FDF75CD6F1C91DFBB85...   989.5446  4.1%    15  65.9696 55.42 SELECT tbl1 tbl2 tbl3 tbl4
#    6 0xB30E9CB844F2F14648B182D0...   420.2127  1.8%     4 105.0532 12.91 SELECT tbl5
#    7 0x7F7C6EE1D23493B5D6234382...   382.1407  1.6%    12  31.8451 70.36 INSERT UPDATE tbl6
#    8 0xBC1EE70ABAE1D17CD8F177D7...   320.5010  1.3%     6  53.4168 67.01 REPLACE tbl7
#   10 0xA2A385D3A76D492144DD219B...   183.9891  0.8%    18  10.2216  0.00 UPDATE tbl8
#      MISC 0xMISC                     948.6902  4.0%    14  67.7636   0.0 <10 ITEMS>

Query #1 is generated by the qan-agent from PMM and runs approximately once a minute. These results will be handed over to PMM Server. Similarly queries #2 & #3 are pretty simple. I mean, they scan just one row and will return either zero or one rows. They also use indexing, which makes me think that this is not because of something just with in MySQL. I wanted to know if I could find any common aspect of all these occurrences.

Let’s take a closer look at the queries recorded in slow query log.

# grep -B3 DIGEST mysql-slow_Oct2nd_4th.log
....
....
# User@Host: ztrend[ztrend] @ localhost []  Id: 6431601021
# Query_time: 139.279651  Lock_time: 64.502959 Rows_sent: 0  Rows_examined: 0
SET timestamp=1538524947;
SELECT DIGEST, CURRENT_SCHEMA, SQL_TEXT FROM performance_schema.events_statements_history;
# User@Host: ztrend[ztrend] @ localhost []  Id: 6431601029
# Query_time: 139.282594  Lock_time: 83.140413 Rows_sent: 0  Rows_examined: 0
SET timestamp=1538524947;
SELECT DIGEST, CURRENT_SCHEMA, SQL_TEXT FROM performance_schema.events_statements_history;
# User@Host: ztrend[ztrend] @ localhost []  Id: 6431601031
# Query_time: 139.314228  Lock_time: 96.679563 Rows_sent: 0  Rows_examined: 0
SET timestamp=1538524947;
SELECT DIGEST, CURRENT_SCHEMA, SQL_TEXT FROM performance_schema.events_statements_history;
....
....

Now you can see two things.

  • All of them have same Unix timestamp
  • All of them were spending more than 70% of their execution time waiting for some lock.

Analyzing the data from pt-query-digest

Now I want to check if I can group the count of queries based on their time of execution. If there are multiple queries at a given time captured into the slow query log, time will be printed for the first query but not all. Fortunately, in this case I can rely on the Unix timestamp to compute the counts. The timestamp is gets captured for every query. Luckily, without a long struggle, a combination of grep and awk utilities have displayed what I wanted to display.

# grep -A1 Query_time mysql-slow_Oct2nd_4th.log | grep SET | awk -F "=" '{ print $2 }' | uniq -c
2   1538450797;
1   1538524822;
3   1538524846;
7   1538524857;
167 1538524947;   ---> 72% of queries have happened at this timestamp.
1   1538551813;
3   1538551815;
6   1538602215;
1   1538617599;
33  1538631015;
1   1538631016;
1   1538631017;

You can use the command below to check the regular date time format of a given timestamp. So, Oct 3, 05:32 is when there was something wrong on the server:

# date -d @1538524947
Wed Oct 3 05:32:27 IST 2018

Query tuning can be carried out alongside this, but identifying such spots helps avoiding spending time on query tuning where badly written queries are not the problem. Having said that, from this point, further troubleshooting may take different sub paths such as checking log files at that particular time, looking at CPU reports, reviewing past pt-stalk reports if set up to run in the background, and dmesg etc. This approach is useful for identifying at what time (or time range) MySQL was more stressed just using slow query log when no robust monitoring tools, like Percona Monitoring and Management (PMM), are deployed.

Using PMM to monitor queries

If you have PMM, you can review Query Analytics to see the topmost slow queries, along with details like execution counts, load etc. Below is a sample screen copy for your reference:

Slow query log from PMM dashboard

NOTE: If you use Percona Server for MySQL, slow query log can report time in micro seconds. It also supports extended logging of  other statistics about query execution. These provide extra power to see the insights of query processing. You can see more information about these options here.

Oct
11
2018
--

How to Fix ProxySQL Configuration When it Won’t Start

restart ProxySQL config

restart ProxySQL configWith the exception of the three configuration variables described here, ProxySQL will only parse the configuration files the first time it is started, or if the proxysql.db file is missing for some other reason.

If we want to change any of this data we need to do so via ProxySQL’s admin interface and then save them to disk. That’s fine if ProxySQL is running, but what if it won’t start because of these values?

For example, perhaps we accidentally configured ProxySQL to run on port 3306 and restarted it, but there’s already a production MySQL instance running on this port. ProxySQL won’t start, so we can’t edit the value that way:

2018-10-02 09:18:33 network.cpp:53:listen_on_port(): [ERROR] bind(): Address already in use

We could delete proxysql.db and have it reload the configuration files, but that would mean any changes we didn’t mirror into the configuration files will be lost.

Another option is to edit ProxySQL’s database file using sqlite3:

[root@centos7-pxc57-4 ~]# cd /var/lib/proxysql/
[root@centos7-pxc57-4 proxysql]# sqlite3 proxysql.db
sqlite> SELECT * FROM global_variables WHERE variable_name='mysql-interfaces';
mysql-interfaces|127.0.0.1:3306
sqlite> UPDATE global_variables SET variable_value='127.0.0.1:6033' WHERE variable_name='mysql-interfaces';
sqlite> SELECT * FROM global_variables WHERE variable_name='mysql-interfaces';
mysql-interfaces|127.0.0.1:6033

Or if we have a few edits to make we may prefer to do so with a text editor:

[root@centos7-pxc57-4 ~]# cd /var/lib/proxysql/
[root@centos7-pxc57-4 proxysql]# sqlite3 proxysql.db
sqlite> .output /tmp/global_variables
sqlite> .dump global_variables
sqlite> .exit

The above commands will dump the global_variables table into a file in SQL format, which we can then edit:

[root@centos7-pxc57-4 proxysql]# grep mysql-interfaces /tmp/global_variables
INSERT INTO “global_variables” VALUES(‘mysql-interfaces’,’127.0.0.1:3306’);
[root@centos7-pxc57-4 proxysql]# vim /tmp/global_variables
[root@centos7-pxc57-4 proxysql]# grep mysql-interfaces /tmp/global_variables
INSERT INTO “global_variables” VALUES(‘mysql-interfaces’,’127.0.0.1:6033’);

Now we need to restore this data. We’ll use the restore command to empty the table (as we’re restoring from a missing backup):

[root@centos7-pxc57-4 proxysql]# sqlite3 proxysql.db
sqlite> .restore global_variables
sqlite> .read /tmp/global_variables
sqlite> .exit

Once we’ve made the change, we should be able to start ProxySQL again:

[root@centos7-pxc57-4 proxysql]# /etc/init.d/proxysql start
Starting ProxySQL: DONE!
[root@centos7-pxc57-4 proxysql]# lsof -I | grep proxysql
proxysql 15171 proxysql 19u IPv4 265881 0t0 TCP localhost:6033 (LISTEN)
proxysql 15171 proxysql 20u IPv4 265882 0t0 TCP localhost:6033 (LISTEN)
proxysql 15171 proxysql 21u IPv4 265883 0t0 TCP localhost:6033 (LISTEN)
proxysql 15171 proxysql 22u IPv4 265884 0t0 TCP localhost:6033 (LISTEN)
proxysql 15171 proxysql 23u IPv4 266635 0t0 TCP *:6032 (LISTEN)

While you are here

You might enjoy my recent post Using ProxySQL to connect to IPV6-only databases over IPV4

You can download ProxySQL from Percona repositories, and you might also want to check out our recorded webinars that feature ProxySQL too.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com