Oct
26
2015
--

State of Percona Server 5.6, MySQL 5.6 and MySQL 5.7 RC

Benchmarking PS for MySQL This week Oracle will release MySQL 5.7 GA, so it’s a perfect time to do a quick review of the current state of Percona Server 5.6.26, MySQL 5.6.26 and MySQL-5.7.8 RC. We used two boxes from our benchmark lab for this:
– Box 1: 16 Cores+HT (32 virt cores)/fast PCIe ssd card/RAM: 192GB
– Box 2: 24 Cores+HT(48 virt cores)/fast PCIe ssd card/RAM: 128GB

Dataset: sysbench/uniform, 32 tables with 12M rows each, ~95GB
Tests: sysbench – point select, oltp read only, oltp read/write
Test sequence: start server, warmup, series of the tests (each lasts 5 minutes) from 1 to 4096 threads

Tests were run for two setups:
– CPU bound (in memory) – innodb_buffer_pool_size=100GB
– IO bound – innodb_buffer_pool_size=25GB

5657.cpubound.ct.sm01.uniform

5657.iobound.ct.sm01.uniform

Observations:

* CPU bound
– It’s clear that MySQL 5.7 RC, in both read-only scenarios (adhoc and transaction), outperforms MySQL 5.6/Percona Server 5.6 and scales very well up to 4k threads, especially on Box 2 with 48 cores. It shows great improvements over 5.6 in the read only scalability area. In the read-write scenario there are still some problems with the 5.7 RC. It shows a stable result on the 16 core box, but notably degrades for high threads on Box 2 with 48 cores. Percona Server 5.6 is OK up to 1024/2048 threads for both types of boxes, and then tps drops as well. MySQL 5.6 in this test scales up to 512 threads only and then tps dramatically decreases.

In general, in the CPU-bound scenario, 5.7 RC on Box 1 with 16 cores showed a bit worse results than 5.6. It looks like it is limited by something, and this may require additional analysis. We will recheck that after GA.

* IO bound
– Again, 5.7 RC shines in read-only scenarios. For Box 1 with 32 cores, Percona Server 5.6 competes with 5.7 RC, but on Box 2 with 48 cores the difference is quite notable with higher threads. Read/write workload in the IO-bound scenario is the most problematic case for 5.7 – it shows an almost similar pattern to MySQL 5.6 on Box 1 and is slightly better on Box 2. We have checked that case with Performance Schema for all 3 servers on each box and according to that, (see charts below) the most notable waits for 5.7 are caused by a doublewrite mutex. MySQL 5.6 is affected by contention of the buffer pool mutex and for Percona Server 5.6 log_sys mutex is the hottest one.

blog.ct.pse.synch.ps56.m57.m56

blog.sm01.pse.synch.ps56.m57.m56

Charts with mutex info above are for the OLTP_RW test for the runs with 64 and 1024 threads for Percona Server 5.6.26/MySQL 5.7.8/MySQL 5.6.26
mysql server settings

innodb_log_file_size=10G
innodb_doublewrite=1
innodb_flush_log_at_trx_commit=1
innodb_buffer_pool_instances=8
innodb_change_buffering=none
innodb_adaptive_hash_index=OFF
innodb_flush_method=O_DIRECT
innodb_flush_neighbors=0
innodb_read_io_threads=8
innodb_write_io_threads=8
innodb_lru_scan_depth=8192
innodb_io_capacity=15000
innodb_io_capacity_max=25000
loose-innodb-page-cleaners=4
table_open_cache_instances=64
table_open_cache=5000
loose-innodb-log_checksum-algorithm=crc32
loose-innodb-checksum-algorithm=strict_crc32
max_connections=50000
skip_name_resolve=ON
loose-performance_schema=ON
loose-performance-schema-instrument='wait/synch/%=ON',

The post State of Percona Server 5.6, MySQL 5.6 and MySQL 5.7 RC appeared first on MySQL Performance Blog.

Aug
24
2015
--

Advanced Query Tuning in MySQL 5.6 and MySQL 5.7 Webinar: Q&A

Thank you for attending my July 22 webinar titled “Advanced Query Tuning in MySQL 5.6 and 5.7” (my slides and a replay available here). As promised here is the list of questions and my answers (thank you for your great questions).

Q: Here is the explain example:

mysql> explain extended select id, site_id from test_index_id where site_id=1
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test_index_id
         type: ref
possible_keys: key_site_id
          key: key_site_id
      key_len: 5
          ref: const
         rows: 1
     filtered: 100.00
        Extra: Using where; Using index

why is site_id a covered index for the query, given the fact that a) we are selecting “id”, b) key_site_id only contains site_id?

As the table is InnoDB, all secondary keys will always contain primary key (“id”); in this case the secondary index will contain all needed information to satisfy the above query and key_site_id will be “covered index”

Q: Applications change over time. Do you suggest doing a periodic analysis of indexes that are being used and drop the ones that are not? If yes, any suggestions as to tackle that?

Yes, that is a good idea. Usually it can be done easily with Percona toolkit or Performance_schema in MySQL 5.6

  1. Enable slow query log and log every query, then use Pt-index-usage tool
  2. Or use the following query (as suggested by FromDual blog post):
SELECT object_schema, object_name, index_name
  FROM performance_schema.table_io_waits_summary_by_index_usage
 WHERE index_name IS NOT NULL
   AND count_star = 0
 ORDER BY object_schema, object_name;

Q: Does the duplicate index is found on 5.6/5.7 will that causes an performance impact to the db while querying?

Duplicate keys can have negative impact on selects:

  1. MySQL can get confused and choose a wrong index
  2. Total index size can grow, which can cause MySQL to run out of RAM

Q: What is the suggested method to measure performance on queries (other than the slow query log) so as to know where to create indexes?

Slow query log is most common method. In MySQL 5.6 you can also use Performance Schema and use events_statements_summary_by_digest table.

Q: I’m not sure if this was covered in the webinar but… are there any best-practices for fulltext indexes?

That was not covered in this webinar, however, I’ve done a number of presentations regarding Full Text Indexes. For example: Creating Geo Enabled Applications with MySQL 5.6

Q: What would be the limit on index size or number of indexes you can defined per table?

There are no limits on Index size on disk, however, it will be good (performance wise) to have active indexes fit in RAM.

In InnoDB there are a number of index limitations, i.e. a table can contain a maximum of 64 secondary indexes.

Q:  If a table has two columns you would like to sum, can you have that sum indexed as a calculated index? To add to that, can that calculated index have “case when”?

Just to clarify, this is only a feature of MySQL 5.7 (not released yet).

Yes, it is documented now:

CREATE TABLE triangle (
  sidea DOUBLE,
  sideb DOUBLE,
  sidec DOUBLE AS (SQRT(sidea * sidea + sideb * sideb))
);

Q: I have noticed that you created indexes on columns like DayOfTheWeek with very low cardinality. Shouldn’t that be a bad practice normally?

Yes, you are right! Unless, you are doing queries like “select count(*) from … where DayOfTheWeek = 7” those indexes may not be very useful.

Q: I saw an article that if you don’t specify a primary key upfront mysql / innodb creates one in the background (hidden). Is it different from a primary key itself, if most of the where fields that are used not in the primary / semi primary key? And is there a way to identify the tables with the hidden primary key indexes?

The “hidden” primary key will be 6 bytes, which will also be appended (duplicated) to all secondary keys. You can create an INT primary key auto_increment, which will be smaller (if you do not plan to store more than 4 billion rows). In addition, you will not be able to use the hidden primary key in your queries.

The following query (against information_schema) can be used to find all tables without declared primary key (with “hidden” primary key):

SELECT tables.table_schema, tables.table_name, tables.table_rows
FROM information_schema.tables
LEFT JOIN (
  SELECT table_schema, table_name
  FROM information_schema.statistics
  GROUP BY table_schema, table_name, index_name
  HAVING
    SUM(
      CASE WHEN non_unique = 0 AND nullable != 'YES' THEN 1 ELSE 0 END
    ) = COUNT(*)
) puks
ON tables.table_schema = puks.table_schema AND tables.table_name = puks.table_name
WHERE puks.table_name IS NULL
AND tables.table_type = 'BASE TABLE' AND engine='InnoDB'

You may also use mysql.innodb_index_stats table to find rows with the hidden primary key:

Example:

mysql> select * from mysql.innodb_index_stats;
+---------------+------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name      | last_update         | stat_name    | stat_value | sample_size | stat_description                  |
+---------------+------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
| test          | t1         | GEN_CLUST_INDEX | 2015-08-08 20:48:23 | n_diff_pfx01 | 96         | 1           | DB_ROW_ID                         |
| test          | t1         | GEN_CLUST_INDEX | 2015-08-08 20:48:23 | n_leaf_pages | 1          | NULL        | Number of leaf pages in the index |
| test          | t1         | GEN_CLUST_INDEX | 2015-08-08 20:48:23 | size         | 1          | NULL        | Number of pages in the index      |
+---------------+------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+

Q: You are using the alter table to create index, but how does mysql sort the data for creating the index? isn’t it uses temp table for that?

That is a very good question: the behavior of the “alter table … add index” has changed over time. As documented in Overview of Online DDL:

Historically, many DDL operations on InnoDB tables were expensive. Many ALTER TABLE operations worked by creating a new, empty table defined with the requested table options and indexes, then copying the existing rows to the new table one-by-one, updating the indexes as the rows were inserted. After all rows from the original table were copied, the old table was dropped and the copy was renamed with the name of the original table.

MySQL 5.5, and MySQL 5.1 with the InnoDB Plugin, optimized CREATE INDEX and DROP INDEX to avoid the table-copying behavior. That feature was known as Fast Index Creation

When MySQL uses “Fast Index Creation” operation it will create a set of temporary files in MySQL’s tmpdir:

To add a secondary index to an existing table, InnoDB scans the table, and sorts the rows using memory buffers and temporary files in order by the values of the secondary index key columns. The B-tree is then built in key-value order, which is more efficient than inserting rows into an index in random order.

Q: How good is InnoDB deadlocks on 5.7 comparing to 5.6 version. Is that based on parameters setup?

InnoDB deadlocks discussion is outside of the scope of this presentation. Valerii Kravchuk and Nilnandan Joshi did an excellent talk at Percona Live 2015 (slides available): Understanding Innodb Locks and Deadlocks

Q: What is the performance impact of generating a virtual column for a table having 66 Million records and generating the index. And how would you go about it? Do you have any suggestions on how to re organize indexes on the physical disk?

As MySQL 5.7 is not released yet, behavior of the virtual columns may change.  The main question here is: will it be online operations to a) add a virtual column (as this is only metadata change it should be very light operation anyway). b) add index on that virtual column. In the labs released it was not online, however this can change.

Thank you again for attending.

The post Advanced Query Tuning in MySQL 5.6 and MySQL 5.7 Webinar: Q&A appeared first on Percona Data Performance Blog.

Aug
19
2015
--

How much could you benefit from MySQL 5.6 parallel replication?

I have heard this question quite often: “At busy times, our replicas start lagging quite frequently. We are using N schemas, so which performance boost could we expect from MySQL 5.6 parallel replication?” Here is a quick way to give you a rough estimate of the potential benefit.

General idea

In MySQL 5.6, parallelism is added at the schema level. So in theory, if you have N schemas and if you use N parallel threads, replication could be up to N times faster. This assumes at least 2 things:

  • Replication throughput scales linearly with the number of parallel threads.
  • Writes are evenly distributed across schemas.

Both assumptions are of course not realistic. But it is easy to know the distribution of writes, and that can already give you an idea about how much you could benefit from parallel replication.

Writes are stored in binary logs but it is much easier to work with the slow query log, so we can enable full slow query logging for some time with long_query_time = 0 and then use pt-query-digest to analyze the resulting log file.

An example

I have a test server with 3 schemas, and I’ve run some sysbench load on it to get a decent slow query log file. Once done, I can run this command:

pt-query-digest --filter '$event->{arg} !~ m/^select|^set|^commit|^show|^admin|^rollback|^begin/i' --group-by db --report-format profile slow_query.log > digest.out

and here is the result I get:

# Profile
# Rank Query ID Response time  Calls  R/Call V/M   Item
# ==== ======== ============== ====== ====== ===== ====
#    1 0x       791.6195 52.1% 100028 0.0079  0.70 db3
#    2 0x       525.1231 34.5% 100022 0.0053  0.68 db1
#    3 0x       203.4649 13.4% 100000 0.0020  0.64 db2

In a perfect world, with 3 parallel threads and if each schema would handle 33% of the total write workload, I could expect a 3x performance improvement.

However here we can see in the report that the 3 replication threads will only work simultaneously 25% of the time in the best case (13.4/52.1 = 0.25). We can also expect 2 replication threads to work simultaneously for some part of the workload, but let’s ignore that for clarity.

It means that instead of the theoretical 200% performance improvement (3 parallel threads 100% of the time), we can hardly expect more than a 50% performance improvement (3 parallel threads 25% of the time). And the reality is that the benefit will be much lower than that.

Conclusion

Parallel replication in MySQL 5.6 is a great step forward, however don’t expect too much if your writes are not evenly distributed across all your schemas. The pt-query-digest trick I shared can give you a rough idea whether your workload is a good fit for multi-threaded slaves in 5.6.

I’m expecting much better results for 5.7, partly because parallelism is handled differently, but also because you can tune how efficient parallel replication will be by adjusting the binlog group commit settings.

The post How much could you benefit from MySQL 5.6 parallel replication? appeared first on MySQL Performance Blog.

Aug
18
2015
--

Featured Talk: The Future of Replication is Today: New Features in Practice

In the past years, both MySQL 5.6, MySQL 5.7 and MariaDB 10 have been successful implementing new features. For many DBAs, the “old way” of replicating data is comfortable so taking the action to implement these new features seems like a momentous leap rather then a simple step. But perhaps it isn’t that complicated…

Giuseppe Maxia, a Quality Assurance Architect at VMware and loyal member of the Percona Live Confepercona-2015DSC_4112rence Committee will be presenting “The Future of Replication is Today: New Features in Practice” at the Percona Live Data Performance Conference this September in Amsterdam.
Percona’s Community Manager, Tom Diederich had an opportunity to catch up with Giuseppe last week and get an in-depth look at some of the items Giuseppe will be covering in his talk in addition to getting his take on some of the hot sessions to hit while at the conference.  This is how it went:

(Hint: Read to the end to find a special discount code) 

 

Tom: Your talk is titled, “The Future of Replication is today: new features in practice.” What are the top 3 areas in which replication options have improved in MySQL 5.6, MySQL 5.7, and MariaDB 10?
Giuseppe: Replication has been stagnant for over 10 years. Before MySQL 5.6, the only important change in the technology was the introduction of row-based replication in 2008. After that, we had to wait till 2013 to see global transaction identifiers in MySQL 5.6, followed by the same feature, with different implementation in 2014 with MariaDB 10. GTID has been complemented, in both flavors, with crash-safe replication tables, which is a feature that guarantees a reliable resume of replication after a server failure. There is also the parallel applier, a minor feature that has been implemented in both MySQL 5.6 and MariaDB, and improved in latest versions, although it seems to lack proper support for monitoring. The last feature that was introduced in MySQL 5.6 and MariaDB 10 is multi-source replication, i.e. the ability of replicating from multiple masters to a single slave. In both editions, the implementation is quite simple, and not so different from what DBAs are used to do for regular replication.
Tom: For DBAs, how difficult will it be to make the change from the “old way” of replicating data — to stop using the same comfortable features that have been around for several years — and put into practice some of the latest features?
Giuseppe: The adoption of new features can be deceptively simple. For example, GTID in MariaDB comes out of the box and its adoption could be as easy as running a backup followed by a restore, but it can produce unpleasant results if you try to combine this feature with multi-source replication without planning ahead. That said, the transition could be simpler than its counterpart in MySQL.
MySQL 5.6 and 5.7 require some reconfiguration to run GTID, and users can face unpleasant failures due to the complexity of the rules applying to this feature. They will need to read the manual thoroughly and test the deployment extensively before trusting an upgrade in production.
For multi-source replication, the difficulties are, in my experience, hidden in the users expectations. When speaking about multi-source (or multi-masters, as it is commonly referred to), many users have the mistaken expectation that they can easily insert anything in multiple masters as if they were doing it in a single server. However, the nature of asynchronous replication and the current implementation of multi-source topologies do not handle conflicts, and this fact will probably surprise and anger the early adopters.
Tom: What is still missing in replication technology? How can MySQL improve?
Giuseppe: There are two areas where the current implementation is lacking. The first one is monitoring data: while new features have been adding up to replication, there is not enough effort made to cover the monitoring needs. The current way of monitoring replication is hard-wired around the original replication feature, and little has been done to give the users a deeper view of what is going on. With the latest releases at our disposal, we can run parallel replication using multiple masters, and yet we have very little visibility on what goes on inside the dozen of threads that the new features can unchain inside a single slave. It’s like driving a F1 racing car with the dashboard of a Ford model-T. MySQL 5.7 has moved a few steps in that direction, with the new replication tables in performance_schema, but it is still a drop in the ocean compared to what we need.
The second area where replication is still too much tied with its past is in heterogeneous replication. While relational databases are still dominating the front-end of the web economy, its back-end is largely being run by different structures, such as Hadoop, MongoDB, Cassandra. Moving data back and forth between the relational storage and its growing siblings has become an urgent need. There have been a few sparks of change in this direction, but nothing that can qualify as promising changes.
Tom: Which other session(s) are you most looking forward to besides your own?
Giuseppe: I am always interested in the sessions that explain and discuss new features. I am most interested in the talks by Oracle engineers, who have been piling up many features in the latest years, and I am sure they have something more up their sleeve that will appear at the conference. I also attend eagerly sessions about complementary tools, which are usually highly educational and often give me more ideas.

Want to read more on the topic? Visit Giuseppe’s blog:

 MySQL Replication Monitoring 101

The Percona Live Data Performance Conference is the premier event for the rich and diverse MySQL, NoSQL and data in the cloud ecosystems in Europe. It is the place to be for the open source community as well as businesses that thrive in the MySQL, NoSQL, cloud, big data and IoT (Internet of Things) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

This year’s conference will feature one day of tutorials and two days of keynote talks and breakout sessions related to MySQL, NoSQL and Data in the Cloud. Attendees will get briefed on the hottest topics, learn about building and maintaining high-performing deployments and hear from top industry leaders.

The Percona Live Europe Data Performance Conference will be September 21-23 at the Mövenpick Hotel Amsterdam City Centre.

Register using code “FeaturedTalk” and save 20 euros off of registration!

Hope to see you in Amsterdam!

The post Featured Talk: The Future of Replication is Today: New Features in Practice appeared first on MySQL Performance Blog.

Dec
24
2014
--

InnoDB crash recovery speed in MySQL 5.6

It has been a while since I have looked at InnoDB crash recovery. A lot has change in the last few years – we have serious crash recovery performance improvements in MySQL 5.5 and MySQL 5.6, we have solid state drives raising as typical high performance IO subsystem and we also have the ability to set much larger log files and often have a much larger InnoDB Buffer Pool to work with.

First let me revisit the challenge with have with InnoDB configuration. For write-intensive workloads it is extremely important to size innodb_log_file_size for good performance, however the longer log file size you have the longer you might have to wait for InnoDB to complete crash recovery, which impacts your recovery strategy.

How much can innodb_log_file_size impact performance? Massively! Doing intensive writes to a database that well fits in memory, I’d say there’s a 10x difference between having combined size of log files of 32GB vs 64MB.

Before we look at some performance numbers let’s talk about what InnoDB Crash Recovery time depends on:

Combined Innodb Log File Size innodb_log_file_size*innodb_log_files_in_group is what really matters. It does not really matter which of those two you change. I prefer to keep innodb_log_files_in_group as default and only work with innodb_log_file_size. The larger size you have allocated the longer recovery will take.

innodb_checkpoint_age – Combined size of InnoDB log files defines how many changes not reflected in the tablespace we may have where innodb_checkpoint_age shows how much changes we actually have at the current moment, being an actual driving factor of recovery time. If you have very large log files allocated but for your workload innodb_checkpoint_age stays low chances are recovery will be quick. Be careful however – intensive writes can cause innodb_checkpoint_age to go much higher than the average for your workload causing recovery time from crashes at that time to be much longer.

Innodb Buffer Pool Size – This is another very important factor. During recovery, InnoDB has to redo changes to the unflushed/dirty pages from buffer pool, which is obviously limited by buffer pool size. This also means innodb_max_dirty_pages_pct can be used to impact recovery speed. This is the number of dirty pages being the true driving factor. With small buffer pool, a limited number of dirty pages based on the workload you might not have innodb_checkpoint_age to go high even if you have allowed for large log space.

Data Structure matters a lot for recovery speed. Generally shorter rows being updated will mean longer recovery time for the same log file size. This should make sense as shorter row changes means there is less log space produced for the same amount of page changes. If you do a lot of blob writes InnoDB crash recovery can be short even with relatively large log files.

Access Pattern is another key factor – the more “random” access is the more distinct pages you will have touched during the same innodb_checkpoint_age the longer recovery can take.

Hardware – Better hardware means recovery goes faster, as much is obvious. More specifically you will be looking for storage performance at low concurrency (both reads and writes are important) as well as fast CPU cores – crash recovery at this point is not able to use multiple cores effectively.

Let’s now look at the test….
I am running Sysbench on an 11GB table, designed to fit in the 12GB buffer pool. Here is the exact command:

sysbench  --tx-rate=4000 --num-threads=64 --report-interval=10 --max-time=0 --max-requests=0 --rand-type=uniform --oltp-table-size=40000000 --mysql-user=root --mysql-password=password  --test=/usr/share/doc/sysbench/tests/db/update_index.lua run

The box is rather low end i3-4010U (4 threads) CPU with a Samsung EVO 840GB SSD, so numbers are expected to be higher on real server hardware.

In my first test I’m injecting 4000 updates/sec which is about half of what the box can do at sustained load. I do this to illustrate more common load scenario as we rarely run systems at their saturation point in real world. The uniform distribution should mean worse case scenarios for in-memory workloads through I think recovery speed would be slower if u use random writes to the database much larger than the amount of memory.

At this workload I’m getting innodb_checkpoint_age of 15GB even though total log file size is 32GB. Crashing the system makes for about 40 minutes recovery time so the log was processed at the 6.25MB/sec

Here are some interesting graphs:

As you can see recovery is essentially close to single core. It also can be CPU bound at times (and will be more so with faster storage) – at certain times of recovery when logs are being scanned it can be completely CPU bound (see how IO wait essentially goes to zero at some times)

Over time as recovery progresses more and more blocks become cached, so they do not have to be read from the disk for log records to be applied, meaning the workload becomes more and more write bound.

This is an unweighted IO utilization graph where 1000 corresponds to 100% of time where at least one IO request was outstanding. As you can see from this and the previous drive, InnoDB does not keep the IO well saturated all the time during crash recovery.

Additionally to the first sysbench crash test I did two more – one running a system completely saturated with updates. This made innodb_checkpoint_age to go as high as 23.2GB and crash recovery took 1h 20 minutes, showing some 4.8MB/sec The thing to consider in this case is that MySQL was not able to keep up with purging the history so it was growing quickly meaning crash recovery had to cover a lot of undo space modifications.

Finally I also did a run with a more skewed pareto distribution which resulted in 9.8G innodb_checkpoint_age 33min crash recovery time and 4.94MB/sec of log processing speed.

As I explained above there are a lot of moving parts so your numbers are likely to be quite different, yet I hope this can provide some reasonable baseline you can use for calculation.

Note also waiting for the server to recover from the crash is only one way to deal with recovery. Even if you size log files to be very small you will likely need to deal with Operating System boot and when warmup which will take a few minutes. It is often much better to use a different primary method of crash recovery, such as failover to the MySQL Replication Slave or using Percona XtraDB Cluster. If you use these methods you can often use quite a high combined InnoDB log file size to optimize for performance.

Final Thoughts: Even though InnoDB Crash Recovery has improved in MySQL 5.5 and MySQL 5.6 there is still room to improve it even more. As we see from the resource usage graphs during recovery there is an opportunity to both use multiple CPU cores more effectively as well as drive IO subsystem with higher concurrency and in more sustained fashion.

The post InnoDB crash recovery speed in MySQL 5.6 appeared first on MySQL Performance Blog.

Nov
25
2014
--

(More) Secure local passwords in MySQL 5.6 and up

I log into a lot of different servers running MySQL and one of the first things I do is create a file in my home directory called ‘.my.cnf’ with my credentials to that local mysql instance:

[client]
user=root
password=secret

This means I don’t have to type my password in every time, nor am I tempted to include it on the command line with -p and get the dreaded (but completely accurate):

Warning: Using a password on the command line interface can be insecure.

MySQL 5.6 introduces a utility to make this easier and more secure. First, let’s start with a new mysqld instance with a blank root password and make it more secure:

[vagrant@localhost ~]$ mysqladmin -u root password
New password:secret
Confirm new password:secret
[vagrant@localhost ~]$ mysql -u root
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)

Ok, so I’ve added a password, now I want to create my .my.cnf file:

[vagrant@localhost ~]$ mysql_config_editor set --user=root --password
Enter password:secret
[vagrant@localhost ~]$ mysql
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 10
Server version: 5.6.21-70.0 Percona Server (GPL), Release 70.0, Revision 688
Copyright (c) 2009-2014 Percona LLC and/or its affiliates
Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql>

What did ‘mysql_config_editor set’ actually do? It creates a .mylogin.cnf file (which stands in for a .my.cnf) in my home directory that contains my credentials, just in encrypted form:

[vagrant@localhost ~]$ ls -alh .my*
-rw-------. 1 vagrant vagrant 100 Nov 19 16:20 .mylogin.cnf
-rw-------. 1 vagrant vagrant  29 Nov 19 16:20 .mysql_history
[vagrant@localhost ~]$ cat .mylogin.cnf
??>NTv?&?S???/?,	>?$%KZ 9i?V?jK?H[???
           k.
[vagrant@localhost ~]$ mysql_config_editor print
[client]
user = root
password = *****

The mysql client picks this up right away and will use it by default. This file has good default filesystem permissions, is local to my homedir, and is a whole lot better than specifying it on the command line or typing it in every time.

This utility also supports a feature called ‘login-path’ wherein you can add multiple mysql logins (perhaps to different servers) and refer to them with the —login-path option in the mysql client:

[vagrant@localhost ~]$ mysql_config_editor set --login-path=remote --host=remote --user=remote --password
Enter password:secure
[vagrant@localhost ~]$ mysql_config_editor print --all
[client]
user = root
password = *****
[remote]
user = remote
password = *****
host = remote
[vagrant@localhost ~]$ mysql --login-path=remote
ERROR 2005 (HY000): Unknown MySQL server host 'remote' (0)

The ‘remote’ host doesn’t exist here, but you get the idea. You can create as many login-paths as you want with varied hostnames, credentials and other login parameters and quickly access them with any client supporting login-path.

Now, how secure is this really?  This isn’t secure from anyone who roots your DB server.  I would say the benefits are more about reducing careless password storage and tidier management of local credentials.

The post (More) Secure local passwords in MySQL 5.6 and up appeared first on MySQL Performance Blog.

Nov
20
2014
--

Sys Schema for MySQL 5.6 and MySQL 5.7

Performance Schema (P_S) has been available since MySQL 5.5, more than 4 years ago. It is still difficult to see production servers with P_S enabled, even with MySQL 5.6 where the feature is enabled by default. There have been several complaints like the overhead, that the fix is a work in progress, and the ease of use. 52 tables to query and 31 configuration variables is enough to scare people.

There is a solution for the second problem, the usability. It’s name is “sys schema“. It is a collection of views, functions and procedures to make P_S more user friendly.

Installation

If you are a MySQL Workbench user the installation is pretty easy because sys schema is already included. You just need to install it. Click on “Performance – Performance Reports” and there you will find the “Install Helper” button that will install sys schema.

sys schema mysql workbench

If you don’t use MySQL Workbench you need to download sys_56.sql or sys_57.sql (depends if you use 5.6 or 5.7) from the github repository. Then, just import the sql file as usual:

mysql -u root -p < ./sys_56.sql

Usage

After the import, you will have a new “sys” schema with some very descriptive table names. Let’s see an example. Do you want to know what tables are using most of our InnoDB buffer memory? Easy:

mysql> select * from sys.innodb_buffer_stats_by_table;
+---------------+--------------------+-----------+-----------+-------+--------------+-----------+-------------+
| object_schema | object_name        | allocated | data      | pages | pages_hashed | pages_old | rows_cached |
+---------------+--------------------+-----------+-----------+-------+--------------+-----------+-------------+
| test          | t                  | 63.61 MiB | 58.06 MiB |  4071 |         4071 |      4071 |     2101222 |
| InnoDB System | SYS_FOREIGN        | 32.00 KiB | 0 bytes   |     2 |            2 |         2 |           0 |
| InnoDB System | SYS_TABLES         | 32.00 KiB | 1.11 KiB  |     2 |            2 |         2 |          10 |
| InnoDB System | SYS_COLUMNS        | 16.00 KiB | 4.68 KiB  |     1 |            1 |         1 |          71 |
| InnoDB System | SYS_DATAFILES      | 16.00 KiB | 324 bytes |     1 |            1 |         1 |           6 |
| InnoDB System | SYS_FIELDS         | 16.00 KiB | 722 bytes |     1 |            1 |         1 |          17 |
| InnoDB System | SYS_INDEXES        | 16.00 KiB | 836 bytes |     1 |            1 |         1 |          12 |
| InnoDB System | SYS_TABLESPACES    | 16.00 KiB | 318 bytes |     1 |            1 |         1 |           6 |
| mysql         | innodb_index_stats | 16.00 KiB | 274 bytes |     1 |            1 |         1 |           3 |
| mysql         | innodb_table_stats | 16.00 KiB | 53 bytes  |     1 |            1 |         1 |           1 |
+---------------+--------------------+-----------+-----------+-------+--------------+-----------+-------------+

Pretty easy and useful, right? You can also get what is the database using more memory in the buffer pool querying innodb_buffer_stats_by_schema.

For each table there is another similar ones that it’s name starts with x$. For example, you have user_summary_by_file_io and x$user_summary_by_file_io. The difference is that the first table has a human readable output values while the second one has the real values. Let’s see an example:

mysql> select * from sys.user_summary_by_file_io;
+------------+-------+------------+
| user       | ios   | io_latency |
+------------+-------+------------+
| root       | 19514 | 2.87 s     |
| background |  5916 | 1.91 s     |
+------------+-------+------------+
2 rows in set (0.00 sec)
mysql> select * from sys.x$user_summary_by_file_io;
+------------+-------+---------------+
| user       | ios   | io_latency    |
+------------+-------+---------------+
| root       | 19514 | 2871847094292 |
| background |  5916 | 1905079715132 |
+------------+-------+---------------+

For humans, at least for me, it is easier to read seconds rather than picoseconds :)

There are multiple tables with very descriptive names.

– io_by_thread_by_latency
– schema_unused_indexes
– statements_with_temp_tables
– statements_with_errors_or_warnings
– user_summary_by_statement_type
– waits_by_host_by_latency

There are lot more, and they are explained with examples in project’s README file.

Configuration

On the MySQL side nothing special is needed. Just enable performance_schema:

performance_schema="on"

sys schema also provides some procedures to enable/disable some features. For example:

– ps_setup_enable_background_threads
– ps_setup_enable_consumers
– ps_setup_enable_instrument
and so on…

We also have the same procedures with “disable”. After you have made the changes you can save them calling ps_setup_save() and reload it later on if you want calling ps_setup_reload_saved(). If you want to reset the configuration to default values just call ps_setup_reset_to_default().

For example, we can check that some consumers are disabled. It is possible to enable them and save the configuration:

mysql> CALL sys.ps_setup_show_disabled_consumers();
+--------------------------------+
| disabled_consumers             |
+--------------------------------+
| events_stages_current          |
| events_stages_history          |
| events_stages_history_long     |
| events_statements_history      |
| events_statements_history_long |
| events_waits_current           |
| events_waits_history           |
| events_waits_history_long      |
+--------------------------------+
mysql> CALL sys.ps_setup_enable_consumers('events');
+---------------------+
| summary             |
+---------------------+
| Enabled 8 consumers |
+---------------------+
mysql> CALL sys.ps_setup_show_disabled_consumers();
Empty set (0.00 sec)
mysql> CALL sys.ps_setup_save(TRUE);

Conclusion

Performance Schema is very useful when we want to know what is happening inside the database. It is getting more features with each new GA and will probably be the single point of information in near future. Now thanks to sys schema it is also easy to use.

The post Sys Schema for MySQL 5.6 and MySQL 5.7 appeared first on MySQL Performance Blog.

Oct
29
2014
--

Facebook MySQL database engineers ready for Percona Live London 2014

With 1.28 billion active users, Facebook MySQL database engineers are active and extremely valuable contributors to the global MySQL community. So naturally they are also active participants of Percona Live MySQL conferences! And next week’s Percona Live London 2014 (Nov. 3-4) is no exception. (Register now and use the promotional code “Facebook” to save £30!)

I spoke with Facebook database engineers Yoshinori “Yoshi” Matsunobu and Shlomo Priymak about their upcoming sessions along with what’s new at Facebook since our last conversation back in April.


Percona Live London 2014Tom: Yoshi, last year Facebook deployed MySQL 5.6 on all production environments – what have you and your team learned since doing that? And do you have a few best practices you could share? I realize you’ll be going into detail during your session in London (MySQL 5.6 and WebScaleSQL at Facebook), but maybe a few words on a couple of the bigger ones?

Yoshi: MySQL 5.6 has excellent replication enhancements to use in large-scale deployments. For example, crash safe slave makes it possible to recover without rebuilding a slave instance on server crash. This can greatly minimize slave downtime, especially if your database size is large. There are many other new features such as GTID, multi-threaded slave, streaming mysqlbinlog and we actively use them in production.

For InnoDB, Online DDL is a good example to ease operations. Many MySQL users are doing schema changes by switching masters. This can minimize downtime but requires operational efforts. Online DDL made things much easier.

Tom: Facebook is an active and extremely valuable part of the overall MySQL community and ecosystem – what are some of the key features and improvements you’ve contributed in the past year since moving to MySQL 5.6?

Yoshi: For InnoDB, I think online defragmentation and faster full table scan are the most valuable contributions from Facebook in 5.6. I have received very positive feedback about faster InnoDB full table scan (Logical ReadAhead). My colleague Rongrong will speak about something interesting regarding online defragmentation at Percona Live London. For Replication, we have done many optimizations to make GTID and MTS work without pain. Semi-Synchronous mysqlbinlog and backported Loss-Less semisync from MySQL 5.7 are very useful when you use Semi-Synchronous replication.

Tom: Shlomo, your sesson, “MySQL Automation at Facebook Scale,” will be of great interest to DBAs at large and growing organizations considering that Facebook has one of the world’s largest MySQL database clusters. What are the two or three most significant things that you’ve learned as a database engineer operating a cluster of this size? And has anything surprised you along the way (so far)?

Shlomo: This is a great question! We like to speak of “10x” at Facebook when thinking of scaling. For example, what would you do differently if the number of servers you had was 10x more than what it is? This type of mental exercise is surprisingly useful when working with systems at scale. If you, or any of the readers, try to extrapolate this about systems you manage, there will be things you’ll be imagining about how a system like this would be – and you won’t be too far from our reality in many aspects.

You’d imagine that we automate much of the single units of work, like master/slave failover, upgrades and schema changes. You’d suspect we have automated fault detection, self managing systems, good alarming and self remediation. You’d presume that if you’re used to running a command on 100 machines, you’ll now be running it on 1000. At least that’s what I thought to myself, so these are not the things that surprised me. There are a few fundamental shifts in one’s thinking when you get to these sizes, which I didn’t foresee.

The first one is that there is absolutely no such thing as “one-off.” If there is a server somewhere that hits a problem every three years, and you have 1000 servers, this will be happening daily! Take it to 10,000 servers, and you can see absolutely nothing is a “one-off”. We can’t write things off as “worst case, I’ll get an SMS.” Whatever it is, we have to chase it down and fix it. Not just that – to deploy a fix at scale can require writing fairly large amounts of code, a fix that could be deployed manually by a DBA in smaller environments.

The second one is adapting to constraints which are very pragmatic and tangible. If you’re on AWS, you’re pretty much isolated from things like worrying where your servers are physically located, when they go over their lifetime, and if the firmware on the switch in the rack needs to be upgraded. If you’re a small shop and have a few racks up in a co-lo, hardware maintenance is just not as frequent, but it becomes more painful as you grow.

At Facebook, we run our own datacenters! We need to work around interesting challenges, such as running datacenters that have highly variable compositions of server hardware. Since we have so many servers, something is always going on. Racks of servers need to be moved. Whole clusters need to be rebuilt or refreshed, to be made better, faster, stronger.  New datacenters are constructed, others decommissioned.

Tom: And this is where automation comes into the picture, right?

Shlomo: We have had to build a lot of automation to make these operations seamless, and we work closely with the Site Ops teams on the ground to coordinate these logistically complicated processes.

Another thing my team does in this space is planning capacity and hardware purchases. Since we build our own servers, the turnaround time between ordering and getting machines is quite long, so proper planning is paramount. Buy too much, and you’ve wasted millions of dollars. Buy too few servers, and there won’t be space for user growth and upcoming projects. The sheer scale makes these decisions more complicated and involved.

These things have actually made my job much more interesting, and I think I’d find it hard to adjust to a smaller environment.

Tom: Last April Facebook announced a move to the newly created WebScaleSQL. Yoshi, do you have an update on where WebScaleSQL is today? And I know it’s early, but has there been any impact on Facebook yet?

Yoshi: WebscaleSQL is a collaboration among engineers from several companies that face similar challenges in running MySQL at scale. Collaboration is nothing new to the MySQL community. The intent is to make this collaboration more efficient.

We are based on the latest upstream (currently MySQL-5.6.21), and added many features. We added patches to improve InnoDB performance around compression LRU flushing, locking, NUMA Support, and doublewrite. We statically link Semi-Sync based on lessons learned at Facebook environments (plugin-lock caused hot mutex contentions). We have many upcoming features such as async clients.

We will continue to track the upstream branch that is the latest, production-ready release (currently MySQL 5.6). We are continuing to push the generally useful changes we have from all of the participants.  If you think you have something to contribute, get in touch!

Tom: I remember being surprised earlier this year when you told me there was usually just one MySQL Operations team member on call at any given time thanks to “robots.” How many robots did your team build and what do they do? Oh, and should rank-and-file DBAs around the world be worried about losing their day jobs to these robots? ;-)

Shlomo: Instead of becoming obsolete as some fear, our team is shifting its focus from smaller to larger problems, as we rise higher in the levels of abstraction. Our team has progressed with the requirements of the role. From being a team of DBAs that automate some of their work, we have become more like Production Engineers. We design, write and maintain MySQL/Facebook-specific automation that does our work for us.

While we build these software “robots” to do our work, we also have to maintain them. The job of the oncall is to fix these robots when they malfunction, and that can sometimes be difficult due to the size of our codebase.

In regards to employment concerns, I’d say our work has become more interesting, and the amount has increased. It definitely did not decrease, so if Facebook is indicative of other companies, jobs are not at risk just yet. Speaking of jobs – if what we’re doing sounds interesting, we’re hiring!

Oh, and as for details about these “robots” – that’s the topic of my talk next week in London. Come and hear me speak if you want to know more!

Tom: Yoshi, you also will host a session titled “Fast Master Failover without Data Loss.” I don’t want to give too much away, but how did you get failover to work at scale – across vast datacenters?

Yoshi: Master failure is a norm at Facebook, because of the large amount of servers. Without automation, it is not realistic for a limited number of people to manage. We have a very interesting infrastructure to automate failure handling at Facebook scale. To automate stuff, reliability is important. Unreliable automation makes engineers spend lots of time fixing things manually, and that increases downtime. It is also important to define what to automate and what we shouldn’t automate. Define failure scenarios and write good test cases and continuously integrate. There are multiple failure scenarios like the ones below and you’ll hear about each in detail at my session:

– mysqld crash
– mysqld stalls
– kernel panic and reboot
– error spikes caused by H/W failure
– error spikes caused by bad application logic
– rack switch down
– multiple rack switches down
– datacenter down

Tom: What other sessions, keynotes or events are you looking forward to at Percona Live London 2014? And are you guys planning on attending the MySQL Community Dinner?

Yoshi:MySQL 5.7: Performance and Scalability Benchmark(led by Oracle MySQL performance architect Dimitri Kravtchuk). And yes, we’re looking forward to meeting with people at MySQL Community Dinner!

Tom:  Thanks again Yoshi and Shlomo for taking the time to speak with me and I look forward to seeing you both in London next week!

Percona Live London 2014 MySQL Community DinnerAnd readers, I invite you to register now for Percona Live London using the promotional code “Facebook” to save £30. I also hope to see you at the MySQL Community Dinner next Monday (Nov. 3). Space is limited so be sure to reserve your spot now and join us aboard our private double-decker bus to the restaurant.

I’d also like to thank the Percona Live London 2014 Conference Committee for putting together a terrific event this year! The conference committee includes:

  • Dailymotion’s Cédric Peintre, conference chairman
  • Percona’s David Busby
  • MariaDB’s Colin Charles
  • ebay Classifieds Group’s Luis Motta Campos
  • Booking.com’s Nicolai Plum
  • Oracle’s Morgan Tocker
  • Spil Games’ Art van Scheppingen

The post Facebook MySQL database engineers ready for Percona Live London 2014 appeared first on MySQL Performance Blog.

Oct
27
2014
--

How to deal with MySQL deadlocks

A deadlock in MySQL happens when two or more transactions mutually hold and request for locks, creating a cycle of dependencies. In a transaction system, deadlocks are a fact of life and not completely avoidable. InnoDB automatically detects transaction deadlocks, rollbacks a transaction immediately and returns an error. It uses a metric to pick the easiest transaction to rollback. Though an occasional deadlock is not something to worry about, frequent occurrences call for attention.

Before MySQL 5.6, only the latest deadlock can be reviewed using SHOW ENGINE INNODB STATUS command. But with Percona Toolkit’s pt-deadlock-logger you can have deadlock information retrieved from SHOW ENGINE INNODB STATUS at a given interval and saved to a file or table for late diagnosis. For more information on using pt-deadlock-logger, see this post. With MySQL 5.6, you can enable a new variable innodb_print_all_deadlocks to have all deadlocks in InnoDB recorded in mysqld error log.

Before and above all diagnosis, it is always an important practice to have the applications catch deadlock error (MySQL error no. 1213) and handle it by retrying the transaction.

How to diagnose a MySQL deadlock

A MySQL deadlock could involve more than two transactions, but the LATEST DETECTED DEADLOCK section only shows the last two transactions. Also it only shows the last statement executed in the two transactions, and locks from the two transactions that created the cycle. What are missed are the earlier statements that might have really acquired the locks. I will show some tips on how to collect the missed statements.

Let’s look at two examples to see what information is given. Example 1:

1 141013 6:06:22
2 *** (1) TRANSACTION:
3 TRANSACTION 876726B90, ACTIVE 7 sec setting auto-inc lock
4 mysql tables in use 1, locked 1
5 LOCK WAIT 9 lock struct(s), heap size 1248, 4 row lock(s), undo log entries 4
6 MySQL thread id 155118366, OS thread handle 0x7f59e638a700, query id 87987781416 localhost msandbox update
7 INSERT INTO t1 (col1, col2, col3, col4) values (10, 20, 30, 'hello')
8 *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
9 TABLE LOCK table `mydb`.`t1` trx id 876726B90 lock mode AUTO-INC waiting
10 *** (2) TRANSACTION:
11 TRANSACTION 876725B2D, ACTIVE 9 sec inserting
12 mysql tables in use 1, locked 1
13 876 lock struct(s), heap size 80312, 1022 row lock(s), undo log entries 1002
14 MySQL thread id 155097580, OS thread handle 0x7f585be79700, query id 87987761732 localhost msandbox update
15 INSERT INTO t1 (col1, col2, col3, col4) values (7, 86, 62, "a lot of things"), (7, 76, 62, "many more")
16 *** (2) HOLDS THE LOCK(S):
17 TABLE LOCK table `mydb`.`t1` trx id 876725B2D lock mode AUTO-INC
18 *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
19 RECORD LOCKS space id 44917 page no 529635 n bits 112 index `PRIMARY` of table `mydb`.`t2` trx id 876725B2D lock mode S locks rec but not gap waiting
20 *** WE ROLL BACK TRANSACTION (1)

Line 1 gives the time when the deadlock happened. If your application code catches and logs deadlock errors,which it should, then you can match this timestamp with the timestamps of deadlock errors in application log. You would have the transaction that got rolled back. From there, retrieve all statements from that transaction.

Line 3 & 11, take note of Transaction number and ACTIVE time. If you log SHOW ENGINE INNODB STATUS output periodically(which is a good practice), then you can search previous outputs with Transaction number to hopefully see more statements from the same transaction. The ACTIVE sec gives a hint on whether the transaction is a single statement or multi-statement one.

Line 4 & 12, the tables in use and locked are only with respect to the current statement. So having 1 table in use does not necessarily mean that the transaction involves 1 table only.

Line 5 & 13, this is worth of attention as it tells how many changes the transaction had made, which is the “undo log entries” and how many row locks it held which is “row lock(s)”. These info hints the complexity of the transaction.

Line 6 & 14, take note of thread id, connecting host and connecting user. If you use different MySQL users for different application functions which is another good practice, then you can tell which application area the transaction comes from based on the connecting host and user.

Line 9, for the first transaction, it only shows the lock it was waiting for, in this case the AUTO-INC lock on table t1. Other possible values are S for shared lock and X for exclusive with or without gap locks.

Line 16 & 17, for the second transaction, it shows the lock(s) it held, in this case the AUTO-INC lock which was what TRANSACTION (1) was waiting for.

Line 18 & 19 shows which lock TRANSACTION (2) was waiting for. In this case, it was a shared not gap record lock on another table’s primary key. There are only a few sources for a shared record lock in InnoDB:
1) use of SELECT … LOCK IN SHARE MODE
2) on foreign key referenced record(s)
3) with INSERT INTO… SELECT, shared locks on source table
The current statement of trx(2) is a simple insert to table t1, so 1 and 3 are eliminated. By checking SHOW CREATE TABLE t1, you could confirm that the S lock was due to a foreign key constraint to the parent table t2.

Example 2: With MySQL community version, each record lock has the record content printed:

1 2014-10-11 10:41:12 7f6f912d7700
2 *** (1) TRANSACTION:
3 TRANSACTION 2164000, ACTIVE 27 sec starting index read
4 mysql tables in use 1, locked 1
5 LOCK WAIT 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1
6 MySQL thread id 9, OS thread handle 0x7f6f91296700, query id 87 localhost ro ot updating
7 update t1 set name = 'b' where id = 3
8 *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
9 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164000 lock_mode X locks rec but not gap waiting
10 Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0
11 0: len 4; hex 80000003; asc ;;
12 1: len 6; hex 000000210521; asc ! !;;
13 2: len 7; hex 180000122117cb; asc ! ;;
14 3: len 4; hex 80000008; asc ;;
15 4: len 1; hex 63; asc c;;
16
17 *** (2) TRANSACTION:
18 TRANSACTION 2164001, ACTIVE 18 sec starting index read
19 mysql tables in use 1, locked 1
20 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1
21 MySQL thread id 10, OS thread handle 0x7f6f912d7700, query id 88 localhost r oot updating
22 update t1 set name = 'c' where id = 2
23 *** (2) HOLDS THE LOCK(S):
24 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164001 lock_mode X locks rec but not gap
25 Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0
26 0: len 4; hex 80000003; asc ;;
27 1: len 6; hex 000000210521; asc ! !;;
28 2: len 7; hex 180000122117cb; asc ! ;;
29 3: len 4; hex 80000008; asc ;;
30 4: len 1; hex 63; asc c;;
31
32 *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
33 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164001 lock_mode X locks rec but not gap waiting
34 Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0
35 0: len 4; hex 80000002; asc ;;
36 1: len 6; hex 000000210520; asc ! ;;
37 2: len 7; hex 17000001c510f5; asc ;;
38 3: len 4; hex 80000009; asc ;;
39 4: len 1; hex 62; asc b;;

Line 9 & 10: The ‘space id’ is tablespace id, ‘page no’ gives which page the record lock is on inside the tablespace. The ‘n bits’ is not the page offset, instead the number of bits in the lock bitmap. The page offset is the ‘heap no’ on line 10,

Line 11~15: It shows the record data in hex numbers. Field 0 is the cluster index(primary key). Ignore the highest bit, the value is 3. Field 1 is the transaction id of the transaction which last modified this record, decimal value is 2164001 which is TRANSACTION (2). Field 2 is the rollback pointer. Starting from field 3 is the rest of the row data. Field 3 is integer column, value 8. Field 4 is string column with character ‘c’. By reading the data, we know exactly which row is locked and what is the current value.

What else can we learn from analysis? Since most MySQL deadlocks happen between two transactions, we could start the analysis based on that assumption. In Example 1, trx (2) was waiting on a shared lock, so trx (1) either held a shared or exclusive lock on that primary key record of table t2. Let’s say col2 is the foreign key column, by checking the current statement of trx(1), we know it did not require the same record lock, so it must be some previous statement in trx(1) that required S or X lock(s) on t2’s PK record(s). Trx (1) only made 4 row changes in 7 seconds. Then you learned a few characteristics of trx(1): it does a lot of processing but a few changes; changes involve table t1 and t2, a single record insertion to t2. These information combined with other data could help developers to locate the transaction.

Where else can we find previous statements of the transactions? Besides application log and previous SHOW ENGINE INNODB STATUS output, you may also leverage binlog, slow log and/or general query log. With binlog, if binlog_format=statement, each binlog event would have the thread_id. Only committed transactions are logged into binlog, so we could only look for Trx(2) in binlog. In the case of Example 1, we know when the deadlock happened, and we know Trx(2) started 9 seconds ago. We can run mysqlbinlog on the right binlog file and look for statements with thread_id = 155097580. It is always good to then cross refer the statements with the application code to confirm.

$ mysqlbinlog -vvv --start-datetime=“2014-10-13 6:06:12” --stop-datatime=“2014-10-13 6:06:22” mysql-bin.000010 > binlog_1013_0606.out

With Percona Server 5.5 and above, you can set log_slow_verbosity to include InnoDB transaction id in slow log. Then if you have long_query_time = 0, you would be able to catch all statements including those rolled back into slow log file. With general query log, the thread id is included and could be used to look for related statements.

How to avoid a MySQL deadlock

There are things we could do to eliminate a deadlock after we understand it.

– Make changes to the application. In some cases, you could greatly reduce the frequency of deadlocks by splitting a long transaction into smaller ones, so locks are released sooner. In other cases, the deadlock rises because two transactions touch the same sets of data, either in one or more tables, with different orders. Then change them to access data in the same order, in another word, serialize the access. That way you would have lock wait instead of deadlock when the transactions happen concurrently.

– Make changes to the table schema, such as removing foreign key constraint to detach two tables, or adding indexes to minimize the rows scanned and locked.

– In case of gap locking, you may change transaction isolation level to read committed for the session or transaction to avoid it. But then the binlog format for the session or transaction would have to be ROW or MIXED.

The post How to deal with MySQL deadlocks appeared first on MySQL Performance Blog.

Sep
05
2014
--

Using MySQL 5.6 Global Transaction IDs (GTIDs) in production: Q&A

Using MySQL Global Transaction IDs in ProductionThank you to all of you who attended my webinar last week about Global Transaction IDs (GTIDs), which were introduced in MySQL 5.6 to make the reconfiguration of replication straightforward. If you missed my webinar, you can still listen to the recording and download the sides (free). We had a lot of questions during the webinar, so let me try to answer them here. Please let me know in the comments if additional clarification is needed.

Q: Does GTID provide any benefit to master-master replication? If yes, how?
Q: Is ACTIVE ACTIVE MASTER MASTER successful in MySQL with GTID?

A: GTIDs don’t change the nature of MySQL replication: for instance it is still asynchronous and if you write on both masters in a master-master setup, there is still no write conflict detection mechanism. The main benefit of GTIDs is that any change of the replication topology is easy because you no longer need to run complex calculations to find the right binlog positions when connecting a slave to a new master.
So master-master replication can be configured with GTIDs, it does not provide a lot of benefits compared to position-based replication as you will never change the replication topology.
And having a setup where both masters receive writes is still not recommended with GTIDs.

Q: Will GTIDs work well with master:standby master? How quick would the failover be?
A: Yes, GTIDs works well with this kind of setup (which is one master and one slave). GTIDs do not provide failover, you will have to use an external tool. Speed of failover then depends on the tool you will use.

Q: For already set up MASTER-MASTER/MASTER-SLAVE Replication, after getting GTID set up, we need to rebuild replication again using AUTO POS=1, correct?
A: Yes, using MASTER_AUTO_POSITION=1 is necessary to indicate that you will use GTID replication. So you will have to run: STOP SLAVE; CHANGE MASTER TO … MASTER_AUTO_POSITION = 1; START SLAVE;

Q: Application having tables from different Engines(InnoDB and MyISAM), how that will handled in GTID?
A: Transactions using both MyISAM and InnoDB tables are not allowed, please refer to the documentation

Q: In a master-slave replication topology (with GTID enabled), how does slave get data from the master if the master’s binary logs are purged given that AUTO_POSITION=1 is used as part of the change master command?
A: This will break replication with error 1236.

Q: Whats the value of show slave status who determines if there is a lag on the slave?
A: This is Seconds_Behind_Master. It’s not always reliable though. For instance if you have a replication setup like A -> B -> C, Seconds_Behind_Master on C will shop the lag relatively to B, not A.

Q: What is the value of saving the history of previous master’s GTIDs executed in the show slave status -> Executed_Gtid_Set?
A: The new replication protocol makes sure that when the slave connects to its master, it sends the range of GTIDs it has already executed. Then the master sends back all other transactions. That’s why Executed_Gtid_Set contains the history of all executed transactions.

Q: We use DB Master and Slave VIPs on our servers, can the mysqlfailover tool also switch the VIP to the new master? Is it scriptable on the event of a failover?
A: Yes you can use extension points to add you own custom scripts with mysqlfailover. See the documentation for –exec-before and –exec-after.

Q: How does mysqlfailover handle brief network instability between the Master and Slaves?
A: mysqlfailover only triggers failover when it suspects the master is no longer alive. So network instability between the master and its slaves won’t affect it for master crash detection. However it can prevent the tool from reconfiguring replication correctly during failover/switchover if one or several slaves are not reachable.

Q: Does Facebook use MySQL with GTID? if yes, which module or all together for everything?
A: I can’t speak for Facebook, but this talk at the MySQL Conference this year suggests that they’re using GTIDs in production. They have added custom code to make GTIDs easier to use.

Q: is GTID_SUBSET function part of MySQL utilities? or we should set a script to regularly detect it? is GTID_SUBSET beneficial in case of an ACTIVE ACTIVE MASTER MASTER setup?
Q: Can you please confirm how to get gtid set in order to use gtid functions?

A: GTID_SUBSET() is a built-in function in MySQL 5.6, you don’t need to install MySQL Utilities to use it. It can be used to easily know whether Executed_Gtid_Set on a given server is a subset of Executed_Gtid_Set on another server, so it can be beneficial to use it in any replication topology.

Q: What is difference between HOLE and BUGS?
A: Holes are not allowed in MySQL 5.6 implementation of GTIDs. So if you see a hole in a GTID sequence, you’re hitting a bug!

Q: Using MySQL utilities, we can set-up replication also using a python script with just one command. Does it automatically takes dump from master to slave and starts replication? If yes so a 300 GB data directory, will it run as background if executed using shell script? Or it just starts the replicationn from current position and won’t take the dump?
A: I think you are talking about mysqlreplicate. This tool only runs CHANGE MASTER TO for you so it doesn’t take a backup of any kind.

Q: Is it possible to use mysqlfailover script at any node(like slave) any time to know which is its MASTER and other SLAVE options also? If no, is this available by some other means?
A: You should probably use mysqlrplshow instead.

Q: As told during limitations of MySQL Utilities on automatic failover, so how can I achieve AUTOMATIC failover if I want this as primary option?
A: The node running mysqlfailover is not highly available so if it is down you lose the ability of doing automatic failover. There are several options if you want to achieve automatic failover: carefully monitor the monitoring node or use solutions like Percona Replication Manager which relies on Pacemaker or Percona XtraDB Cluster which relies on Galera replication.

Q: mysqlrpadmin failover/switchover: Can we execute this command on slave or any other monitoring node?
A: Yes, as long as mysqlrpladmin is installed on a server and if it can connect to the database servers, the command can be executed from anywhere.

Q: If we set slaves to read_only…is that recommended to eliminate the errant transaction?
A: Yes, it is recommended. However it doesn’t prevent users with the SUPER privilege from accidentally writing on a slave.

Q: Can this errant transaction issue be prevented with (active-passive) master-master replication?
A: Master-master replication will make sure that any transaction written on one server will automatically end up being written on the other slave. So it is like avoiding errant transactions. However writing on both masters is not recommended as you can have write conflicts.

Q: Is there any practical advantage of using GTID with MHA?
A: GTIDs provide no way to perform failover, they only simplify how you can reconfigure replication. So using MHA to leverage GTIDs makes sense.

Q: What is diifference GUID and GTID? When do we have to use GUID and GTID?
A: I’m not sure I correctly understand the question. A GTID is made of a source id and a transaction id. The source id is the master’s server_uuid, which is a GUID that is automatically generated when MySQL is started for the first time.

Q: How to check the slave database tables and record? we have to sync data from master to salve database or automatically will be happen?
A: You can use pt-table-checksum and pt-table-sync from Percona Toolkit.

Thanks again for attending the webinar! You can replay it, download the slide – and also access Percona’s vast library of other MySQL webinar recordings here.

The post Using MySQL 5.6 Global Transaction IDs (GTIDs) in production: Q&A appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com