Sep
16
2015
--

Webinar: Introduction to MySQL SYS Schema follow up questions

Thanks to all who attended my webinar Introduction to MySQL SYS Schema. This blog is for me to address the extra questions I didn’t have time to answer on the stream.

Can i have the performance_schema enabled in 5.6 and then install the sys schema? Or they are one and the same?

You need to have enabled the performance_schema in order to use it through the sys schema. They are different entities. In general, performance_schema collects and stores the data, and sys schema reads and presents the data.

The installation of sys schema on primary database will be replicated to the slaves?

By default, no. If you wish that the Sys Schema replicates to the slaves, you can modify the before_setup.sql (https://github.com/MarkLeith/mysql-sys/blob/master/before_setup.sql#L18) to skip the “SET sql_log_bin = 0;

Can MySQL save the slow running query in any table?

Yes it does: https://dev.mysql.com/doc/refman/5.6/en/log-destinations.html

How to see the query execution date & time from events_statements_current/history views in performance_schema?

You can check the performance_schema.events_statements_summary_by_digest table, that have the fields FIRST_SEEN and LAST_SEEN which are both a datetime values.

When the Sys Schema views show certain stats for the queries, is there a execution time range for queries under evaluation or is it like all the queries executed until date?

It’s all the queries executed until date, except when using some of the stored procedures that receive as a parameter a run time value, like “diagnostics” or “ps_trace_statement_digest”

I want to write the automated script to rebuild table or index. How to determine which table(s) or index(es) need to be rebuilt because of high fragmentation ratio?

For this you need to use something completely different. To know the fragmentation inside an InnoDB table i’ll recommend you to use XtraBackup with the –stats parameter https://www.percona.com/doc/percona-xtrabackup/2.2/xtrabackup_bin/analyzing_table_statistics.html

Downside to using? Overhead?

The overhead is the one that comes with using Performance Schema. I like the perspective of this presentation (https://www.percona.com/live/mysql-conference-2014/sessions/performanceschema-worth-overhead): Overhead is dynamic. Do  not  rely  on  other  people’s  benchmarks. Benchmark your application and find out what your overhead is.

What is the performance cost with regards to memory and io when using sys schema? Are there any tweaks or server variables with help the sys schema performing better?

Use only the instrumentation needed. This blog post have extensive info about the topic http://marcalff.blogspot.com.co/2011/06/performance-schema-overhead-tuning.html

For replicate how does sys schema record data?

It doesn’t until MySQL 5.7.2 where the Performance Schema provides tables that expose replication information: https://dev.mysql.com/doc/refman/5.7/en/performance-schema-replication-tables.html and talking about the sys schema, currently the only place where you can find info about replication is in the “diagnostics” procedure, but as you can imagine, it only get’s data if the MySQL version is 5.7.2 or higher.

Is sys schema built into any o the Percona releases?

At the moment, no.

Is it possible to use SYS schema in Galera 3 nodes cluster?

Yes, since the only requirement is to have performance_schema, which is also available on PXC / Galera Cluster

Can you create trending off information pulled from the Sys Schema? Full table scans over time, latency over time, that kind of thing?

Yes, you can use procedures like, for example, “diagnostics

How do I reset the performance data to start collecting from scratch?

By calling the ps_truncate_all_tables procedure. Truncate a performance_schema table equals to “clear collected events”. TRUNCATE TABLE can also be used with summary tables, but the effect is to reset the summary columns to 0 or NULL, not to remove rows.

Can we install SYS schema before 5.6?

You can use the ps_helper on 5.5 https://github.com/MarkLeith/dbahelper

Does sys support performance_schema from 5.0?

Unfortunately, MySQL 5.0 doesn’t have performance_schema. P_S is available since MySQL 5.5.3

If you install the sys schema on one node of a Galera cluster will all the nodes get the Sys schema? Also, is the Sys schema cluster aware or does it only track the local node?

For PXC 5.6 with Galera 3, the answer is: yes, it will be replicated to all the nodes. And the performance schema will always only collect data of the local node.

The post Webinar: Introduction to MySQL SYS Schema follow up questions appeared first on MySQL Performance Blog.

Jan
19
2015
--

Looking deeper into InnoDB’s problem with many row versions

A few days ago I wrote about MySQL performance implications of InnoDB isolation modes and I touched briefly upon the bizarre performance regression I found with InnoDB handling a large amount of versions for a single row. Today I wanted to look a bit deeper into the problem, which I also filed as a bug.

First I validated in which conditions the problem happens. It seems to happen only in REPEATABLE-READ isolation mode and only in case there is some hot rows which get many row versions during a benchmark run. For example the problem does NOT happen if I run sysbench with “uniform” distribution.

In terms of concurrent selects it also seems to require some very special conditions – you need to have the connection to let some history accumulate by having read snapshot open and then do it again with high history. The exact queries to do that seems not to be that important.

Contrary to what I expected this problem also does not require multiple connections – I can repeat it with only one read and one write connection, which means it can happen in a lot more workloads.

Here is the simplified case to repeat it:

sysbench  --num-threads=1 --report-interval=10 --max-time=0 --max-requests=0 --rand-type=uniform --oltp-table-size=1 --mysql-user=root --mysql-password= --mysql-db=sbinnodb  --test=/usr/share/doc/sysbench/tests/db/update_index.lua run

It is repeatable on the table with just 1 row where this row is very hot!

Select query sequence:

begin;
select c  from sbtest1 limit 1;
select sleep(20);
commit;
begin;
select c  from sbtest1 limit 1;
select sleep(300);
commit;

It is interesting though, in this case it does not look like it is enough to open a consistent snapshot – it is only when I run the query on the same table as the update workload is running (I assume touches the same data) when the issue happens.

Let’s look at some graphs:

The transaction rate indeed suffers dramatically – in this case going down more than 100x from close to 15K to around 60 in the lowest point.

It is interesting that CPU consumption is reduced but not as much as the throughput.

This is something I find the most enlightening about what seems to happen. The number of buffer pool read requests increases dramatically where the number of transactions goes down. If we do the math we have 940K buffer pool reads per 14.5K updates in the normal case giving us 65 buffer reads per update, which goes up to 37000 per update when we have regression happening. Which is a dramatic change and it points to something that goes badly wrong.

History size is of interest. We can see it grows quickly first – this is when we have the first transaction and sleep(20) and when it grows very slowly when the second transaction is open as update rate is low. Finally it goes to zero very quickly once the transaction is committed. What is worth noting is the history length here peaks at just 280K which is really quite trivial size.

The InnoDB contention graph is interesting from 2 view points. First it shows that the contention picture changes at the time when the transaction is held open and also right afterward when history is being worked out.

If you look at this graph focused only on the os_wait metrics we can see that mutex_os_waits goes down during the problem while x-locks os waits increases significantly. This means there are some very long exclusive lock os-waits happening which is indicative of the long waits. We can’t actually lock times from status variables – we would need to get data from performance schema for that which unfortunately does not cover key mutexes, and which I unfortunately can’t graph easily yet.

Lets look at some more data.

From the great processlist from sys-schema we can get:

*************************** 4. row ***************************
                thd_id: 536
               conn_id: 516
                  user: root@localhost
                    db: sbinnodb
               command: Query
                 state: updating
                  time: 0
     current_statement: UPDATE sbtest1 SET k=k+1 WHERE id=1
          lock_latency: 36.00 us
         rows_examined: 1
             rows_sent: 0
         rows_affected: 0
            tmp_tables: 0
       tmp_disk_tables: 0
             full_scan: NO
        last_statement: NULL
last_statement_latency: NULL
             last_wait: wait/io/table/sql/handler
     last_wait_latency: Still Waiting
                source: handler.cc:7692

Which is unfortunately not very helpful as it does not show the actual lock which is being waited on – just saying that it is waiting somewhere inside storage engine call which is obvious.

We can get some good summary information in the other table:

mysql> select * from waits_global_by_latency;
+------------------------------------------------------+-----------+---------------+-------------+-------------+
| events                                               | total     | total_latency | avg_latency | max_latency |
+------------------------------------------------------+-----------+---------------+-------------+-------------+
| wait/synch/cond/sql/Item_func_sleep::cond            |       384 | 00:31:27.18   | 4.91 s      | 5.00 s      |
| wait/synch/rwlock/innodb/index_tree_rw_lock          | 239460870 | 00:12:13.82   | 3.06 us     | 495.27 ms   |
| wait/synch/mutex/sql/THD::LOCK_thd_data              | 601160083 | 36.48 s       | 60.41 ns    | 643.18 us   |
| wait/io/table/sql/handler                            |      1838 | 29.12 s       | 15.84 ms    | 211.21 ms   |
| wait/synch/mutex/mysys/THR_LOCK::mutex               | 240456674 | 16.57 s       | 68.86 ns    | 655.16 us   |
| wait/synch/cond/sql/MDL_context::COND_wait_status    |        15 | 14.11 s       | 940.37 ms   | 999.36 ms   |
| wait/synch/mutex/innodb/trx_mutex                    | 360685304 | 10.52 s       | 29.11 ns    | 4.01 ms     |
| wait/synch/mutex/innodb/trx_undo_mutex               | 120228229 | 3.87 s        | 31.93 ns    | 426.16 us   |
| wait/io/file/innodb/innodb_data_file                 |       399 | 1.46 s        | 3.66 ms     | 11.21 ms    |
| wait/synch/mutex/innodb/buf_pool_flush_state_mutex   |     32317 | 587.10 ms     | 18.17 us    | 11.27 ms    |
| wait/synch/rwlock/innodb/fil_space_latch             |   4154178 | 386.96 ms     | 92.96 ns    | 74.57 us    |
| wait/io/file/innodb/innodb_log_file                  |       987 | 271.76 ms     | 275.34 us   | 14.23 ms    |
| wait/synch/rwlock/innodb/hash_table_locks            |   8509138 | 255.76 ms     | 30.05 ns    | 53.41 us    |
| wait/synch/mutex/innodb/srv_threads_mutex            |  11938747 | 249.49 ms     | 20.66 ns    | 15.06 us    |
| wait/synch/rwlock/innodb/trx_purge_latch             |   8488041 | 181.01 ms     | 21.28 ns    | 23.51 us    |
| wait/synch/rwlock/innodb/dict_operation_lock         |       232 | 100.48 ms     | 433.09 us   | 61.88 ms    |
| wait/synch/rwlock/innodb/checkpoint_lock             |        10 | 32.34 ms      | 3.23 ms     | 8.38 ms     |
| wait/io/socket/sql/client_connection                 |      2171 | 30.87 ms      | 14.22 us    | 3.95 ms     |
| wait/synch/mutex/innodb/log_sys_mutex                |    260485 | 6.57 ms       | 25.04 ns    | 52.00 us    |
| wait/synch/rwlock/innodb/btr_search_latch            |      4982 | 1.71 ms       | 343.05 ns   | 17.83 us    |
| wait/io/file/myisam/kfile                            |       647 | 1.70 ms       | 2.63 us     | 39.28 us    |
| wait/synch/rwlock/innodb/dict_table_stats            |     46323 | 1.63 ms       | 35.06 ns    | 6.55 us     |
| wait/synch/mutex/innodb/os_mutex                     |     11410 | 904.63 us     | 79.19 ns    | 26.67 us    |
| wait/lock/table/sql/handler                          |      1838 | 794.21 us     | 431.94 ns   | 9.70 us     |
| wait/synch/rwlock/sql/MDL_lock::rwlock               |      7056 | 672.34 us     | 95.15 ns    | 53.29 us    |
| wait/io/file/myisam/dfile                            |       139 | 518.50 us     | 3.73 us     | 31.61 us    |
| wait/synch/mutex/sql/LOCK_global_system_variables    |     11692 | 462.58 us     | 39.44 ns    | 363.39 ns   |
| wait/synch/mutex/innodb/rseg_mutex                   |      7238 | 348.25 us     | 47.89 ns    | 10.38 us    |
| wait/synch/mutex/innodb/lock_mutex                   |      5747 | 222.13 us     | 38.50 ns    | 9.84 us     |
| wait/synch/mutex/innodb/trx_sys_mutex                |      4601 | 187.43 us     | 40.69 ns    | 326.15 ns   |
| wait/synch/mutex/sql/LOCK_table_cache                |      2173 | 185.14 us     | 85.14 ns    | 10.02 us    |
| wait/synch/mutex/innodb/fil_system_mutex             |      3481 | 144.60 us     | 41.32 ns    | 375.91 ns   |
| wait/synch/rwlock/sql/LOCK_grant                     |      1217 | 121.86 us     | 99.85 ns    | 12.07 us    |
| wait/synch/mutex/innodb/flush_list_mutex             |      3191 | 108.82 us     | 33.80 ns    | 8.81 us     |
| wait/synch/mutex/innodb/purge_sys_bh_mutex           |      3600 | 83.52 us      | 23.16 ns    | 123.95 ns   |
| wait/synch/mutex/sql/MDL_map::mutex                  |      1456 | 75.02 us      | 51.33 ns    | 8.78 us     |
| wait/synch/mutex/myisam/MYISAM_SHARE::intern_lock    |      1357 | 55.55 us      | 40.69 ns    | 320.51 ns   |
| wait/synch/mutex/innodb/log_flush_order_mutex        |       204 | 40.84 us      | 200.01 ns   | 9.42 us     |
| wait/io/file/sql/FRM                                 |        16 | 31.16 us      | 1.95 us     | 16.40 us    |
| wait/synch/mutex/mysys/BITMAP::mutex                 |       378 | 30.52 us      | 80.44 ns    | 9.16 us     |
| wait/synch/mutex/innodb/dict_sys_mutex               |       463 | 24.15 us      | 51.96 ns    | 146.17 ns   |
| wait/synch/mutex/innodb/buf_pool_LRU_list_mutex      |       293 | 23.37 us      | 79.50 ns    | 313.94 ns   |
| wait/synch/mutex/innodb/buf_dblwr_mutex              |       398 | 22.60 us      | 56.65 ns    | 380.61 ns   |
| wait/synch/rwlock/myisam/MYISAM_SHARE::key_root_lock |       388 | 14.86 us      | 38.19 ns    | 254.16 ns   |
| wait/synch/mutex/sql/LOCK_plugin                     |       250 | 14.30 us      | 56.97 ns    | 137.41 ns   |
| wait/io/socket/sql/server_unix_socket                |         2 | 9.35 us       | 4.67 us     | 4.72 us     |
| wait/synch/mutex/sql/THD::LOCK_temporary_tables      |       216 | 8.97 us       | 41.32 ns    | 465.74 ns   |
| wait/synch/mutex/sql/hash_filo::lock                 |       151 | 8.35 us       | 55.09 ns    | 150.87 ns   |
| wait/synch/mutex/mysys/THR_LOCK_open                 |       196 | 7.84 us       | 39.75 ns    | 110.80 ns   |
| wait/synch/mutex/sql/TABLE_SHARE::LOCK_ha_data       |       159 | 6.23 us       | 39.13 ns    | 108.92 ns   |
| wait/synch/mutex/innodb/lock_wait_mutex              |        29 | 5.38 us       | 185.30 ns   | 240.38 ns   |
| wait/synch/mutex/innodb/ibuf_mutex                   |        29 | 5.26 us       | 181.23 ns   | 249.46 ns   |
| wait/synch/mutex/sql/LOCK_status                     |        62 | 4.12 us       | 66.36 ns    | 127.39 ns   |
| wait/synch/mutex/sql/LOCK_open                       |       109 | 4.05 us       | 36.93 ns    | 105.79 ns   |
| wait/io/file/sql/dbopt                               |         3 | 3.34 us       | 1.11 us     | 1.99 us     |
| wait/synch/mutex/innodb/buf_pool_free_list_mutex     |        30 | 1.58 us       | 52.58 ns    | 106.73 ns   |
| wait/synch/mutex/innodb/srv_innodb_monitor_mutex     |        31 | 1.38 us       | 44.45 ns    | 89.52 ns    |
| wait/synch/mutex/sql/LOCK_thread_count               |        33 | 1.10 us       | 33.18 ns    | 50.71 ns    |
| wait/synch/mutex/sql/LOCK_prepared_stmt_count        |        32 | 1.10 us       | 34.12 ns    | 67.92 ns    |
| wait/synch/mutex/innodb/srv_sys_mutex                |        42 | 1.07 us       | 25.35 ns    | 49.14 ns    |
| wait/synch/mutex/mysys/KEY_CACHE::cache_lock         |        24 | 1.02 us       | 42.26 ns    | 103.60 ns   |
| wait/synch/mutex/sql/MDL_wait::LOCK_wait_status      |        18 | 544.62 ns     | 30.05 ns    | 42.57 ns    |
| wait/synch/rwlock/sql/CRYPTO_dynlock_value::lock     |         2 | 509.25 ns     | 254.47 ns   | 287.02 ns   |
| wait/synch/rwlock/sql/LOCK_dboptions                 |         6 | 262.92 ns     | 43.82 ns    | 84.82 ns    |
| wait/synch/rwlock/sql/MDL_context::LOCK_waiting_for  |         4 | 262.92 ns     | 65.73 ns    | 103.60 ns   |
| wait/synch/mutex/sql/LOCK_connection_count           |         4 | 238.51 ns     | 59.47 ns    | 123.01 ns   |
| wait/synch/mutex/sql/LOCK_sql_rand                   |         2 | 172.15 ns     | 86.08 ns    | 94.84 ns    |
| wait/synch/rwlock/sql/LOCK_system_variables_hash     |         2 | 141.16 ns     | 70.43 ns    | 78.25 ns    |
| wait/synch/rwlock/sql/gtid_commit_rollback           |         2 | 95.47 ns      | 47.58 ns    | 51.65 ns    |
| wait/synch/mutex/innodb/autoinc_mutex                |         2 | 90.14 ns      | 45.07 ns    | 45.07 ns    |
| wait/synch/mutex/sql/LOCK_thd_remove                 |         2 | 85.14 ns      | 42.57 ns    | 52.58 ns    |
| wait/synch/mutex/innodb/ibuf_bitmap_mutex            |         1 | 76.37 ns      | 76.37 ns    | 76.37 ns    |
| wait/synch/mutex/sql/LOCK_xid_cache                  |         2 | 62.91 ns      | 31.30 ns    | 34.43 ns    |
+------------------------------------------------------+-----------+---------------+-------------+-------------+
73 rows in set (0.00 sec)

Which shows a couple of hotspots – the well-known index_tree_rw_lock hotspot which should be gone in MySQL 5.7 as well as trx_mutex and trx_undo_mutexes. It is especially worth noting the almost 500ms which this index tree lw_lock was waited at max (this is the data for a couple of minutes only, so it it is unlikely to be a very rare outlier). This hints to me that in the case of many versions our index_tree_rw_lock might be held for a prolonged period of time.

What is interesting is that the reliable “SHOW ENGINE INNODB STATUS” remains quite handy:

----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 58847289
--Thread 140386357991168 has waited at btr0cur.cc line 304 for 0.0000 seconds the semaphore:
X-lock on RW-latch at 0x7fb0bea2ca40 '&block->lock'
a writer (thread id 140386454755072) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file buf0flu.cc line 1082
Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/row /row0row.cc line 815
--Thread 140386374776576 has waited at btr0cur.cc line 577 for 0.0000 seconds the semaphore:
X-lock on RW-latch at 0x7fade8077f98 '&new_index->lock'
a writer (thread id 140386357991168) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file btr0cur.cc line 586
Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577
--Thread 140386324420352 has waited at btr0cur.cc line 577 for 0.0000 seconds the semaphore:
X-lock on RW-latch at 0x7fade8077f98 '&new_index->lock'
a writer (thread id 140386357991168) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file btr0cur.cc line 586
Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577
--Thread 140386332813056 has waited at btr0cur.cc line 577 for 0.0000 seconds the semaphore:
X-lock on RW-latch at 0x7fade8077f98 '&new_index->lock'
a writer (thread id 140386357991168) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file btr0cur.cc line 586
Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577
--Thread 140386445960960 has waited at btr0cur.cc line 586 for 0.0000 seconds the semaphore:
S-lock on RW-latch at 0x7fade8077f98 '&new_index->lock'
a writer (thread id 140386357991168) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file btr0cur.cc line 586
Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577
--Thread 140386366383872 has waited at btr0cur.cc line 577 for 0.0000 seconds the semaphore:
X-lock on RW-latch at 0x7fade8077f98 '&new_index->lock'
a writer (thread id 140386357991168) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file btr0cur.cc line 586
Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577
--Thread 140386341205760 has waited at btr0cur.cc line 577 for 0.0000 seconds the semaphore:
X-lock on RW-latch at 0x7fade8077f98 '&new_index->lock'
a writer (thread id 140386357991168) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file btr0cur.cc line 586
Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577
--Thread 140386349598464 has waited at btr0cur.cc line 577 for 0.0000 seconds the semaphore:
X-lock on RW-latch at 0x7fade8077f98 '&new_index->lock'
a writer (thread id 140386357991168) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file btr0cur.cc line 586
Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr /btr0cur.cc line 577

Another sample….

----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 61753307
--Thread 140386332813056 has waited at row0row.cc line 815 for 0.0000 seconds the semaphore:
X-lock on RW-latch at 0x7fb0bea2cd40 '&block->lock'
a writer (thread id 140386454755072) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file buf0flu.cc line 1082
Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/row/row0row.cc line 815
--Thread 140386366383872 has waited at row0row.cc line 815 for 0.0000 seconds the semaphore:
X-lock on RW-latch at 0x7fb0bea2cd40 '&block->lock'
a writer (thread id 140386454755072) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file buf0flu.cc line 1082
Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/row/row0row.cc line 815
--Thread 140386341205760 has waited at row0row.cc line 815 for 0.0000 seconds the semaphore:
X-lock on RW-latch at 0x7fb0bea2cd40 '&block->lock'
a writer (thread id 140386454755072) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file buf0flu.cc line 1082
Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/row/row0row.cc line 815
--Thread 140386374776576 has waited at row0row.cc line 815 for 0.0000 seconds the semaphore:
X-lock on RW-latch at 0x7fb0bea2cd40 '&block->lock'
a writer (thread id 140386454755072) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file buf0flu.cc line 1082
Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/row/row0row.cc line 815
--Thread 140386445960960 has waited at btr0cur.cc line 257 for 0.0000 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x7fb0bea2f2c0 '&block->lock'
a writer (thread id 140386445960960) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 0, lock_word: ffffffffffffffff
Last time read locked in file btr0cur.cc line 257
Last time write locked in file /mnt/workspace/percona-server-5.6-debian-binary/label_exp/ubuntu-trusty-64bit/percona-server-5.6-5.6.21-70.1/storage/innobase/btr/btr0cur.cc line 257
OS WAIT ARRAY INFO: signal count 45260762
Mutex spin waits 1524575466, rounds 5092199292, OS waits 43431355
RW-shared spins 14500456, rounds 258217222, OS waits 3721069
RW-excl spins 37764438, rounds 730059390, OS waits 12592324
Spin rounds per wait: 3.34 mutex, 17.81 RW-shared, 19.33 RW-excl

The first one shows mostly index lock contention which we have already seen. The second, though, shows a lot of contention on the block lock, which is something that is not covered by performance schema instrumentation in MySQL 5.6.

Now to tell the truth in this 2nd thread test case it is not your typical case of contention – unlike the case with 64 threads we do not see that drastic CPU drop and it still stays above 25%, the single thread fully saturated should show on this system. This is however where DBA level profiling gets tough with MySQL – as I mentioned the only indication of excessive activity going on is a high number of buffer pool requests – there is no instrumentation which I’m aware of that tells us anything around scanning many old row versions in undo space. Perhaps some more metrics around this syssystem need to be added to INNODB_METRICS?

A good tool to see CPU consumption which both developers and DBAs can use is oprofile, which is of low enough overhead to use in production. Note that to get any meaningful information about the CPU usage inside the program you need to have debug systems installed. For Percona Server 5.6 On Ubuntu you can do apt-get install percona-server-5.6-dbg

To profile just the MySQL Process you can do:

root@ts140i:~# operf --pid 8970
operf: Press Ctl-c or 'kill -SIGINT 9113' to stop profiling
operf: Profiler started
^C
Profiling done.

To get top symbols (functions) called in mysqld I can do:

root@ts140i:~# opreport -l | more
Using /root/oprofile_data/samples/ for samples directory.
warning: /no-vmlinux could not be found.
warning: [vdso] (tgid:8970 range:0x7fffba5fe000-0x7fffba5fffff) could not be found.
CPU: Intel Haswell microarchitecture, speed 3.201e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        image name               symbol name
212162   12.4324  mysqld                   buf_page_get_gen(unsigned long, unsigned long, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*, unsigned long, mtr_t*)
204230   11.9676  mysqld                   dict_table_copy_types(dtuple_t*, dict_table_t const*)
179783   10.5351  no-vmlinux               /no-vmlinux
144335    8.4579  mysqld                   trx_undo_update_rec_get_update(unsigned char*, dict_index_t*, unsigned long, unsigned long, unsigned long, unsigned long, trx_t*, mem_block_info_t*, upd_t
**)
95984     5.6245  mysqld                   trx_undo_prev_version_build(unsigned char const*, mtr_t*, unsigned char const*, dict_index_t*, unsigned long*, mem_block_info_t*, unsigned char**)
78785     4.6167  mysqld                   row_build(unsigned long, dict_index_t const*, unsigned char const*, unsigned long const*, dict_table_t const*, dtuple_t const*, unsigned long const*, row_
ext_t**, mem_block_info_t*)
76656     4.4919  libc-2.19.so             _int_free
63040     3.6941  mysqld                   rec_get_offsets_func(unsigned char const*, dict_index_t const*, unsigned long*, unsigned long, mem_block_info_t**)
52794     3.0937  mysqld                   mtr_commit(mtr_t*)

Which points to some hot functions – quite as expected we see some functions working with undo space here. In some cases it is also helpful to look at the call graph to understand where functions are being called if it is not clear.

Now in many cases looking at oprofile report it is hard to figure out what is “wrong” without having the baseline. In this case we can look at the oprofile run for sysbench alone:

root@ts140i:~# opreport -l | more
Using /root/oprofile_data/samples/ for samples directory.
warning: /no-vmlinux could not be found.
warning: [vdso] (tgid:8970 range:0x7fffba5fe000-0x7fffba5fffff) could not be found.
CPU: Intel Haswell microarchitecture, speed 3.201e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        image name               symbol name
1331386  28.4544  no-vmlinux               /no-vmlinux
603873   12.9060  mysqld                   ut_delay(unsigned long)
158428    3.3859  mysqld                   buf_page_get_gen(unsigned long, unsigned long, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*,
 unsigned long, mtr_t*)
88172     1.8844  mysqld                   my_timer_cycles
75948     1.6232  mysqld                   srv_worker_thread
74350     1.5890  mysqld                   MYSQLparse(THD*)
68075     1.4549  mysqld                   mutex_spin_wait(void*, bool, char const*, unsigned long)
59321     1.2678  mysqld                   que_run_threads(que_thr_t*)
59152     1.2642  mysqld                   start_mutex_wait_v1
57495     1.2288  mysqld                   rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned long, bool, bool)
53133     1.1356  mysqld                   mtr_commit(mtr_t*)
44404     0.9490  mysqld                   end_mutex_wait_v1
40226     0.8597  libpthread-2.19.so       pthread_mutex_lock
38889     0.8311  mysqld                   log_block_calc_checksum_innodb(unsigned char const*)

As you see this report is very different in terms of top functions.

Final thoughts: I had 2 purposes with this blog post. First, I hope it will contain some helpful information for developers so this bug can be fixed at least in MySQL 5.7. It is quite a nasty one as it can come out of nowhere if the application unexpectedly develops hot rows. Second, I hope it will show some of the methods you can use to capture and analyze data about MySQL performance.

The post Looking deeper into InnoDB’s problem with many row versions appeared first on MySQL Performance Blog.

Jun
09
2014
--

Using MySQL 5.6 Performance Schema in multi-tenant environments

Hosting a shared MySQL instance for your internal or external clients (“multi-tenant”) was always a challenge. Multi-tenants approach or a “schema-per-customer” approach is pretty common nowadays to host multiple clients on the same MySQL sever. One of issues of this approach, however, is the lack of visibility: it is hard to tell how many resources (queries, disk, cpu, etc) each user will use.

Percona Server contains userstats Google patch, which will allow you to get the resource utilization per user. The new MySQL 5.6 performance_schema has even more instrumentation which can give you a better visibility on per-user or per-schema/per-database level. And if you are running MySQL 5.6.6 or higher, Performance Schema is already enabled (with minimum set of “instrumentation” thou) and ready to use. In this post I will share examples of using Performance Schema for multi-tenant environments.

Overview

If you want to use Performance Schema, make sure it is enabled (enabled by default starting with MySQL 5.6.6):

mysql> show global variables like 'performance_schema';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| performance_schema | ON    |
+--------------------+-------+
1 row in set (0.00 sec)

Performance_schema provides you with the “raw” metrics and it may be difficult to select data from it. The good news is that you can use the “sys” schema project by Mark Leith. The “sys” schema (old name: ps_helper) is a collection of views and stored procedures which will provide you with reports in human readable format. The installation is easy, download it from github and run:

$ mysql -u root -p < ./sys_56.sql

(it will only create database “sys” and a set of views and stored procedures/stored functions in it)

Examples

For the multi-tenant environment the most interesting is resource utilization breakdown. Lets say you want to “charge per IO” similar to Amazon RDS for MySQL model. You can now run this simple query against “sys” schema, which will use Performance Schema in MySQL 5.6 to generate report (in my example all users starts with “a”):

mysql sys> select * from user_summary_by_file_io where user like 'a%';
+-------+------------+-------------+
| user  | ios        | io_latency  |
+-------+------------+-------------+
| api01 | 3061469563 | 4.04h       |
| api03 |  912296937 | 1.36h       |
| api02 |  815473183 | 1.22h       |
| app01 |    9704381 | 00:01:06.33 |
| app02 |    1160149 | 8.18 s      |
| app03 |    1424065 | 7.23 s      |
+-------+------------+-------------+
6 rows in set (0.06 sec)

If you need more extensive metrics you can use this report:

mysql sys> select * from user_summary where user like 'a%' limit 10;
+-------+------------+-------------------+-----------------------+-------------+------------+-----------------+---------------------+-------------------+--------------+
| user  | statements | statement_latency | statement_avg_latency | table_scans | file_ios   | file_io_latency | current_connections | total_connections | unique_hosts |
+-------+------------+-------------------+-----------------------+-------------+------------+-----------------+---------------------+-------------------+--------------+
| api01 | 1837898366 | 192.29h           | 376.65 us             |     2221018 | 3064724488 | 4.05h           |                  22 |          73748519 |            1 |
| api02 |  833712609 | 206.83h           | 893.11 us             |        1164 |  836753563 | 1.26h           |                   7 |            246730 |            1 |
| api03 |  988875004 | 222.03h           | 808.29 us             |        1156 |  943363030 | 1.43h           |                   8 |            247639 |            1 |
| app01 |    1864159 | 4.01h             | 7.75 ms               |       46756 |    9704623 | 00:01:06.33     |                   0 |            300004 |            1 |
| app02 |     812520 | 00:20:51.69       | 1.54 ms               |       42513 |    1164954 | 8.37 s          |                   0 |            122994 |            1 |
| app03 |     811558 | 00:21:59.57       | 1.63 ms               |       42310 |    1429373 | 7.40 s          |                   0 |            123110 |            1 |
+-------+------------+-------------------+-----------------------+-------------+------------+-----------------+---------------------+-------------------+--------------+
6 rows in set

Or a breakdown per user per statement:

mysql sys> select user, statement, total, rows_sent, rows_examined, rows_affected, full_scans from user_summary_by_statement_type   where user =  'api01' limit 100;
+-------+-------------+-----------+-----------+---------------+---------------+------------+
| user  | statement   | total     | rows_sent | rows_examined | rows_affected | full_scans |
+-------+-------------+-----------+-----------+---------------+---------------+------------+
| api01 | insert      | 368405774 |         0 |             0 |     494179199 |          0 |
| api01 | commit      | 376276686 |         0 |             0 |             0 |          0 |
| api01 | select      | 476129532 | 761038004 |    1067363348 |             0 |          0 |
| api01 | update      | 142896430 |         0 |     142896437 |     141978066 |          0 |
| api01 | begin       | 376280424 |         0 |             0 |             0 |          0 |
| api01 | show_fields |   2215016 | 310102240 |     310102240 |             0 |    2215016 |
| api01 | Quit        |  73421361 |         0 |             0 |             0 |          0 |
| api01 | change_db   |  17258767 |         0 |             0 |             0 |          0 |
| api01 | delete      |      7896 |         0 |          7896 |          7896 |          0 |
| api01 | rollback    |      3711 |         0 |             0 |             0 |          0 |
+-------+-------------+-----------+-----------+---------------+---------------+------------+
10 rows in set (0.04 sec)

If you are using  “schema per customer” approach you get get the per-schema resource utilization using sys.schema_table_statistics. Example (from Peter Zaitsev’s webinar):

mysql [localhost] {msandbox} > select * from schema_table_statistics where table_schema='sbtest' limit 1 G
*************************** 1. row ***************************
     table_schema: sbtest
       table_name: sbtest
     rows_fetched: 158764154
    fetch_latency: 1.37h
    rows_inserted: 378901
   insert_latency: 00:07:17.38
     rows_updated: 1136714
   update_latency: 00:45:40.08
     rows_deleted: 378902
   delete_latency: 00:03:00.34
 io_read_requests: 636003
          io_read: 9.70 GiB
  io_read_latency: 00:28:12.01
io_write_requests: 203925
         io_write: 3.11 GiB
 io_write_latency: 17.26 s
 io_misc_requests: 2449
  io_misc_latency: 3.87 s

This report may be really slow if you have lots of tables. If you are only interested in disk utilization per database you can directly query the performance_schema:

SELECT extract_schema_from_file_name(fsbi.file_name) AS table_schema,
       SUM(fsbi.count_read) AS io_read_requests,
       sys.format_bytes(SUM(fsbi.sum_number_of_bytes_read)) AS io_read,
       sys.format_time(SUM(fsbi.sum_timer_read)) AS io_read_latency,
       SUM(fsbi.count_write) AS io_write_requests,
       sys.format_bytes(SUM(fsbi.sum_number_of_bytes_write)) AS io_write,
       sys.format_time(SUM(fsbi.sum_timer_write)) AS io_write_latency,
       SUM(fsbi.count_misc) AS io_misc_requests,
       sys.format_time(SUM(fsbi.sum_timer_misc)) AS io_misc_latency
  FROM  performance_schema.file_summary_by_instance AS fsbi
 GROUP BY table_schema
 ORDER BY io_write_requests DESC limit 10;
 +--------------------+------------------+------------+-----------------+-------------------+------------+------------------+------------------+-----------------+
 | table_schema       | io_read_requests | io_read    | io_read_latency | io_write_requests | io_write   | io_write_latency | io_misc_requests | io_misc_latency |
 +--------------------+------------------+------------+-----------------+-------------------+------------+------------------+------------------+-----------------+
 | binlogs            |       6943921992 | 19.57 TiB  | 2.46h           |        1124408959 | 6.60 TiB   | 2.18h            |           914411 | 00:03:55.75     |
 | d1                 |           146448 | 2.24 GiB   | 1.97 s          |          22768241 | 241.44 GiB | 00:12:57.97      |          5095817 | 00:12:50.16     |
 | stats1             |              853 | 1.81 MiB   | 97.12 ms        |           1042768 | 15.93 GiB  | 10.27 s          |           292303 | 22.74 s         |
 | stats2             |             3024 | 11.22 MiB  | 781.96 ms       |            750182 | 11.45 GiB  | 9.32 s           |           267780 | 50.84 s         |
 | relaylogs          |           105065 | 925.71 MiB | 312.82 ms       |            447607 | 925.71 MiB | 1.19 s           |            72169 | 530.70 ms       |
 | o1268              |             2078 | 3.02 MiB   | 7.40 ms         |              1591 | 203.97 KiB | 1.17 ms          |             2693 | 92.34 ms        |
 +--------------------+------------------+------------+-----------------+-------------------+------------+------------------+------------------+-----------------+

Edit: if you also need to get a quick “hot tables” report with focus on disk utilization (io requests) you can use this query:

SELECT concat(extract_schema_from_file_name(fsbi.file_name), '.',
	   extract_table_from_file_name(fsbi.file_name) ) as unique_table_name,
       SUM(fsbi.count_read) AS io_read_requests,
       sys.format_bytes(SUM(fsbi.sum_number_of_bytes_read)) AS io_read,
       sys.format_time(SUM(fsbi.sum_timer_read)) AS io_read_latency,
       SUM(fsbi.count_write) AS io_write_requests,
       sys.format_bytes(SUM(fsbi.sum_number_of_bytes_write)) AS io_write,
       sys.format_time(SUM(fsbi.sum_timer_write)) AS io_write_latency,
       SUM(fsbi.count_misc) AS io_misc_requests,
       sys.format_time(SUM(fsbi.sum_timer_misc)) AS io_misc_latency
  FROM  performance_schema.file_summary_by_instance AS fsbi
 GROUP BY unique_table_name
 ORDER BY io_write_requests DESC limit 10;

This will give you a breakdown per table (and will include system files as well).

Conclusion

Performance_schema in MySQL 5.6 is very useful tool and can help you to get better visualization in MySQL multi-user (multi-tenant) installations. It can also do much more, i.e. find all queries with temporary tables or profile locks and mutex. “sys” schema provide you with the useful pre-defined reports, the table (view) names are almost self explanatory:

+-----------------------------------------------+
| Tables_in_sys                                 |
+-----------------------------------------------+
| innodb_buffer_stats_by_schema                 |
| innodb_buffer_stats_by_table                  |
...
| processlist                                   |
...
| schema_tables_with_full_table_scans           |
| schema_unused_indexes                         |
| statement_analysis                            |
| statements_with_errors_or_warnings            |
| statements_with_full_table_scans              |
| statements_with_runtimes_in_95th_percentile   |
| statements_with_sorting                       |
| statements_with_temp_tables                   |
...

If you are using MySQL 5.6, performance_schema will give a additional valuable inside.

The post Using MySQL 5.6 Performance Schema in multi-tenant environments appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com