Jul
20
2015
--

Fractal Tree library as a Key-Value store

As you may know, Tokutek is now part of Percona and I would like to explain some internals of TokuDB and TokuMX – what performance benefits they bring, along with further optimizations we are working on.

However, before going into deep details, I feel it is needed to explain the fundamentals of Key-Value store, and how Fractal Tree handles it.

Before that, allow me to say that I hear opinions that the “Fractal Tree” name does not reflect an internal structure and looks more like a marketing term than a technical one. I will not go into this discussion and will keep using name “Fractal Tree” just out of the respect to inventors. I think they are in a position to name their invention with any name they want.

So with that said, the Fractal Tree library implements a new data structure for a more efficient handling (with main focus on insertion, but more on this later) of Key-Value store.

You may question how Key-Value is related to SQL Transactional databases – this is more from the NOSQL world. Partially this is true, and Fractal Tree Key-Value library is successfully used in Percona TokuMX (based on MongoDB 2.4) and Percona TokuMXse (storage engine for MongoDB 3.0) products.

But if we look on a Key-Value store in general, actually it maybe a good fit to use in structural databases. To explain this, let’s take a look in Key-Value details.

So what is Key-Value data structure?

We will use a notation (k,v), or key=>val, which basically mean we associate some value “v” with a key “k”. For software developers following analogies may be close:
key-value access is implemented as dictionary in Python, associative array in PHP or map in C++.
(More details in Wikipedia)

I will define key-value structure as a list of pairs (k,v).

It is important to note that both key and value cannot be just scalars (single value), but to be compound.
That is "k1, k2, k3 => v1, v2", which we can read as (give me two values by a 3-part key).

This brings us closer to a database table structure.
If we apply additional requirement that all (k) in list (k,v) must be unique, this will represent
a PRIMARY KEY for a traditional database table.
To understand this better, let’s take a look on following table:
CREATE TABLE metrics (
ts timestamp,
device_id int,
metric_id int,
cnt int,
val double,
PRIMARY KEY (ts, device_id, metric_id),
KEY metric_id (metric_id, ts),
KEY device_id (device_id, ts)
)

We can state that Key-Value structure (ts, device_id, metric_id => cnt, val), with a requirement
"ts, device_id, metric_id" to be unique, represents PRIMARY KEY for this table, actually this is how InnoDB (and TokuDB for this matter) stores data internally.

Secondary indexes also can be represented in Key=>Value notion, for example, again, how it is used in TokuDB and InnoDB:
(seconday_index_key=>primary_key), where a key for a secondary index points to a primary key (so later we can get values by looking up primary key). Please note that that seconday_index_key may not be unique (unless we add an UNIQUE constraint to a secondary index).

Or if we take again our table, the secondary keys are defined as
(metric_id, ts => ts, device_id, metric_id)
and
(device_id, ts => ts, device_id, metric_id)

It is expected from a Key-Value storage to support basic data manipulation and extraction operations, such as:

        – Add or Insert: add

(key => value)

        pair to a collection
        – Update: from

(key => value2)

        to

(key => value2)

        , that is update

"value"

        assigned to

"key"

        .
        – Delete: remove(key): delete a pair

(key => value)

        from a collection
        – Lookup (select): give a

"value"

        assigned to

"key"

and I want to add fifth operation:

        – Range lookup: give all values for keys defined by a range, such as

"key > 5"

        or

"key >= 10 and key < 15"

They way software implements an internal structure of Key-Value store defines the performance of mentioned operations, and especially if datasize of a store grows over a memory capacity.

For the decades, the most popular data structure to represent Key-Value store on disk is B-Tree, and within the reason. I won’t go into B-Tree details (see for example https://en.wikipedia.org/wiki/B-tree), but it provides probably the best possible time for Lookup operations. However it has challenges when it comes to Insert operations.

And this is an area where newcomers to Fractal Tree and LSM-tree (https://en.wikipedia.org/wiki/Log-structured_merge-tree) propose structures which provide a better performance for Insert operations (often at the expense of Lookup/Select operation, which may become slower).

To get familiar with LSM-tree (this is a structure used by RocksDB) I recommend http://www.benstopford.com/2015/02/14/log-structured-merge-trees/. And as for Fractal Tree I am going to cover details in following posts.

The post Fractal Tree library as a Key-Value store appeared first on MySQL Performance Blog.

Jul
14
2015
--

MySQL QA Episode 6: Analyzing and Filtering

Welcome to MySQL QA Episode #6!

Today we will look into analyzing and filtering our QA run. We’ll use tools like pquery-prep-red.sh, pquery-clean-known.sh & pquery-results.sh

1. Quick re-cap and QA run setup
2. pquery-prep-red.sh
3. pquery-clean-known.sh
4. pquery-results.sh
5. Bonus: pquery reach – pquery-reach++.sh

We’ll also introduce the text_string.sh tool which extracts a most-specific text string from the error log in order to classify a bug/issue.

Full-screen viewing @ 720p resolution recommended

The post MySQL QA Episode 6: Analyzing and Filtering appeared first on MySQL Performance Blog.

Jul
12
2015
--

MySQL QA Episode 5: Preparing Your QA Run with pquery

Welcome to MySQL QA Episode #5! In this episode we’ll be setting up and running pquery for the first time… and I’ll also trigger some actual bugs (fun guaranteed)! I’ll also introduce you to mtr_to_sql.sh and pquery-run.sh.

pquery-run.sh (the main wrapper around pquery) is capable of generating 80-120 MySQL Server crashes – per hour! See how it all works in this episode…

Full-screen viewing @ 720p resolution recommended

The post MySQL QA Episode 5: Preparing Your QA Run with pquery appeared first on MySQL Performance Blog.

Jul
09
2015
--

Percona Server 5.6.25-73.1 is now available

Percona ServerPercona is glad to announce the release of Percona Server 5.6.25-73.1 on July 9, 2015. Download the latest version from the Percona web site or from the Percona Software Repositories.

Based on MySQL 5.6.25, including all the bug fixes in it, Percona Server 5.6.25-73.1 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free – and this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release can be found in the 5.6.25-73.1 milestone on Launchpad.

New Features:

  • TokuDB storage engine package has been updated to version 7.5.8

New TokuDB Features:

  • Exposed ft-index fanout as TokuDB option tokudb_fanout, (default=16 range=2-16384).
  • Tokuftdump can now provide a summary info with new --summary option.
  • Fanout has been serialized in the ft-header and ft_header.fanout value has been exposed in tokuftdump.
  • New checkpoint status variables have been implemented:
    • CP_END_TIME – checkpoint end time, time spend in checkpoint end operation in seconds,
    • CP_LONG_END_COUNT – long checkpoint end count, count of end_checkpoint operations that exceeded 1 minute,
    • CP_LONG_END_TIME – long checkpoint end time, total time of long checkpoints in seconds.
  • “Engine” status variables are now visible as “GLOBAL” status variables.

TokuDB Bugs Fixed:

  • Fixed assertion with big transaction in toku_txn_complete_txn.
  • Fixed assertion that was caused when a transaction had rollback log nodes orphaned in the blocktable.
  • Fixed ftcxx tests failures that were happening when it was run in parallel.
  • Fixed multiple test failures for Debian/Ubuntu caused by assertion on setlocale().
  • Status has been refactored to its own file/subsystem within ft-index code to make the it more accessible.

Release notes for Percona Server 5.6.25-73.1 are available in the online documentation. Please report any bugs on the launchpad bug tracker .

The post Percona Server 5.6.25-73.1 is now available appeared first on MySQL Performance Blog.

Jul
07
2015
--

MySQL QA Episode 4: QA Framework Setup Time!

Welcome to MySQL QA Episode 4! In this episode we’ll look into setting up our QA Framework: percona-qa, pquery, reducer & more.

1. All about percona-qa
2. pquery

$ cd ~; bzr branch lp:percona-qa

3. reducer.sh

$ cd ~; bzr branch lp:randgen
$ vi ~/randgen/util/reducer/reducer.sh

4. Short introduction to pquery framework tools

The tools introduced in this episode will be covered further in next two episodes.

Full-screen viewing @ 720p resolution recommended

To view the other episodes, you can watch the full series on YouTube:

Or checkout the MySQL QA Series Introduction which has an index & links to each episode!

The post MySQL QA Episode 4: QA Framework Setup Time! appeared first on MySQL Performance Blog.

Jul
01
2015
--

Percona Server 5.5.44-37.3 is now available

Percona Server
Percona is glad to announce the release of Percona Server 5.5.44-37.3 on July 1, 2015. Based on MySQL 5.5.44, including all the bug fixes in it, Percona Server 5.5.44-37.3 is now the current stable release in the 5.5 series.

Percona Server is open-source and free. Details of the release can be found in the 5.5.44-37.3 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

Bugs Fixed:

  • Symlinks to libmysqlclient libraries were missing on CentOS 6. Bug fixed #1408500.
  • RHEL/CentOS 6.6 OpenSSL package (1.0.1e-30.el6_6.9), containing a fix for CVE-2015-4000, changed the DH key sizes to a minimum of 768 bits. This caused an issue for MySQL as it uses 512 bit keys. Fixed by backporting an upstream 5.7 fix that increases the key size to 2048 bits. Bug fixed #1462856 (upstream #77275).
  • innochecksum would fail to check tablespaces in compressed format. The fix for this bug has been ported from Facebook MySQL 5.1 patch. Bug fixed #1100652 (upstream #66779).
  • Issuing SHOW BINLOG EVENTS with an invalid starting binlog position would cause a potentially misleading message in the server error log. Bug fixed #1409652 (upstream #75480).
  • While using max_slowlog_size, the slow query log was rotated every time slow query log was enabled, not really checking if the current slow log is indeed bigger than max_slowlog_size or not. Bug fixed #1416582.
  • If query_response_time_range_base variable was set as a command line option or in a configuration file, its value would not take effect until the first flush was made. Bug fixed #1453277 (Preston Bennes).
  • Prepared XA transactions with update undo logs were not properly recovered. Bug fixed #1468301.
  • Variable log_slow_sp_statements now supports skipping the logging of stored procedures into the slow log entirely with new OFF_NO_CALLS option. Bug fixed #1432846.

Other bugs fixed: #1380895 (upstream #72322).

(Please also note that Percona Server 5.6 series is the latest General Availability series and current GA release is 5.6.25-73.0.)

Release notes for Percona Server 5.5.44-37.3 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

The post Percona Server 5.5.44-37.3 is now available appeared first on MySQL Performance Blog.

Jul
01
2015
--

Percona Server 5.6.25-73.0 is now available

Percona ServerPercona is glad to announce the release of Percona Server 5.6.25-73.0 on July 1, 2015. Download the latest version from the Percona web site or from the Percona Software Repositories.

Based on MySQL 5.6.25, including all the bug fixes in it, Percona Server 5.6.25-73.0 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free – and this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release can be found in the 5.6.25-73.0 milestone on Launchpad.

New Features:

  • Percona Server has implemented support for PROXY protocol. The implementation is based on a patch developed by Thierry Fournier.

Bugs Fixed:

  • Symlinks to libmysqlclient libraries were missing on CentOS 6. Bug fixed #1408500.
  • RHEL/CentOS 6.6 OpenSSL package (1.0.1e-30.el6_6.9), containing a fix for CVE-2015-4000, changed the DH key sizes to a minimum of 768 bits. This caused an issue for MySQL as it uses 512 bit keys. Fixed by backporting an upstream 5.7 fix that increases the key size to 2048 bits. Bug fixed #1462856 (upstream #77275).
  • Some compressed InnoDB data pages could be mistakenly considered corrupted, crashing the server. Bug fixed #1467760 (upstream #73689) Justin Tolmer.
  • innochecksum would fail to check tablespaces in compressed format. The fix for this bug has been ported from Facebook MySQL 5.6 patch. Bug fixed #1100652 (upstream #66779).
  • Using concurrent REPLACE, LOAD DATA REPLACE or INSERT ON DUPLICATE KEY UPDATE statements in the READ COMMITTED isolation level or with the innodb_locks_unsafe_for_binlog option enabled could lead to a unique-key constraint violation. Bug fixed #1308016 (upstream #76927).
  • Issuing SHOW BINLOG EVENTS with an invalid starting binlog position would cause a potentially misleading message in the server error log. Bug fixed #1409652 (upstream #75480).
  • While using max_slowlog_size, the slow query log was rotated every time slow query log was enabled, not really checking if the current slow log is indeed bigger than max_slowlog_size or not. Bug fixed #1416582.
  • Fixed possible server assertions when Backup Locks are used. Bug fixed #1432494.
  • If query_response_time_range_base variable was set as a command line option or in a configuration file, its value would not take effect until the first flush was made. Bug fixed #1453277 (Preston Bennes).
  • mysqld_safe script is now searching for libjemalloc.so.1 library, needed by TokuDB, in the basedir directory as well. Bug fixed #1462338.
  • Prepared XA transactions could cause a debug assertion failure during the shutdown. Bug fixed #1468326.
  • Variable log_slow_sp_statements now supports skipping the logging of stored procedures into the slow log entirely with new OFF_NO_CALLS option. Bug fixed #1432846.
  • TokuDB HotBackup library is now automatically loaded with mysqld_safe script. Bug fixed #1467443.

Other bugs fixed: #1457113, #1380895, and #1413836.

Release notes for Percona Server 5.6.25-73.0 are available in the online documentation. Please report any bugs on the launchpad bug tracker .

The post Percona Server 5.6.25-73.0 is now available appeared first on MySQL Performance Blog.

Jun
03
2015
--

Percona XtraDB Cluster 5.6.24-25.11 is now available

Percona XtraDB Cluster 5.6.24-25.11Percona is glad to announce the new release of Percona XtraDB Cluster 5.6 on June 3rd 2015. Binaries are available from downloads area or from our software repositories.

Based on Percona Server 5.6.24-72.2 including all the bug fixes in it, Galera Replicator 3.11, and on Codership wsrep API 25.11, Percona XtraDB Cluster 5.6.24-25.11 is now the current General Availability release. All of Percona’s software is open-source and free, and all the details of the release can be found in the 5.6.24-25.11 milestone at Launchpad.

New Features:

  • Percona XtraDB Cluster now allows reads in non-primary state by introducing a new session variable wsrep_dirty_reads. This variable is boolean and is OFF by default. When set to ON, a Percona XtraDB Cluster node accepts queries that only read, but not modify data even if the node is in the non-PRIM state (#1407770).
  • Percona XtraDB Cluster now allows queries against INFORMATION_SCHEMA and PERFORMANCE_SCHEMA even with variables wsrep_ready and wsrep_dirty_reads set to OFF. This allows monitoring applications to monitor the node when it is even in non-PRIM state (#1409618).
  • wsrep_replicate_myisam variable is now both global and session variable (#1280280).
  • Percona XtraDB Cluster now uses getifaddrs for node address detection (#1252700).
  • Percona XtraDB Cluster has implemented two new status variables: wsrep_cert_bucket_count and wsrep_gcache_pool_size for better instrumentation of galera memory usage. Variable wsrep_cert_bucket_count shows the number of cells in the certification index hash-table and variable wsrep_gcache_pool_size shows the size of the page pool and/or dynamic memory allocated for gcache (in bytes).

Bugs Fixed:

  • Using concurrent REPLACE, LOAD DATA REPLACE or INSERT ON DUPLICATE KEY UPDATE statements in the READ COMMITTED isolation level or with the innodb_locks_unsafe_for_binlog option enabled could lead to a unique-key constraint violation. Bug fixed #1308016.
  • Using the Rolling Schema Upgrade as a schema upgrade method due to conflict with wsrep_desync would allows only one ALTER TABLE to run concurrently. Bugs fixed #1330944 and #1330941.
  • SST would resume even when the donor was already detected as being in SYNCED state. This was caused when wsrep_desync was manually set to OFF which caused the conflict and resumed the donor sooner. Bug fixed #1288528.
  • DDL would fail on a node when running a TOI DDL, if one of the nodes has the table locked. Bug fixed #1376747.
  • xinet.d mysqlchk file was missing type = UNLISTED to work out of the box. Bug fixed #1418614.
  • Conflict between enforce_storage_engine and wsrep_replicate_myisam for CREATE TABLE has been fixed. Bug fixed #1435482.
  • A specific trigger execution on the master server could cause a slave assertion error under row-based replication. The trigger would satisfy the following conditions: 1) it sets a savepoint; 2) it declares a condition handler which releases this savepoint; 3) the trigger execution passes through the condition handler. Bug fixed #1438990.
  • Percona XtraDB Cluster Debian init script was testing connection with wrong credentials. Bug fixed #1439673.
  • Race condition between IST and SST in xtrabackup-v2 SST has been fixed. Bugs fixed #1441762, #1443881, and #1451524.
  • SST will now fail when move-back fails instead of continuing and failing at the next step. Bug fixed #1451670.
  • Percona XtraDB Cluster .deb binaries were built without fast mutexes. Bug fixed #1457118.
  • The error message text returned to the client in the non-primary mode is now more descriptive ("WSREP has not yet prepared node for application use"), instead of "Unknown command" returned previously. Bug fixed #1426378.
  • Out-of-bount memory access issue in seqno_reset() function has been fixed.
  • wsrep_local_cached_downto would underflow when the node on which it is queried has no writesets in gcache.

Other bugs fixed: #1290526.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

The post Percona XtraDB Cluster 5.6.24-25.11 is now available appeared first on MySQL Performance Blog.

Jun
01
2015
--

Percona TokuMXse 3.0.3-1.0-rc6 is now available

Percona TokuMXse Percona is glad to announce the release of Percona TokuMXse 3.0.3-1.0rc6 on June 1st 2015. TokuMXse is a TokuMX storage engine for MongoDB 3.0.3. Downloads are available from our download site here. Packages for this Release Candidate release, will be available in our Ubuntu Trusty, Utopic, Vivid, and Debian Jessie apt testing and CentOS 7 yum testing repositories.

Based on MongoDB 3.0.3, including all the bug fixes in it, Percona TokuMXse 3.0.3-1.0 is the current release candidate.

This release contains minor changes to the Fractal Tree, including:

  • Improved tokuftdump information.
  • Fixed sporadic recovery issue due to rare race between transaction rollback and logging.
  • Report capped boolean for uncapped collections when using tokuft storage engine.
Getting started

After installation, you can start mongod with tokuft as the storage engine, with:

$ mongod --storageEngine tokuft

Note: Transparent huge pages must be turned off for the fractal tree engine to work properly. If you attempt to run mongod with that option enabled, an error informing you of this will be printed to the output of the mongod process and it will fail to start.

You can check if the Transparent huge pages are enabled with:

$ cat /sys/kernel/mm/transparent_hugepage/enabled

[always] madvise never

You can disable the transparent huge pages by running the following command as root (NOTE: Setting this will last only until the server is rebooted):

echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag

Percona TokuMXse currently supports various tuning parameters to pass to the mongod instance. Full list can be read by running mongod --help (this will print options for all available storage engines, including tokuft).

The post Percona TokuMXse 3.0.3-1.0-rc6 is now available appeared first on MySQL Performance Blog.

May
27
2015
--

Percona XtraBackup 2.2.11 is now available

Percona XtraBackup for MySQL Percona is glad to announce the release of Percona XtraBackup 2.2.11 on May 28, 2015. Downloads are available from our download site or Percona Software Repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

New Features:

  • Percona XtraBackup has been rebased on MySQL 5.6.24.

Bugs Fixed:

  • Version check would crash innobackupex and abort the backup on CentOS 5. Bug fixed #1255451.
  • Percona XtraBackup could crash when preparing the backup taken on MySQL/Percona Server 5.5 if there were open temporary tables during the backup. Bug fixed #1399471 (Fungo Wang).
  • Percona XtraBackup would fail to prepare the backup if the xtrabackup_logfile was lager than 512GB. Bug fixed #1425269.
  • Fix for bug #1403237 was incomplete, due to setting wrong offset last copied batch of log records was copied from wrong location. Bug fixed #1448447.
  • Percona XtraBackup now executes an extra FLUSH TABLES before executing FLUSH TABLES WITH READ LOCK to potentially lower the impact from FLUSH TABLES WITH READ LOCK. Bug fixed #1277403.
  • Regression introduced by fixing #1436793 in Percona XtraBackup 2.2.10 caused an error when taking an incremental backup from MariaDB 10. Bug fixed #1444541.
  • Percona XtraBackup now prints and stores the file based binlog coordinates in xtrabackup_binlog_info even though GTID is enabled. Bug fixed #1449834.
  • Percona XtraBackup doesn’t print warnings anymore during the prepare phase about missing tables when a filtering option (--databases, --tables, etc.) is provided. Bug fixed #1454815 (Davi Arnaut).

Other bugs fixed: #1415191.

Release notes with all the bugfixes for Percona XtraBackup 2.2.11 are available in our online documentation. Bugs can be reported on the launchpad bug tracker. Percona XtraBackup is an open source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases.

The post Percona XtraBackup 2.2.11 is now available appeared first on MySQL Performance Blog.