Dec
16
2015
--

Using dbsake to recover table structure from .frm files and process mysqldump output

dbsakeWe work on data recoveries quite often. In many cases, we recover table structures from the .frm files because there was no backup available. There is already a great blog post by my colleague Miguel Ángel Nieto about how we can recover structures from .frm files using MySQL utilities.

This works pretty well and we prefer to run mysqlfrm with the “–server” option to get all possible information from a .frm file. However, this option expects that MySQL is up and running so that mysqlfrm can spawn a new MySQL instance, and run the structure recovery there.

Recently I came across a tool that makes this job easier. The name of tool is dbsake, it’s a collection of command-line tools that perform various DBA related tasks for MySQL. In this blog, we will look at two very useful dbsake commands.

Installation is very easy and straightforward. It’s in an executable python zip archive with all dependencies included.

# curl -s http://get.dbsake.net > dbsake
# chmod u+x dbsake
# ./dbsake --version
dbsake, version 2.1.0 9525896

Recovering table structures from MySQL .frm files with dbsake

To recover table structures using dbsake, you need to use the “dbsake frmdump” command, followed by the .frm file path. The frmdump command decodes the MySQL .frm file and provides a “CREATE TABLE” or “CREATE VIEW” statement in the output. The good thing is that it doesn’t require a running a MySQL server instance, and interprets the .frm file according to rules similar to the MySQL server.

Let’s see an example:

# ./dbsake frmdump /var/lib/mysql/sakila/staff.frm
--
-- Table structure for table `staff`
-- Created with MySQL Version 5.6.27
--
CREATE TABLE `staff` (
`staff_id` tinyint(3) unsigned NOT NULL AUTO_INCREMENT,
`first_name` varchar(45) NOT NULL,
`last_name` varchar(45) NOT NULL,
`address_id` smallint(5) unsigned NOT NULL,
`picture` blob,
`email` varchar(50) DEFAULT NULL,
`store_id` tinyint(3) unsigned NOT NULL,
`active` tinyint(1) NOT NULL DEFAULT '1',
`username` varchar(16) NOT NULL,
`password` varchar(40) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
`last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`staff_id`),
KEY `idx_fk_store_id` (`store_id`),
KEY `idx_fk_address_id` (`address_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

The result looks pretty good, and has recovered the character set and collation information as well. We can also see the MySQL version (5.6.27) retrieved from the .frm file. It is important to mention that the command only decodes the information available in .frm file, which means that it cannot recover InnoDB foreign-key references and AUTO_INCREMENT values. These items are stored outside of the .frm file.

The frmdump command makes the recovery process easy and faster. We can easily script this and recover the structure of a large number of tables. For example, if we need to recover the structure of all tables from a world database, we can do following:

mysql> create database world_recover;
# for tbl in `ls -1 /var/lib/mysql/world/*.frm`; do ./dbsake frmdump $tbl | mysql world_recover; done;
mysql> show tables from world_recover;
+-------------------------+
| Tables_in_world_recover |
+-------------------------+
| city                    |
| country                 |
| countrylanguage         |
+-------------------------+
3 rows in set (0.00 sec)

Filter and transform a mysqldump stream with dbsake

It’s a very common requirement to filter one or more tables from a mysqldump full database backup. The “dbsake sieve [options]” command helps us to filter or transform mysqldump output.

Let’s see how to extract a single table from a mysqldump file.

# mysqldump world > world.sql
# cat world.sql | ./dbsake sieve -t world.city > world.city.sql
Processed . Output: 1 database(s) 1 table(s) and 0 view(s)

The “-t” or “–table” option tells the command to only output the table matching the given pattern. It will also show the number of databases, tables or views processed in output.

To extract multiple tables, you can pass “-t db.tbl” multiple times.

# cat world.sql | ./dbsake sieve -t world.city -t world.country > world.city_country.sql
Processed . Output: 1 database(s) 2 table(s) and 0 view(s)
# cat world.city_country.sql | grep -i 'create table'
CREATE TABLE `city` (
CREATE TABLE `country` (

The latest Percona server added the new option “–innodb-optimize-keys” in mysqldump. It changes the way InnoDB tables are dumped, so that secondary keys are created after loading the data, thus taking advantage of InnoDB fast index creation. This is a really great feature in that it helps us to restore data more efficiently than the default incremental rebuild that mysqldump performs.

Using the “dbsake sieve [options]” command, we can transform the regular mysqldump output to take advantage of fast index creation. The “–defer-indexes” option rewrites the output of CREATE TABLE statements, and arranges for secondary indexes to be created after the table data is loaded. Similarly the “–defer-foreign-keys” option can be added to add foreign key constraints after loading table data.

Let’s see an example:

# cat world.sql | ./dbsake sieve --defer-indexes --defer-foreign-keys -t world.city  > world.city.sql
Processed . Output: 1 database(s) 1 table(s) and 0 view(s)

This means that world.city.sql will have a table structure with the Primary Key first, then will insert statements to load data, and an additional ALTER TABLE statement to create secondary keys when there is at least one secondary index to be added. Foreign keys will also created with secondary indexes.

The original structure of table world.city:

CREATE TABLE `city` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `Name` char(35) NOT NULL DEFAULT '',
  `CountryCode` char(3) NOT NULL DEFAULT '',
  `District` char(20) NOT NULL DEFAULT '',
  `Population` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`ID`),
  KEY `CountryCode` (`CountryCode`),
  CONSTRAINT `city_ibfk_1` FOREIGN KEY (`CountryCode`) REFERENCES `country` (`Code`)
) ENGINE=InnoDB AUTO_INCREMENT=4080 DEFAULT CHARSET=latin1;

And the transformation done by dbsake:

...........
CREATE TABLE `city` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `Name` char(35) NOT NULL DEFAULT '',
  `CountryCode` char(3) NOT NULL DEFAULT '',
  `District` char(20) NOT NULL DEFAULT '',
  `Population` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=4080 DEFAULT CHARSET=latin1;
...........
LOCK TABLES `city` WRITE;
...........
INSERT INTO `city` VALUES .....................
...........
UNLOCK TABLES;
--
-- InnoDB Fast Index Creation (generated by dbsake)
--
ALTER TABLE `city`
  ADD KEY `CountryCode` (`CountryCode`),
  ADD CONSTRAINT `city_ibfk_1` FOREIGN KEY (`CountryCode`) REFERENCES `country` (`Code`);
...........

For more sieve command options, please read the online manual.

Conclusion

The dbsake command makes it easier to recover table structures from .frm files. We can also filter and transform the mysqldump output easily without writing a complex awk or sed script. There are some more useful features of this tool that you can read about in the manual.

The post Using dbsake to recover table structure from .frm files and process mysqldump output appeared first on MySQL Performance Blog.

Jul
17
2014
--

Q&A: Even More Deadly Mistakes of MySQL Development

Percona WebinarsOn Wednesday I gave a presentation on “How to Avoid Even More Common (but Deadly) MySQL Development Mistakes” for Percona MySQL Webinars.  If you missed it, you can still register to view the recording and my slides.

Thanks to everyone who attended, and especially to folks who asked the great questions.  I answered as many as we had time for  during the session, but here are all the questions with my complete answers:

Q: Disk bandwidth also not infinite ;-)

Indeed, you’re right!

We discussed in the webinar the impact on network bandwidth from using column wildcards in queries like SELECT *, but it’s also possible that using SELECT * can impact disk operations. Varchar, Blob, or Text columns can be stored on extra pages in the database, and if you include those columns in your query needlessly, it can cause the storage engine to do a lot of seeks and page reads unnecessarily.

For more details on string storage in InnoDB, see Peter Zaitsev’s blog on Blob Storage in Innodb.

Q: How many tables can be joined in a single query? What is the optimal number of joins?

MySQL has a limit of 63 table references in a given query. This limits how many JOIN operations you can do, and also limits the number of UNIONs. Actually you can go over this limit if your JOIN or UNION don’t reference any tables, that is, create a derived table of one row of expressions.

If you do join a lot of tables (or even self-join the same table many times), you’re likely to hit a practical scaling limit long before you reach 63 table references. The practical limit in your case depends on many factors, including the length of the tables, the data types, the type of join expressions in your queries, and your physical server’s capabilities. It’s not a fixed limit I can cite for you.

If you think you need dozens of table references in a single query, you should probably step back and reconsider your database design or your query design.

I often see this type of question (“what is the limit on the number of joins?”) when people try to use key/value tables, also called Entity-Attribute-Value, and they’re trying to pivot attributes from rows into columns, as if the table were stored in a conventional way with one column per attribute. This is a broken design for many reasons, and the scalability of many-way joins is just one problem with it.

Q: How many indexes can be created in a single table? Any limitation? What is the optimal number of indexes?

All MySQL storage engines support at least 16 indexes per table.

As far as the optimal number of indexes, I don’t pay attention to the number of indexes (as long as it remains lower than the max of 16). I try to make sure I have the right indexes for my queries. If you put an arbitrary cap of for example 8 or 10 indexes on a given table, then you might be running queries that lack a needed index, and the unnecessary extra cost of running that query is probably greater than the cost of maintaining the one extra index it needs.

That said, there are cases where you have such variation in query types that there’s no way to have optimal indexes to cover every possible case. Given that you can have multi-column indexes, and multi-column indexes with columns in different orders, there are n-factorial possible indexes on a table with n columns.

Q: There is a table with 3 columns: id(int), user_id(int), day(date). There is a high chance same user_id will ‘exist’ for every day. I read data by “where user_id = some_id” (very high throuhput) and delete all entries once a day by cron using “where sent_date = ’2014-01-01′ “. Have approx 6M rows per day deletion is pretty painfull. Will partitioning by column ‘day’ help me deleting those bulks faster? If yes – how much faster? How much will it slow down SELECTs? – not all entries are deleted, but only entries for some specific old day, e.g. ‘ WHERE day = ’1 week ago’

Range partitioning by date would give you the opportunity to ALTER TABLE…DROP PARTITION, so you could remove all data for a given date very quickly, much faster than deleting millions of rows. The performance of DROP PARTITION is like that of DROP TABLE, because each partition is physically stored like a separate table.

Searching for “where user_id = ?” would not be able to take advantage of partition pruning, but it would still be able to use an index on user_id. And if you drop old partitions, the benefit of searching a smaller table could be a good tradeoff.

Q: Regarding 20% selectivity as a threshold for the optimizer preferring a table-scan to an index lookup – is that a tunable?

No, it’s not tunable, it’s a fixed behavior of the query optimizer. If you search for a value and the optimizer estimates that > 20% of rows contain the value you search for, it will bypass the index and just do a table-scan.

For the same reason that the index of a book doesn’t contain very common words, because the list of pages that word appears on would be too long, and flipping back and forth from the back of the book to each listed page would actually be more work than just reading the book.

Also keep in mind my figure of 20% is approximate. Your results may vary. This is not a magic threshold in the source code, it’s just a tendency I have observed.

Q: Regarding generating synthetic test data, it sounds like a pretty easy perl script to write.

Yes, it might be easy to do that for one given table. But every table is different, and you might have hundreds of tables in dozens of applications to generate test data for. You might also want to vary the distribution of data values from one test to another.

Writing a test-data generator for one particular case is easy, so you might reasonably do it as a one-off task. Writing a general-purpose test-data generator that you can use for many cases is more work.

Q: Would love to have the set of URLs cited in the presentation without having to go back and mine them out of the presentation.

Open source message queues:

MySQL Performance Blog articles:

Open source test-data generator:

Load-testing tools for web applications:

Load-testing tools to replay query logs:

Further reading for implementing business rules:

Q: How to best use mysql query cache?

Any cache is best used if you read from it many times for each time you write to it. So we’d like to estimate the average ratio of query cache reads to writes, to estimate how much leverage it’s giving us.

mysql> SHOW GLOBAL STATUS LIKE 'Qcache%';

Check the values for QCache_hits (which are cases when a query result was read from the query cache) over QCache_inserts (which are cases when the desired query result was not in the cache, and had to be run and then the result stored in the cache). I like to see a ratio of 1000% or more (i.e. 10:1 hits to inserts).

If you have a poor ratio, for example less than 1:1 or less than 100%, then consider disabling the query cache, because it may be costing more to maintain it than the performance benefit it’s giving you.

Keep in mind that this is only a guideline, because the calculation I described is only an average. It could be that the queries served by the query cache are very expensive, so using the cached result is a great benefit even if it accounts for a small number of hits. The only way to be certain is to load-test your application under your load, and compare overall performance results with the query cache enabled or disabled, and at different sizes.

Q: How to detect when too much indexes start to affect performance?

Some people are reluctant to create indexes because they have been warned that indexes require synchronous updates when you INSERT, UPDATE, or DELETE rows. Some people also make the generalization that indexes harm writes but benefit reads. Bot of these are not true.

Your DML operations aren’t really updating indexes in real time. InnoDB includes a feature called change buffering, which defers index updates. The change buffer is gradually merged into the index over time. That way, InnoDB can handle a big spike in traffic without it hurting throughput as much. You can monitor how much content in the change buffer remains to be merged:
mysql> SHOW GLOBAL STATUS LIKE 'Innodb_ibuf_size';

It’s also not accurate that indexes hurt writes. UPDATE and DELETE statements usually have a WHERE clause, to apply the changes to particular rows. These conditions use indexes to reduce the examined rows, just like in SELECT statements. But in UPDATE and DELETE statements, it’s even more important to use indexes, because otherwise the statement has to lock a lot of rows to ensure it locks the rows you’re changing.

So I generally say, don’t avoid indexes based only on the number of indexes you have, just make sure your indexes are being employed by the queries you run, and drop indexes that aren’t used. Here are a couple of past blog posts that show how to do this:

Thanks again for attending my webinar!  Here are some more tips:

  • Check out upcoming Percona Training classes in North America and Europe.
  • Join Percona and the MySQL community at our Percona Live.
  • Watch more webinars from Percona in the future!

The post Q&A: Even More Deadly Mistakes of MySQL Development appeared first on MySQL Performance Blog.

May
23
2014
--

How to improve InnoDB performance by 55% for write-bound loads

During April’s Percona Live MySQL Conference and Expo 2014, I attended a talk on MySQL 5.7 performance an scalability given by Dimitri Kravtchuk, the Oracle MySQL benchmark specialist. He mentioned at some point that the InnoDB double write buffer was a real performance killer. For the ones that don’t know what the innodb double write buffer is, it is a disk buffer were pages are written before being written to the actual data file. Upon restart, pages in the double write buffer are rewritten to their data files if complete. This is to avoid data file corruption with half written pages. I knew it has an impact on performance, on ZFS since it is transactional I always disable it, but I never realized how important the performance impact could be. Back from PLMCE, a friend had dropped home a Dell R320 server, asking me to setup the OS and test it. How best to test a new server than to run benchmarks on it, so here we go!

ZFS is not the only transactional filesystem, ext4, with the option “data=journal”, can also be transactional. So, the question is: is it better to have the InnoDB double write buffer enabled or to use the ext4 transaction log. Also, if this is better, how does it compare with xfs, the filesystem I use to propose but which do not support transactions.

Methodology

The goal is to stress the double write buffer so the load has to be write intensive. The server has a simple mirror of two 7.2k rpm drives. There is no controller write cache and the drives write caches are disabled. I decided to use the Percona tpcc-mysql benchmark tool and with 200 warehouses, the total dataset size was around 18G, fitting all within the Innodb buffer pool (server has 24GB). Here’re the relevant part of the my.cnf:

innodb_read_io_threads=4
innodb_write_io_threads=8  #To stress the double write buffer
innodb_buffer_pool_size=20G
innodb_buffer_pool_load_at_startup=ON
innodb_log_file_size = 32M #Small log files, more page flush
innodb_log_files_in_group=2
innodb_file_per_table=1
innodb_log_buffer_size=8M
innodb_flush_method=O_DIRECT
innodb_flush_log_at_trx_commit=0
skip-innodb_doublewrite  #commented or not depending on test

So, I generated the dataset for 200 warehouses, added they keys but not the foreign key constraints, loaded all that in the buffer pool with a few queries and dumped the buffer pool. Then, with MySQL stopped, I did a file level backup to a different partition. I used the MySQL 5.6.16 version that comes with Ubuntu 14.04, at the time Percona server was not available for 14.04. Each benchmark followed this procedure:

  1. Stop mysql
  2. umount /var/lib/mysql
  3. comment or uncomment skip-innodb_doublewrite in my.cnf
  4. mount /var/lib/mysql with specific options
  5. copy the reference backup to /var/lib/mysql
  6. Start mysql and wait for the buffer pool load to complete
  7. start tpcc from another server

The tpcc_start I used it the following:

./tpcc_start -h10.2.2.247 -P3306 -dtpcc -utpcc -ptpcc -w200 -c32 -r300 -l3600 -i60

I used 32 connections, let the tool run for 300s of warm up, enough to reach a steady level of dirty pages, and then, I let the benchmark run for one hour, reporting results every minute.

Results

Test: Double write buffer File system options Average NOPTM over 1h
ext4_dw Yes rw 690
ext4_dionolock_dw Yes rw,dioread_nolock 668
ext4_nodw No rw 1107
ext4trx_nodw No rw,data=journal 1066
xfs_dw Yes xfs rw,noatime 754

 

So, from the above table, the first test I did was the common ext4 with the Innodb double write buffer enabled and it yielded 690 new order transactions per minute (NOTPM). Reading the ext4 doc, I also wanted to try the “dioread_nolock” setting that is supposed to reduce mutex contention and this time, I got slightly less 668 NOTPM. The difference is within the measurement error and isn’t significant. Removing the Innodb double write buffer, although unsafe, boosted the throughput to 1107 NOTPM, a 60% increase! Wow, indeed the double write buffer has a huge impact. But what is the impact of asking the file system to replace the innodb double write buffer? Surprisingly, the performance level is only slightly lower at 1066 NOTPM and vmstat did report twice the amount writes. I needed to redo the tests a few times to convince myself. Getting a 55% increase in performance with the same hardware is not common except when some trivial configuration errors are made. Finally, I used to propose xfs with the Innodb double write buffer enabled to customers, that’s about 10% higher than ext4 with the Innodb double write buffer, close to what I was expecting. The graphic below presents the numbers in a more visual form.

TPCC NOTPM for various configurations

TPCC NOTPM for various configurations

In term of performance stability, you’ll find below a graphic of the per minute NOTPM output for three of the tests, ext4 non-transactional with the double write buffer, ext4 transactional without the double write buffer and xfs with the double write buffer. The dispersion is qualitatively similar for all three. The values presented above are just the averages of those data sets.

TPCC NOTPM evolution over time

TPCC NOTPM evolution over time

Safety

Innodb data corruption is not fun and removing the innodb double write buffer is a bit scary. In order to be sure it is safe, I executed the following procedure ten times:

  1. Start mysql and wait for recovery and for the buffer pool load to complete
  2. Check the error log for no corruption
  3. start tpcc from another server
  4. After about 10 minutes, physically unplug the server
  5. Plug back and restart the server

I observed no corruption. I was still a bit preoccupied, what if the test is wrong? I removed the “data=journal” mount option and did a new run. I got corruption the first time. So given what the procedure I followed and the number of crash tests, I think it is reasonable to assume it is safe to replace the InnoDB double write buffer by the ext4 transactional journal.

I also looked at the kernel ext4 sources and changelog. Up to recently, before kernel 3.2, O_DIRECT wasn’t supported with data=journal and MySQL would have issued a warning in the error log. Now, with recent kernels, O_DIRECT is mapped to O_DSYNC and O_DIRECT is faked, always for data=journal, which is exactly what is needed. Indeed, I tried “innodb_flush_method = O_DSYNC” and found the same results. With older kernels I strongly advise to use the “innodb_flush_method = O_DSYNC” setting to make sure files are opened is a way that will cause them to be transactional for ext4. As always, test thoroughfully, I only tested on Ubuntu 14.04.

Impacts on MyISAM

Since we are no longer really using O_DIRECT, even if set in my.cnf, the OS file cache will be used for InnoDB data. If the database is only using InnoDB that’s not a big deal but if MyISAM is significantly used, that may cause performance issues since MyISAM relies on the OS file cache so be warned.

Fast SSDs

If you have a SSD setup that doesn’t offer a transactional file system like the FusionIO directFS, a very interesting setup would be to mix spinning drives and SSDs. For example, let’s suppose we have a mirror of spinning drives handled by a raid controller with a write cache (and a BBU) and an SSD storage on a PCIe card. To reduce the write load to the SSD, we could send the file system journal to the spinning drives using the “journal_path=path” or “journal_dev=devnum” options of ext4. The raid controller write cache would do an awesome job at merging the write operations for the file system journal and the amount of write operations going to the SSD would be cut by half. I don’t have access to such a setup but it seems very promising performance wise.

Conclusion

Like ZFS, ext4 can be transactional and replacing the InnoDB double write buffer with the file system transaction journal yield a 55% increase in performance for write intensive workload. Performance gains are also expected for SSD and mixed spinning/SSD configurations.

The post How to improve InnoDB performance by 55% for write-bound loads appeared first on MySQL Performance Blog.

Feb
12
2014
--

8 common (but deadly) MySQL operations mistakes and how to avoid them

Q&A: How to Avoid Common (but Deadly) MySQL Operations MistakesJanuary 22 I gave a presentation on “How to Avoid Common (but Deadly) MySQL Operations Mistakes” for Percona MySQL Webinars. If you missed it, you can still register to view the recording and my slides.

Thanks to everyone who attended, and especially to folks who asked the great questions. I answered as many as we had time for during the session, but here are all the questions with my complete answers:

Q: What if I use rsync to sync the mysql dir to another server as a backup?

You can do this only if you shut down the mysqld daemon first. Otherwise, you have a high risk that your copy will be incomplete and impossible to restore.

If you need to create a physical backup on a running instance of MySQL, use Percona XtraBackup. This is safe because this tool carefully synchronizes copying the tablespace with the transaction log, so it assures against getting a partial copy.

Q: Is MIXED binlog-format useful as well as ROW?

The MIXED format defaults to STATEMENT, and switches to ROW for individual events that it detects are non-deterministic and therefore unsafe for statement-based replication. In theory, this should be give you the best of both worlds. But there could still be cases where MySQL replicates an event in statement format because it fails to detect a non-deterministic case.

Q: Percona Server version 5.1 — is this version compatible with pt-mext tool?

Yes, pt-mext works fine with Percona Server 5.1.

Q: We have multiple slaves and slaves to slaves and one slave sends changes to the master. Checksum is breaking the replication.

I am not sure I follow your description of your topology, or how the slave would send changes to the master.

I suggest you contact Percona Oncall and they can help you solve any issue with running checksums.

Q: To verify a restore can the check table extended command be used? This supposedly makes sure the table is consistent. I was wondering if this is still useful in to verify a restore in the latest versions of MySQL.

CHECK TABLE analyzes a table for errors, but it can’t tell if the data is different from the data in the same table on a different instance (e.g. a slave). This can check for physical corruption after a restore, but it can’t verify that the data is correct.

Q: Query optimize really helps? And for large databases like 100GB +, how that will be affected?

By choosing query optimization techniques carefully, some queries can be made to run thousands of times faster.

The larger the table, the more important it is to make sure your queries run efficiently.

Q: Is pt-online-schema-change available in 5.1? All my servers are Percona 5.1.

Yes, pt-online-schema-change works fine with Percona Server 5.1.

Q: What is the best way to perform backup?

Choosing the right tool, schedule, and method for performing backups is a complex problem. I suggest you read a white paper Percona has published on “MySQL Backup and Recovery Best Practices.”

Q: Is there a list of measurable indicators of performance somewhere?

A good tool to help you monitor many performance indicators is Percona Monitoring Plugins. This works with popular open-source monitoring frameworks like Cacti and Zabbix.

Q: How does most of this apply to Amazon’s RDS? Not having direct root access seems like a problem.

You’re right, Amazon RDS is a convenient way to deploy a preconfigured MySQL appliance, but you don’t get to log into the server, and many MySQL tuning variables are not accessible. We can still use some of the tools we are accustomed to using with MySQL, because the tools can access a remote MySQL server. But other tools require local access to the data directory.

Amazon RDS has a management console that allows you to do backups and restores, but if you want to choose specific tools, you may need to migrate to another environment such as Amazon EC2.

Q: A sales person told me that Percona XtraDB Cluster was not fully baked yet about half a year ago, is it ready to go for production now? (we have cluster instances in Amazon for low latency)

PXC is fully baked, has a beautiful golden crust, and smells delicious.

But seriously, we’ve helped many customers deploy PXC over the past year, and it’s working in many production environments.

Q: What buffer size and max_heap_size would you recommend for small 512 Mb RAM server (runs php5-fpm+nginx)?

I suggest you try the Percona Configuration Wizard for MySQL to get you started. It will suggest configuration values appropriate for your server hardware. This won’t be optimized specifically for your site’s workload, but it will be a good start.

Q: Is there any harm in in running pt-table-sync without running pt-table-checksum?

No harm. You can optionally use pt-table-sync to calculate its own checks to find out which rows need to be synchronized. You can even synchronize MySQL instances that aren’t replication master and slave.

Q: Is Percona XtraDB Cluster a viable option when MySQL servers are located in different data centers and connected via shared Internet connections?

Yes, this is a great use case for PXC. Refer to an early proof of concept test we ran to prove that multi-datacenter clusters work, and our white paper on High Level Multi-Datacenter MySQL High Availability.

Q: Can Percona XtraBackup be used to take a backup of a single table?

Yes, you can use partial backup options to make innobackupex back up only specific databases or specific tables.

Q: What methods do you recommend to replicate the binlogs outside of replication? We are working with DRBD any other recommendations?

MySQL 5.6 adds an option to the mysqlbinlog tool to backup binary logs files continously. So you can effectively keep your binlogs backed up on a separate server for safety.

Q: How will pt-table-checksum tolerate binlog-format=MIXED with GTID replication?

pt-table-checksum must use statement-based binlog events for the checksums to work, so it overrides any default binlog row format you have defined on your system.

Q: What are your thoughts on SymmetricDS for db replication over standard MySQL replication?

I have not evaluated SymmetricDS, so I can’t offer a specific opinion about it.

Most alternative solutions fit a specific type of project, and no single solution works for everyone.
So if this one works for your needs, it’s worth taking a look at it.

You should compare it with Tungsten Replicator, which is designed for a similar use case, as a highly-available solution for multi-master and multi-site replication.

Q: A question about indexes: in a table with persons, should I add an index on the column gender?

The best indexes depend on what queries you need to run, and the selectivity of data. If you never run a query that uses the gender column for searching or sorting, there would be no benefit to adding an index on it.

Furthermore, MySQL may still not use an index even if you do search on that column, if the value you search for occurs in a large (typically 20%+) of the rows of the table.

Q: I have tried Percona XtraBackup but I’m not sure about the best way to backup full server and restore only a single database from that full backup. I’m using mysqldump to backup and restore now.

Percona XtraBackup does support a method to restore individual tablespaces, but the steps to do it are laborious, and must be done one table at a time. Restoring all tables from a given database this way is possible, but involves more work that so far is mostly manual.

To be honest, using mysqldump is probably still the easier solution for this.

Q: Does Percona Xtradb Cluster have any replication drift? How can one minimize it?

PXC uses a different method of replication, not the built-in replication of standard MySQL. For purposes of replication drift and lag, you can think of it as similar to ROW based replication with semi-synchronous slaves. PXC should therefore have minimal chance of replication drift all by itself.

Q: How reliable are Percona XtraBackup incremental backups, in combination with binary logs for point in time recovery?

The incremental backups in Percona XtraBackup work very reliably, the most common problem is when you make a mistake and apply the incremental backups in an incorrect order.

Likewise, binary logs are reliable, but you must apply all the binary logs after the last incremental backup has been applied, and you must have a contiguous set of binary logs. For maximum safety, use sync_binlog=1 to assure the last events in the binlog are written to disk.

. . .

Thanks again for attending my webinar! Here are some more tips:

The post 8 common (but deadly) MySQL operations mistakes and how to avoid them appeared first on MySQL Performance Blog.

Jan
13
2014
--

Webinars: Common (but deadly) MySQL mistakes

On January 22 I’ll be presenting the first of a series of two new webinars focusing on avoiding common MySQL mistakes:

How to Avoid Common (but Deadly) MySQL Operations Mistakes.

“Don’t bother me with DBA ‘best practices,’ we have real work to get done.”

So go the famous last words from overworked IT managers (we’ve all been there at one time or another).

Best practices don’t have to interrupt your work or make it harder. Ideally, they should make database operations more predictable and less fraught with emergencies or surprises. This presentation provides a short list of common failures of DBA’s, and makes the case that you can save yourself time (and grey hair) by adopting better habits.

Some of the tips include:

  • How to know when and why your server changed configuration values.
  • How to have bulletproof backups.
  • How to save up to 20% of your database size at the same time as reducing the query optimizer’s work.

Then on March 5 I’ll present the second webinar in this series:

How to Avoid Common (but Deadly) MySQL Development Mistakes

“We need a database that ‘just works’ and runs at ‘web scale.’”

MySQL software developers are under pressure to do more in less time, and create applications that adapt to ever-changing requirements.

But it’s true that some of a developer’s time is wasted when their method of optimizing involves trying every combination of code and index definition. There must be a more straight path to achieve the best database code. This talk shows you some of these methods, including:

  • How to find out which indexes are the best fit for your MySQL application
  • How to protect yourself from the number one MySQL database security vulnerability on the web
  • How to decide when to optimize a MySQL database application with denormalization, indexes, caching, partitioning, sharding

At the end of this webinar, you’ll be more productive and confident as you develop MySQL database-driven applications.

Please join me!  Register for the first webinar or register for the second webinar!  Or register for both and get two for the price of one (just kidding; they’re both free)!

The post Webinars: Common (but deadly) MySQL mistakes appeared first on MySQL Performance Blog.

Dec
03
2013
--

innodb_stats_on_metadata and slow queries on INFORMATION_SCHEMA

INFORMATION_SCHEMA is usually the place to go when you want to get facts about a system (how many tables do we have? what are the 10 largest tables? What is data size and index size for table t?, etc). However it is also quite common that such queries are very slow and create lots of I/O load. Here is a tip to avoid theses hassles: set innodb_stats_on_metadata to OFF.

This is a topic we already talked about, but given the number of systems suffering from INFORMATION_SCHEMA slowness, I think it is good to bring innodb_stats_on_metadata back on the table.

The problem

Let’s look at a system I’ve seen recently: MySQL 5.5, working set fitting in memory but not the whole dataset, around 4000 InnoDB tables.

The I/O load is very light as the server is an idle replica. You can see the I/O load from this partial pt-diskstats output:

#ts device    rd_s rd_avkb    wr_s wr_avkb  busy in_prg
1.0 sda2       0.0     0.0     0.0     0.0    0%      0
1.0 sda2       0.0     0.0    16.0     9.2    0%      0
1.0 sda2       0.0     0.0     0.0     0.0    0%      0
1.0 sda2       0.0     0.0     0.0     0.0    0%      0
1.0 sda2       0.0     0.0     2.0     4.0    0%      0

The customer wanted to know what could be improved from the schema so we started by finding the 10 largest tables:


mysql> SELECT table_schema as 'DB',
table_name as 'TABLE',
CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 * 1024 ), 2), 'G') 'TOTAL'
FROM information_schema.TABLES
ORDER BY data_length + index_length DESC
LIMIT 10;
[...]
10 rows in set (1 min 32.23 sec)

1mn32s is slow, but it’s not really a problem. But the I/O load triggered by this query IS a problem:

#ts device    rd_s rd_avkb    wr_s wr_avkb busy in_prg
1.0 sda2       0.0     0.0    18.0     8.7   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2     100.0    16.0     0.0     0.0  79%      1
1.0 sda2     184.0    16.0     5.0     4.8  96%      1
1.0 sda2      97.0    16.0     0.0     0.0  98%      1
1.0 sda2     140.0    16.0     0.0     0.0  98%      1
1.0 sda2     122.0    16.0    17.0     4.0  98%      1
1.0 sda2     147.0    16.0     0.0     0.0  98%      1
[...]
1.0 sda2     136.0    16.0     0.0     0.0  98%      1
1.0 sda2     139.0    16.0     0.0     0.0  98%      1
1.0 sda2     149.0    16.0     0.0     0.0  98%      1
1.0 sda2     114.0    16.0     0.0     0.0  98%      1
1.0 sda2     147.0    16.0     8.0     4.0  96%      1
1.0 sda2     192.0    16.0     0.0     0.0  97%      1
1.0 sda2     141.0    16.0     0.0     0.0  98%      1
1.0 sda2     167.0    16.0     0.0     0.0  98%      1
1.0 sda2      15.0    16.0     0.0     0.0   6%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0    16.0     4.0   0%      0

The disks are 100% busy reading InnoDB pages for our query. No doubt that if the server was running queries from the application, they would have been negatively impacted.

Now let’s execute the same query with innodb_stats_on_metadata = OFF;


mysql> SET GLOBAL innodb_stats_on_metadata = OFF;

 

mysql> SELECT [...]
10 rows in set (0.45 sec)

And let’s look at pt-diskstats:

#ts device    rd_s rd_avkb    wr_s wr_avkb busy in_prg
1.0 sda2       0.0     0.0    16.0     9.2   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0    16.0     4.0   1%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0

No read load this time (and a much faster query!).

What is innodb_stats_on_metadata?

When the option is set to ON, InnoDB index statistics are updated when running SHOW TABLE STATUS, SHOW INDEX or when querying INFORMATION_SCHEMA.TABLES or INFORMATION_SCHEMA.STATISTICS. These statistics include the cardinality and the number of entries, they are used by the optimizer to find an optimal execution plan.

So even if SELECT statements cannot change the real statistics, MySQL updates the statistics for InnoDB tables. This is counterintuitive.

Is it useful? Not really, because InnoDB will always compute statistics when you open a table for the first time and when significant portions of the table have been modified (and when you run ANALYZE TABLE).

Now why did we have such a high read load when innodb_stats_on_metadata was set to ON? For InnoDB, statistics are estimated from random index dives, which translates to random reads.

The problem was magnified in the example because the whole dataset was not fitting in memory, the number of tables was high and the I/O subsystem was not very powerful.

Conclusion

It’s worth mentioning that the default value is now OFF with MySQL 5.6. So if you’re using MySQL 5.6, there’s no need to change anything. If you’re using MySQL 5.1 or 5.5, set innodb_stats_on_metadata to OFF and show your boss how you were able to get a 200x performance boost on some queries! And if you’re using MySQL 5.0 or below, you’ve just found another reason to upgrade!

The post innodb_stats_on_metadata and slow queries on INFORMATION_SCHEMA appeared first on MySQL Performance Blog.

May
20
2013
--

Webinar: SQL Query Patterns, Optimized

Using MySQL 5.6 Performance Schema to Troubleshoot Typical Workload BottlenecksNext Friday, May 31 at 10 a.m. Pacific, I’ll present Percona’s next webinar, “SQL Query Patterns, Optimized.”

Based on my experiences solving tough SQL problems for Percona training and consulting, I’ll classify several common types of queries with which developers struggle. I’ll test several SQL solutions for each type of query objective, and show how you can use MySQL 5.6 built-in methods to analyze them for optimal query efficiency.  The discussion will cover optimizer reports, query profiling, and session status to measure performance.

The query patterns will include:

  • Exclusion Join
  • Random Selection
  • Greatest-Per-Group
  • Dynamic Pivot
  • Relational Division

Please register for this webinar and join me next Friday!

The post Webinar: SQL Query Patterns, Optimized appeared first on MySQL Performance Blog.

Feb
18
2013
--

Analyzing Slow Query Table in MySQL 5.6

February 25-28, 2013 9AM - 11AM Pacific

Analyzing SQL Queries with Percona Toolkit, Feb. 25-28, 9-11 a.m. PST

Next week I’m teaching an online Percona Training class, called Analyzing SQL Queries with Percona Toolkit.  This is a guided tour of best practices for pt-query-digest, the best tool for evaluating where your database response time is being spent.

This month we saw the GA release of MySQL 5.6, and I wanted to check if any improvement has been made to the slow query log in table format.  Users of some MySQL appliances like Amazon RDS must use table logs, since they can’t access the file-based logs on the server.

I read the logging code in MySQL 5.6.10 source, in file sql/log.cc.  I discovered  that they have refactored the way they write to file-based logs versus table-based logs, but one particular feature I was looking for has not changed.

When writing slow query information to the log file, it outputs the query time and lock time as floating-point numerics to microsecond scale:

sql/log.cc, MYSQL_QUERY_LOG::write():

1945   /* For slow query log */
1946   sprintf(query_time_buff, "%.6f", ulonglong2double(query_utime)/1000000.0);
1947   sprintf(lock_time_buff, "%.6f", ulonglong2double(lock_utime)/1000000.0);

However, just like in prior releases of MySQL, when writing slow query information to the log table, query times and lock times are truncated to to integers:

sql/log.cc, Log_to_csv_event_handler::log_slow():

610   longlong query_time= (longlong) (query_utime/1000000);
 611   longlong lock_time= (longlong) (lock_utime/1000000);

All the times recorded in slow query log tables are rounded down to the nearest whole second, meaning any query that executes in less than 1.0 seconds counts as 0 seconds.  This will skew your average query time report.

Still, if you use Amazon RDS or a similar instance without filesystem access, you may want to do a report on the query log in spite of this.  You could still get a report of the most frequent queries, even if the response time is so imprecise.  But pt-query-digest and similar tools don’t read the query log table—they only read the query log file.

To account for this, you can use a client script to dump the contents of the query log table into a flat file in the format of a conventional query log file, which you can then use as input to pt-query-digest.  I’ve seen several implementations of this type of script in the past, but here’s mine:

cat <<'GO' | mysql --raw --skip-column-names --quick --silent --no-auto-rehash --compress $*
SELECT CONCAT(
'# Time: ', DATE_FORMAT(start_time, '%y%m%d %H:%i:%s'), CHAR(10),
'# User@Host: ', user_host, CHAR(10),
'# Query_time: ', TIME_TO_SEC(query_time),
' Lock_time: ', TIME_TO_SEC(lock_time),
' Rows_sent: ', rows_sent,
' Rows_examined: ', rows_examined, CHAR(10),
'SET timestamp=', UNIX_TIMESTAMP(start_time), ';', CHAR(10),
IF(FIND_IN_SET(sql_text, 'Sleep,Quit,Init DB,Query,Field List,Create DB,Drop DB,Refresh,Shutdown,Statistics,Processlist,Connect,Kill,Debug,Ping,Time,Delayed insert,Change user,Binlog Dump,Table Dump,Connect Out,Register Slave,Prepare,Execute,Long Data,Close stmt,Reset stmt,Set option,Fetch,Daemon,Error'),
CONCAT('# administrator command: ', sql_text), sql_text),
';'
) AS `# slow-log`
FROM `mysql`.`slow_log`;
GO
echo "#"

I host the script above in my bk-tools github project. It’s called export-slow-log-table.  I distribute it under the terms of the GNU Public License v3.

Do you want to learn more about best practices for manipulating query logs and running reports on them?  Register to join me February 25-28, 9AM-11AM Pacific Time, for my online class Analyzing SQL Queries with Percona Toolkit.

The post Analyzing Slow Query Table in MySQL 5.6 appeared first on MySQL Performance Blog.

Jan
16
2013
--

Sphinx search performance optimization: multi-threaded search

Queries in MySQL, Sphinx and many other database or search engines are typically single-threaded. That is when you issue a single query on your brand new r910 with 32 CPU cores and 16 disks, the maximum that is going to be used to process this query at any given point is 1 CPU core and 1 disk. In fact, only one or the other.

Seriously, if query is CPU intensive, it is only going to be using 3% of the available CPU capacity (for the same 32-core machine). If disk IO intensive – 6% of the available IO capacity (for the 16-disk RAID10 or RAID0 for that matter).

Let me put it another way. If your MySQL or Sphinx query takes 10s to run on a machine with a single CPU core and single disk, putting it on a machine with 32 such cores and 16 such disks will not make it any better.

But you knew this already. Question is – can you do something about it?

In case of Sphinx – indeed you can! And with very little effort. In fact, it does not require any changes to your application or database whatsoever, it is only a matter of small changes to the sphinx configuration.

The Plan

First of all, let me explain what we are trying to achieve here.

Sphinx has the ability to do distributed search out of the box – it was designed to scale out that way very early on. If your sphinx index does not fit to one machine, you would index different parts of it from different machines and then you would have an aggregator node that receives the request from application, issues search requests to all data nodes in parallel, merges results from all of the data nodes and returns results back to the application as if it was just one server serving the request in the first place.

Well, guess what – you can actually utilize this feature to your advantage even if your data can easily fit into one machine and all you want is your queries to be many times faster. Even more so, Sphinx now supports this out of the box, so you don’t need to pretend you are querying remote nodes.

Also, you get a bonus: once you configure server for distributed search, you can do indexing in parallel too!

Word of caution: while this technique will improve most types of search queries, there are some that aren’t going to benefit greatly from parallel execution. The reason is that while search is done on data nodes (even if local ones) and in parallel, merging of results is done by the aggregator and therefore it is single-threaded. Merging includes some CPU-intensive operations such as ranking, ordering or even COUNT with GROUP BY and if data nodes return large amounts of data to post-process, aggregator may well become a bottle-neck due to its single-threaded nature.

This is however easy to check – look at your Sphinx query log and look at the number of results matched per query – this should give you a clue.

Execution

Let us assume you have this one index on one server with the following basic configuration (many irrelevant details omitted):

source src1
{
	type = mysql
	sql_query = SELECT id, text FROM table
}

index idx1
{
	type = plain
	source = src1
}

searchd
{
	dist_threads = 0 # default
}

And now we want it to utilize 3 CPU cores and/or disks on a local machine for this index of ours – idx1. Here’s how we would change the configuration:

source src1
{
	type = mysql
	sql_query = SELECT id, text FROM table
}

source src1p0 : src1
{
	sql_query = SELECT id, text FROM table WHERE id % 3 = 0;
}

source src1p1 : src1
{
	sql_query = SELECT id, text FROM table WHERE id % 3 = 1;
}

source src1p2 : src1
{
	sql_query = SELECT id, text FROM table WHERE id % 3 = 2;
}

index idx1_template
{
	type = plain
	source = src1
}

index idx1p0 : idx1_template
{
	source = src1p0
}

index idx1p1 : idx1_template
{
	source = src1p1
}

index idx1p2 : idx1_template
{
	source = src1p2
}

index idx1
{
	type = distributed
	local = idx1p0
	local = idx1p1
	local = idx1p2
}

searchd
{
	dist_threads = 3
}

And you’re done. Of course, you need to reindex all of the indexes, but you can now do it in parallel – just run a separate indexer for every index idx1p0 through idx1p2.

BTW, using div operator is not necessarily the best way to split data, you may want these to be ranges by using a helper table in MySQL to define those ranges, mixing it with sql_query_range or something else, depending on how your data looks like.

Finishing line

I always loved how Sphinx scales out easily with as many machines you need and have been running it this way for many years now, however I think I don’t utilize this feature to make queries even faster on a one-machine show nearly as often as I should. Well, it’s not like it is slow or anything, but queries are never too fast, are they? :)

The post Sphinx search performance optimization: multi-threaded search appeared first on MySQL Performance Blog.

Jan
15
2013
--

Sphinx search performance optimization: attribute-based filters

One of the most common causes of a poor Sphinx search performance I find our customers face is misuse of search filters. In this article I will cover how Sphinx attributes (which are normally used for filtering) work, when they are a good idea to use and what to do when they are not, but you still want to take advantage of otherwise superb Sphinx performance.

The Problem

While Sphinx is great for full text search, you can certainly go beyond full text search, but before you go there, it is a good idea to make sure you’re doing it the right way.

In Sphinx, columns are basically one of two kinds:

a) full text
b) attributes

Speaking in MySQL terms, Full text columns are always indexed and using the very powerful extended query syntax you can do search against certain columns only, or against all of them – the fewer results your full text query matches, the faster the query will be. That’s self-evident, I guess. Every time a keyword matches a document, Sphinx needs to resolve an appropriate document and evaluate the result. If your keywords match all of your 100M records, it is going to be a lot of work to do this. However with just a few hundred thousand records it is going to be much much faster.

Attributes on the other hand are sort of like unindexed MySQL columns. They are the extra details you may want to filter by, which is usually things like gender, status, age, group_id etc. The effect of them being unindexed is that whenever you are using attributes – it is a full scan of this attribute for all the records that matched the full text search query. For few hundred thousand of matched records, checking attributes is not going to slow down performance of queries significantly. But – and this is the misuse that I see a lot – when you are NOT doing full text search, only using attribute-based filters means a full scan of all records for that attribute.

Because attributes are not B-tree structured (and therefore are slow to work with), by default Sphinx actually stores them in memory. In fact, it requires that all attributes fit in memory or the server performance will be simply unbearable. However, that still does not mean that you can use attributes to find all the records that match group_id=10 – that query will have to check all 100M of records.

BTW internally there’s some differences between numeric attributes and character based as well as multi-value attributes (MVAs), but for the purpose of our discussion it’s enough to know that attributes are not indexed.

For example..

Now let me give you few examples so it’s not just an empty talk. For the examples below I will be using SphinxQL protocol which looks like talking to MySQL server, but it’s not. It is me talking to Sphinx server.

First of all, let us see how many records we have in this index and how long does it take to do a full scan:

sphinx> select * from catalog;
...
20 rows in set (0.70 sec)

sphinx> show meta;
+---------------+----------+
| Variable_name | Value    |
+---------------+----------+
| total         | 1000     |
| total_found   | 10309972 |
| time          | 0.700    |
+---------------+----------+
3 rows in set (0.00 sec)

Note this is a real index used in production – a catalog of books, so if same query happens to give slightly different results it could be because the indexing occurred between different iterations.

If you are seeing this SphinxQL output first time, it maybe a little confusing, but let me explain. Query returned 20 rows because unless you specify an explicit LIMIT, it defaults to 20. Total says 1000 because by default query is limited to 1000 best results to process (it still searches the entire index though).

Otherwise, takeaway is that this index has 10M records and it takes 700ms to do a full scan.

Now, let us find all records that match user_id = 50:

sphinx> select * from catalog where user_id = 50;
...
20 rows in set (0.16 sec)

sphinx> show meta;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| total         | 287   |
| total_found   | 287   |
| time          | 0.155 |
+---------------+-------+
3 rows in set (0.00 sec)

Pretty bad, isn’t it? 287 records returned in 155ms. Doing the same thing in MySQL, assuming user_id is indexed and in cache, would take less than a millisecond, so it is definitely not the best use case for Sphinx.

When you have full text search keywords that match many documents (and therefore are slow), using attributes may reduce the number of results matched significantly. But not the amount of time it takes to do that:

sphinx> select * from catalog where match('Great') and user_id = 50;
...
5 rows in set (0.10 sec)

sphinx> show meta;
+---------------+--------+
| Variable_name | Value  |
+---------------+--------+
| total         | 5      |
| total_found   | 5      |
| time          | 0.096  |
| keyword[0]    | great  |
| docs[0]       | 200084 |
| hits[0]       | 216948 |
+---------------+--------+
6 rows in set (0.00 sec)

Solution

Solution may not be obvious first, but you will see that it makes sense. So, the strength of Sphinx is full text search. I suggest we exploit that to get good performance on attributes that are highly selective, such as the user_id example above. In fact, I’ve been using this technique with great success for many years now.

First, I would add the following extra item to fetch when indexing the catalog:

CONCAT(‘userkey_’, user_id) userkey

And now I have an extra column in a full text index that I can use for filtering:

sphinx> select * from catalog where match('@userkey userkey_50');
...
20 rows in set (0.00 sec)

sphinx> show meta;
+---------------+------------+
| Variable_name | Value      |
+---------------+------------+
| total         | 287        |
| total_found   | 287        |
| time          | 0.000      |
| keyword[0]    | userkey_50 |
| docs[0]       | 287        |
| hits[0]       | 287        |
+---------------+------------+
6 rows in set (0.00 sec)

That looks much better and I can mix it with other search keywords:

sphinx> select * from catalog where match('Great @userkey userkey_50');
...
5 rows in set (0.01 sec)

sphinx> show meta;
+---------------+------------+
| Variable_name | Value      |
+---------------+------------+
| total         | 5          |
| total_found   | 5          |
| time          | 0.013      |
| keyword[0]    | great      |
| docs[0]       | 200088     |
| hits[0]       | 216952     |
| keyword[1]    | userkey_50 |
| docs[1]       | 287        |
| hits[1]       | 287        |
+---------------+------------+
9 rows in set (0.00 sec)

Highly selective columns only

I thought I would emphasize – while it is a neat performance optimization for highly selective attributes, this is certainly not something you would want to use for every attribute. There’s few reasons for that:

  • it does use more disk space for the index (although it’s not as bad as you might think)
  • attributes are still a good way to filter out data when your search queries don’t match many records
  • in fact, it could reduce performance of queries that otherwise match few records

Let me illustrate that. I have created another full text indexed column for a skewed boolean attribute “ancient” which identifies books published before year 1500 and after, and now I will run some different queries against the two:


sphinx> select * from catalog where ancient = 1;
| total_found   | 1499266 | 14%
| time          | 0.552   |

sphinx> select * from catalog where ancient = 0;
| total_found   | 8852086 | 86%
| time          | 0.662   |

sphinx> select * from catalog where match('ancientkey_1');
| total_found   | 1499266 |
| time          | 0.227   |

sphinx> select * from catalog where match('ancientkey_0');
| total_found   | 8852086 |
| time          | 1.309   |

sphinx> select * from catalog where match('solar');
| total_found   | 2510  |
| time          | 0.001 |

sphinx> select * from catalog where match('solar @ancientkey ancientkey_0');
| total_found   | 2176  |
| time          | 0.434 |

sphinx> select * from catalog where match('solar @ancientkey ancientkey_1');
| total_found   | 334   |
| time          | 0.077 |

sphinx> select * from catalog where match('solar') and ancient = 1;
| total_found   | 334   |
| time          | 0.002 |

sphinx> select * from catalog where match('solar') and ancient = 0;
| total_found   | 2176  |
| time          | 0.003 |

What you can see here is that while there’s very little difference when using only the check against “ancient” – it is very slow in both cases – when doing a selective search and then filtering, using attribute is orders of magnitude better than using a fulltext key.

That being said, for highly selective columns, even more so if they will be used in queries without full text search keywords, having them full text indexed is a very good way to improve such Sphinx search query performance.

The post Sphinx search performance optimization: attribute-based filters appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com