MySQL super_read_only Bugs


super_read_onlyThis blog we describe an issue with MySQL 5.7’s super_read_only feature when used alongside with GTID in chained slave instances.


In MySQL 5.7.5 and onward introduced the gtid_executed table in the MySQL database to store every GTID. This allows slave instances to use the GTID feature regardless whether the binlog option is set or not. Here is an example of the rows in the gtid_executed table:

mysql> SELECT * FROM mysql.gtid_executed;
| source_uuid                          | interval_start | interval_end |
| 00005730-1111-1111-1111-111111111111 |              1 |            1 |
| 00005730-1111-1111-1111-111111111111 |              2 |            2 |
| 00005730-1111-1111-1111-111111111111 |              3 |            3 |
| 00005730-1111-1111-1111-111111111111 |              4 |            4 |
| 00005730-1111-1111-1111-111111111111 |              5 |            5 |
| 00005730-1111-1111-1111-111111111111 |              6 |            6 |
| 00005730-1111-1111-1111-111111111111 |              7 |            7 |
| 00005730-1111-1111-1111-111111111111 |              8 |            8 |
| 00005730-1111-1111-1111-111111111111 |              9 |            9 |
| 00005730-1111-1111-1111-111111111111 |             10 |           10 |

To save space, this table needs to be compressed periodically by replacing GTIDs rows with a single row that represents that interval of identifiers. For example, the above GTIDs can be represented with the following row:

mysql> SELECT * FROM mysql.gtid_executed;
| source_uuid                          | interval_start | interval_end |
| 00005730-1111-1111-1111-111111111111 |              1 |           10 |

On the other hand, we have the super_read_only feature, if this option is set to ON, MySQL won’t allow any updates – even from users that have SUPER privileges. It was first implemented on WebscaleSQL and later ported to Percona Server 5.6. MySQL mainstream code implemented a similar feature in version 5.7.8.

The Issue [1]

MySQL’s super_read_only feature won’t allow the compression of the mysql.gtid_executed table. If a high number of transactions run on the master instance, it causes the gtid_executed table to grow to a considerable size. Let’s see an example.

I’m going to use the MySQL Sandbox to quickly setup a Master/Slave configuration, and sysbench to simulate a high number of transactions on master instance.

First, set up replication using GTID:

make_replication_sandbox --sandbox_base_port=5730 /opt/mysql/5.7.17 --how_many_nodes=1 --gtid

Next, set up the variables for a chained slave instance:

echo "super_read_only=ON" >> node1/my.sandbox.cnf
echo "log_slave_updates=ON" >> node1/my.sandbox.cnf

Now, generate a high number of transactions:

sysbench --test=oltp.lua --mysql-socket=/tmp/mysql_sandbox5730.sock --report-interval=1 --oltp-tables-count=100000 --oltp-table-size=100 --max-time=1800 --oltp-read-only=off --max-requests=0 --num-threads=8 --rand-type=uniform --db-driver=mysql --mysql-user=msandbox --mysql-password=msandbox --mysql-db=test prepare

After running sysbench for awhile, we check that the number of rows in the gtid_executed table is increasing faster:

slave1 [localhost] {msandbox} ((none)) > select count(*) from mysql.gtid_executed ;
| count(*) |
|   300038 |
1 row in set (0.00 sec)

By reviewing SHOW ENGINE INNODB STATUS, we can find a compression thread running and trying to compress the gtid_executed table.

---TRANSACTION 4192571, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
9 lock struct(s), heap size 1136, 1533 row lock(s), undo log entries 1525
MySQL thread id 4, OS thread handle 139671027824384, query id 0 Compressing gtid_executed table

This thread runs and takes ages to complete (or may never complete). It has been reported as #84332.

The Issue [2]

What happens if you have to stop MySQL while the thread compressing the gtid_executed table is running? In this special case, if you run the flush-logs command before or at the same time as mysqladmin shutdown, MySQL will actually stop accepting connections (all new connections hang waiting for the server) and will start to wait for the thread compressing the gtid_executed table to complete its work. Below is an example.

First, execute the flush logs command and obtain ERROR 1290:

$ mysql -h -P 5731 -u msandbox -pmsandbox -e "flush logs ;"
ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement

We’ve tried to shutdown the instance, but it hangs:

$ mysqladmin -h -P 5731 -u msandbox -pmsandbox shutdown
^CWarning;  Aborted waiting on pid file: 'mysql_sandbox5731.pid' after 175 seconds

This bug has been reported and verified as #84597.

The Workaround

If you already have an established connection to your database with SUPER privileges, you can disable the super_read_only feature dynamically. Once that is done, the pending thread compressing the gtid_executed table completes its work and the shutdown finishes successfully. Below is an example.

We check rows in the gtid_executed table:

$ mysql -h -P 5731 -u msandbox -pmsandbox -e "select count(*) from mysql.gtid_executed ;"
| count(*) |
|   300038 |

We disable the super_read_only feature on an already established connection:

$ mysql> set global super_read_only=OFF ;

We check the rows in the gtid_executed table again, verifying that the compress thread ran successfully.

$ mysql -h -P 5731 -u msandbox -pmsandbox -e "select count(*) from mysql.gtid_executed ;"
| count(*) |
|        1 |

Now we can shutdown the instance without issues:

$ mysqladmin -h -P 5731 -u msandbox -pmsandbox shutdown

You can disable the super_read_only feature before you shutdown the instance to compress the gtid_executed table. If you ran into bug above, and don’t have any established connections to your database, the only way to shutdown the server is by issuing a kill -9 on the mysqld process.


As shown in this blog post, some of the mechanics of MySQL 5.7’s super_read_only command are not working as expected. This can prevent some administrative operations, like shutdown, from happening.

If you are using the super_read_only feature on MySQL 5.7.17 or older, including Percona Server 5.7.16 or older (which ports the mainstream implementation – unlike Percona Server 5.6, which ported Webscale’s super_read_only implementation) don’t use FLUSH LOGS.


When MySQL Lies: Wrong seconds_behind_master with slave_parallel_workers > 0


seconds_behind_masterIn today’s blog, I will show an issue with seconds_behind_master that one of our clients faced when running slave_parallel_works > 0. We found out that the reported seconds_behind_master from SHOW SLAVE STATUS was lying. To be more specific, I’m talking about bugs #84415 and #1654091.

The Issue

MySQL will not report the correct slave lag if you have slave_parallel_workers> 0. Let’s show it in practice.

I’ll use MySQL Sandbox to speed up one master and two slaves on MySQL version 5.7.17, and sysbench to populate the database:

# Create sandboxes
make_replication_sandbox /path/to/mysql/5.7.17
# Create table with 1.5M rows on it
sysbench --test=/usr/share/sysbench/tests/db/oltp.lua --mysql-host=localhost --mysql-user=msandbox --mysql-password=msandbox --mysql-socket=/tmp/mysql_sandbox20192.sock --mysql-db=test --oltp-table-size=1500000 prepare
# Add slave_parallel_workers=5 and slave_pending_jobs_size_max=1G" on node1
echo "slave_parallel_workers=5" >> node1/my.sandbox.cnf
echo "slave_pending_jobs_size_max=1G" >> node1/my.sandbox.cnf

Monitor Replication lag via SHOW SLAVE STATUS:

for i in {1..1000};
        node1/use -e "SHOW SLAVE STATUSG" | grep "Seconds_Behind_Master" | awk '{print "Node1: " $2}' &
        sleep 0.1 ;
        node2/use -e "SHOW SLAVE STATUSG" | grep "Seconds_Behind_Master" | awk '{print "Node2: " $2}' &
    sleep 1;

On a separate terminal, DELETE some rows in the test.sbtest1 table on the master, and monitor the above output once the master completes the delete command:

DELETE FROM test.sbtest1 WHERE id > 100;

Here is a sample output:

master [localhost] {msandbox} (test) > DELETE FROM test.sbtest1 WHERE id > 100;
Query OK, 1499900 rows affected (46.42 sec)
. . .
Node1: 0
Node2: 0
Node1: 0
Node2: 48
Node1: 0
Node2: 48
Node1: 0
Node2: 49
Node1: 0
Node2: 50
. . .
Node1: 0
Node2: 90
Node1: 0
Node2: 91
Node1: 0
Node2: 0
Node1: 0
Node2: 0

As you can see, node1 (which is running with slave_parallel_workers = 5) doesn’t report any lag.

The Workaround

We can workaround this issue by querying performance_schema.threads:

SELECT PROCESSLIST_TIME FROM performance_schema.threads WHERE NAME = 'thread/sql/slave_worker' AND (PROCESSLIST_STATE IS NULL  or PROCESSLIST_STATE != 'Waiting for an event from Coordinator') ORDER BY PROCESSLIST_TIME DESC LIMIT 1;

Let’s modify our monitoring script, and use the above query to monitor the lag on node1:

for i in {1..1000};
        node1/use -BNe "SELECT PROCESSLIST_TIME FROM performance_schema.threads WHERE NAME = 'thread/sql/slave_worker' AND (PROCESSLIST_STATE IS NULL  or PROCESSLIST_STATE != 'Waiting for an event from Coordinator') ORDER BY PROCESSLIST_TIME DESC LIMIT 1 INTO @delay; SELECT IFNULL(@delay, 0) AS 'lag';" | awk '{print "Node1: " $1}' &
        sleep 0.1 ;
        node2/use -e "SHOW SLAVE STATUSG" | grep "Seconds_Behind_Master" | awk '{print "Node2: " $2}' &
    sleep 1;

master [localhost] {msandbox} (test) > DELETE FROM test.sbtest1 WHERE id > 100;
Query OK, 1499900 rows affected (45.21 sec)
Node1: 0
Node2: 0
Node1: 0
Node2: 0
Node1: 45
Node2: 45
Node1: 46
Node2: 46
. . .
Node1: 77
Node2: 77
Node1: 78
Node2: 79
Node1: 0
Node2: 80
Node1: 0
Node2: 81
Node1: 0
Node2: 0
Node1: 0
Node2: 0

Please note that in our query to performance_schema.threads, we are filtering PROCESSLIST_STATE “NULL” and “!= Waiting for an event from Coordinator”. The correct state is “Executing Event”, but it seems like it doesn’t correctly report that state (#84655).


MySQL parallel replication is a nice feature, but we still need to make sure we are aware of any potential issues it might bring. Most monitoring systems use the output of SHOW SLAVE STATUS to verify whether or not the slave is lagging behind the master. As shown above, it has its caveats.

As always, we should test, test and test again before implementing any change like this in production!


How Percona Support handles bugs

How Percona Support handles bugsOne of the great values of a Percona Support contract is that we provide bug fixes for covered software, and not just support in terms of advice on how to use it. This is the skill which is most likely missing from in-house for most customers, as it requires a team with code knowledge to build and test infrastructure – something only a few companies can afford to invest in.

There is a lot of misunderstanding about bugs. What is a bug? What is a feature? What is a repeatable bug? How will Percona troubleshoot the bug? In this post I will answer some of the questions about this.

Bugs vs. Features ? One thing a lot of people have a hard time understanding is the difference between a bug and a feature, or when software was designed to work a certain way which might be unwelcome. There is a gray line here, but you need to expect that some of the things you consider to be bugs will be seen as behavior-change features and will be considered as such.

Unfixable Bugs ? There are some behaviors that any sane person would call a bug, but which arise from design limitations or oversight that are impossible to fix in the current GA version without introducing changes that would destabilize it. Such bugs will need to be fixed in the next major GA release or sometimes even further in the future. Some bugs are not bugs at all but rather design tradeoffs made. These can’t be “fixed” unless different design tradeoffs are chosen.

Workaround ? There are going to be unexpected behaviors, unfixable bugs and bugs that take awhile to fix, so your first practical response to running into the bug is often finding a workaround which does not expose it. The Percona Support team will help find a workaround that causes minimal impact to your business, but be prepared: changes to the application, deployed version, schema or configuration will often be required.

Emergencies ? When you have an emergency, our focus is to restore the system to working order. In a complex system a bug fix can often not be delivered in a short period of time, which typically means finding a workaround.

Bug Turnaround ? It is not possible to guarantee the turnaround on a bug fix, as all bugs are different. Some bugs are rather trivial and we might be able to provide a hotfix 24 hours after we have a repeatable test case. In other cases the bug might be complicated and take weeks of engineering to fix or even might be impossible to fix in the current GA version.

Verified Bug Fixes ? When you submit the bug we have to verify if it is actually being a bug. In many cases it might be intended behavior; in others, a user mistake. It is also possible that the behavior has happened once and can’t be repeated. Having a repeatable test case that reveals the bug is the best way to have a bug fixed quickly. You might be able to create a repeatable test case, or our support team might be able to help you create the test case.

Sporadic Bugs ? These are very hard bug types that happen sporadically over a period of time. For example, you might have a system crash once every 3 months with no way to repeat it. The cause of such bugs can be very complicated; for example, a buffer overrun in one piece of code can cause corruption and crash in another place hours later. There are a number of diagnostic tools that exist for such bugs, but generally they take quite awhile to resolve. In addition, without a repeatable test case, it is often impossible to verify that the proposed fix actually resolves the bug.

Environmental Bugs ? Some bugs are caused by what can be called your environment. It could be some hardware bugs or incompatibilities, a build not quite compatible with your version of  operating system, operating system bugs, etc. In some cases we can very clearly point to the environment problems. In others we can suspect the environment is an issue and we may ask you to see if the bug also happens in another environment, such as different hardware or OS installation.

Hot Fixes ? As our default policy we fix bugs in the next release of our software so it can go through the full QA cycle, be properly documented, etc. If you have implemented a workaround and you can wait until the next release, this is the best choice. If not, with the Percona Platinum Support contract, we can provide you with a hotfix that is a special build containing the version of the software you’re running, and with only the bug fix of interest applied. Hotfixes are especially helpful if you’re not looking to do a full software upgrade – requiring several revisions – but want to validate the fix with the minimum possible changes. Hotfixes might also be different from the final bug fix that goes into the GA release. With hotfixes, our goal is to provide a working solution for you faster. Afterward we may optimize or re-architect the code, come up with better option names, etc.

Bug Diagnostics ? Depending on the nature of the bug there are multiple tools that our support team will use for diagnostics – finding a way to fix the bug. To set expectations correctly, it can be a very involved process, where you might need to provide a lot of information or try things on your system, such as:

  • Test case. If you have a test case that can be repeated by the Percona team to trigger the bug, the diagnostic problem is solved from the customer side. Internal debugging starts at this point. It might not be easy to get to that.
  • If we have a crash that we can’t repeat on our system we often will ask you to enable “core” file, or run the program under a debugger so we can get more information when the crash happens.
  • If the problem is related to performance, you should be ready to gather both MySQL information such as EXPLAIN, status counters, information from performance schema, etc., along with system level information such as pt-pmp output,  pt-stalk,  oprofile, perf, etc.
  • If the problem is a “deadlock,” we often need information from gdb about the full state of the system. Information from processlist, performance_schema, SHOW ENGINE INNODB STATUS can also be helpful.
  • It can be very helpful when you have a test system on which you can repeat the problem in your environment and where you can experiment without impacting production. It is not possible in all cases, but is very helpful.
  • Sometimes, for hard-to-repeat bugs, we will need to run a special diagnostics build that provides us with additional debug information. In others, we might need to run a debug build or do a run under valgrind or other software designed to catch bugs. It often has a large performance impact, so it is good to see how your workload can be scaled down in order for this to be feasible.
  • Depending on your environment we might need to login to troubleshoot your bug or might request that you upload the data needed to repeat the bug in our lab (assuming it is not too sensitive). In cases where direct login is not possible, we can help you to come to a repeatable test case via phone, chat, or email. Using screen sharing can also be very helpful.

Bugs and Non-Percona Software ? Percona Support covers some software not produced by Percona. For open source software, if it is not exempt from bug fix support, we will provide the custom build with a bug fix as well as provide the suggested fix to the software maintainer for its possible inclusion in the next release. For example, if we find a bug in the MySQL Community Edition, we will pass our suggested fix to the MySQL Engineering team at Oracle. For other software that is not open source, such as Amazon RDS, we can help to facilitate creation and submission of a repeatable test case and workaround, but we can’t provide a fix as we do not have access to the source code.

In Conclusion ? When I think about software bugs, I find some good parallels with human “bugs” (diseases).  Some issues are trivial to diagnose and the fix is obvious. Others might be very hard to diagnose ? I guess many of us have been in a situation where you visit doctor after doctor and tell them your symptoms, and they run tests but still can’t figure out what’s wrong. Once a diagnosis is done, though, it is not always given the “fix” available or feasible, and while a complete solution is preferred, sometimes we have to settle for “managing” the disease, which is our parallel to implementing changes and settling for a workaround. So in the same way as human doctors, we can’t guarantee we will get to the root of every problem, or if we do, that we will be able to fix every one of them. However, as with having good doctors – having us on your team will maximize the chance of successful resolution.

The post How Percona Support handles bugs appeared first on MySQL Performance Blog.


WITHer Recursive Queries?

Over the past few years, we’ve seen MySQL technology advance in leaps and bounds, especially when it comes to scalability. But by focusing on the internals of the storage engine for so long, MySQL has fallen behind regarding support for advanced SQL features.

SQLite, another popular open-source SQL database, just released version 3.8.3, including support for recursive SQL queries using the WITH RECURSIVE syntax, in compliance with SQL:1999.

Why is this significant? It means that MySQL is now the only widely-used SQL implementation that does not support recursive queries. Fifteen years after it was defined in the SQL standard, almost every other SQL database of note has supported this feature:

Only Informix among common RDBMS brands lacks support for WITH RECURSIVE, though Informix still supports recursive queries with the non-standard CONNECT BY syntax.

MySQL has been requested to support common table expressions using the WITH syntax for a long time:

The CTE-style queries would allow us to share more advanced SQL queries with those that are being used by other brands of database, and do it with standard SQL instead of proprietary functions or tricks.

The most common example of a query solved with a recursive CTE is to query a tree of unknown depth. But there are quite a few other useful applications of this form of query, all the way up to fancy stunts like a query that generates a fractal design like the Mandelbrot Set. Recursion is powerful.

Is it time for the MySQL community to raise the priority of the CTE feature requests for MySQL? Visit the links I gave above at bugs.mysql.com, and add your voice by clicking the Affects Me button.

The post WITHer Recursive Queries? appeared first on MySQL Performance Blog.


Quick link to Percona Toolkit bugs

In this post I wanted to highlight


 , for example: percona.com/bugs/pt-stalk.  This only works for Percona Toolkit.  We often advise people to check a tool’s current bugs, but we don’t always say how.

Percona ToolkitThe official link for Percona Toolkit bugs on Launchpad is https://bugs.launchpad.net/percona-toolkit/+bugs, but then you still have search from there.  So this quick link is a lot easier if, for example, pt-stalk fails to parse df with NFS and you wonder if that’s a known bug (yes it is).  Please never hesitate to file bugs.  We’re always eager to improve our software.


The post Quick link to Percona Toolkit bugs appeared first on MySQL Performance Blog.


Maatkit’s mk-query-digest filters

Have you ever seen BIG weird numbers in mk-query-digest report that just seem wrong? I have! Here’s one report I got today:

# Attribute          total     min     max     avg     95%  stddev  median
# ============     ======= ======= ======= ======= ======= ======= =======
# Exec time          5088s     1us    171s     2ms   467us   104ms    28us
# Lock time            76s       0      3s    26us    69us     3ms       0
# Rows sent          9.80M       0   1.05M    3.50    0.99  642.32       0
# Rows examine       5.59G       0  82.56M   2.00k    0.99  97.41k       0
# Rows affecte     457.30k       0   2.62k    0.16    0.99    1.68       0
# Rows read          2.16G       0  82.56M  788.53   21.45  82.91k    0.99
# Bytes sent         2.14T       0   4.00G 781.27k   3.52k  47.84M   84.10
# Merge passes     273.47G       0   4.00G  97.69k       0  10.35M       0
# Tmp tables       225.85G       0   4.00G  80.67k       0   7.89M       0
# Tmp disk tbl     381.88G       0   4.00G 136.41k       0  14.08M       0
# Tmp tbl size     255.54G       0   2.39G  91.28k       0   8.59M       0
# Query size       418.38M       6 257.39k  149.45  563.87   1.49k   42.48
# InnoDB:
# IO r bytes       272.92G       0   1.58G 479.74k       0  18.19M       0
# IO r ops         328.05G       0   2.00G 576.65k       0  23.08M       0
# IO r wait        237964272912s       0 3377771733s 398921s       0 22888946s       0
# pages distin     156.50G       0   2.39G 275.11k   27.38  14.48M    4.96
# queue wait       143150489533s       0 3377769328s 239976s       0 16014027s       0
# rec lock wai     216352062699s       0 4085510331s 362690s       0 27625029s       0

That can’t be right! Apparently there are couple bugs in the slowlog patch that aren’t really critical as the numbers are only wrong for the administrator commands. Yet when aggregated it kind of messes up the whole mk-query-digest output.

There’s a simple solution to that though — mk-query-digest –filter. With just a small change it will completely ignore administrator commands:

mk-query-digest --filter='$event->{arg} !~ m/administrator command/' in > out

If these bugs are really bugging you, feel free to sponsor the fix. Otherwise enjoy the power of the almighty mk-query-digest!


Data Corruption, DRBD and story of bug

Working with customer, I faced pretty nasty bug, which is actually not rare situation , but in this particular there are some lessons I would like to share.

The case is pretty much described in bug 55981, or
in pastebin.

Everything below is related to InnoDB-plugin/XtraDB, but not to regular InnoDB ( i.e in MySQL 5.0)

In short, if you use big BLOBS ( TEXT, MEDIUMBLOB, etc) (that allocated in external segment in InnoDB), you can get your database in trash state just executing update on row with blob and rolling back transaction twice ( on the same row)

The keywords that diagnose you hit this bug is

InnoDB: Serious error! InnoDB is trying to free page N
InnoDB: though it is already marked as free in the tablespace!
InnoDB: The tablespace free space info is corrupt.
InnoDB: You may need to dump your InnoDB tables and recreate the whole
InnoDB: database!

Trash state means that InnoDB won’t start, and you need to use innodb_force_recovery=3 and mysqldump your data. What makes problem even worse is that InnoDB does not report tablename, so you are pretty much blind and need to dump whole dataset, which can be long process.

The moment where DRBD come in play, is if you use DRBD for HA purposes ( as is in the case I worked with), you screwed,
as DRBD mirroring physical data, and MySQL keeping crashing on both instances – active and passive.

So DRBD can’t be considered fully reliable HA solution if there is risk that application corrupts data by itself

Now to bug 55981. It has MySQL version 5.6, but the problem exists in MySQL 5.1.50 or below and in MySQL 5.5, and
corresponding bug is 55543, which you actually can’t see, as
“You do not have access to bug #55543.”, because it is marked as “Security problem”.

And I actually tend to agree that bug can be considered as “security”.
If you running public hosting or your public users can execute direct SQL statements, I strongly recommend to upgrade to
MySQL version 5.1.51+ .

Now another interesting point – how can you be sure that 5.1.51 works.

The bug 55543 is not mentioned in ChangeLog for 5.1.51 nor 5.1.52. However if you look into source code and revision history, you can see that bug 55543 is fixed in MySQL 5.1.51. I assume it is technical problem with ChangeLog process and it will be fixed soon, but I reported it so it is not lost

At the end let me reiterate my points:
– if you have BLOB/TEXT fields in your InnoDB-plugin schema, it is recommended to upgrade to 5.1.51+
– if you provide public access to MySQL instance ( hosting provider, etc) with InnoDB-plugin installed – it is STRONGLY recommended to upgrade to 5.1.51+
– Review your HA schema. DRBD by itself (without additional solutions) can’t guaranty decent level of High Availability. And just to be clear, it is not DRBD problem, DRBD basically can’t help if there is possibility that application corrupts data by itself. For this case regular Master-Slave setup (in addition to DRBD) would protect from such problem.

Entry posted by Vadim |

Add to: delicious | digg | reddit | netscape | Google Bookmarks


Recall of Percona Server 5.1.47-11.0

Percona Server release 11.0 which we announced few days ago unfortunately was released with a bug introduced while implementing stripping comments in query cache which could cause server crash with certain types of queries if query cache is enabled. We have released Percona Server release 11.1 which includes a fix for this issue. If you can’t perform upgrade promptly you can disable query cache until you can do this.

We’re sorry for shipping release with such bug.

The issue with this bug in more details is as follows:
MySQL Query Cache works by checking incoming queries if they may be found in query cache – this is done by performing simple check if query is starting from SEL or comment. After Query is parsed and executed MySQL stores results for SELECT queries in Query Cache. There are cases when queries will not be considered as cachable query during first check, however result for them will be stored in the query cache. This rare case was not handled properly in the new feature added and it was causing MySQL to crash.

The most typical case for queries causing issues would be queries starting with brackets and having space before select such as ( select * from t1 ) Queries starting with brackets are often used with UNION queries.

Entry posted by peter |

Add to: delicious | digg | reddit | netscape | Google Bookmarks


Missleading Innodb message on recovery

As I wrote about 2 years ago the feature of Innodb to store copy of master’s position in Slave’s Innodb tablespace got broken. There is a lot of discussions at the corresponding bug report while outcome of the fix remained uncertain for me (the bug is market duplicate while the bugs it seems to be duplicate for describe different issues).
Anyway. The customer came to me today having the following message in the error log after Slave crash while running MySQL 5.1.41 (extra virgin version without XtraDB or Percona Patches)

InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 0 10000000, file name mysql-bin.005000
InnoDB: Last MySQL binlog file position 0 20000000, file name ./mysql-bin.003000

He has tried to restart replication from mysql-bin.005000 position 10000000 which failed with “Could not find first log file name in binary log index file” error message

Looking at the Master for this slave I could see its binary log files going only to about 2000, so the binary log file mentioned did not ever exist on this master. What is going on ?
The thing is Innodb does not update this information any more however if it is stored in the tablespace the code is still there to print it. This database was running older MySQL version a while back which was updated to MySQL 5.1 months ago, moved to the new hardware by physical copy and the log file numbers restarted back from 1 but tablespace still contained the ancient data.

I reported the bug on this which should be easy to fix. Otherwise it is easy mistake to make. We also have a patch which restores this information and uses it on slave crash recovery.

Entry posted by peter |

Add to: delicious | digg | reddit | netscape | Google Bookmarks


Getting around optimizer limitations with an IN() list

There was a discussion on LinkedIn one month ago that caught my eye:

Database search by “within x number of miles” radius?

Anyone out there created a zipcode database and created a “search within x numer of miles” function ?
Thankful for any tips you can throw my way..


A few people commented that some solutions wouldn’t scale. To understand why these sorts of geographic search queries are problematic in MySQL, it’s best to show some execution plans on dummy data:


  1. EXPLAIN SELECT * FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40
  2. AND y BETWEEN 50 AND 60 AND col_a = ‘set1’;
  3. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  4. | id | select_type | TABLE       | type  | possible_keys | KEY       | key_len | ref  | rows | Extra       |
  5. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  6. 1 | SIMPLE      | coordinates | range | x_y_col_a     | x_y_col_a | 38      | NULL | 4032 | USING WHERE |
  7. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  8. 1 row IN SET (0.00 sec)
  10. EXPLAIN SELECT * FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40;
  11. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  12. | id | select_type | TABLE       | type  | possible_keys | KEY       | key_len | ref  | rows | Extra       |
  13. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  14. 1 | SIMPLE      | coordinates | range | x_y_col_a     | x_y_col_a | 3       | NULL | 4032 | USING WHERE |
  15. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  16. 1 row IN SET (0.01 sec)
  18. SELECT count(*) FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40
  19. AND y BETWEEN 50 AND 60 AND col_a = ‘set1’;
  20. +———-+
  21. | count(*) |
  22. +———-+
  23. |        1 |
  24. +———-+
  25. 1 row IN SET (0.00 sec)
  27. SELECT count(*) FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40;
  28. +———-+
  29. | count(*) |
  30. +———-+
  31. |     1664 |
  32. +———-+
  33. 1 row IN SET (0.01 sec)

Did you notice that we estimate just as many rows on the first EXPLAIN as the second one? That doesn’t make any sense! The index covers x,y and col_a and should be eliminating a lot of searching, since there is only one row which meets this condition!

The reason for this is simply a missing feature of the MySQL optimizer – and it has to do with using x BETWEEN 30 and 40 (and it’s also true with x >= 30 AND x <= 40). Using a range like this prevents us from using the rest of the index. There is a workaround, but it’s not pretty:


  1. EXPLAIN SELECT * FROM coordinates WHERE x IN
  2. (30.30,30.61,31.18,31.26,31.72,32.11,32.25,32.30,32.42,32.91,33.27,
  3. 33.69,33.79,33.93,34.62,34.78,35.10,35.41,36.62,36.93,37.17,38.93,39.20,
  4. 39.56,39.84,39.87) AND y IN (59.58,56.81,57.27,54.14,56.43,51.87,54.59,
  5. 59.56,57.42,54.13,56.79,59.45) AND col_a = ‘set1’;
  6. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  7. | id | select_type | TABLE       | type  | possible_keys | KEY       | key_len | ref  | rows | Extra       |
  8. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  9. 1 | SIMPLE      | coordinates | range | x_y_col_a     | x_y_col_a | 38      | NULL312 | USING WHERE |
  10. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  11. 1 row IN SET (0.00 sec)

The ugliest thing about this, is that in real life you wouldn’t know all your possible values of X or Y, and so you may end up with a very big IN list. The workaround to this, is to create steppings of the value X and Y that we can use for indexes:


  1. ALTER TABLE coordinates ADD x_floor INT NOT NULL, ADD y_floor INT NOT NULL, DROP INDEX x_y_col_a,
  2. ADD INDEX x_floor_y_floor_col_a (x_floor, y_floor, col_a);
  4. UPDATE coordinates SET x_floor = FLOOR(x), y_floor = FLOOR(y);
  6. EXPLAIN SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  7. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60) AND col_a = ‘set1’\G
  8. *************************** 1. row ***************************
  9.            id: 1
  10.   select_type: SIMPLE
  11.         TABLE: coordinates
  12.          type: range
  13. possible_keys: x_floor_y_floor_col_a
  14.           KEY: x_floor_y_floor_col_a
  15.       key_len: 40
  16.           ref: NULL
  17.          rows: 121
  18.         Extra: USING WHERE
  19. 1 row IN SET (0.00 sec)

Fantastic! The only remaining problem with this query is that it’s not quite identical to our original. In this query 60.79 will be floored to 60 (and erroneously included in our results):


  1. SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  2. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60) AND col_a = ‘set1’;
  3. +—–+——-+——-+——-+——-+———+———+
  4. | id  | x     | y     | col_a | col_b | x_floor | y_floor |
  5. +—–+——-+——-+——-+——-+———+———+
  6. | 144 | 33.79 | 54.59 | set1  | NULL  |      33 |      54 |
  7. 38 | 39.20 | 60.79 | set1  | NULL  |      39 |      60 |
  8. +—–+——-+——-+——-+——-+———+———+
  9. 2 rows IN SET (0.00 sec)

However, this is a quick fix by re-including the original WHERE conditions (we are just no longer using an index on them):


  1. EXPLAIN SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  2. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60)
  3. AND col_a = ‘set1’ AND x BETWEEN 30 AND 40 AND y BETWEEN 50 AND 60\G
  4. *************************** 1. row ***************************
  5.            id: 1
  6.   select_type: SIMPLE
  7.         TABLE: coordinates
  8.          type: range
  9. possible_keys: x_floor_y_floor_col_a
  10.           KEY: x_floor_y_floor_col_a
  11.       key_len: 40
  12.           ref: NULL
  13.          rows: 121
  14.         Extra: USING WHERE
  15. 1 row IN SET (0.00 sec)
  17. SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  18. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60)
  19. AND col_a = ‘set1’ AND x BETWEEN 30 AND 40 AND y BETWEEN 50 AND 60;
  20. +—–+——-+——-+——-+——-+———+———+
  21. | id  | x     | y     | col_a | col_b | x_floor | y_floor |
  22. +—–+——-+——-+——-+——-+———+———+
  23. | 144 | 33.79 | 54.59 | set1  | NULL  |      33 |      54 |
  24. +—–+——-+——-+——-+——-+———+———+
  25. 1 row IN SET (0.00 sec)

My examples were only on a small amount of data (16 000 rows) that fitted in memory, but the original query would have full table scanned if I didn’t use a FORCE INDEX hint. Add more data, and if X can’t filter out enough rows by itself this can create a real problem.

Workarounds are all very good, but they also make applications more difficult to maintain. If you really want to do these types of queries, you should give Sphinx a try.

Entry posted by Morgan Tocker |

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com