Aug
22
2012
--

Recovery after DROP & CREATE

In a very popular data loss scenario a table is dropped and empty one is created with the same name. This is because  mysqldump in many cases generates the “DROP TABLE” instruction before the “CREATE TABLE”:

DROP TABLE IF EXISTS `actor`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `actor` (
  `actor_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
  `first_name` varchar(45) NOT NULL,
  `last_name` varchar(45) NOT NULL,
  `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`actor_id`),
  KEY `idx_actor_last_name` (`last_name`)
) ENGINE=InnoDB AUTO_INCREMENT=201 DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = @saved_cs_client */;

If there were no subsequent CREATE TABLE the recovery would be trivial. Index_id of the PRIMARY index of table sakila/actor still resides in InnoDB dictionary although marked as deleted.

Knowing index_id, it’s easy to get the table back:

# ./bin/constraints_parser.SYS_TABLES -4Df pages-1345219305/FIL_PAGE_INDEX/0-1
LOAD DATA INFILE '/root/src/recovery-tool/s_tools/dumps/default/SYS_INDEXES' REPLACE INTO TABLE `SYS_TABLES` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'SYS_TABLES\t' (`NAME`, `ID`, `N_COLS`, `TYPE`, `MIX_ID`, `MIX_LEN`, `CLUSTER_NAME`, `SPACE`);
SYS_TABLES      "sakila/actor"  15      4       1       0       0       ""      0
SYS_TABLES      "sakila/actor"  15      4       1       0       0       ""      0

# ./bin/constraints_parser.SYS_INDEXES -4Df pages-1345219305/FIL_PAGE_INDEX/0-3| grep 15 | grep PRIMARY
LOAD DATA INFILE '/root/src/recovery-tool/s_tools/dumps/default/SYS_INDEXES' REPLACE INTO TABLE `SYS_INDEXES` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'SYS_INDEXES\t' (`TABLE_ID`, `ID`, `NAME`, `N_FIELDS`, `TYPE`, `SPACE`, `PAGE_NO`);
SYS_INDEXES     15      18      "PRIMARY"       1       3       0       4294967295
SYS_INDEXES     15      18      "PRIMARY"       1       3       0       4294967295
SYS_INDEXES     15      18      "PRIMARY"       1       3       0       4294967295

# ln -fs table_defs.h.actor include/table_defs.h
# make constraints_parser
# ./constraints_parser -5Uf pages-1345219305/FIL_PAGE_INDEX/0-18
actor   1       "PENELOPE"      "GUINESS"       "2006-02-15 04:34:33"
actor   2       "NICK"  "WAHLBERG"      "2006-02-15 04:34:33"
actor   3       "ED"    "CHASE" "2006-02-15 04:34:33"
actor   4       "JENNIFER"      "DAVIS" "2006-02-15 04:34:33"
actor   5       "JOHNNY"        "LOLLOBRIGIDA"  "2006-02-15 04:34:33"
...
actor   199     "JULIA" "FAWCETT"       "2006-02-15 04:34:33"
actor   200     "THORA" "TEMPLE"        "2006-02-15 04:34:33"

How does the following “CREATE TABLE” make the recovery any harder?

When a table gets dropped InnoDB deletes respective records from the dictionary – tables SYS_TABLES and SYS_INDEXES and others. Physically the records remain on their places, they’re just marked as deleted. That’s why it was possible to recover them with -D option.

When a user immediately creates the same table InnoDB adds the records to the dictionary. The size of the new records will be the same. Indeed the only variable field in SYS_* is NAME. Hence, InnoDB will put the new records into the same position in the pages. So, table_id and index_id of the dropped table gets overwritten.

mysql> CREATE TABLE actor (
->   actor_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
->   first_name VARCHAR(45) NOT NULL,
->   last_name VARCHAR(45) NOT NULL,
->   last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
->   PRIMARY KEY  (actor_id),
->   KEY idx_actor_last_name (last_name)
-> )ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.56 sec)

# ./page_parser -f /var/lib/mysql/ibdata1
# ./bin/constraints_parser.SYS_TABLES -4Df pages-1345220873/FIL_PAGE_INDEX/0-1
LOAD DATA INFILE '/root/src/recovery-tool/s_tools/dumps/default/SYS_TABLES' REPLACE INTO TABLE `SYS_TABLES` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'SYS_TABLES\t' (`NAME`, `ID`, `N_COLS`, `TYPE`, `MIX_ID`, `MIX_LEN`, `CLUSTER_NAME`, `SPACE`);
#

If index_id is unknown there are several ways to learn it.

Grepping for a known string

In particular this example we know that PENELOPE GUINESS was in the table. We can concatenate neighbor strings and search them in the pages:

# grep -r "PENELOPEGUINESS" pages-1345220873/FIL_PAGE_INDEX/
Binary file pages-1345220873/FIL_PAGE_INDEX/0-18/9733-00009792.page matches

Earlier Aurimas explained how to deal with tables with binary only values

This method has drawbacks:

  • It assumes a lot of manual work and thus slow
  • There is no way to differentiate tables based on values only. For example, if you dropped table actor from db1 and db2 you’ll get two different index_id. Both are valid, but no general way to know  which index_id belongs to which database/table.
  • Sometime it’s not possible to find the index_id. Either because too many matches or there is no suitable string to grep etc.

Using information from UNDO segment

SYS_TABLES and SYS_INDEXES are normal InnoDB tables. They’re also ruled by MVCC.

When a record is deleted from SYS_TABLES  it is copied to the undo segment. A pointer to the old version is stored in the internal field DB_ROLL_PTR.

Unfortunately this pointer is lost when new record is inserted. But the actual old values remain in the undo segment for some time.

Let’s review some headers of undo slot:

PK len (compressed ulint, 1-5 bytes)
PK data
  old values:
    n_fields (compressed ulint, 1-5 bytes)
    field_no (compressed ulint, 1-5 bytes)
    field_len (compressed ulint, 1-5 bytes)
    field data

We know that sakila/actor is field #0, its length is 12 bytes. So, if we ever meet byte sequence

0x00 0x0C sakila/actor 0x01 0x08

the next eight bytes will be will be table_id.

The same approach is applicable for SYS_INDEXES. We can search for sequence

0x00 0x08 <table_id> 0x01 0x08

and next 8 bytes is index_id.

In revision 69 of Percona Data Recovery Tool for InnoDB two tools are added: s_tables and s_indexes. They scan a given pattern and output next 8 bytes – either table_id or index_id.
Let’s review an example:

# ./s_tables /var/lib/mysql/ibdata1 sakila/actor
sakila/actor 15
# ./s_indexes /var/lib/mysql/ibdata1 15
0-18
0-19
#

Apparently 0-18 is PRIMARY index and 0-19 is index idx_actor_last_name.

It is not guaranteed though an undo slot will be flushed on disk. InnoDB pages are modified in memory first and only after some time InnoDB writes them permanently on disk. But the changes a transaction does are written to the redo log right before a successful response to COMMIT(or in a second if innodb_flush_trx_at_commit != 1).
In that case we can scan the redo log:

# ./s_tables /var/lib/mysql/ib_logfile1 sakila/actor
sakila/actor 15
# ./s_indexes /var/lib/mysql/ib_logfile1 15
0-18
0-19

Now recovery of sakila/actor becomes trivial again as it was before the “CREATE TABLE”.

Aug
21
2012
--

Webinar for Full Text Search Throwdown

Tomorrow, August 22 at 10:00am PDT, I’ll present a webinar called Full Text Search Throwdown.  This is a no-nonsense performance comparison of solutions for full text indexing for MySQL applications, including:

  • LIKE predicates and regular expressions
  • MyISAM FULLTEXT indexes
  • InnoDB FULLTEXT indexes
  • Apache Solr
  • Sphinx Search
  • Trigraphs

I’ll compare the performance for building indexes and querying indexes.

If you’re developing an application with text search features, this will be a very practical and informative overview of your technology options!

Register for this free webinar at http://www.percona.com/webinars/2012-08-22-full-text-search-throwdown

Aug
17
2012
--

Percona XtraDB Cluster: Multi-node writing and Unexpected deadlocks

Percona XtraDB Cluster (PXC) and the technology it uses (Galera) is an exciting alternative to traditional MySQL replication.  For those who don’t know, it gives you:

  • Fully Synchronous replication with a write latency increase equivalent to a ping RTT to the furthest node
  • Automatic cluster synchronization, both incremental and full restores
  • The ability to read and write on every node without needing to worry about replication delay

However, good engineers know there is always a trade-off.  If someone tries to sell you on a technology that sounds amazing without telling you the tradeoffs, be careful.

 

One of the tradeoffs in Galera is how multi-node writing is implemented.  This is highly simplified, but locking inside of an RDBMS can either be optimistic or pessimistic.

Pessimisic locking is what Innodb does as a transaction executes.  All the locks it needs (or it thinks it needs) will be acquired throughout the course of the transaction.  If a given statement in the transaction needs a lock, it will wait for that lock before proceeding (this is where deadlocks and lock wait timeouts can happen).  By the time the transaction reaches the COMMIT, it should be guaranteed to succeed (unless something weird happens like the disk fails or the server crashes) because it acquired all the locks it needed in advance.

Now, think about pessimistic locking in terms of a cluster.  It would be incredibly expensive to go gain every lock on every node you need for all running transactions.  How would deadlock detection work?  On a single node, a deadlock is fairly easy to see by looking at all transactions waiting for locks and checking for circular dependencies, but that would be much more complicated in a cluster.  It’s not to say that pessimistic locking isn’t possible in a cluster, but it is pretty tricky, and all that extra network traffic would really slow down transactions.

 

You may have already guessed, but Galera does not use pessimistic locking cluster-wide, it uses optimistic locking.  Initially a transaction on PXC behaves much like a regular Innodb transaction.  It does pessimistic locking on the node it is started from all the way up to point of the commit.  At that point it knows it has gained all the locks necessary on the local node, but it has no idea about the rest of the cluster.  So, it optimistically shoots the transaction out to the rest of the nodes in the cluster to see if they will agree to it.  If they do, then the commit succeeds at a simple cost of one network roundtrip per node (which is done in parallel to all nodes at once).

This sounds all well and good, but what is the tradeoff?

Not locking resources cluster-wide means that there can be locking conflicts.  In short, these take the form of deadlock errors to the clients of affected transactions.  Technically these aren’t necessarily deadlocks, but instead a cross-node locking conflict of some kind.  These deadlocks can happen in places where you wouldn’t necessarily expect one.

So, enough talking, let’s see what one looks like.  Suppose we have a 3 node PXC cluster.  Let’s create a table to test on and insert some data:

node2 mysql> create table autoinc ( i int unsigned not null auto_increment primary key, j varchar(32) );
Query OK, 0 rows affected (0.02 sec)

node2 mysql> show create table autoinc\G
*************************** 1. row ***************************
       Table: autoinc
Create Table: CREATE TABLE `autoinc` (
  `i` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `j` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`i`)
node2 mysql> insert into autoinc (j) values ('node2' );
Query OK, 1 row affected (0.00 sec)

node2 mysql> insert into autoinc (j) values ('node2' );
Query OK, 1 row affected (0.01 sec)

node2 mysql> insert into autoinc (j) values ('node2' );
Query OK, 1 row affected (0.00 sec)

Now, on two different nodes we will open transactions and attempt to modify the same row:

node1 mysql> set autocommit=off;
Query OK, 0 rows affected (0.00 sec)

node1 mysql> select * from autoinc;
+---+-------+
| i | j     |
+---+-------+
| 1 | node2 |
| 4 | node2 |
| 7 | node2 |
+---+-------+
3 rows in set (0.00 sec)

node1 mysql> update autoinc set j="node1" where i = 1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
node1 mysql> SHOW ENGINE INNODB STATUS\G
...
------------
TRANSACTIONS
------------
...
---TRANSACTION 83B, ACTIVE 50 sec
2 lock struct(s), heap size 376, 1 row lock(s), undo log entries 1
MySQL thread id 3972, OS thread handle 0x7fddb84e0700, query id 16408 localhost root sleeping
show engine innodb status
Trx read view will not see trx with id >= 83C, sees < 83C
TABLE LOCK table `test`.`autoinc` trx id 83B lock mode IX
RECORD LOCKS space id 0 page no 823 n bits 72 index `PRIMARY` of table `test`.`autoinc` trx id 83B lock_mode X locks rec but not gap

So far, so good.  We have an open transaction on node1 updating row 1 in our table. Note we haven’t committed yet.  Now, let’s go to node3:

node3 mysql> set autocommit=off;
Query OK, 0 rows affected (0.00 sec)

node3 mysql> select * from autoinc;
+---+-------+
| i | j     |
+---+-------+
| 1 | node2 |
| 4 | node2 |
| 7 | node2 |
+---+-------+
3 rows in set (0.00 sec)

node3 mysql> update autoinc set j="node3" where i=1;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

node3 mysql> commit;
Query OK, 0 rows affected (0.00 sec)

node3 mysql> select * from autoinc;
+---+-------+
| i | j     |
+---+-------+
| 1 | node3 |
| 4 | node2 |
| 7 | node2 |
+---+-------+
3 rows in set (0.00 sec)

Our commit succeeded!  On a single node, this should have blocked waiting for our first transaction, but it does not.  Let’s go back to node1 and see what happened:

node1 mysql> commit;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

node1 mysql> select * from autoinc;
+---+-------+
| i | j     |
+---+-------+
| 1 | node3 |
| 4 | node2 |
| 7 | node2 |
+---+-------+
3 rows in set (0.00 sec)

Whoops!  We got a deadlock error on commit, that doesn’t happen in normal Innodb.

Galera is allowing the first transaction to commit to “win”.  Any other transactions on other nodes with competing locks are automatically rolled back and will receive a deadlock error no matter what they try to do next (even a SELECT or something else seemingly innocuous).

Dealing with this problem, quite simply, is the tradeoff to multi-node writing in a synchronous replication system with optimistic locking.

So, when is this really a problem?  Really whenever transactions across multiple nodes can take the same write locks.  Whether or not this is a problem for your system is purely application-dependent, but if you’re not sure, I’d say you should assume it will be a problem.

What can you do to work with/around this problem?  Here are some ideas from simplest to most complex:

  • Only write to a single node in the cluster (at a time)-– all the pessimistic locking will happen as usual on that node only and there should be no conflicts from other nodes.  This is pretty easy to rig up with an HAproxy configuration or similar so if your main writer node fails, clients fail over gracefully.
  • Update your application DB library to handle sudden deadlocks– If you end up with one of these deadlock errors, simply retry the transaction.  Any software using transactions should have some facility to do this anyway (IMO).
  • Restructure your application/schema to avoid data hot-spots– this is probably easier said than done in most cases, but in principle the least likely it is for a single piece of data to be written on multiple nodes at the same time, the less often this problem will happen.  For some databases this may be perfectly natural, but I believe most conventional relational databases will naturally have these kinds of problems.

Want to learn more about PXC and go hands-on?   I will walk you through real examples on this and a lot more at my Percona XtraDB Cluster in Practice tutorial at Percona Live NYC  on Oct 1-2, 2012.

Aug
16
2012
--

Where to get a BZR tree of the latest MySQL releases

I just posted this to the MySQL Internals mailing list:

Hi all,

Like many of you, I’m disappointed that the bzr trees for MySQL are out
of sync with the tarball source and binary releases from Oracle.

Since Oracle has been silent on this, and this is a recurring problem,
I’ve decided to attempt to help the situation.

I’ve created the following bzr trees:
lp:percona-server/upstream-5.0
lp:percona-server/upstream-5.1
lp:percona-server/upstream-5.5
lp:percona-server/upstream-5.6

These are going to be exact mirrors of the lp:mysql-server trees, except
where the lp:mysql-server trees lag behind or differ in some
indeterminate way from what has been released as Oracle MySQL – in which
case we’ll import the latest released source tarball.

These trees have *NOTHING AT ALL* from Percona Server, they are simply
straight MySQL, as you would download from Oracle or (at least
previously) from the bzr trees hosted on launchpad.

For example, right now, the 5.1 tree differs as MySQL 5.1.65 has been
released but the last tagged release in the lp:mysql-server/5.1 tree is
MySQL 5.1.63.

For cases such as this, I have imported the MySQL source tarball into
the appropriate lp:percona-server/upstream-X tree and will reconcile it
with the Oracle lp:mysql-server/X tree if/when that becomes available.

The 5.6 tree also differs, as MySQL 5.6.6-m9 has been
released but the BZR tree is somewhere after MySQL 5.6.5 (and since I
cannot be sure that lp:mysql-server is actually still pointing to 5.6
and not an internal 5.7 tree, I’ve created it based off of the tagged
mysql 5.6.5 release).

Basically, for those who make use of BZR trees of MySQL be it for
building, making binary packages for distributions, working on your
internal branch or fork, these lp:percona-server/upstream-X trees will
always be usable – and quickly.

I’ll update them to the latest MySQL release as soon as humanly
possible – hopefully within a day or two of Oracle making a new release
announcement. Percona will have paid resources ensuring that this is
kept up to date for at least as long as it is useful for us.

The only reason they sit under lp:percona-server/upstream-X instead of
some other project on Launchpad is that I couldn’t think of any better
place to put them or a project name that I wouldn’t have to talk to a
trademark lawyer about.

It is well within the rights of Oracle to *not* publish a BZR tree at
all. What stings is the lack of communication.

I hope this helps others, and I’m quite sad that I’ve had to do this.

Aug
16
2012
--

MySQL Indexing Best Practices: Webinar Questions Followup

I had a lot of questions on my MySQL Indexing: Best Practices Webinar (both recording and slides are available now) We had lots of questions. I did not have time to answer some and others are better answered in writing anyway.

Q: One developer on our team wants to replace longish (25-30) indexed varchars with an additional bigint column containing the crc64, and change the indexing to be on that column. That would clearly save indexing space. Is this a reasonable performance optimization. (Keep in mind that the prefix adaptive hashing would fail here, because the first 10 or so characters usually are the same). Of course UNIQUE index optimizations can no longer be applied either.

A: This is good optimization in many cases. When you apply it though remember hash can have collisions so you will need to have your queries do something like SELECT * FROM TBL WHERE hash=crc32(‘string’) AND string=’string’ The other thing you need to consider is string comparison in MySQL is case insensitive by default while hash comparison will be done case sensitive unless you lowercase string before hashing. I also would note 25-30 bytes length is rather short for such hack as BIGINT itself is 8 bytes and the difference in the index length with all overhead is not going to be huge. I think this technique is best when you’re working with 100 bytes+ strings. Is say bytes as it is string length at which it makes sense is collation specific.

Q: ORDER By optimization issues: select * from table where A=xxx and B between 100 and 200 order by B Very common for a date range to also need to be ordered. The question is how can one have optimized indexes and sorting in such a scenario, since inequality ends index usage.

Actually in this case index on (A,B) would work well. If however you would need to sort by some 3rd column, say C index (A,B,C) would not work as range will prevent sorting from using the index. In this case you can use trick mentioned in the presentation to convert sort to the union for small ranges.

Q: In the case of a junction table, would indexing on (foreignkey1,foreignkey2) AND on (foreignkey2,foreignkey1) be a good idea?

A: Yes. This is a good practice. Normally I’d do something like CREATE TABLE LINK (id1 int unsigned not null ,id2 int unsigned not null, PRIMARY KEY(id1,id2), KEY K(id2)) engine=INNODB; when table has to be traversed in both directions for different queries. This will use fast primary key for some queries and use key K as covering index for lookup in other direction.
For Innodb Table id2 is not needed pas part of second key as PRIMARY key is appended to it internally anyway. For MyISAM table you should use K(id2,id1) in the same case. Some people would prefer to define second key as UNIQUE this has benefits and drawbacks. Benefit being you can get extra optimizations by optimizer knowing index is UNIQUE the drawback is insert buffer will not be able to be used, which can be important for large, heavily written tables.

Q: in trick #1 will “WHERE a IN (2-4)” be worse then “WHERE a IN (2,3,4)”? Another word is range for IN clause better than BETWEEN?

A: IN(2-4) will not do what you’re implying here. 2-4 will be evaluated as math expression and the result will be IN(-2) which is not what you’re looking for.

Q: I have a primary index on an int (ID) and other indexes on columns idx1(X,A,B,C) idx2(Y,A,B,C) etc (there are 5) would I be better off making the primary A,B,C,ID and Having other indexes on one column, idx1(X) idx2(Y) etc?

A: I would wonder whenever it is best setup for 5 indexes to differ only by first column. Regarding changing primary key to include such column prefix it depends on what you’re looking for a lot. This will cause data clustering done by these columns which can be helpful if you’re doing a lot of range scans on what would be primary key but it also can slow down your inserts and make primary key significantly fragmented. I also would note there are some MySQL optimizer restrictions in how well it can deal with primary key appended to the index column, especially in such case as you’re suggesting. In the end I would seek for a lot of performance gains before I move to such unusual setup.

Q: Table1 has a primary key. Table2 joins to table1 using Table1′s primary key. Should table2 have an index on the field that is used to join the two tables?

A: The question in this case is how MySQL will execute the join. If it will fist lookup Table2 using some other index and when go to Table1 to lookup row by primary key when you do not need an index on a field which is used to join tables in table2

Q: In regards to extending an index being better than adding a new one: Let’s say I have a table named PO that has a primary key of PO # and 2 additional fields for vendor_id and order_id. If I have an index on vendor_id, order_id but my query is only selecting on vendor, will the index have any impact on the speed of the query?

A: If you extend the index from (vendor_id) to (vendor_id,order_id) you will make it 4 bytes longer (assuming order_id is int) which will impact your queries which only use vendor_id but unlikely significantly. It is likely to be a lot less expensive than having another index on (vendor_id,order_id) in addition to index on (vendor_id) alone. The cases when you really should worry about performance impact of extending index is when you increase its length dramatically, for example adding long varchar column. In such cases it indeed might be better to add another index.

Q: We have a database that has about 400GB of indexes. The indexes can’t fit in memory anymore. How does this affect performance?

A: Typically you do not need all your indexes to be in memory only those portions of them which are accessed frequently. The size of this “working set” can greatly depend on application and can range from 5% of total size or less to almost 100%. When your go from working set what fits in memory to the one which does not any more performance can degrade 10 or more times.

Q: In which cases should auto-increment be used as primary key?

A: Auto-increment is a good default primary key. You should pick something else if you have a good reason to do it – if you would benefit from data clustering in the different way or if you have some other natural candidate for primary key which gets a lot of lookups is frequent reason to use something else as primary key.

Q: How many indexes is too many?

A: There is a hard limit on amount of indexes you can have, which is 64 per table in recent MySQL versions. However it is often too many. Instead of thinking about hard limit of indexes I prefer to add indexes only in cases which provides positive impact to performance. At some point the gains from indexes you add will be less than performance loss because of having too many indexes.

Q: Is there a difference between “`id` = 5″ and “`id` IN (5)” regarding indexes and performance?

A: Recent MySQL versions are smart enough to convert id IN (5) to ID=5 (for single item in-lists). There were the times when when it would make a difference though.

Q: Would you recommend creating an index in every table you create? Example: CREATE TABLE user_competition_entry user_id (INT), competition_id(INT);
The table is only used to record a user_id and competition_id, nothing more. Would doing a SELECT competition_id, COUNT(user_id) AS user_count FROM user_competition_entry GROUP BY competition_id; be slower without an index?

A: I would define (competition_id,user_id) as a PRIMARY KEY for such table. It also will help the query you’re mentioning allowing group by to be performed without temporary table or external sort.

Q: How can we manage indexes on servers from DBA point of view ? Is there any management required or server does everything itself. Especially when using a CMS where DB structure is prepdefined

A: MySQL Server will not automatically define any indexes for you. Hopefully your CMS already comes with reasonable set of indexes, if not you will need to add indexes manually.

Q: What are some methods to overcome vastly differing cardinality on a primary key. After running an analyze on a table with 11M rows I’ve seen cardinality range from 19 to over 19,000?

Cardinality is property of data so you usually would deal with it not overcome it. The best thing to start with is looking at the queries around “outliers” – the keys which have a lot of values to see if you can make them work well. You might need to redesign schema and queries to make it work well.

Q: how does a index be used having an index on one column and using order by on another column. do i need to add the index on column using order by clause.

A: If index is used for ORDER BY the same index must be used for selection for the same table, not other index, also you only can have equality comparison as a search condition WHERE A=5 ORDER BY B will use index (A,B) for sorting optimization, for more complicated conditions you will need to use something like Trick “Unionizing Order by” described in presentation.

Q: what is the impact on indexing to use wider UUID such as VARCHAR(36) instead of auto-increment

A: If you’re using UUID it is at least good to convert it in binary form and store as VARBINARY(16) for performance reasons. In any case you would likely to get table which is larger than if you would use auto increment. Having said that there are many people using UUID rather successful in applications which do not need to be optimized for peak performance or the cases when this does not become the bottleneck. Also check out my old article on the topic which goes into a lot more details.

Q: how mysql use index for group by?

A: If you have Index on the column MySQL can avoid temporary table or filesort for group by by this column. This works because by scanning data in index order MySQL gets data in already sorted order and looks at “one group at the time”, computing aggregate functions as needed.

Q: Is there any special concerns or tricks for selecting using some date ranges? or between dates? or after a date?

A: Date comparisons work very similar to other comparison and same tricks may apply, for example you may benefit to convert BETWEEN into IN-ranges in some cases for better index usage.

Q: Is the b+ tree innodb index a single or double linked list at the leaf nodes? your slide showed single but the fact that you can use and index for “order by desc” indicates a double linked list.

Innodb has double linked list – each leaf page contains pointers to both previous and next pages in index order. Note however it is not really requirement for ORDER BY DESC optimization – you can still traverse BTREE in any direction, even if there are no page leaf pointers, it is just what it becomes relatively more expensive.

Everyone, Thank you for attending and your questions!

Check out more MySQL Webinars from Percona!

Aug
14
2012
--

Announcing Percona Server 5.6 Alpha

We are very happy to announce our first alpha of Percona Server 5.6. Based on MySQL 5.6.5 and all the improvements contained within, this is the first step towards a full Percona Server 5.6 release.

Binaries are available to download from our downloads site here: http://www.percona.com/downloads/Percona-Server-5.6/Percona-Server-5.6.5-alpha60.0/

We will post binaries to our EXPERIMENTAL repositories later, we’re undergoing final testing to ensure that it won’t cause problems for those running Percona Server < 5.6 from EXPERIMENTAL.

Percona Server 5.6.5-alpha60.0 does not contain all the features of Percona Server 5.5. We are going to “release early, release often” as we add features from Percona Server 5.5. As such, our documentation will not be complete for a little while yet and these release notes are currently the best source of information – please bear with us.

Features included:

  • Bug fix for Bug #933969
  • TIME_MS column in INFORMATION_SCHEMA.PROCESSLIST
  • ignore-create-error option to mysqldump
  • control online alter index
  • SHOW TEMP patch

We have decided to not port the following feature:

  • optimizer_fix variable

We welcome feedback on this release, and encourage people to try it out. It should go without saying that this ALPHA is NOT intended for production use.

 

Aug
14
2012
--

Announcing Percona XtraBackup 2.0.2

Percona is glad to announce the release of Percona XtraBackup 2.0.2 on August 13th, 2012. Downloads are available from our download site here and Percona Software Repositories.

This release is the current GA (Generally Available) stable release in the 2.0 series.

Bugs Fixed:

  • Fixed false positive test suite failures with grep 2.10. Bug fixed #996483 (Alexey Kopytov).
  • Incremental backup would fail if a tablespace was created between full and incremental backups. Bug fixed #1022562 (Laurynas Biveinis).
  • Assertion error in creating a compressed tablespace at delta apply time has been fixed. Bug fixed #1028949 (Laurynas Biveinis).
  • If the table was renamed after the full backup, but before the incremental backup has been taken, incremental backups would fail when being prepared. Bug fixed #932623 (Sergei Glushchenko).
  • When the variable innodb_log_block_size was set to 4096, backups would fail in the prepare stage. Bug fixed #976945 (Sergei Glushchenko).
  • Additional incremental backup tests have been added for the incremental backup data page copy. Bug fixed #1021249 (Laurynas Biveinis).
Aug
13
2012
--

European and North American Training Schedule Announced

Ensuring that everyone has a chance to attend Percona’s training for MySQL, we are announcing the following dates for Europe. You can always find the current schedule at Percona Training.

The dates for Europe are:

  • Begins Monday, August 27, 2012 , 9am – 11am, Greenwich Mean Time
  • Begins Monday, September 3, 2012
  • The class will be conducted in Spanish.
  • Begins Monday, September 24, 2012
  • Begins Monday, October 8, 2012
  • Begins Monday, November 5, 2012
  • Begins Monday, November 26, 2012
  • Begins Monday, December 10, 2012
  • Begins Monday, December 17, 2012
  • The class will be conducted in Spanish.

If you are looking for date in North Amercia, they are:

  • Begins Monday, August 20, 2012
  • Begins Monday, August 20, 2012 , 9am – 11am, Pacific Time
  • Begins Monday, September 10, 2012
  • Begins Monday, September 24, 2012
  • Begins Monday, October 1, 2012
  • Begins Monday, October 22, 2012
  • Begins Monday, November 12, 2012
  • Begins Monday, December 3, 2012
  • Begins Monday, December 17, 2012
Aug
12
2012
--

Second Anniversary

“Use The Index, Luke!” celebrates its second anniversary today. Unfortunately, I was not able to prepare anything for this anniversary like I did last year when I presented the 3-minute test. The test is still very popular, by they way, it is taken about 20 times a day. However, all I have for this years anniversary is a promise.

Do you know http://sqlfiddle.com/? You should ;) It’s an online tool to test SQL queries with five different databases (SQL Server, MySQL, Oracle, PostgreSQL and SQLite). It is not a cloud database that you could use for building applications. SQLFiddle is a tool for collaborative online work on SQL problems. It was built to improve asking and solving SQL related questions on Q/A sites like Stackoverflow.

For me, SQL Fiddle is interesting because of the "View Execution Plan" Button. As said before, it is not a cloud DB that we could use for performance tests, but we can have a look at executions plans. I believe it would be very convenient to have prepared fiddles for the examples from „Use The Index, Luke!“ so that you can easily test them online without needing your own database.

And this is exactly what will come—although I don’t have a time line yet. Stay tuned.

Original title and author: “Second Anniversary” by Markus Winand.

Aug
12
2012
--

Customs and, well, errata…

I’ve finally managed to put all the errors that were reported by readers so far online. I’m not exactly proud of this, but there are already more than 30 known errata for the German edition. There are still some German copies of "SQL Performance Explained" on stock, but it seems like I have to arrange a second printing soon. I will surly take up all these errors for the second printing. The English edition has just two errata at the moment, but I doubt it will stay that low. You are more than welcome to report any mistakes you find: I do follow them up :)

And now, something different: customs. I’ve received some inquires if I can indicate how much customs you might have to pay if you order the book from outside the EU. Well, I cannot tell you the exact amount, but help you Google it ;) The important information you need to find the rate in your country is the so-called HS number and the origin of the goods. I’ve to put this information on the CN22 customs declaration on the outside of the package, so authorities know what’s inside. The HS number for books is 490199 and this particular book is printed in Austria. That’s what I write on the customs declaration form. I’ve also been told that some countries might charge other fees besides customs (e.g., VAT). Sorry, don’t know any more details.

However, I’m asking all international buyers to report if and how much fees they had to pay. Until now, I’ve not received any feedback—but I’ve done only one international shipping so far. Next one is due on Tuesday.

Original title and author: “Customs and, well, errata…” by Markus Winand.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com