Jun
30
2012
--

Find unused indexes

I wrote one week ago about how to find duplicate indexes. This time we’ll learn how to find unused indexes to continue improving our schema and the overall performance. There are different possibilites and we’ll explore the two most common here. User Statistics from Percona Server and pt-index-usage.

User Statistics

User Statistics is an improvement on Percona Server that adds some tables to Information Schema with useful information to understand the server activity and identify the source of the load. Those tables store statistics from our users, tables, indexes and so on, information very helpful to have a clear picture of our users are doing in the database.

To accomplish our task of finding unused indexes we’re going to use INFORMATION_SCHEMA.INDEX_STATISTICS but first we should enable User Statistics:

mysql> SET GLOBAL userstat=on;

After all this hard work we have the statistics plugin gathering information. I usually enable it during load peaks to get realistic information about what is happening in the server. To generate some load on the database I’m going to use the benchmark software tpcc-mysql.

After some runs we can start to analyze the information on INDEX_STATISTICS table.

mysql> SELECT * FROM INDEX_STATISTICS;
+--------------+------------+--------------+-----------+
| TABLE_SCHEMA | TABLE_NAME | INDEX_NAME   | ROWS_READ |
+--------------+------------+--------------+-----------+
| tpcc         | orders     | PRIMARY      |     59472 |
| tpcc         | customer   | PRIMARY      |    128652 |
| tpcc         | stock      | PRIMARY      |   1128357 |
| tpcc         | district   | PRIMARY      |    116107 |
| tpcc         | item       | PRIMARY      |    282870 |
| tpcc         | warehouse  | PRIMARY      |     84959 |
| tpcc         | orders     | idx_orders   |      2832 |
| tpcc         | customer   | idx_customer |    175240 |
| tpcc         | new_orders | PRIMARY      |     56202 |
| tpcc         | order_line | PRIMARY      |   1164682 |
+--------------+------------+--------------+-----------+

These are the index used during the benchmark. Let’s compare it with the data on INNODB_INDEX_STATS to get the list of unused indexes:

mysql> SELECT INNODB_INDEX_STATS.TABLE_NAME, INNODB_INDEX_STATS.INDEX_NAME from INNODB_INDEX_STATS WHERE CONCAT(INNODB_INDEX_STATS.index_name, INNODB_INDEX_STATS.table_name)NOT IN(SELECT CONCAT(index_statistics.index_name, index_statistics.table_name) FROM index_statistics) AND INNODB_INDEX_STATS.TABLE_SCHEMA='tpcc' AND INNODB_INDEX_STATS.INDEX_NAME<>'GEN_CLUST_INDEX';
+------------+-------------------+
| TABLE_NAME | INDEX_NAME        |
+------------+-------------------+
| stock      | fkey_stock_2      |
| order_line | fkey_order_line_2 |
| history    | fkey_history_1    |
| history    | fkey_history_2    |
+------------+-------------------+

Here we see one of the limitations on User Statistics, it can only track direct lookups on index keys but not lookups done during constraint checks. That’s the reason FK are marked as unused so that means all the other indexes has been used during the benchmark.

The second limitation is with partitioned tables, you can’t get index statistics usage from those tables.

pt-index-usage

The previous example only works with Percona Server so with stock MySQL we should try another approach.

Percona Toolkit has a tool to check the usage of our index from the slow query log. The concept is easy to understand. pt-index-usage reads the slow query log and execute every query with EXPLAIN to ask MySQL which indexes would it use. At the end of the process you’ll have an output with a list of not used indexes. This is an example:

root@debian:/var/log# pt-index-usage slow.log
slow.log:  11% 03:58 remain
slow.log:  21% 03:43 remain
slow.log:  32% 03:09 remain
[...]
ALTER TABLE `tpcc`.`order_line` DROP KEY `fkey_order_line_2`; -- type:non-unique
ALTER TABLE `tpcc`.`orders` DROP KEY `idx_orders`; -- type:non-unique
ALTER TABLE `tpcc`.`stock` DROP KEY `fkey_stock_2`; -- type:non-unique

If you compare these results with the previous example, there are some differences and these are some of the reasons:

    • pt-index-usage uses EXPLAIN and User Statistics number of rows read for every index. So we’re comparing an estimation with real data.
    • EXPLAIN, as I said before, is an estimation from the optimizer and sometimes the real query can use a different execution plan.
    • pt-index-usage tries to convert non-SELECT queries to SELECT queries and is not always possible to get a perfect conversion so there can be some differences.
  • Be cautious with Unique indexes, because maybe they are not used for index lookups but your application may need them to avoid duplicates on the column.

    Take in account that depending on the size of the log the server would need lot of process time and cpu power so I recommend you to not use in the production server. Run it on a slave or a testing environment with an similar data estructure and size.

    Conclusion

    We’ve learnt how to find unused indexes using different approaches and we’ve also learnt that we should’t follow tool’s suggestion literally. These tools are here to help us and after that is our job to check and benchmark the changes on our testing environment to measure the impact on performance. You won’t need to drop all such indexes but it gives you an idea of which of them you should evaluate. Good tools to help us on that evaluation are pt-log-player and pt-upgrade.

    It’s worth to mention the importance of taking a long enough period of logs or statistics to analyze all relevant workload. In some cases there are processes that run once a day or a week and we should have that information in our analysis report.

    Jun
    28
    2012
    --

    Percona Server 5.5.25-27.0 release delayed

    Due to a bug in upstream MySQL ( http://bugs.mysql.com/bug.php?id=65745 ), we’ve decided to delay the release of the next Percona Server release until it is resolved. We had originally expected to release Percona Server 5.5.25-27.0 today and instead I’m informing you that we are not.

    Quoting Hartmut, who reported the bug:

    An UPDATE that should only modify a single row runs forever and blows up table / tablespace size until there is no more disk space available.

    This isn’t something we want to expose our users to.

    Oracle now lists the latest MySQL 5.5 version on their download site ( http://dev.mysql.com/downloads/mysql/5.5.html ) as 5.5.24, which is the previous release.

    We will keep our 5.5.25 binaries available in our EXPERIMENTAL repository and download site, but will delay our final 5.5.25 release until a resolution to this bug is available.

    Jun
    28
    2012
    --

    MySQL Upgrade Webinar Questions Followup

    I did a Webinar about MySQL Upgrade – Best Practices Yesterday and there were some questions we could not answer during Webinar, following Jay’s Lead I decided to post them as a Blog Post.

    Q: Can you go directly MySQL 5.0 to 5.5 for MyISAM tables?
    MyISAM have not been getting any significant development since MySQL 4.1, so in-place upgrade of MySQL 5.0 to MySQL 5.5 should be rather safe from data standpoint. There are still possibilities for sorting order and data type related changes, so you still need to run mysql_upgrade to check if any tables need to be rebuilt for new versions. Note this only applies to the data part of upgrade, you still need to keep into account possible query changes, replication etc.

    Q: Is it possible to move MyISAM to Innodb through plain ALTER TABLE statements?
    Yes. This is exactly how it is done in many cases. There is really no gain in dumping table via mysqldump and loading it back. If performance is of importance I would recommend using Percona Server with enabled expand_fast_index_creation feature, as I just wrote this option can help creating tables a lot faster. You can also use pt-online-schema-change tool for online migration.

    Q: Should we change ROW replication to MIXED during Upgrade?
    I prefer simple solutions when possible, which is either using STATEMENT or ROW replication. MIXED mode means you will have to deal with STATEMENT and ROW replication issues at the same time. I would only recommend MIXED replication mode if you have something substantial to gain.

    Q: How would Upgrade work if Old Server has Antelope format and New one is Barracuda?
    It should not be the problem. Innodb internal format is rather low level and it rarely impacts application functionality wise. As long a you go over normal upgrade checks you should be fine.

    Q: What are the key concerns upgrading MySQL 4.1 to MySQL 5.5?
    This is jumping over 2 Major versions and many years of development so it is the road less traveled and something not really tested well. One of specific issues might be making MySQL Replication to work. You might need to inject MySQL 5.0 in between so it will convert binary log events, as described here yet it does not guaranty replication will work for any application. There is just too many events. In the worse case you might need to do staged upgrades for your replicated setup or plan some downtime for upgrade.

    Q: Upgrading to MySQL 5.5 do we really need all tables changed to Innodb and character set set to utf8?
    No. MySQL 5.5 continues to support MyISAM tables, so you can continue use them as needed. I would evaluate changing though as MyISAM tables are on track to be depreciated in future MySQL versions also MyISAM tables are essentially at the state they were in MySQL 4.1 while Innodb storage engine got dramatic performance improvements and is getting better in each new release. Also note the default storage engine changed to Innodb in MySQL 5.5, meaning you might need to set storage_engine=MYISAM if you’re relaying on tables to be created as MyISAM by default. Regarding setting character set to utf8 – this is not needed as MySQL 5.5 continues to support various character sets same as previous MySQL versions. Moreover utf8 character set continues to cause significant performance overhead so you should only use it in case you’re storing multi language data.

    Q: What to do if Query Execution Plan changes for the worst during Upgrade?
    This is one of the big challenges with upgrade which can’t be resolved in all cases – sometimes queries just need to be changed or supplemented with hints to get to old execution plans. In many cases though you can get to former execution plan either by updating statistics (ANALYZE TABLE) or disabling some of the optimization/statistics related features in new version. Innodb statistics can especially be the problem as they can be very volatile and sometimes it is hard to get them to be same as in old version.

    Q: What is about running 2 instances of MySQL with different versions to test upgrade?
    This may or may not be good idea. In small environments when we often care more about functionality than performance this can be great way to validate the application is working without significant extra expenses. It also can be used for testing if hardware if there is little extra hardware available. If you’re doing performance testing though you need to ensure only one
    instance is running at the same time so other has full resources (including memory) available. Another thing to mention having multiple instances of different MySQL versions on the same box typically means you will install them from binary distribution instead of packages (most typically used in production) which adds the difference compared to your production use. If resources allow I prefer to have upgrade testing done in the environment which is maximally close to production.

    Q: How safe is it to upgrade from MySQL 5.5 to Percona Server 5.5
    Compatibility is our highest priority with Percona Server, so it should be good as drop in replacement with no data conversion needed. You can also rollback easily unless you have enabled any special Percona Server features which are marked as making database binary incompatible with MySQL. Our focus on compatibility is indeed so high there are number of great features which are disabled by default to make sure you do not start relying on them by accident, so make sure to check out our Documentation to get most our of your Percona Server install.

    Thank you for your questions and for attending. See you at our other MySQL Webinars!

    Jun
    27
    2012
    --

    ALTER TABLE: Creating Index by Sort and Buffer Pool Size

    Today I was looking at the ALTER TABLE performance with fast index creation and without it with different buffer pool sizes. Results are pretty interesting. I used modified Sysbench table for these tests because original table as initially created only has index on column K which initially contains only zeros, which means index is very small and also very fast to build by insertion as insertions happen in the “end” of index tree. I’ve updated column to have bunch of long random strings update sbtest set c=concat(sha1(rand()),’-’,sha1(rand()),’-’,sha1(rand()),’-’,sha1(rand()),’-’,sha1(rand())); and added key on column C: alter table sbtest add key c(c); The box I’m using for test is rather old box with 8GB of RAM and 4*SATA disks in RAID10. I used 10mil row table which would look as following in terms of data and index size:

    mysql> show table status like "sbtest" \G
    *************************** 1. row ***************************
               Name: sbtest
             Engine: InnoDB
            Version: 10
         Row_format: Compact
               Rows: 10000060
     Avg_row_length: 224
        Data_length: 2247098368
    Max_data_length: 0
       Index_length: 1460322304
          Data_free: 7340032
     Auto_increment: 10000001
        Create_time: 2012-06-27 13:04:56
        Update_time: NULL
         Check_time: NULL
          Collation: latin1_swedish_ci
           Checksum: NULL
     Create_options:
            Comment:
    1 row in set (0.00 sec)
    

    So we have about 3.7GB total size out of which 1.4GB is the Index.

    First Lets look at results of Adding and Dropping column for 128M buffer pool, sized to illustrate the case of index being significantly larger than amount of memory.

    Blue Columns in this case is the default configuration, second is with expand_fast_index_creation=1 which builds table with primary key only when creating a table and when builds all indexes by sort. The speedup we can observe is over 20 times. Note this benefit only applies to the case when you’re adding/dropping/changing columns as if you just add index it will be done by sort even without this additional Percona Server feature.

    Lets now see with 6GB buffer pool which is large enough to fit complete table with index (but is a bit too small to keep 2 copies) In this case we’ll see the following picture:

    Comparing these two graphs we can see few different observations:

    Buffer Pool Size Matters There is 10x difference between the case when index fits in memory to when it does not. The difference can be even larger in the real world. Working with indexing, such as inserting data in random order can become very slow when index does not fit to memory any more.

    Creating Index by sort is great Creating index by sort can provide over 20x performance improvement to the speed of ALTER TABLE, as this is not only step which ALTER TABLE does the improvement to the index creation speed itself has to be even larger. It is also offers better performance for in memory case, even though in this case the gain is just about 2x for this workload. Still substantial.

    Sorting needs OS Cache It is not clearly seen from the graphs but with index built by sort enabled the performance of ALTER TABLE with 128MB was better than with 6GB. It looks like this does not make sense but in fact it does if you think about building index stage. On this box with 8GB of memory allocating 6GB to buffer pool made very little memory available for OS cache, so index build process required more IO than if only 128MB were used for buffer pool. The difference was rather small – less than 10% though, which is because disks are not that bad with sequential IO, which is what file merge process uses. This is important illustration as in many cases when people see performance reduction with large buffer pool they forget what bottleneck might be somewhere else where memory resources might be just needed more.

    Take Away: If you’re having large tables and need to run ALTER TABLE which rebuilds the table or OPTIMIZE TABLE do not forget to enable expand_fast_index_creation it can make a whole process a lot faster.

    P.S The tests were done with Percona Server 5.5.24-26.0

    Jun
    27
    2012
    --

    Announcing Percona Live MySQL London

    Looking for a great opportunity to share your MySQL knowledge with the MySQL community? Here is your chance.

    Percona is happy to announce that the dates of the Percona Live MySQL London Conference have been set for Dec. 3-4, 2012. Call for speakers is underway and will be open until August 17th, 2012. The conference committee will review proposals that fit into the likely conference tracks of:

    • Developing Applications
    • Database Administration
    • Trends in Architecture and Design
    • Tools
    • Utilizing Hardware
    • New Features
    • Best Practices for Businesses
    • Replication
    • High Availability Strategies

    Each session is 50 minutes in length, including a Q&A period.

    The conference will also include tutorial sessions. Tutorials should present immediate and practical applications of in-depth MySQL knowledge targeted at a level between a training class and a conference breakout session. Tutorials are three or six hours in length including a Q&A period.

    Submit a speaker proposal by visiting the Percona Live London 2012 website:
    http://www.percona.com/live/london-2012/user/register

    When I design talks for conferences like Percona Live, my goal is to give the audience some new and practical technical knowledge.  I want them to leave after my talk saying, “I can’t wait to get back to my office and try that.”

    Other types of talks are great, too.  A talk can show the results of some deep benchmarking, or discuss internals that give insight into how to optimize or troubleshoot a given scenario.  A success story can present how a particular project approached a problem, and how they used a particular combination of technology to good effect.  Another can demo some new solution in development, giving attendees a preview of things to come.

    All of these types of talks are welcome at Percona Live, and that’s what makes these events the type where attendees have a hard time deciding which room to be in, and how they can get more of their colleagues to attend.  So join up now, and submit a speaker proposal!

    Jun
    25
    2012
    --

    Announcing Percona XtraBackup 2.0.1

    Percona is glad to announce the release of Percona XtraBackup 2.0.1 on June 25th, 2012. Downloads are available from our download site and Percona Software Repositories.

    This release is the current GA (Generally Available) stable release in the 2.0 series.

    This release contains over 20 fixed bugs, some of the more significant bugs fixed are:

    • After creating a full compressed backup, performing a compressed/uncompressed incremental backup would fail because xtrabackup_checkpoints was compressed. This has been fixed by omitting xtrabackup_checkpoints from compression, so that a full backup could be used for incremental backups without decompression. Bug fixed #977652 (Alexey Kopytov).
    • Streaming backups with –stream=tar would fail if the file size was bigger than 8GB. Fixed by changing the libarchive format from USTAR to restricted PAX which supports bigger file sizes. Bug fixed #977998 (Alexey Kopytov).
    • innobackupex was calling the tar utility unconditionally when streaming ib_lru_dump and xtrabackup_galera_info. Which led to a broken stream when the xbstream format was used. Bug fixed #983720 (Sergei Glushchenko).
    • XtraBackup binary could fail with the “log block checksum mismatch” error when reading the partially written log block. Bug fixed #1015416 (Alexey Kopytov).

    Release notes with all the bugfixes for Percona XtraBackup 2.0.1 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

    Jun
    25
    2012
    --

    Symfony Live San Francisco 2012: The Schedule

    The schedule for the
    Symfony Live conference in San Franscico has just been published.

    For the first time in the US, Symfony developers will be able to take the
    Symfony certification exam on September 29th. You can already href="http://trainings.sensiolabs.com/fr/training/symfony-certification/sessions">register.

    We are still looking for
    sponsors to help us cover
    the cost of the conference; read the
    guide
    and consider becoming a sponsor. I’d like to thank the early sponsors for the
    San Francisco conference: Microsoft, ServerGrove, Scalable Path, and KnpLabs.

    Follow us on Twitter to get the latest
    news about the conference. And if you tweet about the conference, consider
    using the official hashtag
    #symfony_live.

    It’s time to buy
    your ticket.


    Be trained by Symfony experts
    2012-07-12 Köln
    2012-07-16 Sheffield
    2012-07-16 Sheffield

    Written by in: Zend Developer |
    Jun
    24
    2012
    --

    A week of symfony #286 (18->24 June 2012)

    This week, the long awaited Symfony 2.1 first beta was released. Meanwhile, the official repository achieved a very remarkable milestone: more than 5,000 watchers. In addition, the first details of the upcoming Symfony Live London 2012 conference were unveiled.

    Symfony2 development highlights

    Master branch:

    • 024cb91:
      [Security] added failing testcases for EncoderFactory
    • 0be11f4:
      [Security] fix retrieval of encoder when configured for concrete classes
    • 24eb396:
      [Filesystem] added chown() and chgrp() methods, added a recursive option for chmod() and added access to the second and third arguments of touch() function
    • 61a9345, faccd25:
      [Security] fixed some unit tests for PHP 5.3.3
    • bb87a71:
      [Console] use mode command to detect terminal size on Windows
    • d982bac:
      fixed built-in server for PHP > 5.4.1
    • 8eca661:
      [FileSystem] explained possible failure of symlink creation in windows
    • 45df2e6:
      [Config] updated resources API to be more explicit
    • 6b39688:
      [Config] moved DirectoryResource childs retrieving to the special getFilteredChilds method
    • 1f9ba38:
      [Config] getFilteredChildResources() method added to DirectoryResource
    • d7c24eb:
      [Config] added new methods and their tests to File and Directory resources
    • ff9c132:
      [Config] added type prefixes to resource ids
    • 241aa92:
      [Config] added existence check to some resource methods
    • 0f9be2f:
      updated for version 2.1 Beta1

    2.0.x branch:

    • 2c12ed3:
      added a missing provide for the swiftmailer bridge
    • 680b83c:
      [Security] allowed “0″ as a password

    Repository summary: 5,035 watchers (#1 in PHP, #30 overall) and 1342 forks (#1 in PHP, #12 overall).

    They talked about us


    Be trained by Symfony experts
    2012-07-12 Köln
    2012-07-16 Sheffield
    2012-07-16 Sheffield

    Written by in: Zend Developer |
    Jun
    24
    2012
    --

    Where Percona software goes before it’s released

    Where does software go while it’s waiting for release? The answer is: our experimental repository!

    For anywhere from a few days to a week before we release them, we stage our software releases in the TESTING download area and in separate software repositories for Debian/Ubuntu and CentOS/RHEL. This is so that we can co-ordinate things such as release notes on the web site, blog posts and to ensure that we have a safe testing ground before we unleash them to the world.

    You can always get Percona Server and Percona XtraBackup TESTING binaries from:

    For example, if you went there currently you’d find the upcoming Percona XtraBackup 2.0.1 release due later this week.

     

    Jun
    22
    2012
    --

    A week of symfony #285 (11->17 June 2012)

    This week, Symfony2 development focused on Form and Filesystem components. In addition, the Security component was updated to allow class names to be passed as an argument to getEncoder() method.

    Development mailing list

    Symfony2 development highlights

    Master branch:

    • 0995b1f:
      moved the UserPassword validator from the security bundle to the security component to make it reusable outside the full-stack framework
    • 31843cf:
      [FrameworkBundle] added info to config
    • 66ff060:
      fixed an issue with session mocking in functional tests that do not start with a fresh session instance for each request (Silex for instance)
    • 1e83206:
      [Form] FormBuilderInterface->getParent() doesn’t take any arguments
    • b5cf337:
      [Form] enhanced the form type mismatch error message
    • 0b58828:
      [FileSystem] removed symlinks under windows
    • abab929:
      [Filesystem] prevent empty value in isAbsolutePath and use rtrim in mirror
    • 3ab9a6e:
      [Yaml] fixed string parsing
    • 0f67ca8:
      [HttpFoundation] fixed StreamedResponse with HEAD as a method
    • fc3ebb8:
      [FileSystem] added if-windows check
    • f881d28:
      [Validator] avoid to get information from the cache when we already have them locally (also fixes usage of this class when not using a cache)
    • 76b2ed4:
      moved validator translation files to the Form and Validator components to make them reusable
    • 38cad9d:
      [Filesystem] added exists method
    • f65ade2:
      fixed order of translation files registration
    • 0b8b76b:
      [Security] allowed class names to be passed as an argument to EncoderFactoryInterface::getEncoder()

    2.0.x branch:

    • 5d88255:
      Authorization header should only be rebuild when Basic Auth scheme is used
    • ba16a51:
      changed getName() to name on all Reflection* object calls
    • 3ce8227, cd0aa37:
      [Security] only redirect to urls called with http method GET

    Repository summary: 4,972 watchers (#1 in PHP, #31 overall) and 1320 forks (#1 in PHP, #12 overall).

    Updated plugins

    • sfDoctrineActAsTaggable:

      • refactored typeahead to use jQuery.ajax. This makes debugging easier.
    • apostrophe:

      • new extraPageSettings partial allows you to easily render more fields in the page settings form without overriding the settingsSuccess template in its entirety
      • If the request has a class parameter, apply it to the admin container and the edit form
      • removed the 10px inset on the crop tool when no aspect ratio is set

    They talked about us


    Be trained by Symfony experts
    2012-07-12 Köln
    2012-07-16 Sheffield
    2012-07-16 Sheffield

    Written by in: Zend Developer |

    Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com