Jul
14
2017
--

A Little Trick Upgrading to MySQL 5.7

Upgrading to MySQL 5.7

Upgrading to MySQL 5.7In this blog post, I’ll look at a trick we use at Percona when upgrading to MySQL 5.7.

I’ll be covering this subject (and others) in my webinar Learning MySQL 5.7 on Wednesday, July 19, 2017.

We’ve been doing upgrades for quite a while here are Percona, and we try to optimize, standardize and improve this process to save time. When upgrading to MySQL 5.7, more often than not you need to run REPAIR or ALTER via mysql_upgrade to a number of MySQL tables. Sometimes a few hundred, sometimes hundreds of thousands.

One way to cut some time from testing or executing mysql_upgrade is to combine it with mysqlcheck. This identifies tables that need to be rebuilt or repaired. The first step is to capture the output of this process:

revin@acme:~$ mysqlcheck --check-upgrade --all-databases > mysql-check.log

This provides a lengthy output of what needs to be done to successfully upgrade our tables. On my test data, I get error reports like the ones below. I’ll need to take the specified action against them:

ads.agency
error    : Table upgrade required. Please do "REPAIR TABLE `agency`" or dump/reload to fix it!
store.categories
error    : Table rebuild required. Please do "ALTER TABLE `categories` FORCE" or dump/reload to fix it!

Before we run through this upgrade, let’s get an idea of how long it would take for a regular mysql_upgrade to complete on this dataset:

revin@acme:~$ time mysql_upgrade
Enter password:
Checking if update is needed.
Checking server version.
Running queries to upgrade MySQL server.
Checking system database.
mysql.columns_priv                                 OK
mysql.db                                           OK
...
mysql.user                                         OK
Upgrading the sys schema.
Checking databases.
ads.account_preference_assoc         OK
...
Repairing tables
...
ads.agency
Note     : TIME/TIMESTAMP/DATETIME columns of old format have been upgraded to the new format.
status   : OK
...
`store`.`categories`
Running  : ALTER TABLE `store`.`categories` FORCE
status   : OK
...
Upgrade process completed successfully.
Checking if update is needed.
real	25m57.482s
user	0m0.024s
sys	0m0.072s

On a cold server, my baseline above took about 25 minutes.

The second step on our time-saving process is to identify the tables that need some action (in this case, REPAIR and ALTER … FORCE). Generate the SQL statements to run them and put them into a single SQL file:

revin@acme:~$ for t in $(cat mysql-check.log |grep -B1 REPAIR | egrep -v 'REPAIR|--');
	do echo "mysql -e 'REPAIR TABLE $t;'" >> upgrade.sql; done
revin@acme:~$ for t in $(cat mysql-check.log |grep -B1 ALTER | egrep -v 'ALTER|--');
	do echo "mysql -e 'ALTER TABLE $t FORCE;'" >> upgrade.sql; done

My upgrade.sql file will have something like this:

mysql -e 'ALTER TABLE store.categories FORCE;'
mysql -e 'REPAIR TABLE ads.agency;'

Now we should be ready to run these commands in parallel as the third step in the process:

revin@acme:~$ time parallel -j 4 -- < upgrade.sql
...
real	17m31.448s
user	0m1.388s
sys	0m0.616s

Getting some parallelization is not bad, and the process improved by about 38%. If we are talking about multi-terabyte data sets, then it is already a big gain.

On the other hand, my dataset has a few tables that are bigger than the rest. Since mysqlcheck processes them in a specific order, one of the threads was processing most of them instead of spreading them out evenly to each thread by size. To fix this, we need to have an idea of the sizes of each table we will be processing. We can use a query from the INFORMATION_SCHEMA.TABLES for this purpose:

revin@acme:~$ for t in $(cat mysql-check.log |grep -B1 ALTER | egrep -v 'ALTER|--');
	do d=$(echo $t|cut -d'.' -f1); tbl=$(echo $t|cut -d'.' -f2);
	s=$(mysql -BNe "select sum(index_length+data_length) from information_schema.tables where table_schema='$d' and table_name='$tbl';");
	echo "$s |mysql -e 'ALTER TABLE $t FORCE;'" >> table-sizes.sql; done
revin@acme:~$ for t in $(cat mysql-check.log |grep -B1 REPAIR | egrep -v 'REPAIR|--');
	do d=$(echo $t|cut -d'.' -f1); tbl=$(echo $t|cut -d'.' -f2);
	s=$(mysql -BNe "select sum(index_length+data_length) from information_schema.tables where table_schema='$d' and table_name='$tbl';");
	echo "$s |mysql -e 'REPAIR TABLE $t;'" >> table-sizes.sql; done

Now my table-sizes.sql file will have contents like below, which I can sort and pass to the parallel command again and cut even more time!

32768 |mysql -e 'REPAIR TABLE ads.agency;'
81920 |mysql -e 'ALTER TABLE store.categories FORCE;'

revin@acme:~$ cat table-sizes.sql |sort -rn|cut -d'|' -f2 > upgrade.sql
revin@acme:~$ time parallel -j 4 -- < upgrade.sql
...
real	8m1.116s
user	0m1.260s
sys	0m0.624s

This go-around, my total execution time is 8 minutes – a good 65% improvement. To wrap it up, we will need to run mysql_upgrade one last time so that the system tables are also upgraded, the tables are checked again and then restart the MySQL server as instructed by the manual:

revin@acme:~$ time mysql_upgrade --force

The whole process should be easy to automate and script, depending on your preference. Lastly: YMMV. If you have one table that is more than half the size of your total data set, there might not be big gains.

If you want to learn more about upgrading to MySQL 5.7, come to my webinar on Wednesday, July 19: Learning MySQL 5.7. This process is only one of the phases in a multi-step upgrade process when moving to 5.7. I will discuss them in more detail next week. Register now from the link below, and I’ll talk to you soon!

Jul
11
2017
--

Webinar Wednesday July 12, 2017: MongoDB Index Types – How, When and Where Should They Be Used?

MongoDB Index Types

MongoDB Index TypesJoin Percona’s Senior Technical Services Engineer, Adamo Tonete as he presents MongoDB Index Types: How, When and Where Should They Be Used? on Wednesday, July 12, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7).

MongoDB has 12 index types. Do you know how each works, or when you should use each of them? This talk will arm you with this knowledge, and help you understand how indexes impact performance, storage and even the sharding of your data. We will also discuss some solid index operational practices, as well as some settings for things like TTL you might not know exist. The contents of this webinar will make you a Rock Star!

Register for the webinar here.

MongoDB Index TypesAdamo Tonete, Senior Technical Services Engineer

Adamo joined Percona in 2015, after working as a MongoDB/MySQL database administrator for three years. As the main database admin of a startup, he was responsible for suggesting the best architecture and data flows for a worldwide company in a 24/7 environment. Before that, he worked as a Microsoft SQL Server DBA for a large e-commerce company, working mainly on performance tuning and automation. Adamo has almost eight years of experience working as a DBA, and in the past three years he has moved to NoSQL technologies (without giving up relational databases).

Jul
10
2017
--

Webinar Tuesday July 11, 2017: Securing Your MySQL/MariaDB Data

Securing Your MySQL/MariaDB Data

Securing Your MySQL/MariaDB DataJoin Percona’s Chief Evangelist, Colin Charles as he presents Securing Your MySQL/MariaDB Data on Tuesday, July 11, 2017 at 7:00 am PDT / 10:00 am EDT (UTC-7).

This webinar will discuss the features of MySQL/MariaDB that when enabled and used improve the default usage of MySQL. Many cloud-based applications fail to:

  • Use appropriate filesystem permissions
  • Employ TLS/SSL for connections
  • Require TLS/SSL with MySQL replication
  • Use external authentication plugins (LDAP, PAM, Kerberos)
  • Encrypt all your data at rest
  • Monitor your database with the audit plugin
  • Review and rejecting SQL injections
  • Design application access using traditional firewall technology
  • Employ other MySQL/MariaDB security features

This webinar will demonstrate and advise on how to correctly implement the features above. We will end the presentation with some simple steps on how to hack a MySQL installation.

You can register for the webinar here.

Securing Your MySQLColin Charles, Percona Chief Evangelist

Colin Charles is the Chief Evangelist at Percona. He was previously on the founding team of MariaDB Server in 2009, worked at MySQL since 2005 and been a MySQL user since 2000. Before joining MySQL, he worked actively on the Fedora and OpenOffice.org projects. He’s well known within open source communities in APAC, and has spoken at many conferences.

Jul
08
2017
--

MongoDB Indexing Types: How, When and Where Should They Be Used?

MongoDB Index Types

MongoDB IndexingIn this blog post, we will talk about MongoDB indexing, and the different types of indexes that are available in MongoDB.

Note: We are hosting a webinar on July 12, 2017, where I will talk about MongoDB indexes and how to choose a good indexing plan.

MongoDB is a NoSQL database that is document-oriented. NoSQL databases share many features with relational databases, and one of them is indexes. The question is how are such documents indexed in the database?

Remember that because MongoDB is a database that writes JSONs, there is no predefined schema in the document. The same field can be a string or an integer – depending on the document.

MongoDB, as well as other databases, use B-trees to index. With some exceptions, the algorithm is the same as a relational database.

The B-tree can use integers and strings together to organize data. The most important thing to know is that an index-less database must read all the documents to filter what you want, while an indexed database can – through indexes – find the documents quickly. Imagine you are looking for a book in a disorganized library. This is how the query optimizer feels when we are looking for something that is not indexed.

There are several different types of indexes available: single field, compound indexes, hashed indexes, geoIndexes, unique, sparse, partial, text… and so on. All of them help the database in some way, although they obviously also get in the way of write performance if used too much.

  • Single fields. Single fields are simple indexes that index only one field in a collection. MongoDB can usually only use one index per query, but in some cases the database can take advantage of more than one index to reply to a query (this is called index intersection). Also, $or actually executes more than one query at a time. Therefore $or and $in can also use more than one index.
  • Compound indexes. Compound indexes are indexes that have more than one indexed field, so ideally the most restrictive field should be to the left of the B-tree. If you want to index by sex and birth, for instance, the index should begin by birth as it is much more restrictive than sex.
  • Hashed indexes. Shards use hashed indexes, and create a hash according to the field value to spread the writes across the sharded instances.
  • GeoIndexes. GeoIndexes are a special index type that allows a search based on location, distance from a point and many other different features.
  • Unique indexes. Unique indexes work as in relational databases. They guarantee that the value doesn’t repeat and raise an error when we try to insert a duplicated value. Unique doesn’t work across shards.
  • Text indexes. Text indexes can work better with indexes than a single indexed field. There are different flags we can use, like giving weights to control the results or using different collections.
  • Sparse/Partial indexes. Sparse and partial indexes seem very similar. However, sparse indexes will index only an existing field and not check its value, while partial indexes will apply a filter (like greater than) to a field to index. This means the partial index doesn’t index all the documents with the existing field, but only documents that match the create index filter.

We will discuss and talk more about indexes and how they work in my webinar MongoDB® Index Types: How, When and Where Should They Be Used? If you have questions, feel free to ask them in the comments below and I will try to answer all of them in the webinar (or in a complementary post).

I hope this blog post was useful, please feel free to reach out me on twitter @AdamoTonete or @percona.

Jul
03
2017
--

Webinar Wednesday July 5, 2017: Indexes – What You Need to Know to Get the Most Out of Them

Indexes

IndexesJoin Percona’s Senior Architect, Matthew Boehm, as he presents Indexes – What You Need to Know to Get the Most Out of Them on Wednesday, July 5, 2017 at 8:00 am PDT / 11:00 am EDT (UTC-7).

Proper indexing is key to database performance. Find out how MySQL uses indexes for query execution, and then how to come up with an optimal index strategy. In this session, you’ll also learn how to know when you need an index, and also how to get rid of indexes that you don’t need to speed up queries.

Register for the webinar here.

MatthewMatthew Boehm, Architect

Matthew joined Percona in the fall of 2012 as a MySQL consultant. He quickly rose to Architect and became one of Percona’s training instructors. His areas of knowledge include the traditional Linux/Apache/MySQL/PHP stack, memcached, MySQL Cluster (NDB), massive sharding topologies, PHP development and a bit of MySQL-C-API development.

Jun
26
2017
--

Webinar Tuesday June 27, 2017: MariaDB® Server 10.2 – The Complete Guide

MariaDB Server 10.2

MariaDB Server 10.2Join Percona’s Chief Evangelist, Colin Charles as he presents MariaDB Server 10.2: The Complete Guide on Tuesday, June 27, 2017, at 7:00 am PDT / 10:00 am EDT (UTC-7).

The new MariaDB Server 10.2 release is out. It has some interesting new features, but beyond just a list of features we need to understand how to use them. This talk will go over everything new that MariaDB 10.2 has to offer.

In this webinar, we’ll learn about Window functions, common table expressions, finer-grained CREATE USER statements, and more – including getting mysqlbinlog up to parity with MySQL.

There are also unique MariaDB features that don’t exist in MySQL like encryption at rest, integrated Galera Cluster, threadpool, InnoDB defragmentation, roles, extended REGEXP, etc.

The webinar describes all the new features, both MySQL compatible and MariaDB-only ones, and show usage examples and practical use cases. We’ll also review the feature roadmap for MariaDB Server 10.3.

Register for the webinar here.

MariaDB Server 10.2Colin Charles, Chief Evangelist

Colin Charles is the Chief Evangelist at Percona. He was previously part of the founding team of MariaDB Server (2009), worked at MySQL since 2005, and has been a MySQL user since 2000. Before joining MySQL, he actively worked on the Fedora and OpenOffice.org projects. He’s well known within open source communities in APAC and has spoken at many conferences.

Jun
20
2017
--

Webinar Thursday June 22, 2017: Deploying MySQL in Production

Deploying MySQL

Join Percona’s Senior Operations Engineer, Daniel Kowalewski as he presents Deploying MySQL in Production on Thursday, June 22, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7).

 MySQL is famous for being something you can install and get going in less than five minutes in terms of development. But normally you want to run MySQL in production, and at scale. This requires some planning and knowledge. So why not learn the best practices around installation, configuration, deployment and backup?

This webinar is a soup-to-nuts talk that will have you going from zero to hero in no time. It includes discussion of the best practices for installation, configuration, taking backups, monitoring, etc.

Register for the webinar here.

Deploying MySQLDaniel Kowalewski, Senior Technical Operations Engineer

Daniel has been designing and deploying solutions around MySQL for over ten years. He lives for those magic moments where response time drops by 90%, and loves adding more “nines” to everything.

Jun
19
2017
--

Upcoming HA Webinar Wed 6/21: Percona XtraDB Cluster, Galera Cluster, MySQL Group Replication

High Availability

High AvailabilityJoin Percona’s MySQL Practice Manager Kenny Gryp and QA Engineer, Ramesh Sivaraman as they present a high availability webinar around Percona XtraDB Cluster, Galera Cluster, MySQL Group Replication on Wednesday, June 21, 2017 at 10:00 am PDT / 1:00 pm EDT (UTC-7).

What are the implementation differences between Percona XtraDB Cluster 5.7, Galera Cluster 5.7 and MySQL Group Replication?

  • How do they work?
  • How do they behave differently?
  • Do these methods have any major issues?

This webinar will describe the differences and shed some light on how QA is done for each of the different technologies.

Register for the webinar here.

High AvailabilityRamesh Sivaraman, QA Engineer

Ramesh joined the Percona QA Team in March 2014. He has almost six years of experience in database administration and, before joining Percona, was giving MySQL database support to various service and product based internet companies. Ramesh’s professional interests include writing shell/Perl script to automate routine tasks and new technology. Ramesh lives in Kerala, the southern part of India, close to his family.

High AvailabilityKenny Gryp, MySQL Practice Manager

Kenny is currently MySQL Practice Manager at Percona.

Jun
13
2017
--

Webinar Thursday, June 15, 2017: Demystifying Postgres Logical Replication

Postgres Logical Replication

Postgres Logical ReplicationJoin Percona’s Senior Technical Services Engineer Emanuel Calvo as he presents Demystifying Postgres Logical Replication on Thursday, June 15, 2017 at 7 am PDT / 10 am EDT (UTC-7).

The Postgres logical decoding feature was added in version 9.4, and thankfully it is continuously improving due to the vibrant open source community. In this webinar, we are going to walk through its concepts, usage and some of the new things coming up in future releases.

Logical decoding is one of the features under the BDR implementation, allowing bidirectional streams of data between Postgres instances. It also allows you to stream data outside Postgres into many other data systems.

Register for the webinar here.

Emanuel CalvoEmanuel Calvo, Percona Sr. Technical Services

Emanuel has worked with MySQL for more than eight years. He is originally from Argentina, but also lived and worked in Spain and other Latin American countries. He lectures and presents at universities and local events. Emanuel currently works as a Sr. Technical Services at Percona, focusing primarily on MySQL. His professional background includes experience at telecommunication companies, educational institutions and data warehousing solutions. In his career, he has worked as a developer, SysAdmin and DBA in companies like Pythian, Blackbird.io/PalominoDB, Siemens IT Solutions, Correo Argentino (Argentinian Postal Services), Globant-EA, SIU – Government Educational Institution and Aedgency among others. As a community member he has lectured and given talks in Argentina, Brazil, United States, Paraguay, Spain and Belgium as well as written several technical papers.

Jun
06
2017
--

Upcoming Webinar Thursday June 8, 2017: MongoDB Shell – A Primer

MongoDB Shell

MongoDB ShellJoin Percona’s Solutions Engineer, Rick Golba as he presents MongoDB Shell: A Primer on Thursday, June 8, 2017, at 11 am PDT / 2 pm EDT (UTC-7).

Every good DBA should be a master of the database shell. In this webinar, we will help you understand how to structure shell commands and discuss all the advanced functions and ways to chain commands in the mongo shell.

This webinar will teach you how to:

  • Limit the number of documents, or skip documents, when running a query
  • Work with the MongoDB aggregation pipeline
  • View an explain plan for a MongoDB query
  • Understand the MongoDB write concerns
  • Validate the contents of a database on various nodes in a replica set
  • Understand the MongoDB read preference

We will touch on CRUD functions, but a great deal more time will be spent on the areas above. We will have a dedicated webinar for mastering CRUD operations in MongoDB in the future.

Register for the webinar here.

MongoDB ShellRick Golba, Solutions Engineer

Rick Golba is a Solutions Engineer at Percona. Rick has over 20 years of experience working with databases. Prior to Percona, he worked as a Technical Trainer for HP/Vertica.

 

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com