Webinar Thursday July 27, 2017: Database Backup and Recovery Best Practices (with a Focus on MySQL)

Backups and Disaster Recovery

Database Backup and RecoveryJoin Percona’s, Architect, Manjot Singh as he presents Database Backup and Recovery Best Practices (with a Focus on MySQL) on Thursday, July 27, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7).

In the case of a failure, do you know how long it will take to restore your database? Do you know how old the backup will be? In this presentation, we will cover the basics of best practices for backup, restoration and business continuity. Don’t put your company on the line due to bad data retention and backup policies.

Register for the webinar here.

Manjot Singh, Architect

Manjot Singh is an Architect with Percona in California. He loves to learn about new technologies and apply them to real-world problems. Manjot is a veteran of startup and Fortune 500 enterprise companies alike, with a few years spent in government, education and hospital IT. Now he consults for Percona with companies around the world on many interesting problems.

Kinetica scores $50 million Series A for super-charged in-memory database solution

 Kinetica’s roots as a company go back to a 2009 consulting project for US intelligence services. When they couldn’t find a solution on the market to meet the strict demands of the army and NSA to track terrorists in real-time, they decided to build it. Today, it’s an in-memory database solution that relies on commodity hardware running Nvidia GPUs to supercharge the processing.… Read More


With version 2.0, Crate.io’s database tools put an emphasis on IoT

 Crate.io, the winner of our Disrupt Europe 2014 Battlefield, is launching version 2.0 of its CrateDB database today. The tool, which is available in both an open source and enterprise version, started out as a general-purpose but highly scalable SQL database. Over time, though, the team found that many of its customers were using the service for managing their machine data. Read More


Dropping the Foreign Key Constraint Using pt-online-schema-change


Foreign KeyIn this blog post, we’ll look at how to get rid of the unused Foreign Key (FK) constraint and/or related columns/keys with the help of pt-online-schema-change and the power of its plugins.

Before we proceed, here is a useful blog post written by Peter Zaitsev on Hijacking Innodb Foreign Keys.

If you are trying to get rid of an unused foreign key (FK) constraint and related columns from versions older than MySQL 5.6, or tables that cannot be executed with


 because of limitations mentioned here (specifically, tables with 5.5 TIMESTAMP formats), you can use







 requires specifying


 rather than the real


. This is due to a limitation in MySQL:


 adds a leading underscore to foreign key constraint names when creating the new table. Here’s is a simple example of one such case:

CREATE TABLE `test3` (
  `Id` int(11) NOT NULL DEFAULT '0',
  `Firstname` varchar(32) DEFAULT NULL,
  `City` varchar(32) DEFAULT NULL,

To drop the constraint, we are supposed to add an underscore prior to



[root@siddhant ~]# pt-online-schema-change --user=root --execute --set-vars=foreign_key_checks=0  --alter-foreign-keys-method=rebuild_constraints --alter="DROP FOREIGN KEY _FKID" D=apps02,t=test3 --socket=/tmp/mysql-master5520.sock
Operation, tries, wait:
analyze_table, 10, 1
copy_rows, 10, 0.25
……...Altering `apps02`.`test3`...
Creating new table...
Created new table apps02._test3_new OK.
Altering new table….... …….
2017-02-11T12:45:12 Dropped old table `apps02`.`_test3_old` OK.
2017-02-11T12:45:12 Dropping triggers...
2017-02-11T12:45:12 Dropped triggers OK.
Successfully altered `apps02`.`test3`.

Below is one case where if, for some reason, you already have an FK constraint with an underscore the above method of adding an additional underscore to already underscored _FK will fail with an error while dropping it:

Error altering new table `apps02`.`_test3_new`: DBD::mysql::db do failed: Error on rename of './apps02/_test3_new' to './apps02/#sql2-697-19' (errno: 152) [for Statement "ALTER TABLE `apps02`.`_test3_new` DROP FOREIGN KEY ___FKID"] at /usr/bin/pt-online-schema-change line 9069.

In such cases, we will have to make use of the


  option used along with a file that calls the 


 class and a hook


 to drop the FK constraint. For example, a table with the FK constraint with an underscore is:

  `Id` int(11) NOT NULL DEFAULT '0',
  `Firstname` varchar(32) DEFAULT NULL,
  `City` varchar(32) DEFAULT NULL,
  CONSTRAINT `___fkId` FOREIGN KEY (`Id`) REFERENCES `test2` (`Id`)

Here we have a table with foreign key


 using three underscores. Our plugin for dropping the constraint should be as follows:

[root@siddhant ~]# cat ptosc_plugin_drop_fk.pl
package pt_online_schema_change_plugin;
use strict;
sub new {
   my ($class, %args) = @_;
   my $self = { %args };
   return bless $self, $class;
sub after_alter_new_table {
   my ($self, %args) = @_;
   my $new_tbl = $args{new_tbl};
   my $dbh     = $self->{cxn}->dbh;
   my $sth = $dbh->prepare("ALTER TABLE $new_tbl->{name} DROP FOREIGN KEY __fkId");

NOTE: DROP FOREIGN KEY CONSTRAINT in the plugin has one underscore less than original foreign key constraint, 




. Also, the alter statement will be NOOP alter (i.e., 

--alter ="ENGINE=INNODB"


Here is the


 execution example with the plugin.

[root@siddhant ~]#  pt-online-schema-change --user=root --execute  --set-vars=foreign_key_checks=0  --alter-foreign-keys-method=rebuild_constraints --alter="ENGINE=INNODB" --plugin=/root/ptosc_plugin_drop_fk.pl  D=apps01,t=test --socket=/tmp/mysql-master5520.sock
Created plugin from /root/ptosc_plugin_drop_fk.pl.
Operation, tries, wait:
  analyze_table, 10, 1
  copy_rows, 10, 0.25
  create_triggers, 10, 1
  drop_triggers, 10, 1
  swap_tables, 10, 1
  update_foreign_keys, 10, 1
Altering `apps01`.`test`...
Creating new table...
Created new table apps01._test_new OK.
Altering new table...
Altered `apps01`.`_test_new` OK.
2017-02-11T11:26:14 Creating triggers...
2017-02-11T11:26:14 Created triggers OK.
2017-02-11T11:26:14 Copied rows OK.
2017-02-11T11:26:14 Swapping tables...
2017-02-11T11:26:14 Swapped original and new tables OK.
2017-02-11T11:26:14 Dropping old table...
2017-02-11T11:26:14 Dropped old table `apps01`.`_test_old` OK.
2017-02-11T11:26:14 Dropping triggers...
2017-02-11T11:26:14 Dropped triggers OK.
Successfully altered `apps01`.`test`.


Percona Blog Poll Results: What Programming Languages Are You Using for Backend Development?

Programming Languages

Programming LanguagesIn this blog we’ll look at the results from Percona’s blog poll on what programming languages you’re using for backend development.

Late last year we started a poll on what backend programming languages are being used by the open source community. The three components of the backend – server, application, and database – are what makes a website or application work. Below are the results of Percona’s poll on backend programming languages in use by the community:

Note: There is a poll embedded within this post, please visit the site to participate in this post’s poll.

One of the best-known and earliest web service stacks is the LAMP stack, which spelled out refers to Linux, Apache, MySQL and PHP/Perl/Python. We can see that this early model is still popular when it comes to the backend.

PHP still remains a very common choice for a backend programming language, with Python moving up the list as well. Perl seems to be fading in popularity, despite being used a lot in the MySQL world.

Java is also showing signs of strength, demonstrating the strides MySQL is making in enterprise applications. We can also see JavaScript is increasingly getting used not only as a front-end programming language, but also as back-end language with the Node.JS framework.

Finally, Go is a language to look out for. Go is an open source programming language created by Google. It first appeared in 2009, and is already more popular than Perl or Ruby according to this poll.

Thanks to the community for participating in our poll. You can take our latest poll on what database engine are you using to store time series data here. 


Vote Percona in LinuxQuestions.org Members Choice Awards 2016

LinuxQuestions.org Members Choice Awards 2016Percona is calling on you! Vote for Percona for Database of the Year in LinuxQuestions.org Members Choice Awards 2016. Help Percona get recognized as one of the best database options for data performance. Percona provides free, fully compatible, enhanced, open source drop-in replacement database software with superior performance, scalability and instrumentation.

LinuxQuestions.org, or LQ for short, is a community-driven, self-help website for Linux users. Each year, LinuxQuestions.org holds an annual competition to recognize the year’s best-in-breed technologies. The online Linux community determines the winners of each category!

You can vote now for your favorite database of 2016 (Percona, of course!). This is your chance to be heard!

Voting ends on February 7, 2017. You must be a registered member of LinuxQuestions.org with at least one post on their forums to vote.


Don’t Let a Leap Second Leap on Your Database!

Leap Second

leap_secThis blog discusses how to prepare your database for the new leap second coming in the new year.

At the end of this year, on December 31, 2016, a new leap second gets added. Many of us remember the huge problems this caused back in 2012. Some of our customers asked how they should prepare for this year’s event to avoid any unexpected problems.

It’s a little late, but I thought discussing the issue might still be useful.

The first thing is to make sure your systems avoid the issue with abnormally high CPU usage. This was an problem in 2012 due to a Linux kernel bug. After the leap second was added, CPU utilization sky-rocketed on many systems, taking down many popular sites. This issue was addressed back in 2012, and similar global problems did not occur in 2015 thanks to those fixes. So it is important to make sure you have an up-to-date Linux kernel version.

It’s worth knowing that in the case of any unpredicted system misbehavior from the leap second problem, the quick remedy for the CPU overheating was restarting services or rebooting servers (in the worst case).

(Please do not reboot the server without being absolutely sure that your serious problems started exactly when the leap second was added.)

The following are examples of bug records:

The second thing is to add proper support for the upcoming event. Leap second additions are announced some time before they are implemented, as it isn’t known exactly when the next one will occur for sure.

Therefore, you should upgrade your OS tzdata package to prepare your system for the upcoming leap second. This document shows how to check if your OS is already “leap second aware”:

zdump -v right/America/Los_Angeles | grep Sat.Dec.31.*2016

A non-updated system returns an empty output. On an updated OS, you should receive something like this:

right/America/Los_Angeles  Sat Dec 31 23:59:60 2016 UTC = Sat Dec 31 15:59:60 2016 PST isdst=0 gmtoff=-28800
right/America/Los_Angeles  Sun Jan  1 00:00:00 2017 UTC = Sat Dec 31 16:00:00 2016 PST isdst=0 gmtoff=-28800

If your systems use the NTP service though, the above is not necessary (as stated in https://access.redhat.com/solutions/2441291). Still, you should make sure that the NTP services you use are also up-to-date.

With regards to leap second support in MySQL there is nothing to do, regardless of the version. MySQL doesn’t allow an extra second numeration within the 60 seconds part of timestamp datatype, so you should expect rows with 59 instead of 60 seconds when the additional second is added, as described here: https://dev.mysql.com/doc/refman/5.7/en/time-zone-leap-seconds.html

Similarly, MongoDB expects no serious problems either.

Let’s “smear” the second

Many big Internet properties, however, introduced a technique to adapt to the leap second change more gracefully and smoothly, called Leap Smear or Slew. Instead of introducing the additional leap second immediately, the clock slows down a bit, allowing it to gradually get in sync with the new time. This way there is no issue with extra abnormal second notation, etc.

This solution is used by Google, Amazon, Microsoft, and others. You can find a comprehensive document about Google’s use here: https://developers.google.com/time/smear

You can easily introduce this technique with the ntpd -x or Chronyd slew options, which are nicely explained in this document: https://developers.redhat.com/blog/2015/06/01/five-different-ways-handle-leap-seconds-ntp/


Make sure you have your kernel up-to-date, NTP service properly configured and consider using the Slew/Smear technique to make the change easier. After the kernel patches in 2012, no major problems happened in 2015. We expect none this year either (especially if you take time to properly prepare).


Percona Blog Poll: What Programming Languages are You Using for Backend Development?

Programming Languages

Programming LanguagesTake Percona’s blog poll on what programming languages you’re using for backend development.

While customers and users focus and interact with applications and websites, these are really just the tip of the iceberg for the whole end-to-end system that allows applications to run. The backend is what makes a website or application work. The backend has three parts to it: server, application, and database. A backend operation can be a web application communicating with the server to make a change in a database stored on a server. Technologies like PHP, Ruby, Python, and others are the ones backend programmers use to make this communication work smoothly, allowing the customer to purchase his or her ticket with ease.

Backend programmers might not get a lot of credit, but they are the ones that design, maintain and repair the machinery that powers a system.

Please take a few seconds and answer the following poll on backend programming languages. Which are you using? Help the community learn what languages help solve critical database issues. Please select from one to six languages as they apply to your environment.

If you’re using other languages, or have specific issues, feel free to comment below. We’ll post a follow-up blog with the results!

Note: There is a poll embedded within this post, please visit the site to participate in this post’s poll.

Row Store and Column Store Databases

Row Store and Column Store

Row Store and Column StoreIn this blog post, we’ll discuss the differences between row store and column store databases.

Clients often ask us if they should or could be using columnar databases. For some applications, a columnar database is a great choice; for others, you should stick with the tried and true row-based option.

At a basic level, row stores are great for transaction processing. Column stores are great for highly analytical query models. Row stores have the ability to write data very quickly, whereas a column store is awesome at aggregating large volumes of data for a subset of columns.

One of the benefits of a columnar database is its crazy fast query speeds. In some cases, queries that took minutes or hours are completed in seconds. This makes columnar databases a good choice in a query-heavy environment. But you must make sure that the queries you run are really suited to a columnar database.

Data Storage

Let’s think about a basic database, like a stockbroker’s transaction records. In a row store, each client would have a record with their basic information – name, address, phone number, etc. – in a single table. It’s likely that each record would have a unique identifier. In our case, it would probably be an



There is another table that stored stock transactions. Again, each transaction is uniquely identified by something like a


. Each transaction is associated to one


, but each


 is associated with multiple transactions. This provides us with a one-to-many relationship, and is a classic example of a transactional database.

We store all these tables on a disk and, when we run a query, the system might access lots of data before it determines what information is relevant to the specific query. If we want to know the








, and


 for a given time period, the system needs to access all of the information for the two tables, including fields that may not be relevant to the query. It then performs a join to relate the two tables’ data, and then it can return the information. This can be inefficient at scale, and this is just one example of a query that would probably run faster on a columnar database.

With a columnar database, each field from each table is stored in its own file or set of files. In our example database, all


 data is stored in one file, all


 data is stored in another file, and so on. This provides some efficiencies when running queries against wide tables, since it is unlikely that a query needs to return all of the fields in a single table. In the query example above, we’d only need to access the files that contained data from the requested fields. You can ignore all other fields that exist in the table. This ability to minimize i/o is one of the key reasons columnar databases can perform much faster.

Normalization Versus Denormalization

Additionally, many columnar databases prefer a denormalized data structure. In the example above, we have two separate tables: one for account information and one for transaction information. In many columnar databases, a single table could represent this information. With this denormalized design, when a query like the one presented is run, no joins would need to be processed in the columnar database, so the query will likely run much faster.

The reason for normalizing data is that it allows data to be written to the database in a highly efficient manner. In our row store example, we need to record just the relevant transaction details whenever an existing customer makes a transaction. The account information does not need to be written along with the transaction data. Instead, we reference the


 to gain access to all of the fields in the accounts table.

The place where a columnar database really shines is when we want to run a query that would, for example, determine the average price for a specific stock over a range of time. In the case of the columnar database, we only need a few fields – 




, and


– in order to complete the query. With a row store, we would gather additional data that was not needed for the query but was still part of the table structure.

Normalization of data also makes updates to some information much more efficient in a row store. If you change an account holder’s address, you simply update the one record in the accounts table. The updated information is available to all transactions completed by that account owner. In the columnar database, since we might store the account information with the transactions of that user, many records might need updating in order update the available address information.


So, which one is right for you? As with so many things, it depends. You can still perform data analysis with a row-based database, but the queries may run slower than they would on a column store. You can record transactions in a column-based model, but the writes may takes longer to complete. In an ideal world, you would have both options available to you, and this is what many companies are doing.

In most cases, the initial write is to a row-based system. We know them, we love them, we’ve worked with them forever. They’re kind of like that odd relative who has some real quirks. We’ve learned the best ways to deal with them.

Then, we write the data (or the relevant parts of the data) to a column based database to allow for fast analytic queries.

Both databases incurred write transactions, and both also likely incur read transactions. Due to the fact that a column-based database has each column’s data in a separate file, it is less than ideal for a “SELECT * FROM…” query, since the request must access numerous files to process the request. Similarly, any query that selects a single or small subset of files will probably perform better in a row store. The column store is awesome for performing aggregation over large volumes of data. Or when you have queries that only need a few fields from a wide table.

It can be tough to decide between the two if you only have one database. But it is more the norm that companies support multiple database platforms for multiple uses. Also, your needs might change over time. The sports car you had when you were single is less than optimal for your current family of five. But, if you could, wouldn’t you want both the sports car and the minivan? This is why we often see both database models in use within a single company.


AWS goes after Oracle with new PostgresSQL support in Aurora

screen-shot-2016-11-30-at-12-47-50-pm There were a lot of jokes and comments at Oracle’s expense today at the AWS re:Invent conference, but perhaps the boldest statement came when AWS announced it  was adding PostgresSQL support to the AWS Aurora database, making it easier to move an Oracle database to the AWS cloud. Even as Oracle makes its own move to the cloud, it is still held up as the prime example of the… Read More

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com