Sep
15
2022
--

Percona XtraDB Cluster on Amazon EC2 and Two Interesting Changes in PXC 8.0

Percona XtraDB Cluster on Amazon EC2

Percona XtraDB Cluster on Amazon EC2This article outlines the basic configurations for setting up and deploying Percona XtraDB Cluster 8.0 (PXC) on Amazon EC2, as well as what is new in the setup compared to Percona XtraDB Cluster 5.7.

What is Percona XtraDB Cluster an ideal fit for?

Percona XtraDB Cluster is a cost-effective, high-performance clustering solution for mission-critical data. It combines all the improvements, and functionality found in MySQL 8 with Percona Server for MySQL‘s Enterprise features and Percona’s upgraded Galera library.

A Percona XtraDB Cluster environment is an ideal fit for applications requiring 5-9s uptime with high read workloads; industries like financial or healthcare businesses that require in-house or externally dedicated database resources.

How is a three-node cluster configured in an EC2 environment?

In order to describe the setup procedures, I’ll be using Amazon EC2 instances to build the environment, and based on the business requirements, we may utilize alternative infrastructures to build the cluster environment.

Amazon EC2 settings are designed to provide uptime and high availability. When designing architecture in an EC2 environment, it is preferable to have one node situated in another Availability Zones to avoid the loss of an entire AZ and its data.

If a different region is planned for a node, we can prevent the loss of the entire region and its data. It is desirable that nodes and regions have appropriate network connectivity because network latency between the two regions affects synchronous replication write latency. Alternatively, an async replica in a different region is an option.

I’m not going into too much depth on Amazon EC2 to keep this blog brief and readable. 

To build the three-node Percona XtraDB 8.0 cluster environment we first spin up the following three nodes in EC2. I’m using Amazon Linux but you can also use Ubuntu or any of the Percona-supported operating systems.

It is advised that a cluster’s nodes all have the same configuration. 

PXCNode1   IP Address: 172.31.18.119

PXCNode2   IP Address: 172.31.86.114

PXCNode3  IP Address: 172.31.26.152

To install the Percona repository on all three nodes, use the following command.

$ sudo yum install -y https://repo.percona.com/yum/percona-release-latest.noarch.rpm

Enable the Percona Server for MySQL 8.0 repository in all three nodes by running the following command.

$ sudo percona-release setup pxc-80
* Disabling all Percona Repositories
* Enabling the Percona XtraDB Cluster 8.0 repository
* Enabling the Percona Tools repository
<*> All done!
$

Using the following command, install the Percona XtraDB Cluster packages and software on all three nodes.

$ sudo yum install percona-xtradb-cluster

Before starting the nodes, update the basic variables listed below for the nodes

The following default variables must be modified with the first installations. Those that came from PXC 5.7 might wonder why wsrep_sst_auth is missing. The wsrep_sst_auth variable was removed in PXC 8 since it causes security concerns, as the user and password are saved in the config file and are easily visible to OS users. 

In PXC 8, a temporary user is created when a new node joins the existing cluster. For additional details on this security enhancement, check this article.

$ vi /etc/my.cnf
..
######## wsrep ###############
wsrep_provider=/usr/lib64/galera4/libgalera_smm.so
wsrep_cluster_address=gcomm://172.31.18.119,172.31.86.114,172.31.26.152
wsrep_slave_threads=8
wsrep_log_conflicts
wsrep_node_address=172.31.18.119
wsrep_cluster_name=pxc-cluster
#If wsrep_node_name is not specified,  then system hostname will be used
wsrep_node_name=pxc-cluster-node-1
wsrep_sst_method=xtrabackup-v2
..

Percona XtraDB Cluster nodes utilize the following ports by default, so we must open them and ensure that the nodes can communicate with one another.

3306 is used for MySQL client connections and SST (State Snapshot Transfer) via mysqldump.

    4444 is used for SST via Percona XtraBackup.

    4567 is used for write-set replication traffic (over TCP) and multicast replication (over TCP and UDP).

    4568 is used for IST (Incremental State Transfer).

For example, to test access.

Node 1
#  socat - TCP-LISTEN:4444
hello

Node 2
# echo "hello" | socat - TCP:172.31.18.119:4444
#

How is the first node bootstrapped?

After configuring each PXC node, you must bootstrap the cluster starting with the first node. All of the data that you wish to replicate to additional nodes must be present on the first node. 

Run the command to bootstrap the first node.

# systemctl start mysql@bootstrap.service
#

Since this is a brand new install, a temporary password is generated for the ‘root’ MySQL user which we can find in the mysqld.log

# grep -i "A temporary password is generated " /var/log/mysqld.log
2022-09-13T06:52:37.700818Z 6 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: PmqQiGf*#6iy
#

Reset the temporary password using the following alter.

$ mysql -uroot -p
Enter password:
mysql> SET PASSWORD = 'GdKG*12#ULmE';
Query OK, 0 rows affected (0.03 sec)

How can the cluster’s remaining nodes be joined?

Before starting node2, you must copy the SSL certificates from node1 to node2 (and to node3). PXC 8 by default encrypts all replication communication, so this is a critical step that most users miss, causing cluster startup failures.

PXCNode1# scp /var/lib/mysql/*.pem 172.31.86.114:/var/lib/mysql/
PXCNode2# chown -R mysql.mysql /var/lib/mysql/*.pem

Start the node.

PXCNode2# systemctl start mysql
#

Verify the following SET and make sure they appear as below once the node has been added to the cluster.

PXCNode2$ mysql -uroot -p -e " show global status where variable_name IN ('wsrep_local_state','wsrep_local_state_comment','wsrep_cluster_size','wsrep_cluster_status','wsrep_connected','wsrep_ready');"
Enter password:
+---------------------------+---------+
| Variable_name             | Value   |
+---------------------------+---------+
| wsrep_cluster_size        | 2       |
| wsrep_cluster_status      | Primary |
| wsrep_connected           | ON      |
| wsrep_local_state         | 4       |
| wsrep_local_state_comment | Synced  |
| wsrep_ready               | ON      |
+---------------------------+---------+

The third node may be added to the cluster using the same procedures, and its status will then look as follows.

PXCNode3$ mysql -uroot -p -e " show global status where variable_name IN ('wsrep_local_state','wsrep_local_state_comment','wsrep_cluster_size','wsrep_cluster_status','wsrep_connected','wsrep_ready');"
Enter password:
+---------------------------+---------+
| Variable_name             | Value   |
+---------------------------+---------+
| wsrep_cluster_size        | 3       |
| wsrep_cluster_status      | Primary |
| wsrep_connected           | ON      |
| wsrep_local_state         | 4       |
| wsrep_local_state_comment | Synced  |
| wsrep_ready               | ON      |
+---------------------------+---------+

Additional supporting factors

Additionally, the use of failover technologies like ProxySQL, which assist in removing failing nodes from the active read pool in some cases and shifting the primary, is advised.

It is advised to have a backup in place, with the open-source tool Percona XtraBackup taking physical copies of the dataset that are significantly faster to recover. It is strongly advised to back up binary logs using mysqlbinlog in order to do point-in-time recovery. Backups should be encrypted, compressed, and transferred to S3 as soon as possible.

To copy binlogs to an s3 bucket the command looks like this.

aws s3 sync /backups/binlogs/ s3://backup-bucket/
upload: ../../../../backups/binlogs/binlog.000001.gz.gpg to 
s3://backup-bucket/binlog.000001.gpg

For query analytics and time-based database performance insights, the open-source tool Percona Monitoring and Management is highly recommended. Using the Amazon Marketplace, this needs to be deployed on a separate host. This monitors the operating system and MySQL metrics and provides sophisticated query analytics.

Jul
17
2018
--

When Should I Use Amazon Aurora and When Should I use RDS MySQL?

Now that Database-as-a-service (DBaaS) is in high demand, there is one question regarding AWS services that cannot always be answered easily : When should I use Aurora and when RDS MySQL?

DBaaS cloud services allow users to use databases without configuring physical hardware and infrastructure, and without installing software. I’m not sure if there is a straightforward answer, but when trying to find out which solution best fits an organization there are multiple factors that should be taken into consideration. These may be performance, high availability, operational cost, management, capacity planning, scalability, security, monitoring, etc.

There are also cases where although the workload and operational needs seem to best fit to one solution, there are other limiting factors which may be blockers (or at least need special handling).

In this blog post, I will try to provide some general rules of thumb but let’s first try to give a short description of these products.

What we should really compare is the MySQL and Aurora database engines provided by Amazon RDS.

An introduction to Amazon RDS

Amazon Relational Database Service (Amazon RDS) is a hosted database service which provides multiple database products to choose from, including Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server. We will focus on MySQL and Aurora.

With regards to systems administration, both solutions are time-saving. You get an environment ready to deploy your application and if there are no dedicated DBAs, RDS gives you great flexibility for operations like upgrades or backups. For both products, Amazon applies required updates and the latest patches without any downtime. You can define maintenance windows and automated patching (if enabled) will occur within them. Data is continuously backed up to S3 in real time, with no performance impact. This eliminates the need for backup windows and other, complex or not, scripted procedures. Although this sounds great, the risk of vendor lock-in and the challenges of enforced updates and client-side optimizations are still there.

So, Aurora or RDS MySQL?

Amazon Aurora is a relational, proprietary, closed-source database engine, with all that that implies.

RDS MySQL is 5.5, 5.6 and 5.7 compatible and offers the option to select among minor releases. While RDS MySQL supports multiple storage engines with varying capabilities, not all of them are optimized for crash recovery and data durability. Until recently, it was a limitation that Aurora was only compatible with MySQL 5.6 but it’s now compatible with both 5.6 and 5.7 too.

So, in most cases, no significant application changes are required for either product. Keep in mind that certain MySQL features like the MyISAM storage engine are not available with Amazon Aurora. Migration to RDS can be performed using Percona XtraBackup.

For RDS products shell access to the underlying operating system is disabled and access to MySQL user accounts with the “SUPER” privilege isn’t allowed. To configure MySQL variables or manage users, Amazon RDS provides specific parameter groups, APIs and other special system procedures which be used. If you need to enable remote access this article will help you do so https://www.percona.com/blog/2018/05/08/how-to-enable-amazon-rds-remote-access/

Performance considerations

Although Amazon RDS uses SSDs to achieve better IO throughput for all its database services, Amazon claims that the Aurora is able to achieve a 5x performance boost than standard MySQL and provides reliability out of the box. In general, Aurora seems to be faster, but not always.

For example, due to the need to disable the InnoDB change buffer for Aurora (this is one of the keys for the distributed storage engine), and that updates to secondary indexes must be write through, there is a big performance penalty in workloads where heavy writes that update secondary indexes are performed. This is because of the way MySQL relies on the change buffer to defer and merge secondary index updates. If your application performs a high rate of updates against tables with secondary indexes, Aurora performance may be poor. In any case, you should always keep in mind that performance depends on schema design. Before taking the decision to migrate, performance should be evaluated against an application specific workload. Doing extensive benchmarks will be the subject of a future blog post.

Capacity Planning

Talking about underlying storage, another important thing to take into consideration is that with Aurora there is no need for capacity planning. Aurora storage will automatically grow, from the minimum of 10 GB up to 64 TiB, in 10 GB increments, with no impact on database performance. The table size limit is only constrained by the size of the Aurora cluster volume, which has a maximum of 64 tebibytes (TiB). As a result, the maximum table size for a table in an Aurora database is 64 TiB. For RDS MySQL, the maximum provisioned storage limit constrains the size of a table to a maximum size of 16 TB when using InnoDB file-per-table tablespaces.

Replication

Replication is a really powerful feature of MySQL (like) products. With Aurora, you can provision up to fifteen replicas compared to just five in RDS MySQL. All Aurora replicas share the same underlying volume with the primary instance and this means that replication can be performed in milliseconds as updates made by the primary instance are instantly available to all Aurora replicas. Failover is automatic with no data loss on Amazon Aurora whereas the replicas failover priority can be set.

An explanatory description of Amazon Aurora’s architecture can be found in Vadim’s post written a couple of years ago https://www.percona.com/blog/2015/11/16/amazon-aurora-looking-deeper/

The architecture used and the way that replication works on both products shows a really significant difference between them. Aurora is a High Availablity (HA) solution where you only need to attach a reader and this automatically becomes Multi-AZ available. Aurora replicates data to six storage nodes in Multi-AZs to withstand the loss of an entire AZ (Availability Zone) or two storage nodes without any availability impact to the client’s applications.

On the other hand, RDS MySQL allows only up to five replicas and the replication process is slower than Aurora. Failover is a manual process and may result in last-minute data loss. RDS for MySQL is not an HA solution, so you have to mark the master as Multi-AZ and attach the endpoints.

Monitoring

Both products can be monitored with a variety of monitoring tools. You can enable automated monitoring and you can define the log types to publish to Amazon CloudWatch. Percona Monitoring and Management (PMM) can also be used to gather metrics.

Be aware that for Aurora there is a limitation for the T2 instances such that Performance Schema can cause the host to run out of memory if enabled.

Costs

Aurora instances will cost you ~20% more than RDS MySQL. If you create Aurora read replicas then the cost of your Aurora cluster will double. Aurora is only available on certain RDS instance sizes. Instances pricing details can be found here and here.

Storage pricing may be a bit tricky. Keep in mind that pricing for Aurora differs to that for RDS MySQL. For RDS MySQL you have to select the type and size for the EBS volume, and you have to be sure that provisioned EBS IOPs can be supported by your instance type as EBS IOPs are restricted by the instance type capabilities. Unless you watch for this, you may end up having EBS IOPs that cannot be really used by your instance.

For Aurora, IOPs are only limited by the instance type. This means that if you want to increase IOPs performance on Aurora you should proceed with an instance type upgrade. In any case, Amazon will charge you based on the dataset size and the requests per second.

That said, although for Aurora you pay only for the data you really use in 10GB increments if you want high performance you have to select the correct instance. For Aurora, regardless of the instance type, you get billed $0.10 per GB-month and $0.20 per 1 million requests so if you need high performance the cost maybe even more than RDS MySQL. For RDS MySQL storage costs are based on the EBS type and size.

Percona provides support for RDS services and you might be interested in these cases studies:

When a more fully customized solution is required, most of our customers usually prefer the use of AWS EC2 instances supported by our managed services offering.

TL;DR
  • If you are looking for a native HA solution then you should use Aurora
  • For a read-intensive workload within an HA environment, Aurora is a perfect match. Combined with ProxySQL for RDS you can get a high flexibility
  • Aurora performance is great but is not as much as expected for write-intensive workloads when secondary indexes exist. In any case, you should benchmark both RDS MySQL and Aurora before taking the decision to migrate.  Performance depends much on workload and schema design
  • By choosing Amazon Aurora you are fully dependent on Amazon for bug fixes or upgrades
  • If you need to use MySQL plugins you should use RDS MySQL
  • Aurora only supports InnoDB. If you need other engines i.e. MyISAM, RDS MySQL is the only option
  • With RDS MySQL you can use specific MySQL releases
  • Aurora is not included in the AWS free-tier and costs a bit more than RDS MySQL. If you only need a managed solution to deploy services in a less expensive way and out of the box availability is not your main concern, RDS MySQL is what you need
  • If for any reason Performance Schema must be ON, you should not enable this on Amazon Aurora MySQL T2 instances. With the Performance Schema enabled, the T2 instance may run out of memory
  • For both products, you should carefully examine the known issues and limitations listed here https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/MySQL.KnownIssuesAndLimitations.html and here https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.AuroraMySQL.html

The post When Should I Use Amazon Aurora and When Should I use RDS MySQL? appeared first on Percona Database Performance Blog.

Dec
01
2017
--

This Week in Data with Colin Charles 17: AWS Re:Invent, a New Book on MySQL Cluster and Another Call Out for Percona Live 2018

Colin Charles

Colin Charles Open Source Database evangelist for PerconaJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

The CFP for Percona Live Santa Clara 2018 closes December 22, 2017: please consider submitting as soon as possible. We want to make an early announcement of talks, so we’ll definitely do a first pass even before the CFP date closes. Keep in mind the expanded view of what we are after: it’s more than just MySQL and MongoDB. And don’t forget that with one day less, there will be intense competition to fit all the content in.

A new book on MySQL Cluster is out: Pro MySQL NDB Cluster by Jesper Wisborg Krogh and Mikiya Okuno. At 690 pages, it is a weighty tome, and something I fully plan on reading, considering I haven’t played with NDBCLUSTER for quite some time.

Did you know that since MySQL 5.7.17, connection control plugins are included? They help DBAs introduce an increasing delay in server response to clients after a certain number of consecutive failed connection attempts. Read more at the connection control plugins.

While there are a tonne of announcements coming out from the Amazon re:Invent 2017 event, I highly recommend also reading Some data of interest as AWS reinvent 2017 ramps up by James Governor. Telemetry data from sumologic’s 1,500 largest customers suggest that NoSQL database usage has overtaken relational database workloads! Read The State of Modern Applications in the Cloud. Page 8 tells us that MySQL is the #1 database on AWS (I don’t see MariaDB Server being mentioned which is odd; did they lump it in together?), and MySQL, Redis & MongoDB account for 40% of database adoption on AWS. In other news, Andy Jassy also mentions that less than 1.5 months after hitting 40,000 database migrations, they’ve gone past 45,000 over the Thanksgiving holiday last week. Have you started using AWS Database Migration Service?

Releases

Link List

Upcoming appearances

  • ACMUG 2017 gathering – Beijing, China, December 9-10 2017 – it was very exciting being there in 2016, I can only imagine it’s going to be bigger and better in 2017, since it is now two days long!

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

Aug
23
2017
--

Migrating Data from an Encrypted Amazon MySQL RDS Instance to an Encrypted Amazon Aurora Instance

Migrating Data

Migrating DataIn this blog post, we’ll discuss migrating data from encrypted Amazon MySQL RDS to encrypted Amazon Aurora.

One of my customers wanted to migrate from an encrypted MySQL RDS instance to an encrypted Aurora instance. They have a pretty large database, therefore using mysqldump or a similar tool was not suitable for them. They also wanted to setup replication between old MySQL RDS and new Aurora instances.

Spoiler: this is possible without any logical dump.

At first, I checked Amazon’s documentation on encryption and found nothing about this type of migration. Even more, if I trust the documentation it looks like they don’t support replication or migration between encrypted MySQL RDS and encrypted Aurora. All instructions are for either “MySQL RDS to MySQL RDS” or “Aurora to Aurora” setups. For example, the documentation says here:

You can create Read Replicas of both encrypted and unencrypted DB clusters. The Read Replica must be encrypted if the source DB cluster is encrypted.

When I tried to create an Aurora read replica of my encrypted MySQL RDS instance, however, the “Enable Encryption” select control was grayed out and I could not change “No” to “Yes”.

I had to find a workaround.

Another idea was creating an encrypted MySQL RDS replica and migrating it to Aurora. While creating encrypted MySQL replica is certainly possible (actually all replicas of encrypted instances must be encrypted) it was not possible to migrate it to any other instance using the standard “Migrate Latest Snapshot” option:

However, the documentation specified that Aurora and MySQL RDS use the same AWS KMS key. As a result, both kinds of encryption should be compatible (if not practically the same). Amazon also has the “AWS Database Migration Service“, which has this promising section in its FAQ:

Q. Can I replicate data from encrypted data sources?

Yes, AWS Database Migration Service can read and write from and to encrypted databases. AWS Database Migration Service connects to your database endpoints on the SQL interface layer. If you use the Transparent Data Encryption features of Oracle or SQL Server, AWS Database Migration Service will be able to extract decrypted data from such sources and replicate it to the target. The same applies to storage-level encryption. As long as AWS Database Migration Service has the correct credentials to the database source, it will be able to connect to the source and propagate data (in decrypted form) to the target. We recommend using encryption-at-rest on the target to maintain the confidentiality of your information. If you use application-level encryption, the data will be transmitted through AWS Database Migration Service as is, in encrypted format, and then inserted into the target database.

I decided to give it a try. And it worked!

The next step was to make this newly migrated Aurora encrypted instance a read replica of the original MySQL RDS instance. This is easy in part with the help of great how-to on migration by Adrian Cantrill. As suggested, you only need to find the master’s binary log file, current position and supply them to the stored routine

mysql.rds_set_external_master

. Then start replication using the stored routine

mysql.rds_start_replication

.

Conclusion: While AWS Database Migration Service has limitations for both source and target databases, this solution allows you to migrate encrypted instances easily and securely.

Save

Save

Save

Save

Apr
06
2015
--

The cost of not properly managing your databases

Every day hundreds of millions of dollars are wasted by allowing improperly tuned or misconfigured systems, misunderstood infrastructure, and inefficient IT operations to live and thrive in data centers around the globe. There are both direct and indirect costs associated with allowing these unhealthy systems to continue to exist. Let’s look at some.

The setup:

Let us start by using a small example. We will start by looking at a small database setup. This setup will have a single master-slave, with a database size of lets say 500GB. Traffic is steady and let’s say this translates into 500 IOPS on the master. You have chosen to host this on Amazon’s AWS. A common way of ensuring backups occur in AWS is to setup ebs snapshots of the slave. In terms of usage, let us assume your CPU is about 50% used and you have about 20GB of hot data that needs to stay in the memory for the database.

If we look at what this would take to support in EC2 you are looking roughly at this:

  • 2 c3.4xlarge servers (16 vcpu, 30GB of memory )
  • Master-Slave Set
  • with 1TB of Provisioned IOPS SSD, over 2 volumes
  • with 500 IOPS on the master, 125 iops on the slave
  • estimated 7TB of storage for snapshots

This calculator gives us an estimated cost of $3,144.28 per month, or roughly $38,000 a year in hosting fees.  Note that you can choose other tiers of service, or reserved or spot servers to get different pricing.

Regular, steady growth:

Now let’s assume your database is growing along with its traffic at about 5% per month (these are rough numbers I know). After a year your database server would be out of steam using 86% CPU, 34GB of hot data (so relying more heavily on disk), and be consuming just about 850GB of storage space. Moving up to the next tier of servers and with additional iops you will see your spend per month jump to around $4,771.32 per month ($57,000 per year).

When tuning and auditing an environment like the above we been able to give some customers up to 50% or more improvement in performance, and often see 20-25% reduction in space. Let’s be conservative and say we can get a 25% boost in your performance, reduce your 5% monthly growth to 4%, and shrink your database by 10%. Based on that you can stave off upgrading your servers an additional 9 months, saving you almost $15,000 in that first year alone. Over 4 years this customer would end up saving an estimated $75,000 in total spend in AWS costs just based on smaller data and performance enhancements.

DB.Cost.Savings.pt1.png 

In this case performance enhancements are not the only place to save costs. Moving from EBS Snapshots to regular MySQL backups using Percona XtraBackup, keeping one copy on disk and sending those backups to s3, the cost of the environment drops to $2,043.87 per month ( from $3144.28).  This means a simple switch of backup methodology can net you about $1,100 a month or $13,200 a year off your hosting bill.

image02

These numbers are are based on only two servers, the saving over dozens or even hundreds of servers can be huge. Take a look at this 10 server environment:
image04

Often we are not only reducing the resources needed, but we can also reduce the number of servers needed to run your application through tuning. We had a recent client who was able to see a 90% reduction in their read heavy workload and actually turn off servers that used to be used to serve their application. Here is what their savings over the next couple of years would look like:

DB Cost Savings - 90% reduction

Here we helped cut this customers direct costs by two thirds.

Handling spikes:

 The one thing to keep in mind is this assumes a linear growth in terms of application and database usage.  This would mean you can predict when you will need hardware.  If your user base is growing and feature adds are controlled it is possible, however in most environments you will not see that linear growth. You will probably see something like this:

 

DB Workload - Cost savings

Understanding this pattern and the spikes are vital to keeping your costs down. See that giant spike up to 2,500? The first reaction for many is upgrade their hardware, then tune. Inevitably any tuning benefit is offset by the already sunk cost of the hardware upgrade which after the tuning they may not have needed. Getting in front of that spike and preventing it could have saved tens or even hundreds of thousands of dollars.

 Spikes kill performance and cost real dollars. Those spikes may not be easy to find. A few years ago I was working with a Fortune 500 client who had one of these spikes. They had been running perfectly fine with steady but controllable growth for 7 or 8 months, then the 9th month things went very wrong very quick.

A critical component of their company was to certify professionals through a testing application. During their peak time of season these certifications had stopped completely – delaying certifications for thousands of employees and clients for 2 weeks. I was flown out to help control the bleeding and hopefully fix the problem.

The number of users using the application was the same, the number of page views on the web was steady, but the number of queries to the database skyrocketed. None of the queries had hit their threshold of 1 second to be flagged as problem queries. It turned out to be one query that took 250ms to execute that was causing this company to grind to a halt. That one query ended up being executed 25,000 times per page when certain conditions were met, and those conditions were not met until the 9th month after this application was re-released.

This query lay like a trojan horse waiting to destroy this company’s ability to deliver to its customers.

 Two lessons can be learned from this. The first is even a seemingly well-tuned system may not be. Second, small things matter. In this case fixing the code is the correct solution, however, proper indexing of the tables dropped the query time from 250ms down to 50ms.  This was enough of a relief to allow the certification process to start up again until the code could be fixed. A seemingly small impacting query still should be optimized.

 Another source for these performance spikes is a company’s application release cycle. Applications are very a living entity in today’s world. They grow and expand and change on a regular basis. In order to stay ahead of any problems you need to have a process and resources in place that can proactively monitor and tune. Every release of new code should be going through a rigorous performance review to prevent trojan problems that may cause problems and extra costs down the road.

Indirect and hard to calculate costs:

All of this discussion so far has been around direct hosting costs. There is also a cost to your reputation and your ability to deliver services that meet a customer’s expectations. Customers who come to your site or are using your application can leave in droves due to poor performance. We have seen several customers who lost 50% or more of their user base due to performance problems with the application.

Lost revenue and profits are often much more difficult to quantify, and vary greatly from company to company. This cost, however, is very real. Silicon Valley is littered with the remnants of companies that did not plan to address scale or simply missed important problems in their IT infrastructure. Unfortunately I have worked directly with numerous companies that learned this lesson the hard way. These hidden costs can kill a customer quicker than any competitor or market shift.

 One of the biggest hidden costs companies needlessly pay is the cost of downtime.

The cost of downtime:

I was reading a gartner study where they estimated that the cost per minute of downtime was $5,600 dollars; other studies, like this one, have pegged the cost per minute of downtime at $7,900.

Anyway you slice it being down for even a minute costs you money. If we are conservative in our estimates, the cost of an hour of downtime can easily top $100,000. It’s amazing the number of well-established companies that don’t have a solid plan for dealing with downtime.

Let’s look at some common disaster recovery policies:

Restore from backup:

How quickly can your DBAs get alerted to an outage, then login to look at the outage, and finally make a call whether or not to restore? I submit that most people are going to take a few minutes to get an alert (let’s say 2). They will then take a few minutes to get to the computer and into the system (let’s say 5 minutes). Then they will take at least 10 minutes to try and figure out what’s going on. Fast-forward 17 minutes later…. minimum has gone by with nothing to show for it.

Restoring the backup itself could take a few minutes or several hours. Let’s just say 40 minutes total. If we use that $7,900 number, you could have just lost $316,000. That’s a huge amount that could have easily been avoided. Maybe you know that you’re not losing $7,900 a minute, maybe it’s only $1,000. That’s still $40,000!

Manual failover to a slave:

The time for getting, reacting and taking action does not change in this equation. The original 17 minutes of time (minimum) to react and start fixing just potentially cost you $134,300.

Automated failover:

Not all automated failovers are created equal. Some solutions can take several minutes to even hours to restore proper service (passive cold slaves warm up time). Just because you think you are protected does not mean you are. Having the right automated solution can mean you minimize your downtime risks to $10K or less, having the wrong one can be worse than having none at all.

It’s important to understand the cost of downtime and pick the proper solution to mitigate it.

Cost of being wrong is high:

These are just a few of the costs that companies can incur by having the improper database and infrastructure setup. Mitigating these costs requires a solid process, a high-level of expertise, and the right resources in place.

The post The cost of not properly managing your databases appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com