Dec
21
2017
--

Three P’s of a Successful Black Friday: Percona, Pepper Media Holding, and PMM

Successful Black Friday

As we close out the holiday season, let’s look at some data that tells us how to guarantee a successful Black Friday (from a database perspective).

There are certain peak times of the year where companies worldwide hold their breath in the hope that their databases do not become overloaded or unresponsive. A large percentage of yearly profits are achieved in a matter of hours during peak events. It is critical that the database environment remains online and responsive. According to a recent survey, users will not wait more than 2.5 seconds for a site to load before navigating elsewhere. Percona has partnered with many clients over the years to ensure success during these critical events. Our goal is always to provide our clients with the most responsive, stable open-source database environments in order to meet their business needs.

First Stop: Germany

In this blog post, we are going to take a closer look at what happened during Black Friday for a high-demand, high-traffic, business-critical application. Pepper Media Holding runs global deals sites where users post and vote on top deals on products in real-time. To give you a better idea of what the user sees, there is a screenshot below from their Germany mydealz.de branch of Pepper Media Holding.Successful Black Friday

As you can imagine, Black Friday results in a huge spike in traffic and user contribution. In order to ensure success during these crucial times, Pepper Media Holding utilizes Percona’s fully managed service offering. Percona’s Managed Services team has become an extension of Pepper Media Holding’s team by helping plan, prepare, and implement MySQL best-practices across their entire database environment.

Pepper Media Holding and Percona thought it would be interesting to reflect on Black Friday 2017 and how we worked together to flourish under huge spikes in query volume and user connections.

Below is a graph of MySQL query volume for Germany servers supporting the mydealz.de front-end. This graph is taken from Percona’s Managed Service Team’s installation of Percona Monitoring and Management (PMM), which they use to monitor Pepper Media’s environment.

As to be expected, MySQL query volume peaked shortly before and during midnight local time. It also spiked early in the morning as users were waking up. The traffic waned throughout the day. The most interesting data point is the spike from 5 AM to 9 AM which saw an 800% increase from the post-midnight dip. The sustained two-day traffic surge was on average a 200% increase when compared to normal, day-to-day query traffic hitting the database.

For more statistics on how the mydealz.de fared from a front-end and user perspective, visit Pepper Media Holding’s newsroom where Pepper Media has given a breakdown of various statistics related to website traffic during Black Friday.

Next Stop: United Kingdom

Another popular Pepper Media Holding branch is in the United Kingdom – better known as HotUKDeals. HotUKDeals hosts user-aggregated and voted-on deals for UK users. This is the busiest Pepper Media Holding database environment on average. Below is a screenshot of the user interface.

The below graphs are from our Managed Service Team’s Percona Monitoring and Management installation and representative of the UK servers supporting the HotUKDeals website traffic.

The first graph we are taking a look at is MySQL Replication Delay. As you can see, the initial midnight wave of Black Friday deals caused a negligible replica delay. The Percona Monitoring and Management MySQL Replication Delay graph is based on seconds_behind_master which is an integer value only. This means the delay is somewhere between 0 and 1 most of the time. Only once did it go between 1 and 2 over the entire course of Black Friday traffic.

The below graphs highlight the MySQL Traffic seen on the UK servers during the Black Friday traffic spike. One interesting note with this graph is the gradual lead-up to the midnight Black Friday spike. It looks like Black Friday is overstepping its boundaries into Gray Thursday. The traffic spikes here mimic the ones we saw in Germany. There’s an initial spike at midnight on Black Friday and then another spike as shoppers are waking up for their day. The UK servers saw a 361% spike in traffic the morning of Black Friday.

MySQL connections also saw an expected and significant spike during this time. Neglecting to consider max_connections system parameter during an event rush might result in “ERROR 1040 (00000): Too many connections.” However, our CEO, Peter Zaitsev, cautions against absent-mindedly setting this parameter at an unreachable level just to avoid this error. In a blog post, he explained best-practices for this scenario.

The MySQL query graph below shows a 400% spike in MySQL queries during the peak Black Friday morning traffic rush. The average number of queries hitting the database over this two day period is significantly higher than normal – approximately 183%.

Conclusion

Percona reported no emergencies during the Black Friday period for its Managed Service customers – including Pepper Media Holding. We saw similarly high traffic spikes among our customers during this 2017 Black Friday season. I hope that this run-down of a few PMM graphs taken during Pepper Media Holding’s Black Friday traffic period was informative and interesting. Special thanks to Pepper Media Holding for working with us to create this blog post.

Note: Check out our Pepper Media case study on how Percona helps them manage their database environment.

If you would like to further explore the graphs and statistics that Percona Monitoring and Management has to offer, we have a live demo available at https://pmmdemo.percona.com. To discuss how Percona Managed Services can help your database thrive during event-based traffic spikes (and all year round), please call us at +1-888-316-9775 (USA), +44 203 608 6727 (Europe), or have us contact you.

Jun
29
2017
--

How Does Percona Software Compare: the Value of Percona Software and Services

Percona Software and Services

Percona Software and ServicesIn this blog post, I’ll discuss my experience as a Solutions Engineer in explaining the value of Percona software and services to customers.

The inspiration for this blog post was a recent conversation with a prospective client. They were exploring open source software solutions and professional services for the first time. This is a fairly common and recurring conversation with individuals and enterprises exploring open source for the first time. They generally find Percona through recommendations from colleagues, or organic Google searches. One of the most common questions Percona gets when engaging at this level is, “How does Percona’s software compare with << Insert Closed-source Enterprise DB Here >>?”

We’re glad you asked. Percona’s position on this question is to explain the value of Percona in the open source space. Percona does not push a particular flavor of open source software. We recommend an appropriate solution given each of our client’s unique business and application requirements. A client coming from this space is likely to have a lengthy list of compliance, performance and scalability requirements. We love these conversations. Our focus is always on providing our customers with the best services, support, and open-source technology solutions possible.

If there were a silver bullet for every database, Percona would not have been as successful over the past 10 (almost 11!) years. We know that MongoDB, MariaDB, MySQL Enterprise, MySQL Community, Percona Server for MySQL, etc., is never the right answer 100% of the time. Our competitors in this space will most likely disagree with this assessment – because they have an incentive to sell their own software. The quality of all of these offerings is extremely high, and our customers use each of them to great success. Using the wrong database software for a particular use-case is like using a butter knife to cut a steak. You can probably make it work, but it will be a frustrating and unrewarding experience.

Usually, there are more efficient alternatives.

There is a better question to ask instead of software vs. software, which basically turns into an apples vs. oranges conversation. It’s: “Why should we partner with Percona as our business adopts more open-source software?” This is where Percona’s experience and expertise in the open-source database space shine.

Percona is invested in ensuring you are making the right decisions based on your needs. Choosing the wrong database solution or architecture can be catastrophic. A recent Ponemon study (https://planetaklimata.com.ua/instr/Liebert_Hiross/Cost_of_Data_Center_Outages_2016_Eng.pdf) estimates the average cost of downtime can be up to $8,800 per minute. Not only is downtime costly, but re-architecting your environment due to an errant choice in database solutions can be costly as well. Uber experienced these pains when using Postgres as their database solution and wrote an interesting blog post about their decision to switch back to MySQL (https://eng.uber.com/mysql-migration/). Postgres has, in turn, responded to Uber recently (http://thebuild.com/presentations/uber-perconalive-2017.pdf). The point of bringing up this high-profile switch is not to take sides, but to highlight the breadth of knowledge necessary to navigate the open source database environment. Given its complexity, the necessity of a trusted business partner with unbiased recommendations should be apparent.

This is where Percona’s experience and expertise in the open source database space shines. Percona’s only investors are its customers, and our only customers are those using our services. Because of this, our core values are intertwined with customer success and satisfaction. This strategy trickles down through support, to professional services and all the way to the engineering team, where efforts are focused on developing software and features that provide the most value to customers as a whole.

Focusing on a business partner in the open source space is a much more important and worthwhile effort than focusing on a software provider. Percona has hands-on, practical experience with a varying array of open-source technologies in the MySQL, MariaDB, and MongoDB ecosystems. We’ve tracked our ability to assist our customer’s efficiencies in their databases and see a 30% efficiency improvement on average with some cases seeing a 10x boost in performance of their database. (https://www.percona.com/blog/2016/07/20/value-database-support/). The open source software itself has proven itself time and again to be the choice of billion dollar industries. Again, there is not one universal choice among companies. There are success stories that run the gamut of open source software solutions.

To summarize, there’s a reason we redirect the software vs. software question. It’s not because we don’t like talking about how great our software is (it is great, and holds it own with other open source offerings). It’s because the question does not highlight Percona’s value. Percona doesn’t profit from its software. Supporting the open-source ecosystem is our goal. We are heavily invested in providing our customers with the right open source software solution for their situation regardless of whether it is ours or not. Our experience and expertise in this space make us the ideal partner for businesses adopting more open source technology in place of enterprise-licensed models.

Dec
02
2014
--

Tips from the trenches for over-extended MySQL DBAs

This post is a follow-up to my November 19 webinar, “Tips from the Trenches: A Guide to Preventing Downtime for the Over-Extended DBA,” during which I described some of the most common reasons DBAs experience avoidable downtime. The session was aimed at the “over-stretched DBA,” identified as the MySQL DBA short of time or an engineer of another discipline without the depth of the MySQL system. The over-stretched DBA may be prone to making fundamental mistakes that cause downtime through poor response time, operations that cause blocking on important data or administrative mishaps through the lack of best practice monitoring and alerting. (You can download my slides and view the recorded webinar here.)

Monitor the things
One of the aides to keeping the system up and running is ensuring that your finger is on the pulse of the environment. Here on the Percona Managed Services team, we leverage Percona Monitoring Plugins (open source plugins for Nagios, Cacti and Zabbix) to ensure we have visibility of our client’s operations. Having a handle on basics such as disk space, memory usage and MySQL operational metrics ensures that we avoid trivial downtime that would affect the client’s uptime or worse, their bottom line.

Road Blocks
One of the most common reasons that an application is unable to serve data to its end user is that access to a table is being blocked due to another ongoing operation. This can be blamed on a variety of sources: backups, schema changes, poor configuration and long running transactions can all lend themselves to costly blocking. Understanding the impact of actions on a MySQL server can be the difference between a happy end user and a frustrated one.

During the webinar I made reference to some resources and techniques that can assist the over extended DBA avoid downtime and here are some highlights….

Monitoring and Alerting
It’s important that you have some indications that something is reaching its capacity. It might be the disk, connections to MySQL or auto_increment limit on a highly used table. There is quite the landscape to cover but here are a handful of helpful tools:
* Percona Monitoring Plugins
* Monyog
* New Relic

Query Tuning
Poorly performing SQL can be indicative that the configuration is incorrect, that there’s a missing index or that your development team needs a quick lesson on MySQL anti-patterns. Arm yourself with proof that the SQL statements are substandard using these resources and work with the source to make things more efficient:
* Percona Cloud Tools
* pt-query-digest, explain, indexes

High Availability
If you need to ensure that your application survives hiccups such as hardware failure or network impairment, a well deployed HA solution will give you the peace of mind that you can quickly mitigate bumps in the road.
* MHA
Percona XtraDB Cluster, Galera
* Percona Replication Manager
* LinuxHA/Corosync/DRBD

Backups
A wise man once quoted “A backup today saves you tomorrow.” Covering all bases can be the difference between recovering from a catastrophic failure and job hunting. Mixing logical, physical and incremental backups while adding in some offsite copies can provide you with the safety net in the event that a small mistake like a dropped table is met or worse, all working copies of data and backups are lost in a SAN failure. It happens so be prepared.
* Percona XtraBackup
* mydumper
* mysqldump
* mysqlbinlog (5.6)
* mylvmbackup

We had some great questions from the attendees and regrettably were unable to answer them all, so here are some of them with my response.

Tips from the trenches for over-extended MySQL DBAsQ: I use MySQL on Amazon RDS. Isn’t much of the operations automated or do these tips still apply?
A: It’s not completely automated. There are still challenges to address and configuration opportunities, but understanding the limitations of RDS is key. For example, the location and size of the tmpdir is something you are unable to customise on RDS. You would typically review this config in a production environment if your workload required it. Any costly queries that perform operations requiring tmp area to sort (think OLAP) might not be a good fit on RDS due to this limitation. Getting to know the limitations around hosted or DBaaS services is time well spent to avoid explaining what keeps taking the application down in peak hours.

Q: What other parts of Percona Toolkit do you recommend for MySQL operations?
A: Percona Toolkit is a well-evolved suite of tools that all MySQL DBAs should familiarize themselves with. In particular I will fit many tools into my weekly workflow:

Operations

  • pt-online-schema-change
  • pt-table-checksum
  • pt-table-sync

Troubleshooting

  • pt-stalk
  • pt-pmp
  • pt-config-diff

Knowledge Gathering

  • pt-summary
  • pt-mysql-summary
  • pt-duplicate -key-checker

The key with Percona Toolkit is that many common tasks or problems that could cause you to reinvent the wheel are covered, mature and production ready. As with any tool, you should always read the label or in this case the documentation so you’re well aware what the tools can do, the risks and the features that you can make use of.

Q: HA – are there any solutions that you would stay away from?
A: Using any particular HA solution is going to be another R&D exercise. You will need to understand the tradeoffs, configuration options and compare between products. Some might have a higher TCO or lack functionality. Once the chosen solution is implemented it’s pertinent that the engineers understand the technology to be able to troubleshoot or utilize the functionality in the situation where failover needs to be instigated. I like HA solutions to be fast to failover to and some entail starting MySQL from cold.

Q: You mentioned having tested backups. How do you perform this?
A: Percona’s method is using a dedicated host with access to the backup files. Then with a combination of mysqlsandbox and pt-table-checksum we can discover if we trust the files we capture for disaster recovery. Many people underestimate the importance of this task.

Q: Percona Cloud Tools – how much does it cost?
A: Right now it’s a free service. Visit cloud.percona.com for more information, but in a nutshell Percona Cloud Tools is a hosted service providing access to query performance insights for all MySQL uses.

Q: Is there API access to Percona Cloud Tools for application integration?
A: There is currently not a public API available. It is on the roadmap, though. We’d be interested to hear more about your use case so please sign up for the service and try it out. After signing in, all pages include a Feedback link to share your thoughts and ideas such as how you’d like to use a public API.

Q: Can you use MHA with Percona XtraDB Cluster?
A: MHA is not something that can be used with Percona XtraDB Cluster (PXC). It’s common to partner PXC with HAProxy for making sure your writes are going to the appropriate node.

Q: Can MHA make automatic failover? If MHA has automatic failover, what do you recommend? Configure it for automatic failover?
A: MHA can make an automatic failover. Personally I prefer managed failover. When working with automated failover it’s important that failback is manual to avoid “flapping.” “Splitbrain” is an ailment that you don’t want to suffer from as well and auto failover removes the human judgment from the decision to relocate operations from a failed node onto a standby node. If you are going to vote for an automatic failover it is advised to test all potential failure scenarios and to employ a STONITH method to really ensure that the unresponsive node is not serving read/write traffic.

Q: What is the best way to detect database blocking from DML statements? Is there a tool that will show blocking after the fact so you don’t have to catch it real-time?
A: Once again, Percona has a tool called pt-deadlock-logger that can detect and log deadlocks. Detecting locking can be achieved using “SHOW ENGINE INNODB STATUS” or utilizing the information_schema.innodb_locks table. Some engineering might be required for this to be logged but those resources exist for use.

Q: Since you mentioned tinkering with ELK I was wondering if you had any tips on good Kibana dashboards to build to monitor MySQL databases/clusters?
A: ELK is something that I’m looking to publish some information on soon so watch this space!

Thanks again everyone for the great questions! And as a reminder, you can download my slides and view the recorded webinar here.

The post Tips from the trenches for over-extended MySQL DBAs appeared first on MySQL Performance Blog.

Nov
18
2014
--

Avoiding MySQL ALTER table downtime

MySQL table alterations can interrupt production traffic causing bad customer experience or in worst cases, loss of revenue. Not all DBAs, developers, syadmins know MySQL well enough to avoid this pitfall. DBAs usually encounter these kinds of production interruptions when working with upgrade scripts that touch both application and database or if an inexperienced admin/dev engineer perform the schema change without knowing how MySQL operates internally.

Truths
* Direct MySQL ALTER table locks for duration of change (pre-5.6)
* Online DDL in MySQL 5.6 is not always online and may incurr locks
* Even with Percona Toolkit‘s pt-online-schema-change there are several workloads that can experience blocking

Here on the Percona MySQL Managed Services team we encourage our clients to work with us when planning and performing schema migrations. We aim to ensure that we are using the best method available in their given circumstance. Our intentions to avoid blocking when performing DDL on large tables ensures that business can continue as usual whilst we strive to improve response time or add application functionality. The bottom line is that a business relying on access to its data cannot afford to be down during core trading hours.

Many of the installations we manage are still below MySQL 5.6, which requires us to seek workarounds to minimize the amount of disruption a migration can cause. This may entail slave promotion or changing the schema with an ‘online schema change’ tool. MySQL version 5.6 looks to address this issue by reducing the number of scenarios where a table is rebuilt and locked but it doesn’t yet cover all eventualities, for example when changing the data type of a column a full table rebuild is necessary. The topic of 5.6 Online Schema Change was discussed in great detail last year in the post, “Schema changes – what’s new in MySQL 5.6?” by Przemys?aw Malkowski

With new functionality arriving in MySQL 5.7, we look forward to non-blocking DDL operations such as; OPTIMIZE TABLE and RENAME INDEX. (More info)

The best advice for MySQL 5.6 users is to review the matrix to familiarize with situations where it might be best to look outside of MySQL to perform schema changes, the good news is that we’re on the right path to solving this natively.

Truth be told, a blocking alter is usually going to go unnoticed on a 30MB table and we tend to use a direct alter in this situation, but on a 30GB or 300GB table we have some planning to do. If there is a period of time where activity is low and the this is permissive of locking the table then sometimes it is better execute within this window. Frequently though we are reactive to new SQL statements or a new performance issue and an emergency index is required to reduce load on the master in order to improve the response time.

To pt-osc or not to pt-osc?

As mentioned, pt-online-schema-change is a fixture in our workflow. It’s usually the right way to go but we still have occasions where pt-online-schema-change cannot be used, for example; when a table already uses triggers. It’s an important to remind ourselves of the the steps that pt-online-schema-change traverses to complete it’s job. Lets look at the source code to identify these;

[moore@localhost]$ egrep 'Step' pt-online-schema-change
# Step 1: Create the new table.
# Step 2: Alter the new, empty table. This should be very quick,
# Step 3: Create the triggers to capture changes on the original table and <--(metadata lock)
# Step 4: Copy rows.
# Step 5: Rename tables: orig -> old, new -> orig <--(metadata lock)
# Step 6: Update foreign key constraints if there are child tables.
# Step 7: Drop the old table.

I pick out steps 3 and 5 from above to highlight a source of a source of potential downtime due to locks, but step 6 is also an area for concern since foreign keys can have nested actions and should be considered when planning these actions to avoid related tables from being rebuilt with a direct alter implicitly. There are several ways to approach a table with referential integrity constraints and they are detailed within the pt-osc documentation a good preparation step is to review the structure of your table including the constraints and how the ripples of the change can affect the tables around it.

Recently we were alerted to an incident after a client with a highly concurrent and highly transactional workload ran a standard pt-online-schema-change script over a large table. This appeared normal to them and a few hours later our pager man was notified that this client was experiencing max_connections limit reached. So what was going on? When pt-online-schema-change reached step 5 it tried to acquire a metadata lock to rename the the original and the shadow table, however this wasn’t immediately granted due to open transactions and thus threads began to queue behind the RENAME command. The actual effect this had on the client’s application was downtime. No new connections could be made and all existing threads were waiting behind the RENAME command.

Metadata locks
Introduced in 5.5.3 at server level. When a transaction starts it will acquire a metadata lock (independent of storage engine) on all tables it uses and then releases them when it’s finished it’s work. This ensures that nothing can alter the table definition whilst a transaction is open.

With some foresight and planning we can avoid these situations with non-default pt-osc options, namely –nodrop-new-table and –no-swap-tables. This combination leaves both the shadow table and the triggers inplace so that we can instigate an atomic RENAME when load permits.

EDIT: as of percona-toolkit version 2.2 we have a new variable –tries which in conjunction with –set-vars has been deployed to cover this scenario where various pt-osc operations could block waiting for a metadata lock. The default behaviour of pt-osc (–set-vars) is to set the following session variables when it connects to the server;

wait_timeout=10000
innodb_lock_wait_timeout=1
lock_wait_timeout=60

when using –tries we can granularly identify the operation, try count and the wait interval between tries. This combination will ensure that pt-osc will kill it’s own waiting session in good time to avoid the thread pileup and provide us with a loop to attempt to acquire our metadata lock for triggers|rename|fk management;

–tries swap_tables:5:0.5,drop_triggers:5:0.5

The documentation is here http://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html#cmdoption-pt-online-schema-change–tries

This illustrates that even with a tool like pt-online-schema-change it is important to understand the caveats presented with the solution you think is most adequate. To help decide the direction to take use the flow chart to ensure you’re taking into account some of the caveats of the MySQL schema change. Be sure to read up on the recommended outcome though as there are uncharted areas such as disk space, IO load that are not featured on the diagram.

DDL Decision chart

Choosing the right DDL option

Ensure you know what effect ALTER TABLE will have on your platform and pick the right method to suit your uptime. Sometimes that means delaying the change until a period of lighter use or utilising a tool that will avoid holding a table locked for the duration of the operation. A direct ALTER is sometimes the answer like when you have triggers installed on a table.

– In most cases pt-osc is exactly what we need
– In many cases pt-osc is needed but the way in which it’s used needs tweaking
– In few cases pt-osc isn’t the right tool/method and we need to consider native blocking ALTER or using failovers to juggle the change into place on all hosts in the replica cluster.

If you want to learn more about avoiding avoidable downtime please tune into my webinar Wednesday, November 19 at 10 a.m. PST. It’s titled “Tips from the Trenches: A Guide to Preventing Downtime for the Over-Extended DBA.” Register now! (If you miss it, don’t worry: Catch the recording and download the slides from that same registration page.)

The post Avoiding MySQL ALTER table downtime appeared first on MySQL Performance Blog.

Jul
18
2014
--

Systemtap solves phantom MySQLd SIGTERM / SIGKILL issue

The Percona Managed Services team recently faced a somewhat peculiar client issue. We’d receive pages about their MySQL service being unreachable. However, studying the logs showed nothing out of the ordinary…. for the most part it appeared to be a normal shutdown and there was nothing in anyone’s command history nor a cron task to speak of that was suspicious.

This is one of those obscure and peculiar (read: unique) issues that triggered an old memory; I’d seen this behavior before and I had just the tool to catch the culprit in the act.

Systemtap made diagnostics of this issue possible and I can’t state enough how much of a powerful and often under-utilized tool set systemtap really is.

cat > signals.stp << EOF
probe signal.send {
if (sig_name == “SIGKILL” || sig_name == “SIGTERM”)
printf(“[%s] %s was sent to %s (pid:%d) by %s uid:%dn”,
ctime(gettimeofday_s()), sig_name, pid_name, sig_pid, execname(), uid())
}
EOF

sudo stap ./signals.stp > signals.log 2>signals.err

grep mysqld signals.log
[Wed Jun 11 19:03:23 2014] SIGKILL was sent to mysqld (pid:8707) by cfagent uid:0
[Fri Jun 13 21:37:27 2014] SIGKILL was sent to mysqld (pid:6583) by cfagent uid:0
[Sun Jun 15 05:05:34 2014] SIGKILL was sent to mysqld (pid:19818) by cfagent uid:0
[Wed Jul 9 07:03:47 2014] SIGKILL was sent to mysqld (pid:4802) by cfagent uid:0

Addendum: It had been so long since I had used this tooling that I could not remember the original source from which I derived the module above; some cursory searching to rectify this issue for this blog post found this original source by Eugene Teo of Red Hat made available under GPLv2.

From this we were able to show that cfagent was killing the mysqld process presumably via a misconfigured job; this information was returned to the client and this has continued to be run in production for two months now at the client’s request with no issues to speak of.

This is by no means the limit to what systemtap can be used to achieve; you can hook into functions though whilst you may need to install the debug packages to find what functions are available run for example:

sudo stap -L 'process("/usr/sbin/mysqld").function("*")' > /tmp/mysql_stapfunc
...
head /tmp/mysql_stapfunc
process("/usr/sbin/mysqld").function("TIME_from_longlong_date_packed")
process("/usr/sbin/mysqld").function("TIME_from_longlong_datetime_packed")
process("/usr/sbin/mysqld").function("TIME_from_longlong_time_packed")
process("/usr/sbin/mysqld").function("TIME_set_hhmmss")
process("/usr/sbin/mysqld").function("TIME_set_yymmdd")
process("/usr/sbin/mysqld").function("TIME_to_longlong_date_packed")
process("/usr/sbin/mysqld").function("TIME_to_longlong_datetime_packed")
process("/usr/sbin/mysqld").function("TIME_to_longlong_packed")
process("/usr/sbin/mysqld").function("TIME_to_longlong_time_packed")
process("/usr/sbin/mysqld").function("TIME_to_ulonglong")
...

This is also true of the kernel using sudo stap -L 'kernel.function("*")' > /tmp/kernel_stapfunc however you must be booted into a debug kernel for this to function.

Systemtap is more than a worthy tool to have at your disposal with plenty of examples available.

Finally I invite you to join me July 23 at 10 a.m. Pacific time for my webinar, “What Every DBA Needs to Know About MySQL Security.” This detailed technical webinar provides insight into best security practices for either setting up a new MySQL environment or upgrading the security of an existing one. I hope to see you there!

The post Systemtap solves phantom MySQLd SIGTERM / SIGKILL issue appeared first on MySQL Performance Blog.

Dec
24
2013
--

Renaming database schema in MySQL

One of the routine tasks for a DBA is renaming database schemas, and as such MySQL added a command to carry out that purpose called “RENAME DATABASE <database_name>”. However this command just made it through a few minor releases before being discontinued (from MySQL 5.1.7 to 5.1.23). Here’s a link to the reference manual regarding the command http://dev.mysql.com/doc/refman/5.1/en/rename-database.html. Vadim wrote a MySQL Performance Blog post about this a few years ago where he mentions the dangerous nature of this command – that post was appropriately headlined, “Dangerous Command.” Today we will see what are the ways in which a database schema can be renamed and which of them is the quickest.

Method 1: A well-known method for renaming database schema is by dumping the schema using Mysqldump and restoring it in another schema, and then dropping the old schema (if needed).

[root@percona ~]# mysqldump emp > emp.out
[root@percona ~]# mysql -e "CREATE DATABASE employees;"
[root@percona ~]# mysql employees < emp.out
[root@percona ~]# mysql -e "DROP DATABASE emp;"

Although the above method is easy, it is time and space consuming. What if the schema is more than a 100GB? There are methods where you can pipe the above commands together to save on space, however it will not save time.

To remedy such situations, there is another quick method to rename schemas, however, some care must be taken while doing it.

Method 2: MySQL has a very good feature for renaming tables that even works across different schemas. This rename operation is atomic and no one else can access the table while its being renamed. This takes a short time to complete since changing a table’s name or its schema is only a metadata change. Here is procedural approach at doing the rename:

  • a) Create the new database schema with the desired name.
  • b) Rename the tables from old schema to new schema, using MySQL’s “RENAME TABLE” command.
  • c) Drop the old database schema.

If there are views, triggers, functions, stored procedures in the schema, those will need to be recreated too. MySQL’s “RENAME TABLE” fails if there are triggers exists on the tables. To remedy this we can do the following things :

1) Dump the triggers, events and stored routines in a separate file. This done using -E, -R flags (in addition to -t -d which dumps the triggers) to the mysqldump command. Once triggers are dumped, we will need to drop them from the schema, for RENAME TABLE command to work.

$ mysqldump <old_schema_name> -d -t -R -E > stored_routines_triggers_events.out

2) Generate a list of  only “BASE” tables. These can be found using a query on information_schema.TABLES table.

mysql> select TABLE_NAME from information_schema.tables where table_schema='<old_schema_name>' and TABLE_TYPE='BASE TABLE';

3) Dump the views in an out file. Views can be found using a query on the same information_schema.TABLES table.

mysql> select TABLE_NAME from information_schema.tables where table_schema='<old_schema_name>' and TABLE_TYPE='VIEW';
$ mysqldump <database> <view1> <view2> … > views.out

4) Drop the triggers on the current tables in the old_schema.

mysql> DROP TRIGGER <trigger_name>;
...

5) Restore the above dump files once all the “Base” tables found in step #2 are renamed.

mysql> RENAME TABLE <old_schema>.table_name TO <new_schema>.table_name;
...
$ mysql <new_schema> < views.out
$ mysql <new_schema> < stored_routines_triggers_events.out

 

Intricacies with above methods :

  • We may need to update the GRANTS for users such that they match the correct schema_name. These could fixed with a simple UPDATE on mysql.columns_priv, mysql.procs_priv, mysql.tables_priv, mysql.db tables updating the old_schema name to new_schema and calling “Flush privileges;”.

Although “method 2″ seems a bit more complicated than the “method 1″, this is totally scriptable. A simple bash script to carry out the above steps in proper sequence, can help you save space and time while renaming database schemas next time.

We on the Percona Remote DBA team have written a script called “rename_db” that works in the following way :

[root@percona ~]# /tmp/rename_db
rename_db <server> <database> <new_database>

To demonstrate the use of this script, we used a sample schema “emp”, created test triggers, stored routines on that schema. We will try to rename the database schema using the script, which takes some seconds to complete as opposed to time consuming dump/restore method.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| emp                |
| mysql              |
| performance_schema |
| test               |
+--------------------+

[root@percona ~]# time /tmp/rename_db localhost emp emp_test
create database emp_test DEFAULT CHARACTER SET latin1
drop trigger salary_trigger
rename table emp.__emp_new to emp_test.__emp_new
rename table emp._emp_new to emp_test._emp_new
rename table emp.departments to emp_test.departments
rename table emp.dept to emp_test.dept
rename table emp.dept_emp to emp_test.dept_emp
rename table emp.dept_manager to emp_test.dept_manager
rename table emp.emp to emp_test.emp
rename table emp.employees to emp_test.employees
rename table emp.salaries_temp to emp_test.salaries_temp
rename table emp.titles to emp_test.titles
loading views
loading triggers, routines and events
Dropping database emp
real    0m0.643s
user    0m0.053s
sys     0m0.131s

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| emp_test           |
| mysql              |
| performance_schema |
| test               |
+--------------------+

As you can see in the above output the database schema “emp” was renamed to “emp_test” in less than a second.

Lastly, we are happy to share the script we used above for “method 2″.

#!/bin/bash
# Copyright 2013 Percona LLC and/or its affiliates
set -e
if [ -z "$3" ]; then
    echo "rename_db <server> <database> <new_database>"
    exit 1
fi
db_exists=`mysql -h $1 -e "show databases like '$3'" -sss`
if [ -n "$db_exists" ]; then
    echo "ERROR: New database already exists $3"
    exit 1
fi
TIMESTAMP=`date +%s`
character_set=`mysql -h $1 -e "show create database $2\G" -sss | grep ^Create | awk -F'CHARACTER SET ' '{print $2}' | awk '{print $1}'`
TABLES=`mysql -h $1 -e "select TABLE_NAME from information_schema.tables where table_schema='$2' and TABLE_TYPE='BASE TABLE'" -sss`
STATUS=$?
if [ "$STATUS" != 0 ] || [ -z "$TABLES" ]; then
    echo "Error retrieving tables from $2"
    exit 1
fi
echo "create database $3 DEFAULT CHARACTER SET $character_set"
mysql -h $1 -e "create database $3 DEFAULT CHARACTER SET $character_set"
TRIGGERS=`mysql -h $1 $2 -e "show triggers\G" | grep Trigger: | awk '{print $2}'`
VIEWS=`mysql -h $1 -e "select TABLE_NAME from information_schema.tables where table_schema='$2' and TABLE_TYPE='VIEW'" -sss`
if [ -n "$VIEWS" ]; then
    mysqldump -h $1 $2 $VIEWS > /tmp/${2}_views${TIMESTAMP}.dump
fi
mysqldump -h $1 $2 -d -t -R -E > /tmp/${2}_triggers${TIMESTAMP}.dump
for TRIGGER in $TRIGGERS; do
    echo "drop trigger $TRIGGER"
    mysql -h $1 $2 -e "drop trigger $TRIGGER"
done
for TABLE in $TABLES; do
    echo "rename table $2.$TABLE to $3.$TABLE"
    mysql -h $1 $2 -e "SET FOREIGN_KEY_CHECKS=0; rename table $2.$TABLE to $3.$TABLE"
done
if [ -n "$VIEWS" ]; then
    echo "loading views"
    mysql -h $1 $3 < /tmp/${2}_views${TIMESTAMP}.dump
fi
echo "loading triggers, routines and events"
mysql -h $1 $3 < /tmp/${2}_triggers${TIMESTAMP}.dump
TABLES=`mysql -h $1 -e "select TABLE_NAME from information_schema.tables where table_schema='$2' and TABLE_TYPE='BASE TABLE'" -sss`
if [ -z "$TABLES" ]; then
    echo "Dropping database $2"
    mysql -h $1 $2 -e "drop database $2"
fi
if [ `mysql -h $1 -e "select count(*) from mysql.columns_priv where db='$2'" -sss` -gt 0 ]; then
    COLUMNS_PRIV="    UPDATE mysql.columns_priv set db='$3' WHERE db='$2';"
fi
if [ `mysql -h $1 -e "select count(*) from mysql.procs_priv where db='$2'" -sss` -gt 0 ]; then
    PROCS_PRIV="    UPDATE mysql.procs_priv set db='$3' WHERE db='$2';"
fi
if [ `mysql -h $1 -e "select count(*) from mysql.tables_priv where db='$2'" -sss` -gt 0 ]; then
    TABLES_PRIV="    UPDATE mysql.tables_priv set db='$3' WHERE db='$2';"
fi
if [ `mysql -h $1 -e "select count(*) from mysql.db where db='$2'" -sss` -gt 0 ]; then
    DB_PRIV="    UPDATE mysql.db set db='$3' WHERE db='$2';"
fi
if [ -n "$COLUMNS_PRIV" ] || [ -n "$PROCS_PRIV" ] || [ -n "$TABLES_PRIV" ] || [ -n "$DB_PRIV" ]; then
    echo "IF YOU WANT TO RENAME the GRANTS YOU NEED TO RUN ALL OUTPUT BELOW:"
    if [ -n "$COLUMNS_PRIV" ]; then echo "$COLUMNS_PRIV"; fi
    if [ -n "$PROCS_PRIV" ]; then echo "$PROCS_PRIV"; fi
    if [ -n "$TABLES_PRIV" ]; then echo "$TABLES_PRIV"; fi
    if [ -n "$DB_PRIV" ]; then echo "$DB_PRIV"; fi
    echo "    flush privileges;"
fi

 

The post Renaming database schema in MySQL appeared first on MySQL Performance Blog.

Nov
27
2013
--

Percona XtraBackup – A workaround to the failed assertion bug

I recently conducted a test backup of my “master-slave” setup in my VirtualBox as I was migrating from Percona Server 5.6.12 to version 5.6.13-rel61.0 with Percona XtraBackup v2.2.0 rev. 4885. However, doing the backup on my slave, I encountered this problem:

[04] Compressing and streaming ./test/checksum.ibd
[01] Compressing and streaming ./mysql/slave_master_info.ibd
Assertion "to_read % cursor->page_size == 0" failed at fil_cur.cc:293
innobackupex: Error: The xtrabackup child process has died at /usr/bin/innobackupex line 2641.

This is related to a bug posted by my colleague George https://bugs.launchpad.net/percona-xtrabackup/+bug/1177201. While I was tracing the code, in fil_cur.cc line 293, it shows that this is caused by the assertion failing to return a 0 based result shown below:

xb_a(to_read > 0 && to_read <= 0xFFFFFFFFLL);
xb_a(to_read % cursor->page_size == 0);
npages = (ulint) (to_read >> cursor->page_size_shift);

which xb_a() is defined as:

#define xb_a(expr)                                                      \
        do {                                                            \
                if (!(expr)) {                                          \
                        msg("Assertion \"%s\" failed at %s:%lu\n",      \
                            #expr, __FILE__, (ulong) __LINE__);         \
                        abort();                                        \
                }                                                       \
        } while (0);

in file common.h of line 29. The problem relies in this part as:

xb_a(to_read % cursor->page_size == 0);

I tried to modify the code to add some verbosity for the values of to_read and its page_size value. So the “to_read” and “cursor->page_size” points as the file size of your *.ibd file and it’s page_size defined in XB base on the current version of MySQL you’re trying to back up. As of MySQL 5.6.4 or Percona Server 5.5, innodb_page_size is introduced which can be set by 16k, 8k and 4k. In relation to that, the part of the code I add was just a simple copy of “msg()” function which shows as:

msg("\n\n<<<<<< to_read value is: \"%lld\", value of page size is: \"%llu\", and modulus is: %d\n\n", (long long) to_read, cursor->page_size, to_read % cursor->page_size);

This shows as:

<<<<<< to_read value is: "98318", value of page size is: "16384", and modulus is: 14
Assertion "to_read % cursor->page_size == 0" failed at fil_cur.cc:296
innobackupex: Error: The xtrabackup child process has died at /usr/bin/innobackupex line 2641.

which shows as the file size of my *.ibd file as below:

-rw-rw---- 1 mysql mysql 98318 Nov  6 12:03 slave_master_info.ibd

Still, I got no clue how the 14 bytes were added to the tablespace and what was the cause of it (my bad was not able to backup or copy the file for further investigation). I tried to search for some possible bug that is relevant to this but I could only find this bug #67963 that Jeremy Cole have posted, which might be relevant to the root cause of those 14 bytes. For the past few days I tried and have few attempts to corrupt my tablespace for other tables as well but I still got no luck (any comments regarding this could really be awesome!). Still, my clues led to some random garbage that was inserted which consists of 14 bytes in the tablespace, and to this belief, I couldn’t point the culprit to Percona XtraBackup. To fix the problem, I came up with running “ALTER TABLE” syntax as:

mysql> alter table mysql.slave_master_info engine=InnoDB;
Query OK, 1 row affected (0.36 sec)
Records: 1  Duplicates: 0  Warnings: 0

Which then shows as…

-rw-rw---- 1 mysql mysql 98304 Nov  6 12:17 slave_master_info.ibd

where…

#> echo "scale=5;98304/16384"|bc
6.00000

…is divisible of 16KiB, and that shows that the tablespace is now aligned accordingly to the desired page size (default one). I had a conversation on this with our dev’s specially George or Aleks who have expressed that this could be a workaround if such case is encountered. Basically, the ALTER TABLE syntax here (I presume OPTIMIZE TABLE might work as well) does defragments the tablespace of the affected *.ibd file which causes to remove those unwanted bytes. By defragmenting your tablespace, the ALTER TABLE statement creates a copy of the original table slave_master_info which those garbage are destroyed by deleting the old copy and the new copy is renamed to the original designated name of the table, which is slave_master_info. In any case, your innodb_page_size value is set to 4KiB or 8KiB, XtraBackup will be able to identify this via your my.cnf file. You can verify this by setting and modifying it from 4KiB to 8KiB, and try to run:

# xtrabackup_56 --help

As a supplementary for prior checking if your *.ibd files too are in line in accordance to your page size, you can try to do some pre-check by using awk:

find . -name "*.ibd" -exec ls -alt {} \; | awk '{print $9 ": " $5 " mod of 16384 is: " $5 % 16384}'

or you can check thru INFORMATION_SCHEMA:

SELECT NAME, PAGE_SIZE, (PAGE_SIZE%16384) MOD_RES FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES;

However, there’s one thing I haven’t tried yet! Since XtraBackup has the ability to read from your *.cnf file, I am not sure if the bug might occur from prior versions when such innodb_page_size is change via its codebase, i.e. done thru innobase/include/univ.i. If it fails, that could be either your *.ibd file is not within space bounds and by running or doing the pre-check, you can have the list of tablespace files that was not aligned accordingly base in your InnoDB page size.

Moreover, on this blog post, if you find other workarounds to alleviate the failed assertion bug, please post in the comments below. I will add some further investigation regarding this issue in the future and as well try with the old version of MySQL. Thank you!

The post Percona XtraBackup – A workaround to the failed assertion bug appeared first on MySQL Performance Blog.

Jun
12
2013
--

Percona MySQL University @Portland next Monday!

Percona MySQL University @Portland, June 17, 2013We’re less than a week away from Percona MySQL University at Portland, Oregon next Monday, June 17. The latest in a series of FREE one-day educational events, we are pleased to feature 10 technical talks by members of Team Percona as well as local members of the MySQL Community:

The daylong event will be held at Portland State University’s Smith Memorial Student Union, located at 1825 SW Broadway, Suite 327/8/9 Portland, Oregon 97201. Afterward, we’ll have a networking reception at the famed Paddy’s Bar and Grill sponsored by Tag1 Consulting featuring great networking possibilities and free drinks for event attendees.

If you’re in the Portland area and work with MySQL, then this is an event you can’t afford to miss… :)   So register now!

Please also join the Portland MySQL Meetup group for more MySQL-focused events in Portland

If you love the ideal of Percona MySQL University and would like us to bring the event to your city, please let us know!

The post Percona MySQL University @Portland next Monday! appeared first on MySQL Performance Blog.

Mar
13
2013
--

MySQL Backup tools used by Percona Remote DBA for MySQL

Percona Remote DBA for MySQLAs part of Percona Remote DBA for MySQL service we recognize that reliable backups are one of the most important things we can bring to the table. In my experience handling emergencies, the single worst thing that can happen is finding out you don’t have backups available when some sort of data loss or catastrophic event occurs.

With our Remote DBA service we can take care of backups for you, what follows are some of the internals of our implementation.

What kind of outages can happen?

  • Someone runs UPDATE or DELETE and forgets the where clause or filters weren’t quite right
  • The application had a bug causing data to be removed or overwritten
  • A table (or entire schema) was dropped accidentally
  • Your InnoDB table was corrupt and mysql shuts down
  • Your server or RAID controller crashes and all data is lost on that server
  • A disk failed, and RAID array does not recover
  • You run into a InnoDB corruption bug that propagates via replication (not common, but does happen)
  • You lose your entire SAN and all your DB servers were located there. Let’s hope your backups are somewhere else!
  • You lose a PSU or network switch in your datacenter and some or all of your servers go down in that location
  • Your entire datacenter loses power and the generators do not start, which happens more often than you might think

What tools do we use in Remote DBA?

We have these major components:
  1. Percona XtraBackup for MySQL for binary backups
  2. mydumper for logical backups
  3. mysqlbinlog 5.6
  4. Amazon S3
  5. monitoring for all the above

Philosophy on backups

  •  It is a good idea to schedule both logical and binary backups. They each have their use cases and add redundancy to your backups. If there is an issue with your backup, it’s likely not to affect the other tool.
  • Store your backups on more than one server.
  • In addition to local copies, store backups offsite. Look at the cost of S3 or S3+Glacier, it’s worth the peace of mind!
  • Test your backups, and if you have a test environment, load them there periodically. You can also spin up an EC2 instance to load your backups onto. In addition, you can binlog rollforward 24 hours of binlogs as a good test.
  • Store your binlogs off your primary server so you can perform point in time recovery.
  • Store your binlogs offsite for disaster recovery scenarios.
  • Run pt-table-checksum periodically (i.e. once a month) and make sure your servers data stays consistent. Checksumming is important, as backups are typically pulled off a slave and it’s vital that it has the same data.

How do we use these components to give our customers reliable backups?

Think about the 10 example outages listed above. Each tool has it’s strong points given the conditions.

Percona XtraBackup for MySQL for binary backups.

Strong Points:
  • It can restore an entire server very fast. Often the limiter of how fast this can be restored to another server, is how fast you can transfer data over your network. If you have 1GB network and you have 1TB of data, it could take awhile.
  • It can compress the DB on the fly
  • It can backup a server at approximately the maximum rate the server allows, given it’s IO system
  • It can typically execute a backup with little to no major impact on the server. For example in xtrabackup 2.0.5+, the time taken for “FLUSH TABLES WITH RAED LOCK” is normally under 1 second.
Tips/Tricks:
  • If you have a lot of non-transactional tables (i.e. myisam), use –rsync option. This will rsync a copy of all the frm files and all the MYD/MYI files. It then does a second rsync while under a global lock. This means where you may have been locked for hours where you had many non-transactional tables, now you can be locked sub-second. Even with innodb only this can greatly cut down on the lock time by syncing the frm files.
  • Enable –slave-info when backing up from a slave so you know what the position you are in the master’s bin logs
  • –compress option, compresses on the fly using qpress under the hood.
When do we typically use xtrbackup restores:
  • Setting up new slaves
  • When we lose an entire server due to hardware failure, corruption, etc
  • When the majority of data on the server was lost. e.g. there is one primary schema and that schema was dropped. Basically when restoring may take less time that trying to load a logical backup.

Restoring your data from backup is another topic. Piecing together data after accidental data loss is one of Percona’s specialties, and there are many different techniques depending on the scenario. I will go through some of these in detail in a future blog post.

Mydumper for logical backups

Strong Points:
  • Very fast for logical backups – compared to mysqldump
  • Consistent backups between myisam and innodb tables. Global read lock only held until myisam tables are dumped.
    • We are researching into how we could further improve lock times here when non-transactional tables are
      used
  • Almost no locking, if not using myisam tables
  • Built in compression
  • Each table is dumped to a separate file. This is very important to make restoring single tables easy. You can quickly restore a single table, instead of restoring your entire backup just to find a tiny table you need. This is actually the most common type of restore needed, so it’s important to make this operation as painless as possible.
  • Compressed mydumper typically 3x-5x smaller vs compressed xtrabackup
  • Typically we upload mydumper backups to s3 vs xtrabackup given the time needed to upload/download. Though it depends on the available bandwidth and should be factored into your restore time.

Problems:

  • You can’t rely on mydumper to dump schema’s. It does not handle views/triggers/procedures etc. Run with –no-schemas, instead use mysqldump for the schemas and rely on mydumper for data only.
  • You will have to compile it yourself as binary packages aren’t distributed
  • Be careful with importing a dump from a server running in a different timezone. We have a fix here.
Details on how we dump schemas:
  • loop through each DB
    • write out ALTER DATABASE DEFAULT CHARACTER SET <charset> to the schema file, putting in the current charset
    • mysqldump … -d -R –skip-triggers, out to the schema file
    • create a schema-post file that has the triggers # mysqldump … -d -t
How to restore mydumper data:
  • Load the schema file
  • Run myloader –threads=x
  • Load the schema-post file
I will get into specifics on the tips/tricks to restore data in a future blog post.
Tips/Tricks:
  • run with –kill-long-queries to avoid nasty problems with “FLUSH TABLES WITH READ LOCK”
  • –compress, compresses tables per file and should typically be enabled by default. The time needed to uncompress is not a limiting factor on restore time when done inline.
When do we typically use mydumper restores:
  • Restoring a single file
  • Restoring a single schema or rolling forward a single schema to a point in time
  • Restoring data while automatically replicating out to all slaves

mysqlbinlog 5.6

Last year Percona IT director Tamas Kozak had a great blog post that showed how mysqlbinlog in 5.6 could be used. With mysqlbinlog 5.6, you can now pull binary logs in real time to another server using “mysqlbinlog … –read-from-remote-server –raw –stop-never”

  • Useful to mirror the binlogs on the master to a second server.
  • Allows you to roll forward backups even after losing the master
  • Very useful for disaster recovery.
  • You can have your backups in S3 and mysqlbinlog –stop-never running on a small ec2 instance. This can allow for a very low cost disaster recovery plan to ensure you will not lose data even in the worst case scenarios.
  • Takes very little resources to run, can run about anywhere with disk space and writes out binlog files sequentially.
Tips/Tricks (how we run this):
  • Ensure it stays running, restart it if it appears to be hanging
  • Verify the file is the same on master and slave
  • Re-transfer files that are partially transferred
  • Compress the files after successful transfer

Amazon S3 for MySQL

I discuss S3 here but other cloud based storage can be used as well. S3 is just the most popular in this category and is in wide use.
Details:
  • s3cmd – we have been using the version from github,  Mostly for multi-part upload support. This prevents us from having to split files up before uploading to S3.
  • There is released alpha version of this version here
  • You can now set bucket lifecycle properties so data over X days is archived to Glacier and data over Y days is removed. This is very convenient feature and allows you to cost effectively store long term backups with little additional work
Tips/Tricks:
  • –add-header=x-amz-server-side-encryption:AES256 to use the server side encryption feature which helps with some types of compliance. We also have the capability to encrypt all files with gpg prior to upload via a separate script
  • use_https = True, especially if your data is not encrypted before transfer

Monitoring

  • Monitoring is the most important piece to tie all of these process together. We employ nsca nagios alerts for all of the backup processes.
  • freshness_threshold should be set so if your nsca hasn’t checked in within a certain period it will alert you. For example if you backup once a day a good threshold could be 36 hours.
  • For our mysqlbinlog processes, we have it sending nsca alerts every  30 seconds and have it alert when nothing has been received for 15 minutes -> 1 hour
  • If backups throw an error and are aborted, we send a critical alert immediately to be investigated
  • The number one cause of backup alerts are due to problems with “FLUSH TABLE WITH READ LOCK”. Namely when a select is blocking the flush from completing and queuing all requests behind it. Our current solution to deal with this issue is we have a guardian process that runs during a backup. It then kills any process that causes a stall of the flush. We are also researching into other ways that could improve this in the future.

Other details on Percona Remote DBA for MySQL backup systems for future posts

  • Detailed strategies for different types of restores
  • Strategies on retention dailies weeklies, long term backups
  • Decompressing Percona XtraBackup for MySQL  in parallel using all your resources available
  • Downloading from s3 in parallel
  • Parallel encryption/description
  • Hardlinking of backups. Given both our mydumper and xtrbackup are seperated by file, for files that don’t change they can be easily hardlinked to typically save 20-80% of space locally

The post MySQL Backup tools used by Percona Remote DBA for MySQL appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com