Feb
18
2015
--

Percona submits 7 talks for Vancouver OpenStack Summit (voting ends Feb. 23)

Percona has submitted seven talks for the next OpenStack Summit in Vancouver this May. And as with all OpenStack Summit events, the community decides the content. Voting ends February 23, and if you aren’t already an OpenStack Foundation member (required to vote), you can join now for free here.

Percona’s Vancouver OpenStack Summit proposals

Vancouver OpenStack Summit, photo from https://www.flickr.com/photos/73230975@N03Percona’s proposals include collaborations with top contributors across a variety of OpenStack services including Trove and Swift. You can vote for our talks by clicking the titles below that interest you.

MySQL and OpenStack Deep Dive
Speakers: Peter Boros, Jay Pipes (Mirantis)

Deep Dive into MySQL Replication with OpenStack Trove, and Kilo
Speakers: George Lorch, Amrith Kumar (Tesora)

MySQL on Ceph Storage: Exploring Design, Challenges and Benefits
Speakers: Yves Trudeau, Kyle Bader (Red Hat)

Core Services MySQL Database Backup & Recovery to Swift
Speakers: Kenny Gryp, Chris Nelson (SwiftStack)

Smart MySQL Log Management with Swift
Speakers: Matt Griffin, Chris Nelson (SwiftStack)

Discovering Better Database Architectures For Core Services In OpenStack
Speakers: Kenny Gryp, Matt Griffin

Upgrading your OpenStack Core Services On The Database Side: Nova, Neutron, Cinder…
Speakers: Kenny Gryp, Matt Griffin

Other interesting proposals

Here are a few proposals from other organizations that look particularly interesting. Please consider them as well.

Exploration into OpenStack Trove, customer use cases, and the future of Trove for the community
Speakers: Amrith Kumar (Tesora), Brad Topol (IBM), Mariam John (IBM)

The Entrepreneur’s Challenge: The Realities of Starting an OpenStack Company
Speakers: Simon Anderson (Dreamhost, Inktank), Ken Rugg (Tesora), Josh McKenty (Pivotal, Piston Cloud), Jesse Proudman (BlueBox), Joe Arnold (SwiftStack)

Making a Case for Your OpenStack Deployment: How Vendor’s Can Help
Speaker: Ryan Floyd (Storm Ventures)

Real World Experiences with Upgrading OpenStack at Time Warner Cable
Speakers: Clayton ONeill (Time Warner Cable), Matt Fischer (Time Warner Cable)

Percona & OpenStack

According to the most recent OpenStack User Survey in November 2014, Percona’s database software is a popular choice for OpenStack operators needing high availability.

OpenStack November 2014 User Survey  - Database Options

OpenStack User Survey results from November 2014 shows Percona XtraDB Cluster as the top Galera-based choice for production clouds.

Percona XtraDB Cluster, the top Galera-based MySQL cluster solution for production OpenStack deployments, incorporates the latest version of MySQL 5.6, Percona Server 5.6, Percona XtraBackup, and Galera. This combination delivers top performance, high availability, and critical security coverage with the latest features and fixes. Additionally, Percona Server is a popular guest database option with unique features designed for cloud operators offering DBaaS.

In addition to sharing our open source software with the OpenStack community, Percona is sharing our expertise in services like Trove, projects like the HA Guide update, extensive benchmark testing activities, and upcoming events like OpenStack Live 2015. The inaugural OpenStack Live Conference, April 13-14 in Santa Clara, California, will be a user-focused event. The program will cover database-related on topics like Trove as well as other OpenStack services and feature multiple 3-hour hands-on tutorials.

Percona is a proud supporter of OpenStack and we hope to see you in both Santa Clara in April and Vancouver in May. And in the meantime, don’t forget to vote!

The post Percona submits 7 talks for Vancouver OpenStack Summit (voting ends Feb. 23) appeared first on MySQL Performance Blog.

Jan
30
2015
--

OpenStack Live 2015: FAQs on the who, what, where, when, why & how

This April 13-14 Percona is introducing an annual conference called OpenStack Live. I’ve seen a few questions about the new event so decided to help clarify what this show is about and who should attend.

Unlike OpenStack Summits, held twice a year and dedicated to primarily to developers, OpenStack Live is an opportunity for OpenStack evaluators and users of all levels to learn from experts on topics such as how to deploy, optimize, and manage OpenStack and the role of MySQL as a crucial technology in this free and open-source cloud computing software platform. A full day of hands-on tutorials will also focus on making OpenStack users more productive and confident in this emerging technology.

OpenStack Live 2015: FAQs on the who, what, where, when, why & howStill confused about OpenStack Live 2015? Fear not! Here are the answers to commonly asked questions.

Q: Who should attend?
A: You should attend…

  • if you are currently using OpenStack and want to improve your skills and knowledge
  • if you are evaluating or considering using it.
  • if you are a solutions provider – this is your opportunity to show the world your contributions and services

Q: Percona Live has a conference committee. Does OpenStack Live have one, too?
A: Yes and it’s a completely different committee comprised of:

  • Mark Atwood, Director of Open Source Engagement at HP (Conference Chairman)
  • Rich Bowen, OpenStack Community Liaison at Red Hat
  • Jason Rouault, Senior Director OpenStack Cloud at Time Warner Cable
  • Peter Boros, Principal Architect at Percona

Q: Are the tutorials really “hands-on”?
A: Yes and most are at least 3-hours long. So you’ll need your laptop and power cord. Here’s a look at all of the OpenStack tutorials.

Q: How meaty are the sessions?
A: Very meaty indeed! Here’s a sample:

Q: I am going to attend the Percona Live MySQL Conference and Expo. Will my pass also include OpenStack Live 2015?
A: Yes, your Percona Live pass will be honored at the OpenStack Live conference. OpenStack Live attendees will also have access to the Percona Live/OpenStack Live Exhibit hall, keynotes, receptions and FUN activities April 13 and 16, allowing them to dive deeper into MySQL topics such as high availability, security, performance optimization, and much more. However, the OpenStack Live pass does not allow access to Percona Live breakout sessions or tutorials.

Q: Where can I register?
A: You can register here and take advantage of Early Bird discounts but those end Feb. 1 at 11:30 p.m. PST, so hurry!

The post OpenStack Live 2015: FAQs on the who, what, where, when, why & how appeared first on MySQL Performance Blog.

Jan
26
2015
--

MySQL benchmarks on eXFlash DIMMs

In this blog post, we will discuss MySQL performance on eXFlash DIMMs. Earlier we measured the IO performance of these storage devices with sysbench fileio.

Environment

The benchmarking environment was the same as the one we did sysbench fileio in.

CPU: 2x Intel Xeon E5-2690 (hyper threading enabled)
FusionIO driver version: 3.2.6 build 1212
Operating system: CentOS 6.5
Kernel version: 2.6.32-431.el6.x86_64

In this case, we used a separate machine for testing which had a 10G ethernet connection to this server. This server executed sysbench. The client was not the bottleneck in this case. The environment is described in greater detail at the end of the blog post.

Sysbench OLTP write workload

exflash_sysbench_oltp_tp_partial

The graph shows throughput for sysbench OLTP, we will examine properties only for the dark areas of this graph: which is the read/write case for high concurrency.

Each table in the following sections has the following columns

column explanation
storage The device that was used for the measurement.
threads The number of sysbench client threads were used in the benchmark.
ro_rw Read-only or read-write. In the whitepaper you can find detailed information about read-only data as well.
sd The standard deviation of the metric in question.
mean The mean of the metric in question.
95thpct The 95th percentile of the metric in question (the maximum without the highest 5 percent of the samples).
max The maximum of the metric in question.

Sysbench OLTP throughput

storage threads ro_rw sd mean 95thpct max
eXFlash DIMM_4 128 rw 714.09605 5996.5105 7172.0725 7674.87
eXFlash DIMM_4 256 rw 470.95410 6162.4271 6673.0205 7467.99
eXFlash DIMM_8 128 rw 195.57857 7140.5038 7493.4780 7723.13
eXFlash DIMM_8 256 rw 173.51373 6498.1460 6736.1710 7490.95
fio 128 rw 588.14282 1855.4304 2280.2780 7179.95
fio 256 rw 599.88510 2187.5271 2584.1995 7467.13

Going from 4 to 8 eXFlash DIMMs will mostly mean more consistent throughput. The mean throughput is significantly higher in case of 8 DIMMs used, but the 95th percentile and the maximum values are not much different (the difference in standard deviation also shows this). The reason they are not much different is that these benchmark are CPU bound (check CPU idle time table later in this post or the graphs in the whitepaper). The PCI-E flash drive on the other hand can do less than half of the throughput of the eXFlash DIMMs (the most relevant is comparing the 95th percentile value).

Sysbench OLTP response time

storage threads ro_rw sd mean 95thpct max
eXFlash DIMM_4 128 rw 4.4187784 37.931489 44.2600 64.54
eXFlash DIMM_4 256 rw 9.6642741 90.789317 109.0450 176.45
eXFlash DIMM_8 128 rw 2.1004085 28.796017 32.1600 67.10
eXFlash DIMM_8 256 rw 5.5932572 94.060628 101.6300 121.92
fio 128 rw 51.2343587 138.052150 203.1160 766.11
fio 256 rw 72.9901355 304.851844 392.7660 862.00

The 95th percentile response time for the eXFlash DIMM’s case are less than 1/4 compared to the PCI-E flash device.

CPU idle percentage

storage threads ro_rw sd mean 95thpct max
eXFlash DIMM_4 128 rw 1.62846674 3.3683857 6.2600 22.18
eXFlash DIMM_4 256 rw 1.06980095 2.2930634 3.9170 26.37
eXFlash DIMM_8 128 rw 0.42987637 0.8553543 1.2900 15.28
eXFlash DIMM_8 256 rw 1.32328435 4.4861795 6.7100 9.40
fio 128 rw 4.21156996 26.1278994 31.5020 55.49
fio 256 rw 5.49489852 19.3123639 27.6715 47.34

The percentage of CPU being idle shows that the performance bottleneck in this benchmark was the CPU in case of eXFlash DIMMs (both with 4 and 8 DIMMs, this is why we didn’t see a substantial throughput difference between the 4 and the 8 DIMM setup). However, for the PCI-E flash, the storage device itself was the bottleneck.

If you are interested in more details, download the free white paper which contains the full analysis of sysbench OLTP and linkbench benchmarks.

The post MySQL benchmarks on eXFlash DIMMs appeared first on MySQL Performance Blog.

Dec
18
2014
--

Making HAProxy 1.5 replication lag aware in MySQL

HAProxy is frequently used as a software load balancer in the MySQL world. Peter Boros, in a past post, explained how to set it up with Percona XtraDB Cluster (PXC) so that it only sends queries to available nodes. The same approach can be used in a regular master-slaves setup to spread the read load across multiple slaves. However with MySQL replication, another factor comes into play: replication lag. In this case the approach mentioned for Percona XtraDB Cluster does not work that well as the check we presented only returns ‘up’ or ‘down’. We would like to be able to tune the weight of a replica inside HAProxy depending on its replication lag. This is what we will do in this post using HAProxy 1.5.

Agent checks in HAProxy

Making HAProxy 1.5 replication lag aware in MySQLHAProxy 1.5 allows us to run an agent check, which is a check that can be added to a regular health check. The benefit of agent checks is that the return value can be ‘up’ or ‘down’, but also a weight.

What is an agent? It is simply a program that can be accessed from a TCP connection on a given port. So if we want to run an agent on a MySQL server that will:

  • Mark the server as down in HAProxy if replication is not working
  • Set the weight to 100% if the replication lag is < 10s
  • Set the weight to 50% if the replication lag is >= 10s and < 60s
  • Set the weight to 5% in all other situations

We can use a script like this:

$ less agent.php
= 10 && $lag < 60){
		return "up 50%";
	}
	else
		return "up 5%";
}
set_time_limit(0);
$socket = stream_socket_server("tcp://127.0.0.1:$port", $errno, $errstr);
if (!$socket) {
	echo "$errstr ($errno)
n";
} else {
	while ($conn = stream_socket_accept($socket,9999999999999)) {
		$cmd = "$mysql -h127.0.0.1 -u$user -p$password -P$mysql_port -Ee "$query" | grep Seconds_Behind_Master | cut -d ':' -f2 | tr -d ' '";
		exec("$cmd",$lag);
		$weight = set_weight($lag[0]);
		unset($lag);
		fputs ($conn, $weight);
		fclose ($conn);
	}
	fclose($socket);
}
?>

If you want the script to be accessible from port 6789 and connect to a MySQL instance running on port 3306, run:

$ php agent.php 6789 3306

You will also need a dedicated MySQL user:

mysql> GRANT REPLICATION CLIENT ON *.* TO 'haproxy'@'127.0.0.1' IDENTIFIED BY 'haproxy_pwd';

When the agent is started, you can check that it is working properly:

# telnet 127.0.0.1 6789
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
up 100%
Connection closed by foreign host.

Assuming it is run locally on the app server, that 2 replicas are available (192.168.10.2 and 192.168.10.3) and that the application will send all reads on port 3307, you will define a frontend and a backend in your HAProxy configuration like this:

frontend read_only-front
bind *:3307
mode tcp
option tcplog
log global
default_backend read_only-back
backend read_only-back
mode tcp
balance leastconn
server slave1 192.168.10.2 weight 100 check agent-check agent-port 6789 inter 1000  rise 1  fall 1 on-marked-down shutdown-sessions
server slave2 192.168.10.3 weight 100 check agent-check agent-port 6789 inter 1000  rise 1  fall 1 on-marked-down shutdown-sessions

Demo

Now that everything is set up, let’s see how HAProxy can dynamically change the weight of the servers depending on the replication lag.

No lag

# Slave1
$ mysql -Ee "show slave status" | grep Seconds_Behind_Master
        Seconds_Behind_Master: 0
# Slave2
$ mysql -Ee "show slave status" | grep Seconds_Behind_Master
        Seconds_Behind_Master: 0
# HAProxy
$ echo "show stat" | socat stdio /run/haproxy/admin.sock | cut -d ',' -f1,2,18,19
# pxname,svname,status,weight
read_only-front,FRONTEND,OPEN,
read_only-back,slave1,UP,100
read_only-back,slave2,UP,100
read_only-back,BACKEND,UP,200

Slave1 lagging

# Slave1
$ mysql -Ee "show slave status" | grep Seconds_Behind_Master
        Seconds_Behind_Master: 25
# Slave2
$ mysql -Ee "show slave status" | grep Seconds_Behind_Master
        Seconds_Behind_Master: 0
# echo "show stat" | socat stdio /run/haproxy/admin.sock | cut -d ',' -f1,2,18,19
# pxname,svname,status,weight
read_only-front,FRONTEND,OPEN,
read_only-back,slave1,UP,50
read_only-back,slave2,UP,100
read_only-back,BACKEND,UP,150

Slave2 down

# Slave1
$ mysql -Ee "show slave status" | grep Seconds_Behind_Master
        Seconds_Behind_Master: 0
# Slave2
$ mysql -Ee "show slave status" | grep Seconds_Behind_Master
        Seconds_Behind_Master: NULL
# echo "show stat" | socat stdio /run/haproxy/admin.sock | cut -d ',' -f1,2,18,19
# pxname,svname,status,weight
read_only-front,FRONTEND,OPEN,
read_only-back,slave1,UP,100
read_only-back,slave2,DOWN (agent),100
read_only-back,BACKEND,UP,100

Conclusion

Agent checks are a nice addition in HAProxy 1.5. The setup presented above is a bit simplistic though: for instance, if HAProxy fails to connect to the agent, it will not mark the corresponding as down. It is then recommended to keep a regular health check along with the agent check.

Astute readers will also notice that in this configuration, if replication is broken on all nodes, HAProxy will stop sending reads. This may not be the best solution. Possible options are: stop the agent and mark the servers as UP using the stats socket or add the master as a backup server.

And as a final note, you can edit the code of the agent so that replication lag is measured with Percona Toolkit’s pt-heartbeat instead of Seconds_Behind_Master.

The post Making HAProxy 1.5 replication lag aware in MySQL appeared first on MySQL Performance Blog.

Dec
09
2014
--

OpenStack Live 2015: Sneak peak of the April conference

On behalf of the OpenStack Live Conference Committee, I am excited to announce the sneak peek schedule for the inaugural OpenStack Live 2015 Conference! This new annual conference, running in parallel with the already established Percona Live MySQL Conference and Expo, will feature one day of tutorials followed by a full day of breakout sessions April 13-14, in Santa Clara, Calif.

Though the entire conference schedule won’t be finalized until early January, this initial list of talks is sure to spark interest! So without further ado, here is he OpenStack Live 2015 SNEAK PEEK SCHEDULE!


Deploying an OpenStack Cloud at Scale at Time Warner Cable
-Matthew Fischer, Principal Software Engineer for OpenStack DevOps at Time Warner Cable, and Clayton O’Neill, Principal Software Engineer for OpenStack Cloud at Time Warner Cable


An Introduction to Database as a Service with an Emphasis on OpenStack Using Trove
-Amrith Kumar, Founder and CTO of Tesora, Inc., and Tushar Katarki, Director of Product Management at Percona


Lightweight OpenStack Benchmarking Service with Rally and Docker
-Swapnil Kulkarni, Senior Software Engineer at Red Hat


MySQL and OpenStack Deep Dive
-Peter Boros, Principal Architect at Percona


OpenStack Live 2015: Sneak peak of the April conferenceThis is just a small taste of what will be presented at OpenStack Live 2015 conference this spring. Take advantage of this unique opportunity to hear from leading experts in the field about top cloud strategies, improving overall cloud performance, and operational best practices for managing and optimizing OpenStack and its MySQL database core.

As a special bonus, OpenStack Live attendees attendees will also have access to the Percona Live MySQL Conference & Expo keynotes, receptions, exhibition hall, and Birds of a Feather sessions on April 13 and 14, allowing them to dive deeper into MySQL topics such as high availability, security, performance optimization, and much more.

Registration for OpenStack Live 2015 is now open… register now with Early Bird pricing! Hope to see you there!

The post OpenStack Live 2015: Sneak peak of the April conference appeared first on MySQL Performance Blog.

Dec
04
2014
--

MySQL and OpenStack deep dive: Dec. 10 webinar

Fact: MySQL is the most commonly used database in OpenStack deployments. Of course that includes a number of MySQL variants – standard MySQL by Oracle, MariaDB, Percona Server, MySQL Galera, Percona XtraDB Cluster, etc.

MySQL and OpenStack deep dive: Dec. 10 webinar with Peter Boros and Jay PipesHowever, there are many misconceptions and myths around the pros and cons of these MySQL flavors. Join me and my friend Jay Pipes of Mirantis next Wednesday (Dec. 10) at 10 a.m. Pacific and we’ll dispel some of these myths and provide a clearer picture of the strengths and weaknesses of each of these flavors.

This free Percona webinar, titled “MySQL and OpenStack Deep Dive,” will also illuminate the pitfalls to avoid when migrating between MySQL flavors – and what architectural information to take into account when planning your OpenStack MySQL database deployments.

We’ll also discuss replication topologies and techniques, and explain how the Galera Cluster variants differ from standard MySQL replication.

Finally, in the latter part of the session, we’ll take a deep dive into MySQL database performance analysis, diving into the results of a Rally run showing a typical Nova workload. In addition, we’ll use Percona Toolkit’s famed pt-query-digest tool to determine if a synchronously replication database cluster like the free Percona XtraDB Cluster is a good fit for certain OpenStack projects.

The webinar is free but I encourage you to register now to reserve your spot. See you Dec. 10! In the meantime, learn more about the new annual OpenStack Live Conference and Expo which debuts April 13-14 in the heart of Silicon Valley. If you register now you’ll save with Early Bird pricing. However, one lucky webinar attendee will win a full pass! So be sure to register for next week’s webinar now for your chance to win! (Click here!) The winner will be announced at the end of the webinar.

The post MySQL and OpenStack deep dive: Dec. 10 webinar appeared first on MySQL Performance Blog.

Nov
11
2014
--

OpenStack Live Call for Proposals closes November 16

OpenStack Live 2015: Call for speakers open through Nov. 9The OpenStack Live conference in Silicon Valley (April 13-14, 2015) will emphasize the essential elements of making OpenStack perform better with emphasis on the critical role of MySQL and Trove. If you use OpenStack and have a story to share or a skill to teach, we encourage you to submit a speaking proposal for a breakout or tutorial session. The OpenStack Live call for proposals is your chance to put your ideas, case studies, best practices and technical knowledge in front of an intelligent, engaged audience of OpenStack Users. If you are selected as a speaker, you will receive one complimentary full conference pass. November 16th is the last day to submit.

We are seeking submissions for both breakout and tutorial sessions on the following topics:

  • Performance Optimization of OpenStack
  • OpenStack Operations
  • OpenStack Trove
  • Replication and Backup for OpenStack
  • High Availability for OpenStack
  • OpenStack User Stories
  • Monitoring and Tools for OpenStack

All submissions will be reviewed by our highly qualified Conference Committee:

  • Mark Atwood from HP
  • Rich Bowen from Red Hat
  • Andrew Mitty from Comcast
  • Jason Rouault from Time Warner
  • Peter Boros from Percona

If you don’t plan to submit a speaking proposal, now is a great time to purchase your ticket at the low Super Saver rates. Visit the OpenStack Live 2015 conference website for full details.

OpenStack Live 2015 April 13-14

The post OpenStack Live Call for Proposals closes November 16 appeared first on MySQL Performance Blog.

Oct
29
2014
--

MySQL and Openstack deep dive talk at OpenStack Paris Summit (and more!)

MySQL and Openstack deep dive talk at OpenStack Paris Summit (and more!)I will present a benchmarking talk next week (Nov. 4) at the OpenStack Paris Summit with Jay Pipes from Mirantis. In order to be able to talk about benchmarking, we had to be able to set up and tear down OpenStack environments really quickly. For the benchmarks, we are using a deployment on AWS (ironically) where the instances aren’t actually started and the tenant network is not reachable but all the backend operations still happen.

The first performance bottleneck we hit wasn’t at the MySQL level. We used Rally to benchmark the environment. We started 1,000 fake instances with it at the first glance.

The first bottleneck that we saw was neutron-server eating up a single CPU core. We took a deeper look, and saw that neutron-server is utilizing a single core completely. By default, neutron does everything in a single process. After configuring the api workers and the rpc workers, performance became significantly better.

api_workers = 64
rpc_workers = 32

Before adding the options:

u'runner': {u'concurrency': 24, u'times': 1000, u'type': u'constant'}}
+------------------+-----------+-----------+-----------+---------------+---------------+---------+-------+
| action           | min (sec) | avg (sec) | max (sec) | 90 percentile | 95 percentile | success | count |
+------------------+-----------+-----------+-----------+---------------+---------------+---------+-------+
| nova.boot_server | 4.125     | 9.336     | 15.547    | 11.795        | 12.362        | 100.0%  | 1000  |
| total            | 4.126     | 9.336     | 15.547    | 11.795        | 12.362        | 100.0%  | 1000  |
+------------------+-----------+-----------+-----------+---------------+---------------+---------+-------+
Whole scenario time without context preparation:  391.359671831

After adding the options:

u'runner': {u'concurrency': 24, u'times': 1000, u'type': u'constant'}}
+------------------+-----------+-----------+-----------+---------------+---------------+---------+-------+
| action           | min (sec) | avg (sec) | max (sec) | 90 percentile | 95 percentile | success | count |
+------------------+-----------+-----------+-----------+---------------+---------------+---------+-------+
| nova.boot_server | 2.821     | 6.958     | 36.826    | 8.165         | 10.49         | 100.0%  | 1000  |
| total            | 2.821     | 6.958     | 36.826    | 8.165         | 10.49         | 100.0%  | 1000  |
+------------------+-----------+-----------+-----------+---------------+---------------+---------+-------+
Whole scenario time without context preparation:  292.163493156

Stop by our talk at the OpenStack Paris Summit for more details!

In addition to our talk, Percona has two additional speakers at the OpenStack Paris Summit. George Lorch, Percona software engineer, will speak with Vipul Sabhaya of the HP Cloud Platform Services team on “Percona Server Features for OpenStack and Trove Ops.” Tushar Katarki, Percona director of product management, will present a vBrownBag Tech Talk entitled “MySQL High Availability Options for OpenStack.” Percona is exhibiting at the OpenStack Paris Summit conference, as well – stop by booth E20 and say hello!

At Percona, we’re pleased to see the adoption of our open source software by the OpenStack community and we are working actively to develop more solutions for OpenStack users. We also provide Consulting assistance to organizations that are adopting OpenStack internally or are creating commercial services on top of OpenStack.

We are also pleased to introduce the first annual OpenStack Live, a conference focused on OpenStack and Trove, which is April 13 & 14, 2015 in Santa Clara, California. The call for speaking proposals is now open for submissions which will be reviewed by our OpenStack Live Conference Committee (including me!).

The post MySQL and Openstack deep dive talk at OpenStack Paris Summit (and more!) appeared first on MySQL Performance Blog.

Sep
11
2014
--

OpenStack users shed light on Percona XtraDB Cluster deadlock issues

OpenStack_PerconaI was fortunate to attend an Ops discussion about databases at the OpenStack Summit Atlanta this past May as one of the panelists. The discussion was about deadlock issues OpenStack operators see with Percona XtraDB Cluster (of course this is applicable to any Galera-based solution). I asked to describe what they are seeing, and as it turned out, nova and neutron uses the SELECT … FOR UPDATE SQL construct quite heavily. This is a topic I thought was worth writing about.

Write set replication in a nutshell (with oversimplification)

Any node is writable, and replication happens in write sets. A write set is practically a row based binary log event or events and “some additional stuff.” The “some additional stuff” is good for 2 things.

  • Two write sets can be compared and told if they are conflicting or not.
  • A write set can be checked against a database if it’s applicable.

Before committing on the originating node, the write set is transferred to all other nodes in the cluster. The originating node checks that the transaction is not conflicting with any of the transactions in the receive queue and checks if it’s applicable to the database. This process is called certification. After the write set is certified the transaction is committed. The remote nodes will do certification asynchronously compared to the local node. Since the certification is deterministic, they will get the same result. Also the write set on the remote nodes can be applied later because of this reason. This kind of replication is called virtually synchronous, which means that the data transfer is synchronous, but the actual apply is not.

We have a nice flowchat about this.

Since the write set is only transferred before commit, InnoDB row level locks, which are held locally, are not held on remote nodes (if these were escalated, each row lock would take a network round trip to acquire). This also means that by default if multiple nodes are used, the ability to read your own writes is not guaranteed. In that case, a certified transaction, which is already committed on the originating node can still sit in the receive queue of the node the application is reading from, waiting to be applied.

SELECT … FOR UPDATE

The SELECT … FOR UPDATE construct reads the given records in InnoDB, and locks the rows that are read from the index the query used, not only the rows that it returns. Given how write set replication works, the row locks of SELECT … FOR UPDATE are not replicated.

Putting it together

Let’s create a test table.

CREATE TABLE `t` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

And some records we can lock.

pxc1> insert into t values();
Query OK, 1 row affected (0.01 sec)
pxc1> insert into t values();
Query OK, 1 row affected (0.01 sec)
pxc1> insert into t values();
Query OK, 1 row affected (0.01 sec)
pxc1> insert into t values();
Query OK, 1 row affected (0.00 sec)
pxc1> insert into t values();
Query OK, 1 row affected (0.01 sec)

pxc1> select * from t;
+----+---------------------+
| id | ts                  |
+----+---------------------+
|  1 | 2014-06-26 21:37:01 |
|  4 | 2014-06-26 21:37:02 |
|  7 | 2014-06-26 21:37:02 |
| 10 | 2014-06-26 21:37:03 |
| 13 | 2014-06-26 21:37:03 |
+----+---------------------+
5 rows in set (0.00 sec)

On the first node, lock the record.

pxc1> start transaction;
Query OK, 0 rows affected (0.00 sec)
pxc1> select * from t where id=1 for update;
+----+---------------------+
| id | ts                  |
+----+---------------------+
|  1 | 2014-06-26 21:37:01 |
+----+---------------------+
1 row in set (0.00 sec)

On the second, update it with an autocommit transaction.

pxc2> update t set ts=now() where id=1;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0
pxc1> select * from t;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

Let’s examine what happened here. The local record lock held by the started transation on pxc1 didn’t play any part in replication or certification (replication happens at commit time, there was no commit there yet). Once the node received the write set from pxc2, that write set had a conflict with a transaction still in-flight locally. In this case, our transaction on pxc1 has to be rolled back. This is a type of conflict as well, but here the conflict is not caught on certification time. This is called a brute force abort. This happens when a transaction done by a slave thread conflict with a transaction that’s in-flight on the node. In this case the first commit wins (which is the already replicated one) and the original transaction is aborted. Jay Janssen discusses multi-node writing conflicts in detail in this post.

The same thing happens when 2 of the nodes are holding record locks via select for update. Whichever node commits first will win, the other transaction will hit the deadlock error and will be rolled back. The behavior is correct.

Here is the same SELECT … FOR UPDATE transaction overlapping on the 2 nodes.

pxc1> start transaction;
Query OK, 0 rows affected (0.00 sec)
pxc2> start transaction;
Query OK, 0 rows affected (0.00 sec)

pxc1> select * from t where id=1 for update;
+----+---------------------+
| id | ts                  |
+----+---------------------+
|  1 | 2014-06-26 21:37:48 |
+----+---------------------+
1 row in set (0.00 sec)
pxc2> select * from t where id=1 for update;
+----+---------------------+
| id | ts                  |
+----+---------------------+
|  1 | 2014-06-26 21:37:48 |
+----+---------------------+
1 row in set (0.00 sec)

pxc1> update t set ts=now() where id=1;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0
pxc2> update t set ts=now() where id=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

pxc1> commit;
Query OK, 0 rows affected (0.00 sec)
pxc2> commit;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

Where does this happen in OpenStack?

For example in OpenStack Nova (the compute project in OpenStack), tracking the quota usage uses the SELECT…FOR UPDATE construct.

# User@Host: nova[nova] @  [10.10.10.11]  Id:   147
# Schema: nova  Last_errno: 0  Killed: 0
# Query_time: 0.001712  Lock_time: 0.000000  Rows_sent: 4  Rows_examined: 4  Rows_affected: 0
# Bytes_sent: 1461  Tmp_tables: 0  Tmp_disk_tables: 0  Tmp_table_sizes: 0
# InnoDB_trx_id: C698
# QC_Hit: No  Full_scan: Yes  Full_join: No  Tmp_table: No  Tmp_table_on_disk: No
# Filesort: No  Filesort_on_disk: No  Merge_passes: 0
#   InnoDB_IO_r_ops: 0  InnoDB_IO_r_bytes: 0  InnoDB_IO_r_wait: 0.000000
#   InnoDB_rec_lock_wait: 0.000000  InnoDB_queue_wait: 0.000000
#   InnoDB_pages_distinct: 2
SET timestamp=1409074305;
SELECT quota_usages.created_at AS quota_usages_created_at, quota_usages.updated_at AS quota_usages_updated_at, quota_usages.deleted_at AS quota_usages_deleted_at, quota_usages.deleted AS quota_usages_deleted, quota_usages.id AS quota_usages_id, quota_usages.project_id AS quota_usages_project_id, quota_usages.user_id AS quota_usages_user_id, quota_usages.resource AS quota_usages_resource, quota_usages.in_use AS quota_usages_in_use, quota_usages.reserved AS quota_usages_reserved, quota_usages.until_refresh AS quota_usages_until_refresh
FROM quota_usages
WHERE quota_usages.deleted = 0 AND quota_usages.project_id = '12ce401aa7e14446a9f0c996240fd8cb' FOR UPDATE;

So where does it come from?

These constructs are generated by SQLAlchemy using with_lockmode(‘update’). Even in nova’s pydoc, it’s recommended to avoid with_lockmode(‘update’) whenever possible. Galera replication is not mentioned among the reasons to avoid this construct, but knowing how many OpenStack deployments are using Galera for high availability (either Percona XtraDB Cluster, MariaDB Galera Cluster, or Codership’s own mysql-wsrep), it can be a very good reason to avoid it. The solution proposed in the linked pydoc above is also a good one, using an INSERT INTO … ON DUPLICATE KEY UPDATE is a single atomic write, which will be replicated as expected, it will also keep correct track of quota usage.

The simplest way to overcome this issue from the operator’s point of view is to use only one writer node for these types of transactions. This usually involves configuration change at the load-balancer level. See this post for possible load-balancer configurations.

The post OpenStack users shed light on Percona XtraDB Cluster deadlock issues appeared first on MySQL Performance Blog.

Apr
29
2014
--

ScaleArc: Real-world application testing with WordPress (benchmark test)

ScaleArc recently hired Percona to perform various tests on its database traffic management product. This post is the outcome of the benchmarks carried out by me and ScaleArc co-founder and chief architect, Uday Sawant.

The goal of this benchmark was to identify ScaleArc’s overhead using a real-world application – the world’s most popular (according to wikipedia) content management system and blog engine: WordPress.

The tests also sought to identify the benefit of caching for this type of workload. The caching parameters represent more real-life circumstances than we applied in the sysbench performance tests – the goal here was not just to saturate the cache. For this reason, we created an artificial WordPress blog with generated data.

The size of the database was roughly 4G. For this particular test, we saw that using ScaleArc introduces very little overhead and caching increased the throughput 3.5 times at peak capacity. In terms of response times, response times on queries for which we had a cache hit decreased substantially. For example, a 5-second main page load became less than 1 second when we had cache hits on certain queries. It’s a bit hard to talk about response time here in general, because WordPress itself has different requests that are associated with different costs (computationally) and which have different response times.

Test description

The pre-generated test database contained the following:

  • 100 users
  • 25 categories
  • 100.000 posts (stories)
  • 300.000 comments (3 per post)

One iteration of the load contained the following:

  • Homepage retrieval
  • 10 story (post) page retrieval
  • 3 category page retrieval
  • Log in as a random user
  • That random user posted a new story and commented on an existing post

We think that the usage pattern is close to reality – most people just visit blogs, but some write posts and comments. For the test, we used WordPress version 3.8.1. We wrote a simple shell script that could do these iterations using multiple processes. Some of this testing pattern, however, is not realistic. Some posts will always have many more comments than others, and some posts won’t have any comments at all. This test doesn’t take that nuance into account, but that doesn’t change the big picture. Choosing a random post to comment on will give us a uniform comment distribution.

We measured 3 scenarios:

  • Direct connection to the database (direct_wp).
  • Connection through ScaleArc without caching.
  • Connection through ScaleArc with caching enabled.

When caching is enabled, queries belonging to comments were cached for 5 minutes, queries belonging to the home page were cached for 15 minutes, and queries belonging to stories (posts) were cached for 30 minutes.

We varied the number of parallel iterations. Each test ran for an hour.

Results for direct database connection

Threads: 1, Iterations: 180, Time[sec]: 3605
   Threads: 2, Iterations: 356, Time[sec]: 3616
   Threads: 4, Iterations: 780, Time[sec]: 3618
   Threads: 8, Iterations: 1408, Time[sec]: 3614
   Threads: 16, Iterations: 2144, Time[sec]: 3619
   Threads: 32, Iterations: 2432, Time[sec]: 3646
   Threads: 64, Iterations: 2368, Time[sec]: 3635
   Threads: 128, Iterations: 2432, Time[sec]: 3722

The result above is the summary output of the script we used. The data shows we reach peak capacity at 32 concurrent threads.

Results for connecting through ScaleArc

Threads: 1, Iterations: 171, Time[sec]: 3604
   Threads: 2, Iterations: 342, Time[sec]: 3606
   Threads: 4, Iterations: 740, Time[sec]: 3619
   Threads: 8, Iterations: 1304, Time[sec]: 3609
   Threads: 16, Iterations: 2048, Time[sec]: 3625
   Threads: 32, Iterations: 2336, Time[sec]: 3638
   Threads: 64, Iterations: 2304, Time[sec]: 3678
   Threads: 128, Iterations: 2304, Time[sec]: 3675

The results are almost identical. Because a typical query in this example is quite expensive, the overhead of ScaleArc here is barely measurable.

Results for connecting through ScaleArc with caching enabled

Threads: 1, Iterations: 437, Time[sec]: 3601
   Threads: 2, Iterations: 886, Time[sec]: 3604
   Threads: 4, Iterations: 1788, Time[sec]: 3605
   Threads: 8, Iterations: 3336, Time[sec]: 3600
   Threads: 16, Iterations: 6880, Time[sec]: 3606
   Threads: 32, Iterations: 8832, Time[sec]: 3600
   Threads: 64, Iterations: 9024, Time[sec]: 3614
   Threads: 128, Iterations: 8576, Time[sec]: 3630

Caching improved response time even for a single thread. At 32 threads, we see more than 3.5x improvement in throughput. Caching is a great help here for the same reason the overhead is barely measurable: the queries are more expensive in general, so more resources are spared when they are not run.

Throughput
From the web server’s access log, we created a per-second throughput graph. We are talking about requests per second here. Please note that the variance is relatively high, because the requests are not identical – retrieving the main page is a different request and has a different cost then retrieving a story page.

throughput

The red and blue dots are literally plotted on top of each other – the green bar is always on top of them. The green ones have a greater variance because even though we had caching enabled during the test, we used more realistic TTLs in this cache, so cached items did actually expire during the test. When the cache was expired, requests took longer, so the throughput was lower. When the cache was populated, requests took a shorter amount of time, so the throughput was higher.

CPU utilization

cpu_util

CPU utilization characteristics are pretty much the same on the left and right sides (direct connection on the left and ScaleArc without caching on the right). In the middle, we can see that the web server’s CPU gets completely utilized sooner with caching. Because data comes faster from the cache, it serves more requests, which costs more resources computationally. On the other hand, the database server’s CPU utilization is significantly lower when caching is used. The bar is on the top on the left and right sides – in the middle, we have bars both at the top and at the bottom. The test is utilizing the database server’s CPU completely only when we hit cache misses.

Because ScaleArc serves the cache hits, and these requests are not hitting the database, the database is not used at all when requests are served from the cache. In the case of tests with caching on, the bottleneck became the web server, which is a component that is a lot easier to scale than the database.

There are two more key points to take away here. First, regardless of whether caching is turned on or off, this workload is not too much for ScaleArc. Second, the client we ran the measurement scripts on was not the bottleneck.

Conclusion
The goal of these benchmarks was to show that ScaleArc has very little overhead and that caching can be beneficial for a real-world application, which has a “read mostly” workload with relatively expensive reads (expensive means that network round trip is not a significant contributor in the read’s response time). A blog is exactly that type – typically, more people are visiting than commenting. The test showed that ScaleArc is capable of supporting this scenario well, delivering 3.5x throughput at peak capacity. It’s worth mentioning that if this system needs to be scaled, more web servers could be added as well as more read slaves. Those read slaves can take up read queries by a WordPress plugin which allows this, or by ScaleArc’s read-write splitting facility (it threats autocommit selects as reads), in the later case, the caching benefit is present for the slaves as well.

The post ScaleArc: Real-world application testing with WordPress (benchmark test) appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com