Feb
28
2011
--

Is VoltDB really as scalable as they claim?

Before I begin, a disclaimer. VoltDB is not a customer, and did not pay Percona or me to investigate VoltDB’s scalability or publish this blog post. More disclaimers at the end. Short version: VoltDB is very scalable; it should scale to 120 partitions, 39 servers, and 1.6 million complex transactions per second at over 300 CPU cores, on the benchmarked configuration, with the recommended level of redundancy for HA.

First, if you’re new to VoltDB, I’ll summarize: it’s an open-source OLTP database that is designed to run on a cluster, not just a single machine, and doesn’t sacrifice consistency during a network partition. It is an in-memory shared-nothing system, and tables are partitioned across multiple servers in the cluster; high availability is ensured by keeping more copies of each partition. You query VoltDB with stored procedures, not with arbitrary SQL queries. It is designed to be very fast (hundreds of thousands of TPS) even on low-end machines, by doing away with the usual buffer pools, logs, latching, and so on.

The benchmark is VoltDB’s “voter” benchmark, which is explained briefly at this blog post. VoltDB’s Tim Callaghan ran the benchmark three times for each node count from 1 to 12, for k-factors of 0, 1, and 2. The k-factor is the number of redundant copies of each partition that the system maintains. An update to a partition in one server is synchronously replicated to all other copies in the cluster before the transaction completes.

Running all these benchmarks is a lot of work, which is why it is useful to run benchmarks for a dozen machines and model the behavior at larger cluster sizes. I used Neil J. Gunther’s Universal Scalability Law (USL) to model the cluster’s scalability. If you are not familiar with this model, probably the most succinct write-up is in a white paper I published some time ago. Let’s go right to the results and then I will discuss the details about the modeling.

First, let’s look at results for k-factors of 0 (no redundancy), 1 (recommended), and 2 (extra safety):

Results for k-factor 0

Results for k-factor 0

Results for k-factor 1

Results for k-factor 1

Results for k-factor 2

Results for k-factor 2

Those thumbnails are small and hard to read, but that’s OK because there is something interesting and important that’s easy to miss by looking at separate images anyway. The k-factor of 0 achieves the highest throughput, which I expected because of the lack of cross-node communication. What’s odd is that the k-factor of 0 reaches its peak throughput at 35 nodes, but k-factor 1 scales to 39 nodes and k-factor 2 doesn’t top out until 46 nodes. If we plot these on the same graph, it’s easier to see:

Actual and modeled results for k-factors 0, 1, and 2

Actual and modeled results for k-factors 0, 1, and 2

This result was unexpected for me. I expect that a cluster with more inter-node communication should peak at fewer nodes. I asked Tim if he could explain, and he responded that it can be explained by the fact that at higher k-factors, there are fewer distinct partitions of data in the cluster. In all configurations, each node had 6 partitions of data, so when we keep more copies of the data, we have fewer unique partitions. In other words, the “unit of scaling” that is on the x-axis really shouldn’t be the server count, but rather the number of partitions in the system. I re-ran my models and generated the following graph:

Actual and modeled results with partitions for k-factors 0, 1, and 2

Actual and modeled results with partitions for k-factors 0, 1, and 2

When approached from this angle, the results make sense. (Individual graphs by partition for k-factors: 0, 1, 2.) Now, for the recommended degree of safety, we can see that this cluster is predicted to scale to 120 partitions, at a throughput of more than 1.6 million transactions per second. This is on commodity 8-core boxes, and with 6 partitions per server and 2 copies of each partition, that should be a 40-node cluster.

In case you don’t know what to think of that, I’ll tell you: scaling a synchronously replicated, active-active master, fully ACID, always-consistent database to a 40-server cluster is impressive. Yes, it comes with some limitations (there’s a decent write-up on highscalability.com that explains more), but that is still nothing to sneeze at.

Here are some more disclaimers and details, and I’ll try to anticipate some questions:

  • Percona has no plans to provide services for VoltDB. We’re focused on MySQL software and services.
  • I am not a VoltDB expert. I have some understanding of VoltDB in general, and of distributed systems in general.
  • This post grew out of a series of conversations and email exchanges with Tim Callaghan over the course of many months. I asked Tim lots of questions, and he ran lots of benchmarks to satisfy my curiosity.
  • In the opening of this post, I say “should scale” because I don’t have access to the raw performance results from benchmarks at that scale — this is a mathematical model based on a smaller benchmark. Also note that these are not high-end servers, and VoltDB should provide even higher performance on faster machines.
  • I did not audit or repeat Tim’s benchmarks in the level of detail that I would do if this were a paid engagement. However, the data fits the model very well (r-squared of 99.8% or better in all cases), and Tim didn’t know in advance that I’d be modeling the data this way, so the benchmark numbers aren’t manipulated to fit the model. The fact that they fit so well gives me a lot of confidence in them.
  • I had to do a couple of things to model the benchmarks. First, I had to ignore results from one and two nodes. That’s because the inter-node communication doesn’t exist at all in a one-node cluster, and not the same way in a two-node cluster as it does at higher node counts, so the benchmark results for one and two nodes do not fit the scalability curve at all. However, the USL requires the one-node throughput as a multiplier, so I had to do a regression from higher node counts down to 1 in order to use the model. Finally, I had to adjust the computed 1-node performance slightly (ranging from 1/2 of a percent to 8 percent) to avoid unphysical parameters as a result of performing a regression against the full dataset. These are rather standard steps in applying the USL model.
  • Note that the per-node and per-partition models don’t quite agree. Per-node k-factor-1 says we should scale to 39 servers, while per-partition says 120 partitions which is 40 servers. Similarly, the peak throughput numbers differ slightly. That’s because there’s some rounding, and performing a regression and USL modeling against this kind of data isn’t an exact science anyway — there is some human judgment involved (see the previous bullet point).

Let me close by answering my own question: from what I know of VoltDB it does indeed match their claim, with good scalability to dozens of servers. For more information, or to download it and try yourself, you can visit the VoltDB website.

Feb
28
2011
--

Percona Server Scales Vertically with Virident tachIOn Drives

We’ve published a new white paper that explains how to stop sharding and start scaling vertically with PCI-E flash drives, specifically the Virident tachIOn drive, which offers consistent, low-latency IO performance. I’ve been beating this drum for a while, so it’s a great feeling to have an explicitly recommended reference architecture: buy flash storage first, shard as a last resort. From the summary: The sharding approach that has been advocated for the last five years or so is becoming increasingly questionable advice in some environments. Today’s solid-state PCIe hardware offers extremely high-bandwidth, low-latency I/O performance, exemplified by the Virident tachIOn drive. “Scaling up” is once again a viable and economical strategy for MySQL, and “scaling out” need no longer be the default database architecture.

Feb
27
2011
--

Percona welcomes Alexey Kopytov

Percona is pleased to extend a belated welcome to Alexey Kopytov. He actually joined us back in November, 2010, but we somehow overlooked blogging then about his switch from Oracle to Percona. Alexey has over six years of work deep in MySQL source code as part of MySQL AB, then Sun, and later Oracle. His first MySQL job was as part of the original High Performance group within the MySQL Support Team, working under Peter and Vadim. Alexey was the original implementer and maintainer of Sysbench, the tool we use everyday for our performance benchmarks. Now Alexey will focus on Percona Server and Percona XtraBackup, you may have seen his name in our recent Release Notes.

Alexey, a big welcome!

Feb
27
2011
--

Percona Server and XtraBackup weekly news, February 26th

Percona Server has a new logo:

Percona Server Logo

Percona Server Logo

Other news for Percona Server:

  • Many users noticed that our repositories don’t work for Debian Squeeze, the new version of Debian. Our repositories will support Squeeze on the next release, which will be based on MySQL 5.1.55.
  • We merged a fix for InnoDB’s slow DROP TABLE performance. A new option, innodb_lazy_drop_table, controls this behavior.
  • Vadim asked Yasufumi to look into porting the innodb_sync_checkpoint_limit feature from the Facebook patches into Percona Server.
  • Peter requested a high priority for logging queries to the slow query log from a replica using row-based replication.
  • Yasufumi reported a feature request for XtraDB to store statistics for the ::records_in_range() handler function, instead of accessing the data. This could improve performance for complex queries.
  • A user on our mailing list requested us to merge Google’s KILL IF_IDLE functionality into Percona Server, and proposed the merge on Launchpad.

In XtraBackup news,

  • Alexey completed moving streaming functionality from innobackupex. InnoDB files are now streamed by the xtrabackup binary, and tar4ibd has been removed (this has not been merged into the trunk yet).
  • Alexey completed implementing compression with quicklz in xtrabackup, though a tricky problem remains in combination with streaming.
Feb
26
2011
0

Pathfinder – 2011-02-26

The Group

Fighter – sword & board teamwork fighter
Ranger
Witch
Paladin/Sorcerer
Cleric
Rogue
Fighter/Cleric

== The Beginning ==

We started the in the small town near the dam.  The townspeople were
repairing the damage done earlier.  We debated what we should do next
now that the situation was stabilized and the original mission was
completed.  We decided to go check out the sunken barge.

(more…)

Written by in: D&D,Pathfinder | Tags:
Feb
24
2011
--

Friends of Percona Get 20% Off at the MySQL Conference!

O'Reilly MySQL ConferenceWe have a special Friends of Percona discount code that you can use to get 20% off of registration at the MySQL conference in April: mys11pkb. If you click the image to the left, or this special link, it will pre-fill the code for you when you check out. Read on to see the list of sessions we’re presenting at the conference.

I really hope you are able to make it to this event. The value of attending conferences, for your company and your career, is hard to overstate. For me personally, attending the MySQL conference was a big part of what has shaped my career so far. I hope to see you there — you can flag me (and any of us) down in the hallways, eat lunch with us, come see us at our expo hall booth, and above all come listen and participate in our presentations! We’re giving 3 great tutorials on Monday this year; these always sell out early, so don’t delay to register!

Here are the sessions we’re currently scheduled to present — if I miss any, say something in the comments:

Feb
21
2011
--

Death match! EBS versus SSD price, performance, and QoS

Is it a good idea to deploy your database into the cloud? It depends. I have seen it work well many times, and cause trouble at other times. In this blog post I want to examine cloud-based I/O. I/O matters a lot when a) the database’s working set is bigger than the server’s memory, or b) the workload is write-heavy. If this is the case, how expensive is it to get good performance, relative to what you get with physical hardware? Specifically, how does it compare to commodity solid-state drives? Let’s put them in the ring and let them duke it out.

I could do benchmarks, but that would not be interesting — we already know that benchmarks are unrealistic, and we know that SSDs would win. I’d rather look at real systems and see how they behave. Are the theoretical advantages of SSDs really a big advantage in practice? I will show the performance of two real customer systems running web applications.

Let’s begin with a system running in a popular hosting provider’s datacenter. This application is a popular blogging service, running on a generic midrange Dell-class server. The disks are six OCZ-VERTEX2 200-GB drives in a RAID10 array, with an LSI MegaRAID controller with a BBU. These disks currently cost about $400 each, and are roughly half full. (That actually matters — the fuller they get, the slower they are.) So let’s call this a $2500 disk array, and you can plug that into your favorite server and hosting provider costs to see what the CapEx and OpEx are for you. Assuming you will depreciate this over 3 years, let’s call this a $1000 per year storage array, just to make it a round number. These aren’t the most reliable disks in my experience, and you are likely to need to replace one, for example. If you rent this array instead of buying it, the cost is likely to be quite a bit higher.

Now, let’s look at the performance this server is getting from its disks. I’m using the Aspersa diskstats tool to pull this data straight from /proc/diskstats. I’ll aggregate over long periods of time in my sample file so we can see performance throughout the day. If you’re not familiar with the diskstats tool, the columns are the number of seconds in the sample, the device name, and the following statistics for reads: MB/s, average concurrency, and average response time in milliseconds. The same statistics are repeated for writes, and then we have the percent of time the device was busy, and the average number of requests in progress at the time the samples were taken.

[baron@ginger logstats]$ diskstats -g sample -i 6000 sda3-stats.txt
    #ts device rd_mb_s rd_cnc   rd_rt wr_mb_s wr_cnc   wr_rt busy in_prg
...snip...
54009.9 sda3      32.6    1.1     0.5     2.7    0.0     0.4  54%      0
60032.9 sda3      30.0    1.0     0.5     3.0    0.0     0.3  50%      1
66034.0 sda3      23.6    0.8     0.5     3.1    0.0     0.4  43%      1
72040.6 sda3      25.5    2.7     1.6     3.8    0.2     1.5  47%      1
78041.7 sda3      25.5    1.2     0.7     4.5    0.1     0.4  46%      0
84042.8 sda3      24.4    0.7     0.5     4.7    0.0     0.3  44%      0
90043.9 sda3      21.7    0.9     0.6     4.7    0.1     0.6  41%      0
...snip...

So we’re reading 20-30 MB/s from these disks, with average latencies generally under a millisecond, and we’re writing a few MB/s with latencies about the same. The device is active about 40% to 50% of the time, but given that we know there are 3 pairs of drives behind it, the device wouldn’t be 100% utilized until average read concurrency reached 6 and write concurrency reached 3. One of the samples shows the performance during a period of pretty high utilization. (Note: A read concurrency of 2.7 for a 6-device array, plus a few writes that have to be sent to both devices in a mirrored pair, is roughly 50% read utilized. This is one of the reasons I wrote the diskstats tool — you can understand busy-ness and utilization correctly. The %util that is displayed by iostat is confusing, and you have to do some tedious math to get something approximating the real device utilization, but reads and writes aren’t broken out, so you can’t actually understand performance and utilization in this level of detail from the output of iostat.)

I would characterize this as very good performance. Sub-millisecond latencies to disk, pretty much consistently, for reads and writes, is very good. I’m averaging across large periods of time here, so you can’t really see it, but there are significant spikes of load on the server during these times, and the disks keep responding in less than 2ms. I could zoom in to the 30-second level or 1-second level and show you that it remains the same from second to second. I could slice and dice this data all different ways, but it would be boring, because it looks the same from every angle.

Now let’s switch to a different customer’s system. This one is a popular online retail store, running in the Amazon cloud. It’s a quadruple extra large EC2 server (currently priced at $2/hour for a non-reserved instance) with a software RAID10 array of EBS volumes. As time has passed, they’ve added more EBS volumes to keep up with load. Between the time I sampled statistics last week and now, they went from a 10-volume array to a 20-volume array, for example. But the samples I’ll show are from an array of 10 x 100GB EBS volumes. EBS currently lists at $0.10 per GB-month of provisioned storage, which I calculate should cost $100 per month. I grabbed the counters from /proc/diskstats and computed I/O operations done so far on this array, at the list price of $0.10 per 1 million I/O requests, to be $126. This is over a long period of time, so the counters could have wrapped, but let’s assume not. So this disk array might cost $1500 or so per year. What do we get for that? Let’s look at some second-by-second output of the diskstats tool during a 30-second period, aggregated over all of the 10 EBS volumes:

  #ts device rd_mb_s rd_cnc   rd_rt wr_mb_s wr_cnc   wr_rt busy in_prg
  1.0 {10}       1.4    0.7   118.1     0.3    0.0     9.7   3%      0
  2.0 {10}       4.9    0.4    24.2     0.5    0.0     3.9   8%      0
  3.0 {10}       5.8    0.8    42.0     2.1    0.1     6.5  16%      0
...snip...
 20.2 {10}       2.6    0.1    12.1     0.3    0.0     4.0   3%      0
 21.2 {10}       2.3    0.2    27.7     0.3    0.0     2.3   4%      0
 22.2 {10}       6.6    0.6    28.8     0.1    0.0     5.8  11%     19
 23.2 {10}       0.6    0.4   132.5     0.3    0.0     9.7   1%      0
 24.2 {10}       2.5    0.1    13.4     0.3    0.0     3.8   3%      0
 25.2 {10}       2.6    0.1    12.8     0.3    0.0     4.2   3%      0

In terms of throughput, we’re getting a couple megabytes per second of reads, and generally less than a megabyte per second of writes, but the performance (latency) is both large and variable even though the devices are idle most of the time. In some time periods, the write latency gets into the 30s of milliseconds, and the read latency goes above 130 milliseconds. That is average over a period of one second per sample, and over 10 devices all aggregated together in each line.

I can switch the view of the same data to look at it disk-by-disk, aggregated over the entire 30 seconds that I captured samples of /proc/diskstats. Here are the statistics for the EBS volumes over that time period:

  #ts device rd_mb_s rd_cnc   rd_rt wr_mb_s wr_cnc   wr_rt busy in_prg
 {29} sdi5       0.4    0.3    24.5     0.1    0.1    11.1   6%      0
 {29} sdi2       0.4    0.5    38.3     0.1    0.1    15.5   8%      0
 {29} sdj5       0.4    0.3    22.6     0.2    0.1    13.5   7%      0
 {29} sdi4       0.4    0.2    14.5     0.1    0.1    11.7   6%      0
 {29} sdi3       0.4    0.4    27.9     0.1    0.1    11.3   7%      0
 {29} sdj3       0.4    0.4    35.9     0.1    0.1    11.4   8%      0
 {29} sdi1       0.4    0.3    20.3     0.1    0.1    16.4   6%      0
 {29} sdj4       0.4    0.4    30.9     0.2    0.1    13.4   8%      0
 {29} sdj2       0.4    0.3    20.6     0.1    0.1    15.0   7%      0
 {29} sdj1       0.4    0.9    71.8     0.1    0.1    12.2   9%      0

So over the 30-second period (shown as {29} in the output because the first sample is merely used as a baseline for subtracting from other samples), we read and wrote about half a megabyte per second from each volume, and got read latencies varying from the teens to the seventies of milliseconds, and write latencies in the teens. Note how variable the quality of service from these EBS volumes is — some are fast, some are slow, even though we are asking the same thing from all of them (I wrote on my own blog about the reasons for this and the wrench it throws into capacity planning). If we zoom in on a particular sample — say the sample taken at a 23.2 second delta since the beginning of the period — we can see non-aggregated statistics:

  #ts device rd_mb_s rd_cnc   rd_rt wr_mb_s wr_cnc   wr_rt busy in_prg
 23.2 sdi5       0.0    0.0     0.0     0.0    0.0     2.5   0%      0
 23.2 sdi2       0.0    0.0     0.0     0.0    0.0     2.0   0%      0
 23.2 sdj5       0.0    0.0     0.0     0.0    0.0     1.0   0%      0
 23.2 sdi4       0.0    0.0     0.0     0.1    0.0     2.2   1%      0
 23.2 sdi3       0.0    0.0     0.0     0.1    0.0     2.5   1%      0
 23.2 sdj3       0.0    0.0     0.0     0.1    0.0     3.0   1%      0
 23.2 sdi1       0.0    0.0     0.0     0.0    0.0     3.0   1%      0
 23.2 sdj4       0.3    1.9   161.4     0.0    0.1    48.7   5%      0
 23.2 sdj2       0.0    0.0     0.8     0.1    0.0     2.3   1%      0
 23.2 sdj1       0.3    1.8   176.8     0.0    0.1    34.0   1%      0

During that time period, two of these devices were responding in worse than 160 and 170 milliseconds, respectively. One final zoom-in on a specific device, and I’ll stop belaboring the point. Let’s look at the performance of sdj1 over the entire sample period:

  #ts device rd_mb_s rd_cnc   rd_rt wr_mb_s wr_cnc   wr_rt busy in_prg
  1.0 sdj1       0.6    4.1   228.3     0.0    0.1    45.3  11%      0
  2.0 sdj1       0.5    1.4    90.8     0.0    0.0     4.0  25%      0
  3.0 sdj1       0.6    0.2    12.3     0.3    0.1     4.8   9%      0
  4.0 sdj1       0.2    0.1     8.7     0.0    0.0     1.4   2%      0
  5.0 sdj1       0.7    4.2   192.6     0.0    0.0     2.2  23%      0
  6.1 sdj1       0.7    4.3   196.9     0.0    0.1    34.0  23%      0
  7.1 sdj1       0.4    1.7   123.4     0.0    0.2   204.0  22%      0
  8.1 sdj1       0.4    0.1     8.3     0.0    0.0     2.3   3%      0
  9.1 sdj1       0.2    0.0     7.7     0.1    0.0     3.0   2%      2
 10.1 sdj1       0.8    0.5    18.7     0.0    0.0     1.8   8%      0
 11.1 sdj1       0.5    0.1     4.5     0.1    0.0     4.6   4%      0
 12.1 sdj1       0.5    0.1     4.4     0.0    0.0     1.8   3%      0
 13.1 sdj1       0.7    5.1   235.2     0.0    0.0     1.7  28%      0
 14.1 sdj1       0.5    0.1     5.8     0.0    0.0     1.0   4%      2
 15.1 sdj1       0.0    0.0     0.0     1.6    1.2    12.7  15%      0
 16.1 sdj1       0.3    0.2    16.2     0.0    0.0     5.5   3%      0
 17.2 sdj1       0.2    0.1    12.2     0.0    0.0     3.5   2%      0
 18.2 sdj1       0.2    0.1    10.6     0.0    0.0     3.2   2%      0
 19.2 sdj1       0.2    0.1    13.4     0.0    0.0     4.2   3%      0
 20.2 sdj1       0.2    0.1    14.1     0.0    0.0     3.7   2%      0
 21.2 sdj1       0.2    0.1    10.2     0.0    0.0     2.5   2%      0
 22.2 sdj1       0.4    0.3    19.1     0.0    0.0     1.0  23%     10
 23.2 sdj1       0.3    1.8   176.8     0.0    0.1    34.0   1%      0
 24.2 sdj1       0.2    0.1    16.0     0.0    0.0     3.5   3%      0
 25.2 sdj1       0.2    0.1     9.4     0.1    0.0     2.8   2%      0
 26.2 sdj1       0.2    0.1    13.1     1.5    1.4    17.4  15%      0
 27.2 sdj1       0.0    0.0     0.0     0.0    0.0     1.2   0%      0
 28.3 sdj1       0.8    0.5    19.3     0.1    0.0     3.2   8%      0
 29.3 sdj1       0.0    0.0     0.0     0.1    0.0     9.5   4%      0

Yes, that’s right, average latency during some of those samples was over 230 milliseconds per operation. At that rate you could expect to get about 4 reads per second from that device. You can probably guess that this database server is in pretty severe trouble, waiting 230 milliseconds for the disk to respond. Indeed it is — and as you can see, it’s not like we’re really asking all that much of the disks. We’re trying to read a total of 3.7 MB/s and write 1.4 MB/s, and we’re being stalled for a quarter of a second sometimes. This is causing acute performance problems for the database, manifested as epic stalls and server lockups which show up as sky-high latency spikes in New Relic.

Suddenly EC2 doesn’t seem like such a good deal for this database after all (I emphasize, for this database). To summarize:

  • Server one in the datacenter is maybe a $10k machine with a $3000 disk array (say $4000 total per year plus colo costs, if you buy the server and rent a rack), responding to the database in generally sub-millisecond latencies, at a throughput of 30-40MB/s with quite a bit of headroom for more throughput.
  • Server two in the cloud costs about $17k to run per year, plus about $1500 per year in disk cost (up to $3000 per year now that they’ve added 10 more volumes), and is responding to the database in the tens and hundreds of milliseconds — highly variable from second to second and device to device — and causing horrible database pile-ups.
  • We’re comparing apples and oranges no matter what, but put simply, price is in the same order of magnitude, but performance is two to three orders of magnitude different.

I don’t want to be seen as bashing cloud computing or any cloud platform. It is my intention to show that under these circumstances, EBS doesn’t deliver good QoS or a good price-to-performance ratio for the database as compared to a handful of SSDs (or traditional disks, for that matter, which would show writes in the sub-ms range and reads in the 3-5ms range). We have a lot of customers running databases in various cloud services with great results. In particular, it can be made to work really well if the database fits in memory and isn’t write-intensive. But the high and unpredictable latency of EBS storage means that if the active set of data stops fitting in memory, or if a lot of transactional writes are required, then performance takes a severe hit.

In conclusion, in a knock-down-drag-out fight between the EBS gang and the SSD thugs, a small number of SSDs mops the floor with the competition, and walks away with the prize.

(For my future reference, the EBS sample file here was named 2011_02_10_12_29_12-diskstats).

Feb
21
2011
--

Percona Server and XtraBackup weekly news, February 19th

This week’s announcement is short: last week was pretty much all-hands-on-deck for our first 5.5.8 release. Still, it wasn’t ALL about the new release:

  • The beta release of Percona Server 5.5.8 is out! More details in the release notes.
  • There are ongoing discussions about parallel replication, as well as customer requests for enhancements to row-based replication (for example, making it work with more types of schema changes on a replica), but there is nothing concrete to announce.
  • Yasufumi is working on solving the problem with slow DROP TABLE performance in InnoDB.
  • There is a lot of work on packaging and build improvements, including some packaging improvements such as providing various dependencies better in Ubuntu.
  • A lot of new test cases! Our QA engineer Valentine Gostev is very busy thinking of ways to try to break things, and writing test cases for them.
Feb
17
2011
--

How to syntax-check your my.cnf file

For a long time I’ve used a little trick to check whether there are syntax errors in a server’s my.cnf file. I do this when I need to shut down and restart the server, and I’ve either made changes to the file, or I’m worried that someone else has done so. I don’t want to have extra downtime because of a syntax error.

The trick is to examine the currently running MySQL server’s command-line from ps -eaf | grep mysqld, and then copy those options into something like the following:

/usr/sbin/mysqld <options> --help --verbose

However, this requires care. First, it should be run as a user who doesn’t have write privileges to the database directory, so it can’t actually mess with the server’s data if something goes wrong. Second, you need to specify a non-default socket and pid-file location. If you run the command as a privileged user, it will actually remove the pid file from the running server, and that can break init scripts.

Because of the above risks, I am extremely careful with this technique, and I have always wanted a better way. In fact, I only recently discovered the gotcha with the pid file. Perhaps readers can suggest something safer but still effective in the comments.

Feb
16
2011
--

Percona Server 5.5.8 Beta Release

It’s finally here! Percona Server Percona Server 5.5.8-20.0 is now available for download. This is a beta release of Percona’s enhancements to the MySQL 5.5.8 server. Here are some highlights:

  • Performance and scalability improvements throughout the server and storage engine
  • Optimizations for flash storage such as SSD, Virident, and FusionIO
  • Optimizations for cloud computing
  • The HandlerSocket plugin for NoSQL access
  • There’s an Amazon OS repository, as well as Yum and Apt repositories
  • Improvements to replication, partitioning, stored procedures
  • More diagnostics and tunability
  • More pluggability, including pluggable authentication


In addition to building on MySQL 5.5, here are the changes we’ve made from previous Percona Server releases:

New Features

Variable Changes

Other Changes

  • Additional information was added to the LOG section of the SHOW STATUS command. Bug fixed: #693269. (Yasufumi Kinoshita)
  • The SHOW PATCHES command was removed. (Vadim Tkachenko)
  • The INFORMATION_SCHEMA table XTRADB_ENHANCEMENTS was removed. (Yasufumi Kinoshita)
  • Several fields in the INFORMATION_SCHEMA table INNODB_INDEX_STATS were renamed. Bug fixed: #691777. (Yasufumi Kinoshita)
  • The XtraDB version was set to 20.0. (Aleksandr Kuzminsky)
  • Many InnoDB compilation warnings were fixed. Bug fixed:#695273. (Yasufumi Kinoshita)
  • An Amazon OS repository was created. Bug fixed: #691996. (Aleksandr Kuzminsky)

For more information, please see the following links:

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com