Jul
31
2012
--

Innodb Table Locks

Innodb uses row level locks right ? So if you see locked tables reported in SHOW ENGINE INNODB STATUS you might be confused and rightfully so as Innodb table locking is a bit more complicated than traditional MyISAM table locks.

Let me start with some examples. First lets run SELECT Query:


---TRANSACTION 12303, ACTIVE 26 sec
mysql tables in use 2, locked 0
MySQL thread id 53038, OS thread handle 0x7ff759b22700, query id 3918786 localhost root Sending data
select count(*) from sbtest,sbtest x
Trx read view will not see trx with id >= 12304, sees < 12301

As you can see in this case the query self joins the table so we observe 2 table instances (note - same table gets counted twice) in use but zero tables are locked. Innodb does not need any row locks for conventional selects it will just use MVCC to handle updates if they were to happen concurrently.

Lets now try same select but add LOCK IN SHARE MODE so it performs locking reads to validate our theory:

---TRANSACTION 12305, ACTIVE 9 sec
mysql tables in use 2, locked 2
8316 lock struct(s), heap size 1948088, 10008317 row lock(s)
MySQL thread id 53173, OS thread handle 0x7ff75963b700, query id 3936362 localhost root Sending data
select count(*) from sbtest,sbtest x lock in share mode
TABLE LOCK table `sbtest`.`sbtest` trx id 12305 lock mode IS
RECORD LOCKS space id 84 page no 38 n bits 1272 index `k` of table `sbtest`.`sbtest` trx id 12305 lock mode S
RECORD LOCKS space id 84 page no 39 n bits 1272 index `k` of table `sbtest`.`sbtest` trx id 12305 lock mode S
RECORD LOCKS space id 84 page no 55 n bits 1272 index `k` of table `sbtest`.`sbtest` trx id 12305 lock mode S
RECORD LOCKS space id 84 page no 73 n bits 1272 index `k` of table `sbtest`.`sbtest` trx id 12305 lock mode S

Aha! Now we have 2 tables in use and 2 tables locked reported. If we go down to see details about locks held (feature available in Percona Server) we can see the table is locked in "IS" mode and there are number of row level locks in "S" mode. What does it mean ? Well we asked Innodb to do locking reads so it has to lock all the rows which are being touched. However with Innodb's lock hierarchy this also means the table need to be locked in "IS" mode. "IS" means Intent-Share - locking the table with intent to lock some of the rows in Shared mode. Intention locks are very loose IS lock on the table does not conflict with any other locks other than X lock on the whole table, which would only be set if you're doing table level operations, such as dropping the table.

If you're attentive you will also note the locks are set on index "k" - this is because Innodb decided to do index scan to resolve this query, so it is locking the entries in this index rather than primary key.

Lets now see about writes:

---TRANSACTION 12304, ACTIVE 3 sec fetching rows
mysql tables in use 1, locked 1
9417 lock struct(s), heap size 915896, 696679 row lock(s)
MySQL thread id 53173, OS thread handle 0x7ff75963b700, query id 3929840 localhost root Updating
update sbtest set c=concat(c,'c')
TABLE LOCK table `sbtest`.`sbtest` trx id 12304 lock mode IX
RECORD LOCKS space id 84 page no 6 n bits 144 index `PRIMARY` of table `sbtest`.`sbtest` trx id 12304 lock_mode X
RECORD LOCKS space id 84 page no 7 n bits 144 index `PRIMARY` of table `sbtest`.`sbtest` trx id 12304 lock_mode X
RECORD LOCKS space id 84 page no 8 n bits 144 index `PRIMARY` of table `sbtest`.`sbtest` trx id 12304 lock_mode X
RECORD LOCKS space id 84 page no 11 n bits 144 index `PRIMARY` of table `sbtest`.`sbtest` trx id 12304 lock_mode X
RECORD LOCKS space id 84 page no 14 n bits 144 index `PRIMARY` of table `sbtest`.`sbtest` trx id 12304 lock_mode X
RECORD LOCKS space id 84 page no 15 n bits 144 index `PRIMARY` of table `sbtest`.`sbtest` trx id 12304 lock_mode X
RECORD LOCKS space id 84 page no 16 n bits 144 index `PRIMARY` of table `sbtest`.`sbtest` trx id 12304 lock_mode X

When updating the table it also gets "locked", now with IX lock... this is because update needs to lock the rows in exclusive mode. Similar to IS lock IX is rather lose - I can have multiple update queries running on the table each locking it in IX mode, which will not conflict unless they touch the same rows.

Now finally lets illustrate how MySQL and Innodb level locks play together with each other. To do this we can issue LOCK TABLE sbtest WRITE and repeat our update query. We will still see table reported as locked in "IX" in SHOW ENGINE INNODB STATUS while concurrent updates to this table will be prevented until it is unlocked. What does this illustrate ? Very simple - SHOW INNODB STATUS does not know anything about MySQL level locks, so table locked on MySQL level with LOCK TABLES will not show up out there.

Now you may spotted important difference between MyISAM and Innodb when it comes to Table Level Locks. For MyISAM tables running UPDATE query on the table is essentially equivalent to locking table for write (on MySQL Level) before operation and unlocking it straight after. Not so for Innodb. Unless table is being locked explicitly Innodb "converts" table lock to "no lock" hence eliminating conflicts on MySQL level table locks for most queries.

Summary: MySQL Table level locks and Innodb Table Level locks are two separate beings. You almost never will run into problems with Innodb table level locks because innodb will only set intentional level locks for everything by DDL operations. If you're having locking issues with Innodb chances are it is row level locks or auto increment table level lock (mostly with MySQL 5.0 and older MySQL versions). MySQL level locks are entirely different story. Explicitly locking tables on MySQL level will prevent tables from being accessed and will not show up in SHOW ENGINE INNODB STATUS. It is a good practice not to use LOCK TABLES when you're using Innodb Tables.

Jul
30
2012
--

Percona’s “Developer Training for MySQL” is now available via Live Virtual Training

Today, Percona announces that Live Virtual Training is now an option for busy professionals.

We set out to develop a training solution that balances the time to deliver the content with the amount of time an attendee can participate in a live virtual training (LVT) session. The goal was to ensure there was still time in the day to allow the attendee to complete his or her work. Percona’s LVT offering extends the days of delivery while reducing the time the attendee needs to be in class each day. Our LVT sessions are a combination of live virtual sessions and remote labs,  blended with online learning. How does this work?

We ask that attendees set aside 2-hours each day for 5-days for the live virtual sessions. This is the time when the Percona instructor delivers the lecture, shows examples, and answers questions. After the live session is over, attendees have a remote lab that will take, on average, 30-minutes to complete. The remote lab is hosted in the cloud and the attendee has 22-hours to finish it. All we ask is that it is done before the live lecture starts the next day. Outside of the live virtual session, the attendee has access to online forums where they can ask questions of the instructor and their fellow students.

For the month of August, you can enroll in one of the LVT sessions for Developer Training for MySQL and save over 35%. The sessions we have scheduled are:

  • August 6, Monday – Friday, 9am – 11am Eastern timezone.
  • August 20, Monday – Friday, 9am – 11am Pacific timezone.
  • August 27, Monday – Friday, 9am – 11am GMT timezone.

To learn more and take advantage of our introductory offer, go to our online store and purchase a seat today.

We will announce availability of other LVT offerings as they become available.

Jul
30
2012
--

Percona Live NY 2012 Super Saver discounts end tomorrow. Book Now!

If you’re looking for best prices for Percona Live NY 2012, taking place Oct 1,2 in New York do not delay and book now. Super Saver discounts ends tomorrow, which can save you hundreds of dollars off conference prices list prices. These are the best prices you can get by booking early – There will be no other discounts available that will get you to this price.

Percona Live NY 2012 builds on success on our 2011 NY event Which attracted over 300 attendees in May 2011. We have expanded the event to 2 days, adding Tutorial Day with many in depth and hands on training sessions for most popular MySQL topics. We also have moved to the new, nice, larger venue which can accommodate more people and more tracks. We also have more expo space for sponsors to present their solutions and technologies.

Our conference day features over 30 high pace conference sessions and keynotes, presented by experts from Percona, EffectiveMySQL, PalominoDB, Monti Program AB, Akiban, Tokutek, Continuent, Codeship as well as representatives from some of the words most demanding MySQL installations at Facebook, Youtube, Tumblr, Etsy, Paypal, HP, and many others.

This is really going to be MySQL event of the year on East Coast !

Jul
26
2012
--

SQL Injection Questions Followup

I presented a webinar today about SQL Injection, to try to clear up some of the misconceptions that many other blogs and articles have about this security risk.  You can register for the webinar even now that I’ve presented it, and you’ll be emailed a link to the recording, which will be available soon.

During my webinar, a number of attendees asked some good questions, and I wasn’t able to answer them all before the hour was up.  Here are the questions and my answers.

Tobin C. asked:
Q: Does the use of Parameters (particulary OdbcParameter class in .NET) qualify as an appropriate security mechanism for normal WHERE interpolation? Or should the input be validated before creating a parameter?

Yes, the OdbcParameter class should be safe.  The OdbcParameter abstracts query parameters in order to make sure you can use ODBC in a database-agnostic way.  Vendor differences in parameter naming are solved by ODBC interpolating parameter values before it sends the query string to be prepared.

This is different from a true parameterized query, which separates the prepare step from the parameter binding step as I showed in my presentation.  However, it’s still safe because the interpolation is done in ODBC with escaping code that is mature and thoroughly tested.

PHP’s PDO library does something similar, interpolating parameter values into a query string before prepare.  But I’m not so sure the PDO code is as thoroughly tested.

Jonathan C. asked:
Q: Is your book available through any of the digital library subscription services?

You can purchase my book SQL Antipatterns: Avoiding the Pitfalls of Database Programming in digital formats for Kindle, iOS, and PDF at the publisher site: http://pragprog.com/book/bksqla/sql-antipatterns

Jonathan C. also commented:
Q: My preferred variable parameter format:

bug_id IN (?" . str_repeat(', ?', count($params) - 1) . ")

Good suggestion, Jonathan. Since you can use parameters for only one value at a time, it’s tricky to create IN predicates.  But you can add a variable number of parameter placeholders based on the length of the array of values in your application. Of course there are multiple ways of doing that.  Here’s an example of how I do it using PHP:

$placeholders = join(",", array_fill(0, count($params), "?"));
$sql = "SELECT * FROM Bugs WHERE bug_id IN ($placeholders)";
$stmt = $pdo->prepare($sql);
$stmt->execute($params);

Radu M. asked:
Q: What about INSERT-ing binary data? Do I have to protect myself in this situation? Why and how? 

Any interpolation of unsafe content into an SQL string is potentially an injection risk.  If you have binary data, you could interpolate the binary bytes, so it might not be clear how to escape input that contains non-printing characters.

The standard API function mysql_real_escape_string() is aware of binary bytes, and will escape appropriately.  SQL also has a builtin QUOTE() function so you can do this in SQL expressions instead of in application code.  See http://dev.mysql.com/doc/refman/5.5/en/string-literals.html for more information on string literals.

Allan F. asked:
Q: Is it the DBA’s job to do review the codes that might have SQL injection?

Good question!  It’s primarily the responsibility of developers to do self-review and peer review of application code for security handling, but the DBA can definitely help and offer insights into SQL and how the injection risks could affect the database.  The DBA should be included in the review process, but it’s not her responsibility alone.

Brian P. commented:
Q: If you’re writing Dynamic SQL in your Stored Procedures you have other issues :)  

I consulted for a customer who uses dynamic SQL in stored procedures extensively.  They found that it was awkward because of the limits of the procedure language MySQL offers, and also it moves too much computation load onto their database server.  Their MySQL server became CPU-bound, while their multiple PHP application servers were often idle, waiting for the database server.  It would make sense to scale out as much work as possible to the app servers.

Having said that, it’s hard to generalize.  There are always edge cases where it’s appropriate to use a stored procedure, even one that writes dynamic SQL.  And it only takes one SQL injection vulnerability to cause a big problem.  Your site is only as secure as its weakest link, which makes coding and testing for security very challenging.  Whereas with performance, your site is working well if you’ve optimized the majority of the most frequently-accessed code, even if not every bit of code.

Andrey M. asked:
Q: Could you specify input data strings I should use to find a security bug in my code?

This is a big topic.  One simple answer is that you can try embedding a string-termination quote into an input data string, followed by invalid SQL syntax:

http://example.com/page.php?param=string' TEST TEST TEST

Hopefully, you have written your code to raise errors so you will then see during testing where you have problems with input being interpolated in an unsafe manner.  You don’t need to perform a successful breakin to test for this type of security flaw.

The more difficult part is making sure to test for all such cases in your code where SQL statements interpolate external content.  Unfortunately, this can be laborious and time-consuming to do code review to find all these cases.

Andrey M. also asked:
Q: Should I delete all unsafe sentences from user input before I store the data in DB?

I assume you mean to remove content that looks like it might contain SQL reserved words?  If we did that, how would we store the legitimate name of a Mr. Order, who lives at 123 Union Street?

If you are careful to escape or parameterize to ensure you insert data values safely, and then also assume that data in the database must be protected again if you use query results in a subsequent SQL query, then you should be able to store such strings safely.

Another risk of storing code-like strings in the database is that they may contain illicit JavaScript.  This is not an SQL injection risk, but it can cause trouble when you retrieve the string from the database and output it in an HTML presentation.  This type of security issue is as common a SQL injection, and you can read more about it here:  https://www.owasp.org/index.php/Cross_Site_Scripting_Flaw

I’d like to see some of the folks who attended my SQL Injection webinar when I present the popular Percona Training in Salt Lake City, September 24-27.  See http://www.percona.com/training/ for details on our training offerings and the schedule for upcoming events.

Jul
25
2012
--

Percona XtraDB Cluster: Failure Scenarios with only 2 nodes

During the design period of a new cluster, it is always advised to have at least 3 nodes (this is the case with PXC but it’s also the same with PRM). But why and what are the risks ?

The goal of having more than 2 nodes, in fact an odd number is recommended in that kind of clusters, is to avoid split-brain situation. This can occur when quorum (that can be simplified as “majority vote”) is not honoured. A split-brain is a state in which the nodes lose contact with one another and then both try to take control of shared resources or provide simultaneously the cluster service.
On PRM the problem is obvious, both nodes will try to run the master and slave IPS and will accept writes. But what could happen with Galera replication on PXC ?

Ok first let’s have a look with a standard PXC setup (no special galera options), 2 nodes:

two running nodes (percona1 and percona2), communication between nodes is ok

[root@percona1 ~]# clustercheck 
HTTP/1.1 200 OK

Content-Type: Content-Type: text/plain



Node is running.

Same output on percona2

Now let’s check the status variables:


| wsrep_local_state_comment  | Synced (6)                           |
| wsrep_cert_index_size      | 2                                    |
| wsrep_cluster_conf_id      | 4                                    |
| wsrep_cluster_size         | 2                                    |
| wsrep_cluster_state_uuid   | 8dccca9f-d4b8-11e1-0800-344f6b618448 |
| wsrep_cluster_status       | Primary                              |
| wsrep_connected            | ON                                   |
| wsrep_local_index          | 0                                    |
| wsrep_ready                | ON                                   |

on percona2:


| wsrep_local_state_comment  | Synced (6)                           |
| wsrep_cert_index_size      | 2                                    |
| wsrep_cluster_conf_id      | 4                                    |
| wsrep_cluster_size         | 2                                    |
| wsrep_cluster_state_uuid   | 8dccca9f-d4b8-11e1-0800-344f6b618448 |
| wsrep_cluster_status       | Primary                              |
| wsrep_connected            | ON                                   |
| wsrep_local_index          | 1                                    |
| wsrep_ready                | ON                                   |

only wsrep_local_index defers as expected.

Now let’s stop the communication between both nodes (using firewall rules):

iptables -A INPUT -d 192.168.70.3 -s 192.168.70.2 -j REJECT

This rule simulates a network outage that makes the connections between the two nodes impossible (switch/router failure)

[root@percona1 ~]# clustercheck 
HTTP/1.1 503 Service Unavailable

Content-Type: Content-Type: text/plain



Node is *down*.

We can see that the node appears down, but we can still run some statements on it:

on node1:

| wsrep_local_state_comment  | Initialized (0)                      |
| wsrep_cert_index_size      | 2                                    |
| wsrep_cluster_conf_id      | 18446744073709551615                 |
| wsrep_cluster_size         | 1                                    |
| wsrep_cluster_state_uuid   | 8dccca9f-d4b8-11e1-0800-344f6b618448 |
| wsrep_cluster_status       | non-Primary                          |
| wsrep_connected            | ON                                   |
| wsrep_local_index          | 0                                    |
| wsrep_provider_name        | Galera                               |
| wsrep_provider_vendor      | Codership Oy info@codership.com      |
| wsrep_provider_version     | 2.1(r113)                            |
| wsrep_ready                | OFF                                  |

on node2:

| wsrep_local_state_comment  | Initialized (0)                      |
| wsrep_cert_index_size      | 2                                    |
| wsrep_cluster_conf_id      | 18446744073709551615                 |
| wsrep_cluster_size         | 1                                    |
| wsrep_cluster_state_uuid   | 8dccca9f-d4b8-11e1-0800-344f6b618448 |
| wsrep_cluster_status       | non-Primary                          |
| wsrep_connected            | ON                                   |
| wsrep_local_index          | 0                                    |
| wsrep_provider_name        | Galera                               |
| wsrep_provider_vendor      | Codership Oy info@codership.com      |
| wsrep_provider_version     | 2.1(r113)                            |
| wsrep_ready                | OFF                                  |

And if you test to use the mysql server:

[root@percona1 ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 868
Server version: 5.5.24

Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

percona1 mysql> use test
ERROR 1047 (08S01): Unknown command

If you try to insert data just while the communication problem occurs, here is what you will have:

percona1 mysql> insert into percona values (0,'percona1','peter');
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
percona1 mysql> insert into percona values (0,'percona1','peter');
ERROR 1047 (08S01): Unknown command

Is is possible anyway to have a two nodes cluster ?

So, by default, Percona XtraDB Cluster does the right thing, this is how it needs to work and you don’t suffer critical problem when you have enough nodes.
But how can we deal with that and avoid the resource to stop ? If we check the list of parameters on galera’s wiki wcan see that there are two options referring to that:

pc.ignore_quorum: Completely ignore quorum calculations. E.g. in case master splits from several slaves it still remains operational. Use with extreme caution even in master-slave setups, because slaves won’t automatically reconnect to master in this case.
pc.ignore_sb : Should we allow nodes to process updates even in the case of split brain? This is a dangerous setting in multi-master setup, but should simplify things in master-slave cluster (especially if only 2 nodes are used).

Let’s try first with ignoring quorum:

By ignoring the quorum, we ask to the cluster to not perform the majority calculation to define the Primary Component (PC). A component is a set of nodes which are connected to each other and when everything is ok, the whole cluster is one component. For example if you have 3 nodes, and if 1 node gets isolated (2 nodes can see each others and 1 node can see only itself), we have then 2 components and the quorum calculation will be 2/3 (66%) on the 2 nodes communicating each others and 1/3 (33%) on the single one. In this case the service will be stopped on the nodes where the majority is not reached. The quorum algorithm helps to select a PC and guarantees that there is no more than one primary component in the cluster.
In our 2 nodes setup, when the communication between the 2 nodes is broken, the quorum will be 1/2 (50%) on both node which is not the majority… therefore the service is stopped on both node. In this case, service means accepting queries.

Back to our test, we check that the data is the same on both nodes:

percona1 mysql> select * from percona;
+----+---------------+--------+
| id | inserted_from | name   |
+----+---------------+--------+
|  2 | percona1      | lefred |
|  3 | percona2      | kenny  |
+----+---------------+--------+
2 rows in set (0.00 sec)

percona2 mysql> select * from percona;
+----+---------------+--------+
| id | inserted_from | name   |
+----+---------------+--------+
|  2 | percona1      | lefred |
|  3 | percona2      | kenny  |
+----+---------------+--------+
2 rows in set (0.00 sec)

Adding ignore quorum and restart both nodes:

set global wsrep_provider_options=”pc.ignore_quorum=true”; (this seems to not work properly currently, I needed to change it in my.cnf and restart the nodes)

wsrep_provider_options = “pc.ignore_quorum = true”

break again connection between both nodes

iptables -A INPUT -d 192.168.70.3 -s 192.168.70.2 -j REJECT

and perform an insert, this first insert will take longer:

percona1 mysql> insert into percona values (0,'percona1','vadim');
Query OK, 1 row affected (7.77 sec)
percona1 mysql> insert into percona values (0,'percona1','miguel');
Query OK, 1 row affected (0.01 sec)

and on node2 you can also add records:

percona2 mysql> insert into percona values (0,'percona2','jay');
Query OK, 1 row affected (0.02 sec)

The wsrep status variables are like this now:

| wsrep_local_state_uuid     | e20f9da7-d509-11e1-0800-013f68429ec1 |
| wsrep_local_state_comment  | Synced (6)                           |
| wsrep_cluster_size         | 1                                    |
| wsrep_cluster_state_uuid   | e20f9da7-d509-11e1-0800-013f68429ec1 |
| wsrep_cluster_status       | Primary                              |
| wsrep_connected            | ON                                   |
| wsrep_ready                | ON                                   |

| wsrep_local_state_uuid     | e20f9da7-d509-11e1-0800-013f68429ec1 |

| wsrep_local_state_comment  | Synced (6)                           |
| wsrep_cluster_size         | 1                                    |
| wsrep_cluster_state_uuid   | e20f9da7-d509-11e1-0800-013f68429ec1 |
| wsrep_cluster_status       | Primary                              |
| wsrep_connected            | ON                                   |
| wsrep_ready                | ON                                   |

Then we fix the connection problem:

iptables -D INPUT -d 192.168.70.3 -s 192.168.70.2 -j REJECT

nothing changes, data stays different:

percona1 mysql> select * from percona;
+----+---------------+--------+
| id | inserted_from | name   |
+----+---------------+--------+
|  2 | percona1      | lefred |
|  3 | percona2      | kenny  |
|  5 | percona1      | liz    |
|  9 | percona1      | ewen   |
| 11 | percona1      | vadim  |
| 13 | percona1      | miguel |
+----+---------------+--------+
6 rows in set (0.00 sec)

percona2 mysql> select * from percona;
+----+---------------+--------+
| id | inserted_from | name   |
+----+---------------+--------+
|  2 | percona1      | lefred |
|  3 | percona2      | kenny  |
|  5 | percona1      | liz    |
|  9 | percona1      | ewen   |
| 10 | percona2      | jay    |
+----+---------------+--------+
5 rows in set (0.01 sec)

You can keep inserting data, it won’t be replicated and you will have 2 different version of your inconsistent data !

Also when we restart a it will request an SST or in certain case fail to start like this:

120723 23:45:30 [ERROR] WSREP: Local state seqno (6) is greater than group seqno (5): states diverged. Aborting to avoid potential data loss. Remove '/var/lib/mysql//grastate.dat' file and restart if you wish to continue. (FATAL)
         at galera/src/replicator_str.cpp:state_transfer_required():34
120723 23:45:30 [Note] WSREP: applier thread exiting (code:7)
120723 23:45:30 [ERROR] Aborting

Of course all the data that was written on the node we just restarted is lost after the SST.

Now let’s try with pc.ignore_sb=true:

When the quorum algorithm fails to select a Primary Component, we have then a split-brain condition. In our 2 nodes setup when a node loses connection to it’s only peer, the default is to stop accepting queries to avoid database inconsistency. We can bypass this behaviour by ignoring the split-brain by adding

wsrep_provider_options = “pc.ignore_sb = true” in my.cnf

Then we can insert in both nodes without any problem when the connection between the nodes is gone:

percona1 mysql> insert into percona values (0,'percona1','jaime');
Query OK, 1 row affected (0.02 sec)

percona1 mysql> select * from percona;
+----+---------------+--------+
| id | inserted_from | name   |
+----+---------------+--------+
|  2 | percona1      | lefred |
|  3 | percona2      | kenny  |
|  5 | percona1      | liz    |
|  9 | percona1      | ewen   |
| 11 | percona1      | vadim  |
| 13 | percona1      | miguel |
| 14 | percona1      | marcos |
| 15 | percona1      | baron  |
| 16 | percona1      | brian  |
| 17 | percona1      | jaime  |
+----+---------------+--------+
10 rows in set (0.00 sec)

percona2 mysql> insert into percona values (0,'percona2','daniel');
Query OK, 1 row affected (0.02 sec)

percona2 mysql> select * from percona;
+----+---------------+--------+
| id | inserted_from | name   |
+----+---------------+--------+
|  2 | percona1      | lefred |
|  3 | percona2      | kenny  |
|  5 | percona1      | liz    |
|  9 | percona1      | ewen   |
| 11 | percona1      | vadim  |
| 13 | percona1      | miguel |
| 14 | percona1      | marcos |
| 15 | percona1      | baron  |
| 16 | percona1      | brian  |
| 17 | percona2      | daniel |
+----+---------------+--------+
10 rows in set (0.00 sec)

When the connection is back, the two servers are like independent, these are now two single node clusters.

We can see it in the log file:

120724 10:58:09 [Note] WSREP: evs::proto(86928728-d56d-11e1-0800-f7c4916d8330, GATHER, view_id(REG,7e6d285b-d56d-11e1-0800-2491595e99bb,2)) detected inactive node: 7e6d285b-d56d-11e1-0800-2491595e99bb
120724 10:58:09 [Warning] WSREP: Ignoring possible split-brain (allowed by configuration) from view:
view(view_id(REG,7e6d285b-d56d-11e1-0800-2491595e99bb,2) memb {
	7e6d285b-d56d-11e1-0800-2491595e99bb,
	86928728-d56d-11e1-0800-f7c4916d8330,
} joined {
	7e6d285b-d56d-11e1-0800-2491595e99bb,
} left {
} partitioned {
})
to view:
view(view_id(TRANS,7e6d285b-d56d-11e1-0800-2491595e99bb,2) memb {
	86928728-d56d-11e1-0800-f7c4916d8330,
} joined {
} left {
} partitioned {
	7e6d285b-d56d-11e1-0800-2491595e99bb,
})
120724 10:58:09 [Note] WSREP: view(view_id(PRIM,86928728-d56d-11e1-0800-f7c4916d8330,3) memb {
	86928728-d56d-11e1-0800-f7c4916d8330,
} joined {
} left {
} partitioned {
	7e6d285b-d56d-11e1-0800-2491595e99bb,
})
120724 10:58:09 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
120724 10:58:09 [Note] WSREP: forgetting 7e6d285b-d56d-11e1-0800-2491595e99bb (tcp://192.168.70.2:4567)
120724 10:58:09 [Note] WSREP: deleting entry tcp://192.168.70.2:4567
120724 10:58:09 [Note] WSREP: (86928728-d56d-11e1-0800-f7c4916d8330, 'tcp://0.0.0.0:4567') turning message relay requesting off
120724 10:58:09 [Note] WSREP: STATE_EXCHANGE: sent state UUID: b0dddbb3-d56d-11e1-0800-b62dc3759660
120724 10:58:09 [Note] WSREP: STATE EXCHANGE: sent state msg: b0dddbb3-d56d-11e1-0800-b62dc3759660
120724 10:58:09 [Note] WSREP: STATE EXCHANGE: got state msg: b0dddbb3-d56d-11e1-0800-b62dc3759660 from 0 (percona2)
120724 10:58:09 [Note] WSREP: Quorum results:
	version    = 2,
	component  = PRIMARY,
	conf_id    = 2,
	members    = 1/1 (joined/total),
	act_id     = 16,
	last_appl. = 0,
	protocols  = 0/4/2 (gcs/repl/appl),
	group UUID = e20f9da7-d509-11e1-0800-013f68429ec1
120724 10:58:09 [Note] WSREP: Flow-control interval: [8, 16]
120724 10:58:09 [Note] WSREP: New cluster view: global state: e20f9da7-d509-11e1-0800-013f68429ec1:16, view# 3: Primary, number of nodes: 1, my index: 0, protocol version 2

Now when we put the two settings to true, with a two node cluster, it acts exactly like when ignore_sb is enabled.
And if the local state seqno is greater than group seqno it fails to restart. You need again to delete the file grastate.dat to request a full SST and you loose again some data.

This is why two node clusters is not recommended at all. Now if you have only storage for 2 nodes, using the galera arbitrator is a very good alternative then.

On a third node, instead of running Percona XtraDB Cluster (mysqld) just run garbd:

Currently there is no init script for garbd, but this is something easy to write as it can run in daemon mode using -d

[root@percona3 ~]# garbd -a gcomm://192.168.70.2:4567 -g trimethylxanthine
2012-07-24 11:42:50.237  INFO: Read config: 
	daemon:  0
	address: gcomm://192.168.70.2:4567
	group:   trimethylxanthine
	sst:     trivial
	donor:   
	options: gcs.fc_limit=9999999; gcs.fc_factor=1.0; gcs.fc_master_slave=yes
	cfg:     
	log:     

2012-07-24 11:42:50.248  INFO: protonet asio version 0
2012-07-24 11:42:50.252  INFO: backend: asio
2012-07-24 11:42:50.254  INFO: GMCast version 0
2012-07-24 11:42:50.255  INFO: (eed227a2-d573-11e1-0800-b8b68845d409, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2012-07-24 11:42:50.255  INFO: (eed227a2-d573-11e1-0800-b8b68845d409, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2012-07-24 11:42:50.257  INFO: EVS version 0
2012-07-24 11:42:50.258  INFO: PC version 0
2012-07-24 11:42:50.258  INFO: gcomm: connecting to group 'trimethylxanthine', peer '192.168.70.2:4567'
2012-07-24 11:42:50.270  INFO: (eed227a2-d573-11e1-0800-b8b68845d409, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://192.168.70.3:4567 
2012-07-24 11:42:50.533  INFO: (eed227a2-d573-11e1-0800-b8b68845d409, 'tcp://0.0.0.0:4567') turning message relay requesting off
2012-07-24 11:42:51.080  INFO: declaring 577c0ad0-d570-11e1-0800-198f7634d8ec stable
2012-07-24 11:42:51.080  INFO: declaring 593e68c8-d570-11e1-0800-74c74791749b stable
2012-07-24 11:42:51.088  INFO: view(view_id(PRIM,577c0ad0-d570-11e1-0800-198f7634d8ec,8) memb {
	577c0ad0-d570-11e1-0800-198f7634d8ec,
	593e68c8-d570-11e1-0800-74c74791749b,
	eed227a2-d573-11e1-0800-b8b68845d409,
} joined {
} left {
} partitioned {
})
2012-07-24 11:42:51.268  INFO: gcomm: connected
2012-07-24 11:42:51.268  INFO: Changing maximum packet size to 64500, resulting msg size: 32636
2012-07-24 11:42:51.269  INFO: Shifting CLOSED -> OPEN (TO: 0)
2012-07-24 11:42:51.271  INFO: Opened channel 'trimethylxanthine'
2012-07-24 11:42:51.272  INFO: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
2012-07-24 11:42:51.273  INFO: STATE EXCHANGE: Waiting for state UUID.
2012-07-24 11:42:51.276  INFO: STATE EXCHANGE: sent state msg: f00ff3f4-d573-11e1-0800-db814c3ecb8f
2012-07-24 11:42:51.276  INFO: STATE EXCHANGE: got state msg: f00ff3f4-d573-11e1-0800-db814c3ecb8f from 0 (percona1)
2012-07-24 11:42:51.276  INFO: STATE EXCHANGE: got state msg: f00ff3f4-d573-11e1-0800-db814c3ecb8f from 1 (percona2)
2012-07-24 11:42:51.292  INFO: STATE EXCHANGE: got state msg: f00ff3f4-d573-11e1-0800-db814c3ecb8f from 2 (garb)
2012-07-24 11:42:51.292  INFO: Quorum results:
	version    = 2,
	component  = PRIMARY,
	conf_id    = 6,
	members    = 2/3 (joined/total),
	act_id     = 19,
	last_appl. = -1,
	protocols  = 0/4/2 (gcs/repl/appl),
	group UUID = e20f9da7-d509-11e1-0800-013f68429ec1
2012-07-24 11:42:51.292  INFO: Flow-control interval: [9999999, 9999999]
2012-07-24 11:42:51.292  INFO: Shifting OPEN -> PRIMARY (TO: 19)
2012-07-24 11:42:51.292  INFO: Sending state transfer request: 'trivial', size: 7
2012-07-24 11:42:51.297  INFO: Node 2 (garb) requested state transfer from '*any*'. Selected 0 (percona1)(SYNCED) as donor.
2012-07-24 11:42:51.297  INFO: Shifting PRIMARY -> JOINER (TO: 19)
2012-07-24 11:42:51.303  INFO: 2 (garb): State transfer from 0 (percona1) complete.
2012-07-24 11:42:51.308  INFO: Shifting JOINER -> JOINED (TO: 19)
2012-07-24 11:42:51.311  INFO: Member 2 (garb) synced with group.
2012-07-24 11:42:51.311  INFO: Shifting JOINED -> SYNCED (TO: 19)
2012-07-24 11:42:51.325  WARN: 0 (percona1): State transfer to 2 (garb) failed: -1 (Operation not permitted)
2012-07-24 11:42:51.328  INFO: Member 0 (percona1) synced with group.

If the communication fails between node1 and node2, they will communicate and eventually send the changes through the node running garbd (node3) and if one node dies, the other one behaves without any problem and when the dead node comes back it will perform its IST or SST.

In conclusion: 2 nodes cluster is possible with Percona XtraDB Cluster but it’s not advised at all because it will generates a lot of problem in case of issue on one of the nodes. It’s much safer to use then a 3rd node even a fake one using garbd.

If you plan anyway to have a cluster with only 2 nodes, don’t forget that :
– by default if one peer dies or if the communication between both nodes is unstable, both nodes won’t accept queries
– if you plan to ignore split-brain or quorum, you risk to have inconsistent data very easily

Jul
24
2012
--

New variable slave_max_allowed_packet for slave servers

One month ago I wrote about how a big read_buffer_size could break the replication. The bug is not solved but now there is an official workaround to ease this problem using a new configuration variable:

slave_max_allowed_packet

This new variable will be available in 5.1.64, 5.5.26, and 5.6.6 and can establish a different limit on the max_allowed_packet for the slave servers (IO Thread). Now on a slave server the maximum size of a packet is checked against this variable and not max_allowed_packet.

The default value is 1GB and that means if we don’t tune the variable our I/O thread can read up to that amount of data. This is important if we use the binary logs for PITR. This variable solves only one part of the problem, the slave doesn’t stop working but binary log events can still be bigger than max_allowed_packet. During the recovery process the slave_max_allowed_packet is not going to help us and our recovery process could fail.

Conclusion

Now we have two workarounds for the same problem. First, use a small value for your read_buffer_size (less than max_allowed_packet) and the second one, upgrade to a version that has slave_max_allowed_packet.

Training

In September I’m going to deliver MySQL Training (Sep 3-6) in Madrid. This is the first time we are going to deliver the training in Spanish :) If you’re interested just click on the previous link to get more information.

En Septiembre voy a impartir Formación de MySQL en Madrid (Sep 3-6). Está será la primera vez que se dará la formación en Español :) Si estás interesado haz click en el enlace anterior para tener más información.

Jul
21
2012
--

Announcing Percona Server 5.5.25a-27.1

Percona is glad to announce the release of Percona Server 5.5.25a-27.1 on July 21st, 2012 (Downloads are available here and from the Percona Software Repositories).

Based on MySQL 5.5.25a, including all the bug fixes in it, Percona Server 5.5.25a-27.1 is now the current stable release in the 5.5 series. All of Percona‘s software is open-source and free, all the details of the release can be found in the 5.5.25a-27.1 milestone at Launchpad.

New Features:

  • Percona Server has extended standard behavior of variable secure-file-priv. When used with no argument, the LOAD_FILE() function will always return NULL. The LOAD DATA INFILE and SELECT INTO OUTFILE statements will fail with the following error: “The MySQL server is running with the –secure-file-priv option so it cannot execute this statement”. LOAD DATA LOCAL INFILE is not affected by the –secure-file-priv option and will still work when it’s used without an argument.
  • Percona Server now uses thread based profiling by default, instead of process based profiling. This was implemented because with process based profiling, threads on the server, other than the one being profiled, can affect the profiling information.

Bugs Fixed:

  • Percona Server would crash if userstats were enabled with any replication configured. This was a regression introduced with ssl connections count in statistics tables in Percona Server 5.5.24-26.0. Bug fixed #1008278 (Vladislav Lesin).
  • PAM authentication plugin was in different directories in 32bit and 64bit binary tarballs. Bug fixed #1007271 (Ignacio Nin).
  • Querying I_S.GLOBAL_TEMPORARY_TABLES or TEMPORARY_TABLES would crash threads working with temporary tables. Bug fixed #951588 (Laurynas Biveinis).
  • mysqld crash message wasn’t pointing to Percona Server bugtracker. Bug fixed #1007254 (Vadim Tkachenko).
  • If the tablespace has been created with MySQL 5.0 or older, importing that table could crash Percona Server in some cases. Bug fixed #1000221 (Alexey Kopytov).
  • Server started with skip-innodb crashes on SELECT * FROM INNODB_TABLE_STATS or INNODB_INDEX_STATS. Bug fixed #896439 (Stewart Smith).
  • Fixed typo for log_slow_verbosity in the code. Bug fixed #987737 (Stewart Smith).
  • Removed some patch-based source code management leftovers from the bzr branch migration. Bug fixed #988383 (Stewart Smith).
  • Fixed upstream mysql bug #60743, typo in cmake/dtrace.cmake that was making dtrace unusable. Bug fixed #1013455 (Stewart Smith).

Release notes for Percona Server 5.5.25a-27.1 are available in our online documentation.

Jul
18
2012
--

News Flash: SQL Injection Still a Problem

The threat of SQL injection has appeared prominently in the news recently:

SQL injection was documented as a security threat in 1998, but new incidents still occur every month.  Making honest mistakes, developers fail to defend against this means of attack, and the security of online data is at risk for all of us because of it.

Most computer professionals have heard of SQL injection, but advice about how to prevent this issue is generally incomplete and oversimplified.

On July 25 2012, I will present SQL Injection Myths and Fallacies as a Percona webinar, to help shed light on the nature of the problem, and effective defenses against it.  Whether you’re a software developer who uses databases, or a DBA who is responsible for secure operations, I hope you register for this webinar and join me as an evangelist for secure web programming.

Your online information is at risk too!

Jul
11
2012
--

Meet Percona Team at OSCON

Yes, We will be at OSCON next week. I will be talking about Optimizing MySQL Configuration and host a BOF on MySQL Sharding Replication and Clustering if you’re interested in any of these technologies please come by and share your story. I would love to see both users and technology vendors working in this field.

Jay Janssen is going to talk about Writing non-blocking code for interaction with data systems and web services in Node.js and Perl and about Running a high performance LAMP stack on a $20 Virtual server.

We’re also there at Expo Hall, please come by and say Hi our Booth number is 524

Jul
10
2012
--

Percona Playback 0.3 development release

I’m glad to announce the third Percona Playback release – another alpha release of a new software package designed to replay database server load. The first two versions were released in April, just in time for my talk at the Percona Live MySQL Conference and Expo: Replaying Database Load with Percona Playback.

This is still very much under development, so there’s likely going to be bugs. Please feel free to report bugs here: https://bugs.launchpad.net/percona-playback

Percona Playback is designed to replay database load captured either in a MySQL slow query log or a tcpdump capture of the MySQL protocol exchange between client and server.

It can replay the load either as fast as possible or in accurate mode, where it tries to replay load over the same wall time as the capture.

Current Notable Limitations:

  • tcpdump replay: IPv4 only
  • tcpdump replay: no support for server side prepared statements

Build requirements:

  • libtbb-dev (Intel Threading Building blocks)
  • boost (including boost program_options)
  • intltool
  • gettext
  • libpcap-dev
  • libcloog-ppl (if using gcc 4.6)
  • libmysqlclient-dev
  • libdrizzle-dev
  • pkg-config

Source Tarball: percona-playback-0.3.tar.gz (md5sig)

We’ve tested building on CentOS/RHEL 6, Ubuntu 10.04LTS (lucid), Ubuntu 12.04 (precise) and Debian 6.

You can build using the standard: ./configure && make && make install

Run the test suite: make check

We are planning binary packages for the next development milestone.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com