May
09
2018
--

Does the Version Number Matter?

ProxySQL

ProxySQLYes, it does! In this blog post, I am going to share my recent experiences with ProxySQL and how important the database software version number can be.

Migration

I was working on a migration to Percona XtraDB Cluster (PXC) with ProxySQL, fortunately on a staging environment first so we could catch any issues (like this one).

We installed Percona XtraDB Cluster and ProxySQL on the staging environment and repointed the staging application to ProxySQL. At first, everything looked great. We were able to do some application tests and everything looked good. I advised the customer to do more testing to make sure everything works well.

Something is wrong, but what?

A few days later the customer noticed that their application was not working properly.

We started investigating. Everything seemed well-configured, and the only thing we could see in the application log was the following:

2018-04-20 11:28:31,169 [ default-threads - 42] ERROR Error in lifecycle management : org.hibernate.StaleStateException : Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1 {it.tasgroup.monetica.gt.lifecycle.LifeCycle:line 103} (method: error)
org.hibernate.StaleStateException: Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1
at org.hibernate.jdbc.Expectations$BasicExpectation.checkBatched(Expectations.java:85)
at org.hibernate.jdbc.Expectations$BasicExpectation.verifyOutcome(Expectations.java:70)

Based on this error I still did not know what is wrong. Were some of the queries failing because of PXC, ProxySQL or some other settings?

We redirected the application to one of the nodes from PXC, and everything worked fine. We tried HAproxy as well, and everything worked again. We knew something was happening around ProxySQL which is causing the problem. But we still could not find the problem. Every query went through ProxySQL without any issue.

Debug log is our savior

The customer finally enabled the application debug logging so we could see which query was failing:

delete from TABLENAME where ID='11' and Timestamp ='2018-04-20 16:15:03';

I was confused at first: this is a kind of simple query, what could be wrong? Let’s investigate it on the cluster. When I tried to select the data on the cluster, it gave me back zero results. That’s OK, maybe the row was already deleted?

For this investigation, the slow query logging was enabled and long_query_time set to 0 to log all the queries. I checked the slow query log looking for queries like this. What I found helped me realize what the problem was:

delete from TABLENAME where ID=10 and Timestamp ='2018-04-20 11:17:22.35';
delete from TABLENAME where ID=24 and Timestamp ='2018-04-20 11:17:31.602';
delete from TABLENAME where ID=43 and Timestamp ='2018-04-20 11:18:13.2';
delete from TABLENAME where ID=22 and Timestamp ='2018-04-20 11:11:02.854';
delete from TABLENAME where ID=11 and Timestamp ='2018-04-20 11:21:57';
delete from TABLENAME where ID=64 and Timestamp ='2018-04-20 11:18:34';
delete from TABLENAME where ID=47 and Timestamp ='2018-04-20 10:38:35';
delete from TABLENAME where ID=23 and Timestamp ='2018-04-20 11:30:03';

I hope you see the difference! The first four lines have fractional seconds! At that time, the application was pointed to the cluster directly. So ProxySQL cut off the fractional seconds? That would be a nasty bug.

I checked the application log again with the debug information, and I could see the application does not even use the fractional seconds in the queries when it points to ProxySQL. This is why the query was failing (does not delete any rows), because in the table all the rows had fractional seconds but the queries were not using them.

So why does the application not use fractional seconds with ProxySQL?

First of all, fractional seconds were introduced in MySQL 5.6.4. The application is a Java-based application with Jboss and Hibernate. I knew ProxySQL reports MySQL 5.5. Maybe the application/connector reads the version number and makes decisions based on that?

It was quite easy to test this theory by just changing the version number in ProxySQL like this:

update global_variables set variable_value="5.7.21" where variable_name='mysql-server_version';
load mysql variables to run;save mysql variables to disk;

The application had to be restarted (probably it was caching the previous settings) but after that everything was working as expected.

But be careful, now it will report 5.7.21 for all the hostgroups. What if you have multiple hostgroups with different MySQL versions? It would be nice if you could define this for every hostgroup.

Conclusion

The solution was very easy, but finding the source of the problem took a long time. If you are planning to use ProxySQL, I would always recommend changing the mysql-server_version to match to the underlying MySQL server version number because who knows which connector or application checks the version and makes a decision based on that.

There is another example here where Marco Tusa had a very similar problem with a Java connector.

The post Does the Version Number Matter? appeared first on Percona Database Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com