May
13
2019
--

PostgreSQL Minor Versions Released May 9, 2019

PostgreSQL Logo

Usually, the PostgreSQL Community releases minor patches on the Thursday of the second week of the second month of each quarter. This can vary depending on the nature of the fixes. For example, if the fixes include critical fixes as a postgres community response to security vulnerabilities or bugs that might affect data integrity. The previous minor release happened on February 14, 2019. In this case, the fysnc failure issue was corrected, along with some other enhancements and fixes.

The latest minor release, published on May 9, 2019, has some security fixes and bug fixes for these PostgreSQL Major versions.

PostgreSQL 11 (11.3)
PostgreSQL 10 (10.8)
PostgreSQL 9.6 (9.6.13)
PostgreSQL 9.5 (9.5.17)
PostgreSQL 9.4 (9.4.22)

Let’s take a look at some of the security fixes in this release.

Security Fixes

CVE-2019-10130 (Applicable to PostgreSQL 9.5, 9.6, 10 and 11 versions)

When you

ANALYZE

a table in PostgreSQL, statistics of all the database objects are stored in

pg_statistic

. The query planner uses this statistical data is and it may contain some sensitive data, for example, min and max values of a column. Some of the planner’s selectivity estimators apply user-defined operators to values found in

pg_statistic

. There was a security fix :

CVE-2017-7484

in the past, that restricted a leaky user-defined operator from disclosing some of the entries of a data column.

Starting from PostgreSQL 9.5, tables in PostgreSQL not only have SQL-standard privilege system but also row security policies. To keep it short and simple, you can restrict a user so that they can only access specific rows of a table. We call this RLS (Row-Level Security).

CVE-2019-10130

is about restricting a user who has SQL permissions to read a column but is forbidden to read some rows due to RLS policy from discovering the restricted rows through a leaky operator. Through this patch, leaky operators to statistical data are only allowed when there is no relevant

RLS

policy. We shall see more about

RLS

in our future blog posts.

CVE-2019-10129 (Applicable to PostgreSQL 11 only)

This fixes the situation where a user could access arbitrary back end memory with correctly crafted DDLs. This is only applicable to PostgreSQL 11 version and no other major release has been impacted.

Fixes roundup

There have been a good number of fixes around partitioning in both PostgreSQL 10 and 11 versions that included a fix to behaviour of an UPDATE or an INSERT on certain scenarios, failure while ALTER INDEX .. ATTACH PARTITION, tuple routing in multi-level partitioned tables that have dropped attributes. etc. .

We should also mention a couple of interesting bugs addressed in this new release.

Bug # 15631 is about the catalog corruption when a temporary table on commit drop is created with an identity column. Due to this issue, PostgreSQL 11.1 would not allow the creation of temporary tables, and you might have observed this error message prior to this release:

# create temporary table foo ( bar int generated by default as identity ) on
# commit drop;
# CREATE TABLE
# \q
# psql postgres postgres
# create temporary table a (b varchar);
ERROR: could not open relation with OID 16388
LINE 1: create temporary table a (b varchar);

Another bug that got fixed is Bug # 15734 which is about a

Walsender

process crashing when executing

SHOW ALL

using replication protocol.

Some more fixes include:

1. Fix

CREATE VIEW

to allow zero-column views.
2. Add missing support for the

CREATE TABLE IF NOT EXISTS .. AS EXECUTE

  .. statement.
3. Fix incompatibility of GIN-index WAL records that were introduced in 11.2, 10.7, 9.6.12, 9.5.16, and 9.4.21 that affected replica servers running these versions reading in changes to GIN indexes from primary servers of older versions.
4. Make

pg_verify_checksums

verify that the data directory it’s pointed at is of the right PostgreSQL version.
5. Fixes for

postgres_fdw

where an

UPDATE

could lead to incorrect results or a crash.
6. Several memory leak fixes as well as fixes to management of dynamic shared memory.

The postgres community have fixed more than just the bug fixes and enhancements we’ve highlighted. We would recommend you to go through the release notes, which you can access by clicking the appropriate release in the beginning of this blog post. Let us know if you have any specific questions you’d like us to address.

Applying updates

We always recommend that you keep your PostgreSQL databases updated to the latest minor versions. Be aware that applying a minor release might need a restart after updating the new binaries.

Here is the sequence of steps you should follow to upgrade to the latest minor versions after thorough testing :

1. Shutdown the PostgreSQL database server
2. Install the updated binaries
3. Restart your PostgreSQL database server

Most of the time, you can choose to update the minor versions in a rolling fashion in a master-slave (replication) setup, because it avoids downtime for both reads and writes simultaneously. PostgreSQL logoFor a rolling style update, you could perform the update on one server after another. However, the best method that we’d almost always recommend is – shutdown, update and restart all instances at once.

If you are currently running your databases on PostgreSQL 9.3.x or earlier, we recommend that you to prepare a plan to upgrade your PostgreSQL databases to a supported version ASAP. We’ve previously published some blog posts about the various options for upgrading your PostgreSQL databases to a supported major version, and you can read our post archive here. Please subscribe to our blog posts so that you can see more interesting topics around PostgreSQL.

And don’t forget that at Percona Live in Austin, May 28-30 2019, we’ll have two days of PostgreSQL content in a postgres dedicated track. Please see all our PostgreSQL talks here.

Sep
20
2018
--

MyRocks Disk Full Edge Case

problem in MyRocks

MyRocks disk full bugRocksDB engine—and it’s MySQL implementation MyRocks—is a very good alternative engine for MySQL. It has proven to be very efficient and stable for many workloads, including those of large scale. However, it is still a relative newborn in the MySQL ecosystem, and has only a small fraction of the adoption rate of InnoDB. That means it is not so well tested at all possible edge cases, and may have many unreported bugs. One known bug is discussed here: if you are a MyRocks user, it’s important that you are aware of the possibility of lost data in the specific circumstances described below.

In writing this article, I want to broadcast a wider warning, as the problem I found is pretty serious and could lead to a very unpleasant situation.

The problem is related to not a very edge case after all – disk full. The result could be extremely bad, though. After printing some errors, the RocksDB engine continues to message clients that consecutive writes are accepted, while they are, in fact, lost! In addition, depending on the workload and at which moment the disk ran out of space, in a worst case scenario we may lose access to tables that use the RocksDB engine completely.

Let me show how bad the situation could be, using a basic sandbox and sysbench test example. I have a sandbox with MyRocks installed:

mysql [localhost] {msandbox} ((none)) > show engines\G
*************************** 1. row ***************************
Engine: ROCKSDB
Support: YES
Comment: RocksDB storage engine
Transactions: YES
XA: YES
Savepoints: YES
mysql [localhost] {msandbox} ((none)) > select @@version,@@version_comment;
+-----------+----------------------------------------------------+
| @@version | @@version_comment |
+-----------+----------------------------------------------------+
| 5.7.22-22 | Percona Server (GPL), Release 22, Revision f62d93c |
+-----------+----------------------------------------------------+
1 row in set (0.00 sec)

Once my test system had not so much free disk space remaining, I ran this simple sysbench prepare command:

$ sysbench /usr/share/sysbench/oltp_insert.lua --mysql_storage_engine=rocksdb --table-size=1000000 --tables=4 --mysql-db=db2 --mysql-user=root --mysql-password=msandbox --mysql-socket=/tmp/mysql_sandbox5723.sock --threads=4 --time=200 --report-interval=1 --events=0 --db-driver=mysql prepare
sysbench 1.0.15 (using bundled LuaJIT 2.1.0-beta2)
Initializing worker threads...
Creating table 'sbtest2'...
Creating table 'sbtest3'...
Creating table 'sbtest1'...
Creating table 'sbtest4'...
Inserting 1000000 records into 'sbtest2'
Inserting 1000000 records into 'sbtest1'
Inserting 1000000 records into 'sbtest3'
Inserting 1000000 records into 'sbtest4'
Creating a secondary index on 'sbtest4'...
FATAL: mysql_drv_query() returned error 1105 ([./.rocksdb/db2.sbtest4_k_4_4_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device) for query 'CREATE INDEX k_4 ON sbtest4(k)'
FATAL: `sysbench.cmdline.call_command' function failed: /usr/share/sysbench/oltp_common.lua:238: SQL error, errno = 1105, state = 'HY000': [./.rocksdb/db2.sbtest4_k_4_4_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device
Creating a secondary index on 'sbtest2'...
Creating a secondary index on 'sbtest1'...
Creating a secondary index on 'sbtest3'...
FATAL: mysql_drv_query() returned error 1105 ([./.rocksdb/db2.sbtest2_k_2_5_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device) for query 'CREATE INDEX k_2 ON sbtest2(k)'
FATAL: `sysbench.cmdline.call_command' function failed: /usr/share/sysbench/oltp_common.lua:238: SQL error, errno = 1105, state = 'HY000': [./.rocksdb/db2.sbtest2_k_2_5_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device
FATAL: mysql_drv_query() returned error 1105 ([./.rocksdb/db2.sbtest3_k_3_6_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device) for query 'CREATE INDEX k_3 ON sbtest3(k)'
FATAL: `sysbench.cmdline.call_command' function failed: /usr/share/sysbench/oltp_common.lua:238: SQL error, errno = 1105, state = 'HY000': [./.rocksdb/db2.sbtest3_k_3_6_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device
FATAL: mysql_drv_query() returned error 1105 ([./.rocksdb/db2.sbtest1_k_1_7_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device) for query 'CREATE INDEX k_1 ON sbtest1(k)'
FATAL: `sysbench.cmdline.call_command' function failed: /usr/share/sysbench/oltp_common.lua:238: SQL error, errno = 1105, state = 'HY000': [./.rocksdb/db2.sbtest1_k_1_7_0.bulk_load.tmp] bulk load error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device

This resulted in the following errors being printed to the error log:

2018-08-10T11:01:35.379611Z 5 [Note] RocksDB: Begin index creation (0,260)
2018-08-10T11:01:40.105596Z 3 [Note] RocksDB: Begin index creation (0,261)
2018-08-10T11:01:40.838902Z 4 [Note] RocksDB: Begin index creation (0,262)
2018-08-10T11:01:40.924093Z 6 [Note] RocksDB: Begin index creation (0,263)
2018-08-10T11:03:14.868958Z 0 [ERROR] RocksDB: Error detected in background, Status Code: 5, Status: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on device
2018-08-10T11:03:14.868986Z 0 [ERROR] RocksDB: BackgroundErrorReason: 0
2018-08-10T11:03:14.869003Z 0 [ERROR] LibRocksDB:[/mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/min-stretch-x64/percona-server-5.7.22-22/storage/rocksdb/rocksdb/db/db_impl_compaction_flush.cc:1293] Waiting after background flush error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000101.sst: No space left on deviceAccumulated background error counts: 1
2018-08-10T11:03:15.430064Z 0 [ERROR] LibRocksDB:[/mnt/workspace/percona-server-5.7-binaries-release-rocks/label_exp/min-stretch-x64/percona-server-5.7.22-22/storage/rocksdb/rocksdb/db/db_impl_compaction_flush.cc:1373] Waiting after background compaction error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000102.sst: No space left on device, Accumulated background error counts: 2
2018-08-10T11:03:18.538866Z 9 [Note] RocksDB: Begin index creation (0,268)
2018-08-10T11:03:19.897797Z 9 [ERROR] Error finishing bulk load.
2018-08-10T11:03:23.306957Z 10 [Note] RocksDB: Begin index creation (0,269)
2018-08-10T11:03:23.509448Z 8 [Note] RocksDB: Begin index creation (0,270)
2018-08-10T11:03:23.539705Z 7 [Note] RocksDB: Begin index creation (0,271)
2018-08-10T11:03:24.536092Z 10 [ERROR] Error finishing bulk load.
2018-08-10T11:03:24.659233Z 7 [ERROR] Error finishing bulk load.
2018-08-10T11:03:24.736380Z 8 [ERROR] Error finishing bulk load.

OK, so I resized my data partition online (no MySQL service restart) and gave it more free disk space. And here is what happened next. I tried to see if MySQL/MyRocks works now:

mysql [localhost] {msandbox} (db1) > select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.33 sec)
mysql [localhost] {msandbox} (db1) > insert into sbtest1 set k=1;
Query OK, 1 row affected (0.01 sec)
mysql [localhost] {msandbox} (db1) > select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.32 sec)
mysql [localhost] {msandbox} (db1) > delete from db1.sbtest2 where id<100;
Query OK, 99 rows affected (0.01 sec)
mysql [localhost] {msandbox} (db1) > select count(*) from db1.sbtest2;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.32 sec)

That’s the worst part – there’s no error returned to the client, and yet my new writes are basically gone!

The good thing is we can still read from the tables, so mysqldump or mydumper is the way to go now, BEFORE you attempt to restart the service:

mysql [localhost] {msandbox} (db1) > select id from db1.sbtest2 where id<10;
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+----+
9 rows in set (0.00 sec)

When I later try to restart the service, it crashes on shutdown. In some cases, the database is able to recover and offer access to the old RocksDB data (though the new data, written above, are still lost).

In other cases, though, your RocksDB won’t load after restart:

mysql [localhost] {msandbox} (db1) > show create table sbtest1\G
ERROR 1286 (42000): Unknown storage engine 'ROCKSDB'

and using this option:

rocksdb_wal_recovery_mode = 3

…helps to recover the engine to usable state.

Ultimately, the worst case scenario is that you end up with the MyRocks engine dead and unable to recover, no matter what options you try. We can see this in the error log when I try to start the service:

2018-08-10T09:33:18.756868Z 0 [Note] RocksDB: 2 column families found
2018-08-10T09:33:18.756993Z 0 [Note] RocksDB: Column Families at start:
2018-08-10T09:33:18.757016Z 0 [Note] cf=default
2018-08-10T09:33:18.757022Z 0 [Note] write_buffer_size=67108864
2018-08-10T09:33:18.757027Z 0 [Note] target_file_size_base=67108864
2018-08-10T09:33:18.757043Z 0 [Note] cf=__system__
2018-08-10T09:33:18.757048Z 0 [Note] write_buffer_size=67108864
2018-08-10T09:33:18.757052Z 0 [Note] target_file_size_base=67108864
2018-08-10T09:33:18.811671Z 0 [ERROR] RocksDB: Could not get index information for Index Number (0,282), table db3.sbtest4
2018-08-10T09:33:18.811741Z 0 [ERROR] RocksDB: Failed to initialize DDL manager.
2018-08-10T09:33:18.811795Z 0 [ERROR] Plugin 'ROCKSDB' init function returned error.
2018-08-10T09:33:18.811812Z 0 [ERROR] Plugin 'ROCKSDB' registration as a STORAGE ENGINE failed.

Unfortunately, we were not able to find any workaround for this ultimate worst case situation, and so recovering data—including all other RocksDB tables not touched during the incident—may be very hard.

The problem, which is obviously a bug, is reported here: https://jira.percona.com/browse/PS-4706.

Fortunately, the engineers at Facebook—the original RocksDB founder—are aware of the issue and have already released a simple patch. This allows the engine simply to crash instead of allowing lost writes. While this is not a perfect solution, it at least protects you from losing data. Related commit details can be found here: https://github.com/facebook/mysql-5.6/commit/4a7d3fb8f96c96c2be10d61a7796ccac7610a5d6.

At Percona, we accepted that fix, and the latest Percona Server for MySQL 5.7.23 version, already has it incorporated: https://github.com/percona/percona-server/commit/d0d5635573ebc4f157e36a989d14f7ee6f87c410

So, with the upgraded version, we instead will see this kind of error log event:

2018-09-19T08:12:14.496412Z 0 [ERROR] RocksDB: Error detected in background, Status Code: 5, Status: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000104.sst: No space left on device
2018-09-19T08:12:14.496435Z 0 [ERROR] RocksDB: BackgroundErrorReason: 1
2018-09-19T08:12:14.496451Z 0 [ERROR] LibRocksDB:[/mnt/workspace/percona-server-5.7-binaries-release-rocks-new/label_exp/min-stretch-x64/test/percona-server-5.7.23-23/storage/rocksdb/rocksdb/db/db_impl_compactio
n_flush.cc:1472] Waiting after background compaction error: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000104.sst: No space left on device, Accumulated background error counts: 1
2018-09-19T08:12:14.511910Z 12 [ERROR] RocksDB: failed to write to WAL, Status Code: 5, Status: IO error: No space left on deviceWhile appending to file: ./.rocksdb/000104.sst: No space left on device
2018-09-19T08:12:14.511935Z 12 [ERROR] MyRocks: aborting on WAL write error.
08:12:14 UTC - mysqld got signal 6 ;

Therefore, all MyRocks users are advised to upgrade ASAP and if that’s not possible, you should at least double check the disk space monitoring and alerting.

The post MyRocks Disk Full Edge Case appeared first on Percona Database Performance Blog.

May
11
2015
--

How Percona Support handles bugs

How Percona Support handles bugsOne of the great values of a Percona Support contract is that we provide bug fixes for covered software, and not just support in terms of advice on how to use it. This is the skill which is most likely missing from in-house for most customers, as it requires a team with code knowledge to build and test infrastructure – something only a few companies can afford to invest in.

There is a lot of misunderstanding about bugs. What is a bug? What is a feature? What is a repeatable bug? How will Percona troubleshoot the bug? In this post I will answer some of the questions about this.

Bugs vs. Features ? One thing a lot of people have a hard time understanding is the difference between a bug and a feature, or when software was designed to work a certain way which might be unwelcome. There is a gray line here, but you need to expect that some of the things you consider to be bugs will be seen as behavior-change features and will be considered as such.

Unfixable Bugs ? There are some behaviors that any sane person would call a bug, but which arise from design limitations or oversight that are impossible to fix in the current GA version without introducing changes that would destabilize it. Such bugs will need to be fixed in the next major GA release or sometimes even further in the future. Some bugs are not bugs at all but rather design tradeoffs made. These can’t be “fixed” unless different design tradeoffs are chosen.

Workaround ? There are going to be unexpected behaviors, unfixable bugs and bugs that take awhile to fix, so your first practical response to running into the bug is often finding a workaround which does not expose it. The Percona Support team will help find a workaround that causes minimal impact to your business, but be prepared: changes to the application, deployed version, schema or configuration will often be required.

Emergencies ? When you have an emergency, our focus is to restore the system to working order. In a complex system a bug fix can often not be delivered in a short period of time, which typically means finding a workaround.

Bug Turnaround ? It is not possible to guarantee the turnaround on a bug fix, as all bugs are different. Some bugs are rather trivial and we might be able to provide a hotfix 24 hours after we have a repeatable test case. In other cases the bug might be complicated and take weeks of engineering to fix or even might be impossible to fix in the current GA version.

Verified Bug Fixes ? When you submit the bug we have to verify if it is actually being a bug. In many cases it might be intended behavior; in others, a user mistake. It is also possible that the behavior has happened once and can’t be repeated. Having a repeatable test case that reveals the bug is the best way to have a bug fixed quickly. You might be able to create a repeatable test case, or our support team might be able to help you create the test case.

Sporadic Bugs ? These are very hard bug types that happen sporadically over a period of time. For example, you might have a system crash once every 3 months with no way to repeat it. The cause of such bugs can be very complicated; for example, a buffer overrun in one piece of code can cause corruption and crash in another place hours later. There are a number of diagnostic tools that exist for such bugs, but generally they take quite awhile to resolve. In addition, without a repeatable test case, it is often impossible to verify that the proposed fix actually resolves the bug.

Environmental Bugs ? Some bugs are caused by what can be called your environment. It could be some hardware bugs or incompatibilities, a build not quite compatible with your version of  operating system, operating system bugs, etc. In some cases we can very clearly point to the environment problems. In others we can suspect the environment is an issue and we may ask you to see if the bug also happens in another environment, such as different hardware or OS installation.

Hot Fixes ? As our default policy we fix bugs in the next release of our software so it can go through the full QA cycle, be properly documented, etc. If you have implemented a workaround and you can wait until the next release, this is the best choice. If not, with the Percona Platinum Support contract, we can provide you with a hotfix that is a special build containing the version of the software you’re running, and with only the bug fix of interest applied. Hotfixes are especially helpful if you’re not looking to do a full software upgrade – requiring several revisions – but want to validate the fix with the minimum possible changes. Hotfixes might also be different from the final bug fix that goes into the GA release. With hotfixes, our goal is to provide a working solution for you faster. Afterward we may optimize or re-architect the code, come up with better option names, etc.

Bug Diagnostics ? Depending on the nature of the bug there are multiple tools that our support team will use for diagnostics – finding a way to fix the bug. To set expectations correctly, it can be a very involved process, where you might need to provide a lot of information or try things on your system, such as:

  • Test case. If you have a test case that can be repeated by the Percona team to trigger the bug, the diagnostic problem is solved from the customer side. Internal debugging starts at this point. It might not be easy to get to that.
  • If we have a crash that we can’t repeat on our system we often will ask you to enable “core” file, or run the program under a debugger so we can get more information when the crash happens.
  • If the problem is related to performance, you should be ready to gather both MySQL information such as EXPLAIN, status counters, information from performance schema, etc., along with system level information such as pt-pmp output,  pt-stalk,  oprofile, perf, etc.
  • If the problem is a “deadlock,” we often need information from gdb about the full state of the system. Information from processlist, performance_schema, SHOW ENGINE INNODB STATUS can also be helpful.
  • It can be very helpful when you have a test system on which you can repeat the problem in your environment and where you can experiment without impacting production. It is not possible in all cases, but is very helpful.
  • Sometimes, for hard-to-repeat bugs, we will need to run a special diagnostics build that provides us with additional debug information. In others, we might need to run a debug build or do a run under valgrind or other software designed to catch bugs. It often has a large performance impact, so it is good to see how your workload can be scaled down in order for this to be feasible.
  • Depending on your environment we might need to login to troubleshoot your bug or might request that you upload the data needed to repeat the bug in our lab (assuming it is not too sensitive). In cases where direct login is not possible, we can help you to come to a repeatable test case via phone, chat, or email. Using screen sharing can also be very helpful.

Bugs and Non-Percona Software ? Percona Support covers some software not produced by Percona. For open source software, if it is not exempt from bug fix support, we will provide the custom build with a bug fix as well as provide the suggested fix to the software maintainer for its possible inclusion in the next release. For example, if we find a bug in the MySQL Community Edition, we will pass our suggested fix to the MySQL Engineering team at Oracle. For other software that is not open source, such as Amazon RDS, we can help to facilitate creation and submission of a repeatable test case and workaround, but we can’t provide a fix as we do not have access to the source code.

In Conclusion ? When I think about software bugs, I find some good parallels with human “bugs” (diseases).  Some issues are trivial to diagnose and the fix is obvious. Others might be very hard to diagnose ? I guess many of us have been in a situation where you visit doctor after doctor and tell them your symptoms, and they run tests but still can’t figure out what’s wrong. Once a diagnosis is done, though, it is not always given the “fix” available or feasible, and while a complete solution is preferred, sometimes we have to settle for “managing” the disease, which is our parallel to implementing changes and settling for a workaround. So in the same way as human doctors, we can’t guarantee we will get to the root of every problem, or if we do, that we will be able to fix every one of them. However, as with having good doctors – having us on your team will maximize the chance of successful resolution.

The post How Percona Support handles bugs appeared first on MySQL Performance Blog.

Apr
02
2015
--

The value of MySQL Support

The value of MySQL SupportYears ago when I worked for the MySQL Support organization at the original MySQL AB, we spoke about MySQL Support as insurance and focused on a value proposition similar to that of car insurance. For your car to be fully covered, you must purchase car insurance before the incident happens – in fact most places around the world require automobile insurance. Similarly with many organizations, any production-use technology might be mandated to have its own insurance in the way of 24/7 support.

I think however this is a very one-sided view that does not capture the full value (and ROI) that a MySQL Support contract with Percona provides. Let’s look at the different dimensions of value it provides based on the different support cases we have received throughout the years.

Reduce and Prevent Downtime
If your database goes down, the time to recover will be significantly shorter with a support agreement than without it. Cost of downtime varies widely between organizations. Gartner estimates the average cost of downtime is $5,000 per minute.

With most of our clients, we have found that the cost of preventing or rapidly reducing even one significant downtime event a year more than pays for the cost of support. Even when the client’s in-house team is very experienced, our help is often invaluable as we are exposed to a great variety of incidents from hundreds of companies, so it is much more likely we have encountered the same incident before and have a solution ready. Helping to recover from downtime quickly is a reactive part of support – you can realize even more value by proactively working with support to get advice on your HA options as well as ensure that you’re following the best database backup and security practices.

Better Security
Having a MySQL Support contract by itself is not enough to prevent all security incidents. MySQL will be only one of the components for a possible attack vector and it takes a lot of everyday work to stay secure. There is nothing that can guarantee complete security. MySQL Support, however, can be an invaluable resource for your security team to learn how to apply security and compliance practices to your MySQL environment and how to avoid typical mistakes.

The cost of data breaches can be phenomenal and also impact business reputations much more than downtime or performance issues. Depending on the company size and market, costs will vary. Different studies estimate costs ranging in average from $640K  in direct costs to $3.5M. What everyone seems to agree upon is that security risks and security costs are on the rise and you can’t afford to leave this areas unchecked.

Fix Database Software Bugs
While you might have great DBAs on your team who are comfortable with best practices and downtime recovery, most likely you do not have a development team comfortable with fixing bugs in the database kernel or supporting tools. Being able get software fixes contributes to downtime reduction as well as all kinds of other things, such as ensuring efficient development and operation teams, avoiding using complex workarounds, etc.

Reduce Resources
A large number of questions we get are performance-related which, when addressed, provide a better experience for users, saves costs, and minimizes environmental impact by using less resources.

Savings vary depending on your application scale and how much it is already optimized. In the best cases, our support team has helped customers make applications more than 10x more efficient. In most cases though, we are able to help make things at least 30% more efficient. If you’re spending $100K or more on your database environment, this benefit alone will make a support agreement well worth it.

Efficient Developers
This is important one. Way too often customers do not even give their developers access to support, even though these developers are critical in realizing the the full value of their application. Developers working with databases make many decisions about schema design, query writing, and the use of MySQL features such as stored procedures, triggers or foreign keys. Without a MySQL Support contract, developers often have resort to “Google” to find an answer – and often end up with inapplicable, outdated or simply wrong information. Combined with this, they often apply or resort to time-consuming trial and error.

With help of the Percona Support team, developers can learn the proven practices that apply to their specific situation, save a lot of time and get a better application to the market faster. Even with a single US-based developer intensively working with MySQL, a support agreement might be well worth the cost based on increased developer efficiency alone. Larger development teams simply cannot afford to not have support.

Efficient Operations
Your operations staff (DBAs, DevOps, Sysadmins) are in the same boat – if your database environment is significant, chances are you are always looking for ways to save time, make operations more efficient and reduce mistakes. Our Support team can provide you with specific actionable advice for the challenges you’re experiencing.

Chances are we have seen environments similar to yours and know which software, approaches and practices work well and which do not. This all of course contributes to downtime prevention and reduction, but also helps with team efficiency. With the Percona Support team’s help, you will be able to handle operations with a smaller team or be able to have stuff done with less experienced staff members.

Better Applications
Percona Support access helps developers not only be more productive, but also results in better application quality because best practices in application database interface design, schema, queries, etc. are followed. The Percona team has supported many applications for many years, so we often will think about problems before you might think about them, such as:

  • “How will this design play with replication or sharding?”
  • “Will it scale with large amounts of users or data?”
  • “How flexible is such a design when the  application will inevitably be evolving over years?”

While a better application is hard to quantify, it really is quite important.

Faster Time to Market
Yet another benefit that comes from developers having access to a MySQL Support team is faster time to market. For many agile applications, being able to launch new features faster is even more important than cost savings – this is how businesses can succeed against the competition. At Percona, we love helping businesses succeed.

As you see, there are a lot of ways Percona Support can contribute to the success of your business. Support is much more than “insurance” that you should consider purchasing for compliance reasons. If you’re using MySQL for your applications, Percona Support will provide a great return on investment, allowing you to minimize risks and costs while delivering the highest quality of application or service possible.

The post The value of MySQL Support appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com