Apr
20
2018
--

The Final Countdown: Are You Ready for Percona Live 2018?

Are you ready for Percona Live

Are you ready for Percona Live 2018It’s hard to believe Percona Live 2018 starts on Monday! We’re looking forward to seeing everyone in Santa Clara next week! Here are some quick highlights to remember:

  • In addition to all the amazing sessions and keynotes we’ve announced, we’ll be hosting the MySQL Community Awards and the Lightning Talks on Monday during the Opening Reception.
  • We’ve also got a great lineup of demos in the exhibit hall all day Tuesday and Wednesday – be sure to stop by and learn more about open source database products and tools.
  • On Monday, we have a special China Track now available from Alibaba Cloud, PingCAP and Shannon Systems. We’ve just put a $20.00 ticket on sale for that track, and if you have already purchased any of our other tickets, you are also welcome to attend those four sessions.
  • Don’t forget to make your reservation at the Community Dinner. It’s a great opportunity to socialize with everyone and Pythian is always a wonderful host!

Thanks to everyone who is sponsoring, presenting and attending! The community is who makes this event successful and so much fun to be a part of!

The post The Final Countdown: Are You Ready for Percona Live 2018? appeared first on Percona Database Performance Blog.

Apr
20
2018
--

Kubernetes and Cloud Foundry grow closer

Containers are eating the software world — and Kubernetes is the king of containers. So if you are working on any major software project, especially in the enterprise, you will run into it sooner or later. Cloud Foundry, which hosted its semi-annual developer conference in Boston this week, is an interesting example for this.

Outside of the world of enterprise developers, Cloud Foundry remains a bit of an unknown entity, despite having users in at least half of the Fortune 500 companies (though in the startup world, it has almost no traction). If you are unfamiliar with Cloud Foundry, you can think of it as somewhat similar to Heroku, but as an open-source project with a large commercial ecosystem and the ability to run it at scale on any cloud or on-premises installation. Developers write their code (following the twelve-factor methodology), define what it needs to run and Cloud Foundry handles all of the underlying infrastructure and — if necessary — scaling. Ideally, that frees up the developer from having to think about where their applications will run and lets them work more efficiently.

To enable all of this, the Cloud Foundry Foundation made a very early bet on containers, even before Docker was a thing. Since Kubernetes wasn’t around at the time, the various companies involved in Cloud Foundry came together to build their own container orchestration system, which still underpins much of the service today. As it took off, though, the pressure to bring support for Kubernetes grew inside of the Cloud Foundry ecosystem. Last year, the Foundation announced its first major move in this direction by launching its Kubernetes-based Container Runtime for managing containers, which sits next to the existing Application Runtime. With this, developers can use Cloud Foundry to run and manage their new (and existing) monolithic apps and run them in parallel with the new services they develop.

But remember how Cloud Foundry also still uses its own container service for the Application Runtime? There is really no reason to do that now that Kubernetes (and the various other projects in its ecosystem) have become the default of handling containers. It’s maybe no surprise then that there is now a Cloud Foundry project that aims to rip out the old container management systems and replace them with Kubernetes. The container management piece isn’t what differentiates Cloud Foundry, after all. Instead, it’s the developer experience — and at the end of the day, the whole point of Cloud Foundry is that developers shouldn’t have to care about the internal plumbing of the infrastructure.

There is another aspect to how the Cloud Foundry ecosystem is embracing Kubernetes, too. Since Cloud Foundry is also just software, there’s nothing stopping you from running it on top of Kubernetes, too. And with that, it’s no surprise that some of the largest Cloud Foundry vendors, including SUSE and IBM, are doing exactly that.

The SUSE Cloud Application Platform, which is a certified Cloud Foundry distribution, can run on any public cloud Kubernetes infrastructure, including the Microsoft Azure Container service. As the SUSE team told me, that means it’s not just easier to deploy, but also far less resource-intensive to run.

Similarly, IBM is now offering Cloud Foundry on top of Kubernetes for its customers, though it’s only calling this an experimental product for now. IBM’s GM of Cloud Developer Services Don Boulia stressed that IBM’s customers were mostly looking for ways to run their workloads in an isolated environment that isn’t shared with other IBM customers.

Boulia also stressed that for most customers, it’s not about Kubernetes versus Cloud Foundry. For most of his customers, using Kubernetes by itself is very much about moving their existing applications to the cloud. And for new applications, those customers are then opting to run Cloud Foundry.

That’s something the SUSE team also stressed. One pattern SUSE has seen is that potential customers come to it with the idea of setting up a container environment and then, over the course of the conversation, decide to implement Cloud Foundry as well.

Indeed, the message of this week’s event was very much that Kubernetes and Cloud Foundry are complementary technologies. That’s something Chen Goldberg, Google’s Director of Engineering for Container Engine and Kubernetes, also stressed during a panel discussion at the event.

Both the Cloud Foundry Foundation and the Cloud Native Computing Foundation (CNCF), the home of Kubernetes, are under the umbrella of the Linux Foundation. They take somewhat different approaches to their communities, with Cloud Foundry stressing enterprise users far more than the CNCF. There are probably some politics at play here, but for the most part, the two organizations seem friendly enough — and they do share a number of members. “We are part of CNCF and part of Cloud Foundry foundation,” Pivotal CEO Rob Mee told our own Ron Miller. “Those communities are increasingly sharing tech back and forth and evolving together. Not entirely independent and not competitive either. Lot of complexity and subtlety. CNCF and Cloud Foundry are part of a larger ecosystem with complimentary and converging tech.”

We’ll likely see more of this technology sharing — and maybe collaboration — between the CNCF and Cloud Foundry going forward. The CNCF is, after all, the home of a number of very interesting projects for building cloud-native applications that do have their fair share of use cases in Cloud Foundry, too.

Apr
20
2018
--

Percona Toolkit 3.0.9 Is Now Available

Percona Toolkit 3.0.9Percona announces the release of Percona Toolkit 3.0.9 on April 20, 2018.

Percona Toolkit is a collection of advanced open source command-line tools, developed and used by the Percona technical staff, that are engineered to perform a variety of MySQL®, MongoDB® and system tasks that are too difficult or complex to perform manually. With over 1,000,000 downloads, Percona Toolkit supports Percona Server for MySQL, MySQL, MariaDB®, Percona Server for MongoDB and MongoDB.

Percona Toolkit, like all Percona software, is free and open source. You can download packages from the website or install from official repositories.

This release includes the following changes:

New Tools:

  • PT-1501: pt-secure-collect – new tool to collect and sanitize pt-tools outputs

New Features:

  • PT-1530: Add support for encryption status to pt-mysql-summary
  • PT-1526: Add ndb status to pt-mysql-summary (Thanks Fernando Ipar)
  • PT-1525: Add support for MySQL 8 roles into pt-mysql-summary
  • PT-1509: Make pt-table-sync only set binlog_format when necessary (Thanks Moritz Lenz)
  • PT-1508: Add --read-only-interval and --fail-successive-errors flags to pt-heartbeat (Thanks Shlomi Noach)
  • PT-243: Add --max-hostname-length and --max-line-length flags to pt-query-digest

Bug Fixes:

  • PT-1527: Fixed pt-table-checksum ignores --nocheck-binlog-format

Improvements:

  • PT-1507: pt-summary does not reliably read in the transparent huge pages setting (Thanks Nick Veenhof)
  • PT-1488: pt-show-grants support for MySQL 8.0

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system.

The post Percona Toolkit 3.0.9 Is Now Available appeared first on Percona Database Performance Blog.

Apr
20
2018
--

Percona Monitoring and Management (PMM) 1.10.0 Is Now Available

Percona Monitoring and Management

Percona Monitoring and Management (PMM) is a free and open-source platform for managing and monitoring MySQL® and MongoDB® performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL® and MongoDB® servers to ensure that your data works as efficiently as possible.

Percona Monitoring and ManagementWe focused mainly on two features in 1.10.0, but there are also several notable improvements worth highlighting:

  • Annotations – Record and display Application Events as Annotations using pmm-admin annotate
  • Grafana 5.0 – Improved visualization effects
  • Switching between Dashboards – Restored functionality to preserve host when switching dashboards
  • New Percona XtraDB Cluster Overview graphs – Added Galera Replication Latency graphs on Percona XtraDB Cluster Overview dashboard with consistent colors

The issues in the release include four new features & improvements, and eight bugs fixed.

Annotations

Application events are one of the contributors to changes in database performance characteristics, and in this release PMM now supports receiving events and displaying them as Annotations using the new command pmm-admin annotate. A recent Percona survey reveals that Database and DevOps Engineers highly value visibility into the Application layer.  By displaying Application Events on top of your PMM graphs, Engineers can now correlate Application Events (common cases: Application Deploys, Outages, and Upgrades) against Database and System level metric changes.

Usage

For example, you have just completed an Application deployment to version 1.2, which is relevant to UI only, so you want to set tags for the version and interface impacted:

pmm-admin annotate "Application deploy v1.2" --tags "UI, v1.2"

Using the optional --tags allows you to filter which Annotations are displayed on the dashboard via a toggle option.  Read more about Annotations utilization in the Documentation.

Grafana 5.0

We’re extremely pleased to see Grafana ship 5.0 and we were fortunate enough to be at Grafanacon, including Percona’s very own Dimitri Vanoverbeke (Dim0) who presented What we Learned Integrating Grafana and Prometheus!

 

 

Included in Grafana 5.0 are a number of dramatic improvements, which in future Percona Monitoring and Management releases we plan to extend our usage of each feature, and the one we like best is the virtually unlimited way you can size and shape graphs.  No longer are you bound by panel constraints to keep all objects at the same fixed height!  This improvement indirectly addresses the visualization error in PMM Server where some graphs would appear to be on two lines and ended up wasting screen space.

Switching between Dashboards

PMM now allows you to navigate between dashboards while maintaining the same host under observation, so that for example you can start on MySQL Overview looking at host serverA, switch to MySQL InnoDB Advanced dashboard and continue looking at serverA, thus saving you a few clicks in the interface.

New Percona XtraDB Cluster Galera Replication Latency Graphs

We have added new Percona XtraDB Cluster Replication Latency graphs on our Percona XtraDB Cluster Galera Cluster Overview dashboard so that you can compare latency across all members in a cluster in one view.

Issues in this release

New Features & Improvements

  • PMM-2330Application Annotations DOC Update
  • PMM-2332Grafana 5 DOC Update
  • PMM-2293Add Galera Replication Latency Graph to Dashboard PXC/Galera Cluster Overview RC Ready
  • PMM-2295Improve color selection on Dashboard PXC/Galera Cluster Overview RC Ready

Bugs fixed

  • PMM-2311Fix misalignment in Query Analytics Metrics table RC Ready
  • PMM-2341Typo in text on password page of OVF RC Ready
  • PMM-2359Trim leading and trailing whitespaces for all fields on AWS/OVF Installation wizard RC Ready
  • PMM-2360Include a “What’s new?” link for Update widget RC Ready
  • PMM-2346Arithmetic on InnoDB AHI Graphs are invalid DOC Update
  • PMM-2364QPS are wrong in QAN RC Ready
  • PMM-2388Query Analytics does not render fingerprint section in some cases DOC Update
  • PMM-2371Pass host when switching between Dashboards

How to get PMM

PMM is available for installation using three methods:

Help us improve our software quality by reporting any Percona Monitoring and Management bugs you encounter using our bug tracking system.

The post Percona Monitoring and Management (PMM) 1.10.0 Is Now Available appeared first on Percona Database Performance Blog.

Apr
19
2018
--

Sysbench-tpcc Supports PostgreSQL (No, Really This Time)

Sysbench-tpcc Supports PostgreSQL

Sysbench-tpcc Supports PostgreSQLThis time, we really mean it when we say sysbench-tpcc supports PostgreSQL.

When I initially announced sysbench-tpcc, I mentioned it potentially could run against PostgreSQL, but it was more like wishful thinking than reality. The reality was that even though both databases speak SQL, the difference in dialects was too big and the queries written for MySQL could not run without modification on PostgreSQL.

Well, we introduced needed changes, and now you can use sysbench-tpcc with PostgreSQL. Just try the latest commit to https://github.com/Percona-Lab/sysbench-tpcc.

If you’re interested, here is a quick overview of what changes we had to make:

  1. It appears that PostgreSQL does not support the 
    tinyint

     and

    datetime

     data types. We had to use smallint and

    timestamp

     fields, even if using

    smallint

     makes the database size bigger.

  2. PostgreSQL does not have a simple equivalent for MySQL’s
    SHOW TABLES

    . The best replacement we found is

    select * from pg_catalog.pg_tables where schemaname != 'information_schema' and schemaname != 'pg_catalog'

    .

  3. PostgreSQL does not have a way to disable Foreign Key checks like MySQL:
    SET FOREIGN_KEY_CHECKS=0

    . With PostgreSQL, we needed to create and load tables in a very specific order to avoid Foreign Keys violations.

  4. PostgreSQL requires you to have a unique index name per the whole database, white MySQL requires it only per table. So instead of using:
    CREATE INDEX idx_customer ON customer1 (c_w_id,c_d_id,c_last,c_first)
    CREATE INDEX idx_customer ON customer2 (c_w_id,c_d_id,c_last,c_first)

    We need to use:

    CREATE INDEX idx_customer1 ON customer1 (c_w_id,c_d_id,c_last,c_first)
    CREATE INDEX idx_customer2 ON customer2 (c_w_id,c_d_id,c_last,c_first)
  5. PostgreSQL does not have a 
    STRAIGHT_JOIN

     hint, so we had to remove this from queries. But it is worth mentioning we use

    STRAIGHT_JOIN

     mostly as a hack to force MySQL to use a correct execution plan for one of the queries.

  6. PostgreSQL is very strict on GROUP BY queries. All fields that are not in the GROUP BY clause must use an aggregation function. So PostgreSQL complained on queries like
    SELECT d_w_id,sum(d_ytd)-w_ytd diff FROM district,warehouse WHERE d_w_id=w_id AND w_id=1 GROUP BY d_w_id

     even when we know that only single value for w_ytd is possible. We had to rewrite this query as

    SELECT d_w_id,SUM(d_ytd)-MAX(w_ytd) diff FROM district,warehouse WHERE d_w_id=w_id AND w_id=1 GROUP BY d_w_id

    .

So you can see there was some work involved when we try to migrate even a simple application from MySQL to PostgreSQL.

Hopefully, now sysbench-tpcc supports PostgreSQL, it is a useful tool to evaluate a PostgreSQL performance. If you find that we did not optimally execute some transaction, please let us know!

The post Sysbench-tpcc Supports PostgreSQL (No, Really This Time) appeared first on Percona Database Performance Blog.

Apr
19
2018
--

Congratulations to Our Friends at Oracle with the MySQL 8.0 GA Release!

MySQL 8.0 GA

MySQL 8.0 GAIt is a great today for whole MySQL community: MySQL 8.0 was just released as GA!

Geir Høydalsvik has a great summary in his “What’s New in MySQL 8.0” blog post. You can find additional information about MySQL 8.0 Replication and MySQL 8.0 Document Store that is also worth reading.

If you can’t wait to upgrade to MySQL 8.0, please make sure to read the Upgrading to MySQL 8.0 section in the manual, and pay particular attention to changes to Connection Authentication. It requires special handling for most applications.

Also keep in mind that while MySQL 8.0 passed through an extensive QA process, this is the first GA release. It is not yet as mature and polished as MySQL 5.7. If you’re just now starting application development, however, you should definitely start with MySQL 8.0 — by the time you launch your application, 8.0 will be good. 

All of us at Percona – and me personally – are very excited about this release. You can learn more details about what we expect from it in our Why We’re Excited about MySQL 8.0 webinar recording.    

We also wrote extensively about MySQL 8.0 on our blog. Below are some posts on various features, as well as thoughts on the various RCs, that you might want to review:

The best way to learn about MySQL 8.0, though, is to attend the Percona Live Open Source Database Conference 2018, taking place in Santa Clara, CA next week. We have an outstanding selection of MySQL 8.0 focused talks both from the MySQL Engineering team and the community at large (myself included):

You can still get tickets to the conference. Come by and learn about MySQL 8.0. If you can’t make it, please check back later for slides.

Done reading? Go ahead go download  MySQL 8.0 and check it out!

The post Congratulations to Our Friends at Oracle with the MySQL 8.0 GA Release! appeared first on Percona Database Performance Blog.

Apr
18
2018
--

Why Analyze Raw MySQL Query Logs?

Raw MySQL Query Logs

Raw MySQL Query LogsIn this blog post, I’ll examine when looking at raw MySQL query logs can be more useful than working with tools that only have summary data.

In my previous blog post, I wrote about analyzing MySQL Slow Query Logs with ClickHouse and ClickTail. One of the follow-up questions I got is when do you want to do that compared to just using tools like Percona Monitoring and Management or VividCortex, which provide a beautiful interface for detailed analyses (rather than spartan SQL interface).    

MySQL Logs

A lot of folks are confused about what query logs MySQL has, and what you can use them for. First, MySQL has a “General Query Log”. As the name implies, this is a general-purpose query log. You would think this is the first log you should use, but it is, in fact, pretty useless:

2018-03-31T15:38:44.521650Z      2356 Query SELECT c FROM sbtest1 WHERE id=164802
2018-03-31T15:38:44.521790Z      2356 Query SELECT c FROM sbtest1 WHERE id BETWEEN 95241 AND 95340
2018-03-31T15:38:44.522168Z      2356 Query SELECT SUM(k) FROM sbtest1 WHERE id BETWEEN 1 AND 100
2018-03-31T15:38:44.522500Z      2356 Query SELECT c FROM sbtest1 WHERE id BETWEEN 304556 AND 304655 ORDER BY c
2018-03-31T15:38:44.522941Z      2356 Query SELECT DISTINCT c FROM sbtest1 WHERE id BETWEEN 924 AND 1023 ORDER BY c
2018-03-31T15:38:44.523525Z      2356 Query UPDATE sbtest1 SET k=k+1 WHERE id=514

As you can see, it only has very limited information about queries: no query execution time or which user is running the query. This type of log is helpful if you want to see very clean, basic information on what queries your application is really running. It can also help debug MySQL crashes because, unlike other log formats, the query is written to this log file before MySQL attempts to execute the query.

The MySQL Slow Log is, in my opinion, much more useful (especially with Percona Server Slow Query Log Extensions). Again as the name implies, you would think it is only used for slow queries (and by default, it is). However, you can set long_query_time to 0 (with a few other options) to get all queries here with lots of rich information about query execution:

# Time: 2018-03-31T15:48:55.795145Z
# User@Host: sbtest[sbtest] @ localhost []  Id: 2332
# Schema: sbtest  Last_errno: 0 Killed: 0
# Query_time: 0.000143  Lock_time: 0.000047 Rows_sent: 1  Rows_examined: 1 Rows_affected: 0
# Bytes_sent: 188  Tmp_tables: 0 Tmp_disk_tables: 0  Tmp_table_sizes: 0
# QC_Hit: No  Full_scan: No Full_join: No  Tmp_table: No Tmp_table_on_disk: No
# Filesort: No  Filesort_on_disk: No  Merge_passes: 0
#   InnoDB_IO_r_ops: 0  InnoDB_IO_r_bytes: 0  InnoDB_IO_r_wait: 0.000000
#   InnoDB_rec_lock_wait: 0.000000  InnoDB_queue_wait: 0.000000
#   InnoDB_pages_distinct: 0
# Log_slow_rate_type: query  Log_slow_rate_limit: 10
SET timestamp=1522511335;
SELECT c FROM sbtest1 WHERE id=2428336;

Finally, there is the MySQL Audit Log, which is part of the MySQL Enterprise offering and format-compatible Percona Server for MySQL Audit Log Plugin. This is designed for auditing access to the server, and as such it has matched details in the log. Unlike the first two log formats, it is designed first and foremost to be machine-readable and supports JSON, XML and CVS output formats:

{"audit_record":{"name":"Query","record":"743017006_2018-03-31T01:03:12","timestamp":"2018-03-31T15:53:42 UTC","command_class":"select","connection_id":"2394","status":0,"sqltext":"SELECT SUM(k) FROM sbtest1 WHERE id BETWEEN 3 AND 102","user":"sbtest[sbtest] @ localhost []","host":"localhost","os_user":"","ip":"","db":"sbtest"}}
{"audit_record":{"name":"Query","record":"743017007_2018-03-31T01:03:12","timestamp":"2018-03-31T15:53:42 UTC","command_class":"select","connection_id":"2394","status":0,"sqltext":"SELECT c FROM sbtest1 WHERE id BETWEEN 2812021 AND 2812120 ORDER BY c","user":"sbtest[sbtest] @ localhost []","host":"localhost","os_user":"","ip":"","db":"sbtest"}}
{"audit_record":{"name":"Query","record":"743017008_2018-03-31T01:03:12","timestamp":"2018-03-31T15:53:42 UTC","command_class":"select","connection_id":"2394","status":0,"sqltext":"SELECT DISTINCT c FROM sbtest1 WHERE id BETWEEN 1 AND 100 ORDER BY c","user":"sbtest[sbtest] @ localhost []","host":"localhost","os_user":"","ip":"","db":"sbtest"}}

As you can see, there are substantial differences in the purposes of the different MySQL log formats, along with the information they provide.

Why analyze raw MySQL query logs

In my opinion, there are two main reasons to look directly at raw log files without aggregation (you might find others):

  • Auditing, where the Audit Log is useful (Vadim recently blogged about it)
  • Advanced MySQL/application debugging, where an aggregated summary might not allow you to drill down to the fullest level of detail

When you’re debugging using MySQL logs, the Slow Query Log, set to log all queries with no sampling, is the most useful. Of course, this can cause significant additional overhead in many workloads, so it is best to do it in a development environment (if you can repeat the situation you’re looking to analyze). At the very least, don’t do it during peak time.

For Percona Server for MySQL, these options ensure it logs all queries to the query log with no sampling:

log_output=file
slow_query_log=ON
long_query_time=0
log_slow_rate_limit=1
log_slow_verbosity=full
log_slow_admin_statements=ON
log_slow_slave_statements=ON
slow_query_log_always_write_time=1

Now that we have full queries, we can easily use Linux command line tools like grep and others to look into what is going on. However, many times this isn’t always convenient. This is where loading logs into storage that you can conveniently query is a good solution.

Let’s look into some specific and interesting cases.

Were any queries killed?

SELECT
   _time,
   query,
   query_time
FROM mysql_slow_log
WHERE killed > 0
????????????????_time???query?????????????????????????????????query_time??
? 2018-04-02 19:02:56 ? select benchmark(10000000000,"1+1") ?  10.640794 ?
??????????????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 0.242 sec. Processed 929.14 million rows, 1.86 GB (3.84                                         billion rows/s., 7.67 GB/s.)

Yes. A query got killed after running for 10 seconds.

Did any query fail? With what error codes?

SELECT
   error_num,
   min(_time),
   max(_time),
   count(*)
FROM mysql_slow_log
GROUP BY error_num
??error_num????????????min(_time)????????????max(_time)?????count()??
?         0 ? 2018-04-02 18:59:49 ? 2018-04-07 19:39:27 ? 925428375 ?
?      1160 ? 2018-04-02 19:02:56 ? 2018-04-02 19:02:56 ?         1 ?
?      1213 ? 2018-04-02 19:00:00 ? 2018-04-07 19:18:14 ?   3709520 ?
?      1054 ? 2018-04-07 19:38:14 ? 2018-04-07 19:38:14 ?         1 ?
?????????????????????????????????????????????????????????????????????
4 rows in set. Elapsed: 2.391 sec. Processed 929.14 million rows, 7.43 GB (388.64 million rows/s., 3.11 GB/s.)

You can resolve error codes with the 

perror

 command:

root@rocky:~# perror 1054
MySQL error code 1054 (ER_BAD_FIELD_ERROR): Unknown column '%-.192s' in '%-.192s'

This command has many uses. You can use it to hunt down application issues (like in this example of a missing column — likely due to bad or old code). It can also help you to spot SQL injection attempts that often cause queries with bad syntax, and troubleshoot deadlocks or foreign key violations.

Are there any nasty, long transactions?

SELECT
   transaction_id,
   max(_time) - min(_time) AS run_time,
   count(*) AS num_queries,
   sum(rows_affected) AS rows_changed
FROM mysql_slow_log
WHERE transaction_id != ''
GROUP BY transaction_id
ORDER BY rows_changed DESC
LIMIT 10
??transaction_id???run_time???num_queries???rows_changed??
? 17E070082      ? 0        ?      1      ? 9999         ?
? 17934C73C      ? 2        ?      6      ? 4            ?
? 178B6D346      ? 0        ?      6      ? 4            ?
? 17C909086      ? 2        ?      6      ? 4            ?
? 17B45EFAD      ? 5        ?      6      ? 4            ?
? 17ABAB840      ? 0        ?      6      ? 4            ?
? 17A36AD3F      ? 3        ?      6      ? 4            ?
? 178E037A5      ? 1        ?      6      ? 4            ?
? 17D1549C9      ? 0        ?      6      ? 4            ?
? 1799639F2      ? 1        ?      6      ? 4            ?
??????????????????????????????????????????????????????????
10 rows in set. Elapsed: 15.574 sec. Processed 930.58 million rows, 18.23 GB (59.75 million rows/s., 1.17 GB/s.)

Finding transactions that modify a lot of rows, like transaction 17E070082 above, can be very helpful to ensure you control MySQL replication slave lag. It is also critical if you’re looking to migrate to MySQL Group Replication or Percona XtraDB Cluster.

What statements were executed in a long transaction?

SELECT
   _time,
   _ms,
   query
FROM mysql_slow_log
WHERE transaction_id = '17E070082'
ORDER BY
   _time ASC,
   _ms ASC
LIMIT 10
????????????????_time??????_ms???query??????????????????????????????????
? 2018-04-07 20:08:43 ? 890693 ? update sbtest1 set k=0 where id<10000 ?
????????????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 2.361 sec. Processed 931.04 million rows, 10.79 GB (394.27 million rows/s., 4.57 GB/s.)

I used transaction 17E070082 from the previous query above (which modified 9999 rows). Note that this schema improves compression by storing the seconds and microseconds parts of the timestamp in different columns.

Were any queries dumping large numbers of rows from the database?

SELECT
   _time,
   query,
   rows_sent,
   bytes_sent
FROM mysql_slow_log
WHERE rows_sent > 10000
????????????????_time???query??????????????????????????????????????????????rows_sent???bytes_sent??
? 2018-04-07 20:21:08 ? SELECT /*!40001 SQL_NO_CACHE */ * FROM `sbtest1` ?  10000000 ? 1976260712 ?
???????????????????????????????????????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 0.294 sec. Processed 932.19 million rows, 3.73 GB (3.18 billion rows/s., 12.71 GB/s.)

Did someone Update a record?

SELECT
   _time,
   query
FROM mysql_slow_log
WHERE (rows_affected > 0) AND (query LIKE '%id=3301689%')
LIMIT 1
????????????????_time???query??????????????????????????????????????
? 2018-04-02 19:04:48 ? UPDATE sbtest1 SET k=k+1 WHERE id=3301689 ?
???????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 0.046 sec. Processed 2.29 million rows, 161.60 MB (49.57 million rows/s., 3.49 GB/s.)

Note that I’m cheating here by assuming we know an update used a primary key, but it is practically helpful in a lot of cases.

These are just some of the examples of what you can find out by querying raw slow query logs. They contain a ton of information about query execution (especially in Percona Server for MySQL) that allows you to use them both for performance analysis and some security and auditing purposes.

The post Why Analyze Raw MySQL Query Logs? appeared first on Percona Database Performance Blog.

Apr
18
2018
--

Restore a MongoDB Logical Backup

MongoDB Logical Backup

MongoDB Logical BackupIn this article, we will explain how to restore a MongoDB logical backup performed via ‘mongodump’ to a mongod instance.

MongoDB logical backup requires the use of the ‘mongorestore‘ tool to perform the restore backup. This article focuses on this tool and process.

Note: Percona develops a backup tool named Percona-Lab/mongodb-consistent-backup, which is a wrapper for ‘mongodump‘, adding cluster-wide backup consistency. The backups created by mongodb_consistent_backup (in Dump/Mongodump mode) can be restored using the exact same steps as a regular ‘mongodump’ backup – no special steps!

Mongorestore Command Flags

–host/–port (and –user/–password)

Required, even if you’re using the default host/port (localhost:27017). If authorization is enabled, add –user/–password flags also.

–drop

This is almost always required. This causes ‘mongodump‘ to drop the collection that is being restored before restoring it. Without this flag, the documents from the backup are inserted one at a time and if they already exist the restore fails.

–oplogReplay

This is almost always required. Replays the oplog that was dumped by mongodump. It is best to include this flag on replset-based backups unless there is a specific reason not to. You can tell if the backup was from a replset by looking for the file ‘oplog.bson‘ at the base of the dump directory.

–dir

Required. The path to the mongodump data.

–gzip

Optional. For mongodump >= 3.2, enables inline compression on the restore. This is required if ‘mongodump‘ used the –gzip flag (look for *.bson.gz files if you’re not sure if the collection files have no .gz suffix, don’t use –gzip).

–numParallelCollections=<number>

Optional. For mongodump >= 3.2 only, sets the number of collections to insert in parallel. By default four threads are used, and if you have a large server and you want to restore faster (more resource usage though), you could increase this number. Note that each thread uncompresses bson if the ‘–gzip‘ flag is used, so consider this when raising this number.

Steps

  1. (Optional) If the backup is archived (mongodb_consistent_backup defaults to creating tar archives), un-archive the backup so that ‘mongorestore‘ can access the .bson/.bson.gz files:
    $ tar -C /opt/mongodb/backup/testbackup/20160809_1306 -xvf /opt/mongodb/backup/testbackup/20160809_1306/test1.tar
    test1/
    test1/dump/
    test1/dump/wikipedia/
    test1/dump/wikipedia/pages.metadata.json.gz
    test1/dump/wikipedia/pages.bson.gz
    test1/dump/oplog.bson

    ** This command un-tars the backup to ‘/opt/mongodb/backup/testbackup/20160809_1306/test1/dump’ **

  2. Check (and then check again!) that you’re restoring the right backup to the right host. When in doubt, it is safer to ask the customer or others.
    1. The Percona ‘mongodb_consistent_backup‘ tool names backup subdirectories by replica set name, so you can ensure you’re restoring the right backup by checking the replica set name of the node you’re restoring to, if it exists.
    2. If you’re restoring to a replica set you will need to restore to the PRIMARY member and there needs to be a majority (so writes are accepted – some exceptions if you override write-concern, but not advised).
  3. Use ‘mongorestore‘ to restore the data by dropping/restoring each collection (–drop flag) and replay the oplog changes (–oplogReplay flag), specifying the restore dir explicitly (–dir flag) to the ‘mongorestore‘ command. In this example I also used authorization (–user/–password flags) and un-compression (–gzip flag):
    $ mongorestore --drop --host localhost --port 27017 --user secret --password secret --oplogReplay --gzip --dir /opt/mongodb/backup/testbackup/20160809_1306/test1/dump
    2016-08-09T14:23:04.057+0200    building a list of dbs and collections to restore from /opt/mongodb/backup/testbackup/20160809_1306/test1/dump dir
    2016-08-09T14:23:04.065+0200    reading metadata for wikipedia.pages from /opt/mongodb/backup/testbackup/20160809_1306/test1/dump/wikipedia/pages.metadata.json.gz
    2016-08-09T14:23:04.067+0200    restoring wikipedia.pages from /opt/mongodb/backup/testbackup/20160809_1306/test1/dump/wikipedia/pages.bson.gz
    2016-08-09T14:23:07.058+0200    [#######.................]  wikipedia.pages  63.9 MB/199.0 MB  (32.1%)
    2016-08-09T14:23:10.058+0200    [###############.........]  wikipedia.pages  127.7 MB/199.0 MB  (64.1%)
    2016-08-09T14:23:13.060+0200    [###################.....]  wikipedia.pages  160.4 MB/199.0 MB  (80.6%)
    2016-08-09T14:23:16.059+0200    [#######################.]  wikipedia.pages  191.5 MB/199.0 MB  (96.2%)
    2016-08-09T14:23:19.071+0200    [########################]  wikipedia.pages  223.5 MB/199.0 MB  (112.3%)
    2016-08-09T14:23:22.062+0200    [########################]  wikipedia.pages  255.6 MB/199.0 MB  (128.4%)
    2016-08-09T14:23:25.067+0200    [########################]  wikipedia.pages  271.4 MB/199.0 MB  (136.4%)
    ...
    ...
    2016-08-09T14:24:19.058+0200    [########################]  wikipedia.pages  526.9 MB/199.0 MB  (264.7%)
    2016-08-09T14:24:22.058+0200    [########################]  wikipedia.pages  558.9 MB/199.0 MB  (280.8%)
    2016-08-09T14:24:23.521+0200    [########################]  wikipedia.pages  560.6 MB/199.0 MB  (281.6%)
    2016-08-09T14:24:23.522+0200    restoring indexes for collection wikipedia.pages from metadata
    2016-08-09T14:24:23.528+0200    finished restoring wikipedia.pages (32725 documents)
    2016-08-09T14:24:23.528+0200    replaying oplog
    2016-08-09T14:24:23.597+0200    done
    1. If you encounter problems with ‘mongorestore‘, carefully read the error message or rerun with several ‘-v‘ flags, e.g.: ‘-vvv‘. Once you have an error, attempt to troubleshoot the cause.
  4. Check to see that you saw “replaying oplog” and “done” after the restore (last two lines in the example). If you don’t see this, there is a problem.

As you notice, using this tool for MongoDB logical backup is very simple. However, when using sharding please note that –oplog is not available and the mongodump uses the primaries for each shard. As this is not advised typically in production, you might consider looking at Percona-Lab/mongodb-consistent-backup to ensure you are consistent and hitting secondary nodes, like mongodump with replica sets, will work.

If MongoDB and topics like this interest you, please see the document below, we are hiring!

{
  hiring: true,
  role: "Consultant",
  tech: "MongoDB",
  location: "USA",
  moreInfo: "https://www.percona.com/about-percona/careers/mongodb-consultant-usa-based"
}

The post Restore a MongoDB Logical Backup appeared first on Percona Database Performance Blog.

Apr
18
2018
--

Stripe debuts Radar anti-fraud AI tools for big businesses, says it has halted $4B in fraud to date

Cybersecurity continues to be a growing focus and problem in the digital world, and now Stripe is launching a new paid product that it hopes will help its customers better battle one of the bigger side-effects of data breaches: online payment fraud. Today, Stripe is announcing Radar for Fraud Teams, an expansion of its free AI-based Radar service that runs alongside Stripe’s core payments API to help identify and block fraudulent transactions.

And there are further efforts that Stripe is planning in coming months. Michael Manapat, Stripe’s engineering manager for Radar and machine learning, said the company is going to soon launch a private beta of a “dynamic authentication” that will bring in two-factor authentication. This is on top of Stripe’s first forays into using biometric factors in payments, made via partners like Apple and Google. With these and others, fingerprints and other physical attributes have become increasingly popular ways to identify mobile and other users.

The initial iteration of Radar launched in October 2016, and since then, Manapat tells me that it has prevented $4 billion in fraud for its “hundreds of thousands” of customers.

Considering the wider scope of how much e-commerce is affected by fraud — one study estimates $57.8 billion in e-commerce fraud across eight major verticals in a one-year period between 2016 and 2017 — this is a decent dent, but there is a lot more work to be done. And Stripe’s position of knowing four out of every five payment card numbers globally (on account of the ubiquity of its payments API) gives it a strong position to be able to tackle it.

The new paid product comes alongside an update to the core, free product that Stripe is dubbing Radar 2.0, which Stripe claims will have more advanced machine learning built into it and can therefore up its fraud detection by some 25 percent over the previous version.

New features for the whole product (free and paid) will include being able to detect when a proxy VPN is being used (which fraudsters might use to appear like they are in one country when they are actually in another) and ingesting billions of data points to train its model, which is now being updated on a daily basis automatically — itself an improvement on the slower and more manual system that Manapat said Stripe has been using for the past couple of years.

Meanwhile, the paid product is an interesting development.

At the time of the original launch, Stripe co-founder John Collison hinted that the company would be considering a paid product down the line. Stripe has said multiple times that it’s in no rush to go public — and statement that a spokesperson reiterated this week — but it’s notable that a paid tier is a sign of how Stripe is slowly building up more monetization and revenue generation.

Stripe is valued at around $9.2 billion as of its last big round in 2016. Most recently, it raised $150 million back in that November 2016 round. A $44 million from March of this year, noted in Pitchbook, was actually related to issuing stock related to its quiet acquisition of point-of-sale payments startup Index in that month — incidentally another interesting move for Stripe to expand its position and placement in the payments ecosystem. Stripe has raised around $450 million in total.

The Teams product, aimed at businesses that are big enough to have dedicated fraud detection staff, will be priced at an additional $0.02 per transaction, on top of Stripe’s basic transaction fees of a 2.9 percent commission plus 30 cents per successful card charge in the U.S. (fees vary in other markets).

The chief advantage of taking the paid product will be that teams will be able to customise how Radar works with their own transactions.

This will include a more complete set of data for teams that review transactions, and a more granular set of tools to determine where and when sales are reviewed, for example based on usage patterns or the size of the transaction. There are already a set of flags the work to note when a card is used in frequent succession across disparate geographies; but Manapat said that newer details such as analysing the speed at which payment details are entered and purchases are made will now also factor into how it flags transactions for review.

Similarly, teams will be able to determine the value at which a transaction needs to be flagged. This is the online equivalent of when certain purchases require or waive you to enter a PIN or provide a signature to seal the deal. (And it’s interesting to see that some e-commerce operations are potentially allowing some dodgy sales to happen simply to keep up the user experience for the majority of legitimate transactions.)

Users of the paid product will also be able to now use Radar to help with their overall management of how it handles fraud. This will include being able to keep lists of attributes, names and numbers that are scrutinised, and to check against them with analytics also created by Stripe to help identify trending issues, and to plan anti-fraud activities going forward.

Updated with further detail about Stripe’s funding.

Apr
18
2018
--

Webinar Thursday, April 19, 2018: Running MongoDB in Production, Part 1

Running MongoDB

Running MongoDBPlease join Percona’s Senior Technical Operations Architect, Tim Vaillancourt as he presents Running MongoDB in Production, Part 1 on Thursday, April 19, 2018, at 10:00 am PDT (UTC-7) / 1:00 pm EDT (UTC-4).

Are you a seasoned MySQL DBA that needs to add MongoDB to your skills? Are you used to managing a small environment that runs well, but want to know what you might not know yet? This webinar helps you with running MongoDB in production environments.

MongoDB works well, but when it has issues, the number one question is “where should I go to solve a problem?”

This tutorial will cover:

Backups
– Logical vs Binary-level backups
– Sharding and Replica-Set Backup strategies
Security
– Filesystem and Network Security
– Operational Security
– External Authentication features of Percona Server for MongoDB
– Securing connections with SSL and MongoDB Authorization
– Encryption at Rest
– New Security features in 3.6
Monitoring
– Monitoring Strategy
– Important metrics to monitor in MongoDB and Linux
– Percona Monitoring and Management

Register for the webinar now.

Part 2 of this series will take place on Thursday, April 26, 2018, at 10:00 am PDT (UTC-7) / 1:00 pm EDT (UTC-4). Register for the second part of this series here.

Running MongoDBTimothy Vaillancourt, Senior Technical Operations Architect

Tim joined Percona in 2016 as Sr. Technical Operations Architect for MongoDB, with the goal to make the operations of MongoDB as smooth as possible. With experience operating infrastructures in industries such as government, online marketing/publishing, SaaS and gaming combined with experience tuning systems from the hard disk all the way up to the end-user, Tim has spent time in nearly every area of the modern IT stack with many lessons learned. Tim is based in Amsterdam, NL and enjoys traveling, coding and music.

Prior to Percona Tim was the Lead MySQL DBA of Electronic Arts’ DICE studios, helping some of the largest games in the world (“Battlefield” series, “Mirrors Edge” series, “Star Wars: Battlefront”) launch and operate smoothly while also leading the automation of MongoDB deployments for EA systems. Before the role of DBA at EA’s DICE studio, Tim served as a subject matter expert in NoSQL databases, queues and search on the Online Operations team at EA SPORTS. Before moving to the gaming industry, Tim served as a Database/Systems Admin operating a large MySQL-based SaaS infrastructure at AbeBooks/Amazon Inc.

The post Webinar Thursday, April 19, 2018: Running MongoDB in Production, Part 1 appeared first on Percona Database Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com