Dec
30
2015
--

Database Performance Webinar: Tired of MySQL Making You Wait?

Performance

database performance

Too often developers and DBAs struggle to pinpoint the root cause of MySQL database performance issues, and then spend too much time in trying to fix them. Wouldn’t it be great to bypass wasted guesswork and get right to the issue?

In our upcoming webinar Tired of MySQL Making You Wait? we’re going to help you discover how to significantly increase the performance of your applications and reduce database response time.

In this webinar, Principal Architect Alexander Rubin and Database Evangelist Janis Griffin will provide the key steps needed to identify, prioritize, and improve query performance.

They will discuss the following topics:

  • Wait time analytics using Performance / Information schemas
  • Monitoring for performance using DPA
  • Explaining plan operations focusing on temporary tables and filesort
  • Using indexes to optimize your queries
  • Using loose and tight index scans in MySQL

WHEN:

Thursday, January 7, 2016 10:00am Pacific Standard Time (UTC – 8)

PRESENTERS:

Alexander RubinPrincipal Consultant, Percona

Janis GriffinDatabase Evangelist, SolarWinds

Register now!

Percona is the only company that delivers enterprise-class software, support, consulting and managed services solutions for both MySQL and MongoDB® across traditional and cloud-based platforms that maximize application performance while streamlining database efficiencies.

Percona’s industry-recognized performance experts can maximize your database, server and application performance, lower infrastructure costs, and provide capacity and scalability planning for future growth.

Dec
30
2015
--

Slack’s New TV Commercial Is Adorable And Effective

Screen Shot 2015-12-30 at 10.41.22 AM If you’re in the tech world, then you know exactly what Slack is and does. As you eke outside of our little techosphere, however, you might get some puzzled faces when you talk about your favorite work communication tool. Read More

Dec
29
2015
--

2016 Percona Live Tutorials Schedule is UP!

PL16-Logo-Vert-Full-Opt1

percona live tutorialsWe are excited to announce that the tutorial schedule for the Percona Live Data Performance Conference 2016 is up!

The schedule shows all the details for each of our informative and enlightening Percona Live tutorial sessions, including insights into InnoDB, MySQL 5.7, MongoDB 3.2 and RocksDB. These tutorials are a must for any data performance professional!

The Percona Live Data Performance Conference is the premier open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, NoSQL, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The sneak peek schedule for Percona Live 2016 has also been posted! The Conference will feature a variety of formal tracks and sessions related to MySQL, NoSQL and Data in the Cloud. With over 150 slots to fill, there will be no shortage of great content this year.

The Percona Live Data Performance Conference will be April 18-21 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.

Just a reminder to everyone out there: our Super Saver discount rate for the Percona Live Data Performance and Expo 2016 is only available ‘til December 31 11:30pm PST! This rate gets you all the excellent and amazing opportunities that Percona Live offers, at the lowest price possible!

Become a conference sponsor! We have sponsorship opportunities available for this annual MySQL, NoSQL and Data in the Cloud event. Sponsors become a part of a dynamic and growing ecosystem and interact with more than 1,000 DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solutions vendors, and entrepreneurs who attend the event.

Click through to the tutorial link right now, look them over, and pick which sessions you want to attend!

Dec
29
2015
--

EXPLAIN FORMAT=JSON: order_by_subqueries, group_by_subqueries details on subqueries in ORDER BY and GROUP BY

EXPLAIN FORMAT=JSON

EXPLAIN FORMATAnother post in the EXPLAIN FORMAT=JSON is Cool! series! In this post, we’ll discuss how the EXPLAIN FORMAT=JSON provides optimization details for 

ORDER BY

 and  

GROUP BY

 operations in conjunction with 

order_by_subqueries

 and  

group_by_subqueries

EXPLAIN FORMAT=JSON

 can print details on how a subquery in

ORDER BY

 is optimized:

mysql> explain format=json select emp_no, concat(first_name, ' ', last_name) f2 from employees order by (select emp_no limit 1)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "60833.60"
    },
    "ordering_operation": {
      "using_filesort": true,
      "table": {
        "table_name": "employees",
        "access_type": "ALL",
        "rows_examined_per_scan": 299843,
        "rows_produced_per_join": 299843,
        "filtered": "100.00",
        "cost_info": {
          "read_cost": "865.00",
          "eval_cost": "59968.60",
          "prefix_cost": "60833.60",
          "data_read_per_join": "13M"
        },
        "used_columns": [
          "emp_no",
          "first_name",
          "last_name"
        ]
      },
      "order_by_subqueries": [
        {
          "dependent": true,
          "cacheable": false,
          "query_block": {
            "select_id": 2,
            "message": "No tables used"
          }
        }
      ]
    }
  }
}
1 row in set, 2 warnings (0.00 sec)
Note (Code 1276): Field or reference 'employees.employees.emp_no' of SELECT #2 was resolved in SELECT #1
Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,concat(`employees`.`employees`.`first_name`,' ',`employees`.`employees`.`last_name`) AS `f2` from `employees`.`employees` order by (/* select#2 */ select `employees`.`employees`.`emp_no` limit 1)

The above code shows member

ordering_operation

 of

query_block

  (which includes the 

order_by_subqueries

 array) with information on how the subquery in

ORDER BY

  was optimized.

This is a simple example. In real life you can have larger subqueries in the 

ORDER BY

  clause. For example, take this more complicated and slightly crazy query:

select emp_no, concat(first_name, ' ', last_name) f2 from employees order by (select dept_no as c from salaries join dept_emp using (emp_no) group by dept_no)

Run a regular

EXPLAIN

 on it. If we imagine this is a regular subquery, we won’t know if it can be cached or would be executed for each row sorted.

mysql> explain  select emp_no, concat(first_name, ' ', last_name) f2 from employees order by (select dept_no as c from salaries join dept_emp using (emp_no) group by dept_no)G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: employees
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 299843
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 2
  select_type: SUBQUERY
        table: dept_emp
   partitions: NULL
         type: index
possible_keys: PRIMARY,emp_no,dept_no
          key: dept_no
      key_len: 4
          ref: NULL
         rows: 331215
     filtered: 100.00
        Extra: Using index
*************************** 3. row ***************************
           id: 2
  select_type: SUBQUERY
        table: salaries
   partitions: NULL
         type: ref
possible_keys: PRIMARY,emp_no
          key: emp_no
      key_len: 4
          ref: employees.dept_emp.emp_no
         rows: 10
     filtered: 100.00
        Extra: Using index
3 rows in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,concat(`employees`.`employees`.`first_name`,' ',`employees`.`employees`.`last_name`) AS `f2` from `employees`.`employees` order by (/* select#2 */ select `employees`.`dept_emp`.`dept_no` AS `c` from `employees`.`salaries` join `employees`.`dept_emp` where (`employees`.`salaries`.`emp_no` = `employees`.`dept_emp`.`emp_no`) group by `employees`.`dept_emp`.`dept_no`)

EXPLAIN FORMAT=JSON

  provides a completely different picture:

mysql> explain format=json select emp_no, concat(first_name, ' ', last_name) f2 from employees order by (select dept_no as c from salaries join dept_emp using (emp_no) group by dept_no)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "60833.60"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "employees",
        "access_type": "ALL",
        "rows_examined_per_scan": 299843,
        "rows_produced_per_join": 299843,
        "filtered": "100.00",
        "cost_info": {
          "read_cost": "865.00",
          "eval_cost": "59968.60",
          "prefix_cost": "60833.60",
          "data_read_per_join": "13M"
        },
        "used_columns": [
          "emp_no",
          "first_name",
          "last_name"
        ]
      },
      "optimized_away_subqueries": [
        {
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "1082124.21"
            },
            "grouping_operation": {
              "using_filesort": false,
              "nested_loop": [
                {
                  "table": {
                    "table_name": "dept_emp",
                    "access_type": "index",
                    "possible_keys": [
                      "PRIMARY",
                      "emp_no",
                      "dept_no"
                    ],
                    "key": "dept_no",
                    "used_key_parts": [
                      "dept_no"
                    ],
                    "key_length": "4",
                    "rows_examined_per_scan": 331215,
                    "rows_produced_per_join": 331215,
                    "filtered": "100.00",
                    "using_index": true,
                    "cost_info": {
                      "read_cost": "673.00",
                      "eval_cost": "66243.00",
                      "prefix_cost": "66916.00",
                      "data_read_per_join": "5M"
                    },
                    "used_columns": [
                      "emp_no",
                      "dept_no"
                    ]
                  }
                },
                {
                  "table": {
                    "table_name": "salaries",
                    "access_type": "ref",
                    "possible_keys": [
                      "PRIMARY",
                      "emp_no"
                    ],
                    "key": "emp_no",
                    "used_key_parts": [
                      "emp_no"
                    ],
                    "key_length": "4",
                    "ref": [
                      "employees.dept_emp.emp_no"
                    ],
                    "rows_examined_per_scan": 10,
                    "rows_produced_per_join": 3399374,
                    "filtered": "100.00",
                    "using_index": true,
                    "cost_info": {
                      "read_cost": "335333.33",
                      "eval_cost": "679874.87",
                      "prefix_cost": "1082124.21",
                      "data_read_per_join": "51M"
                    },
                    "used_columns": [
                      "emp_no",
                      "from_date"
                    ]
                  }
                }
              ]
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,concat(`employees`.`employees`.`first_name`,' ',`employees`.`employees`.`last_name`) AS `f2` from `employees`.`employees` order by (/* select#2 */ select `employees`.`dept_emp`.`dept_no` AS `c` from `employees`.`salaries` join `employees`.`dept_emp` where (`employees`.`salaries`.`emp_no` = `employees`.`dept_emp`.`emp_no`) group by `employees`.`dept_emp`.`dept_no`)

We see that the subquery was optimized away: member

optimized_away_subqueries

 exists, but there is no

order_by_subqueries

 in the

ordering_operation

 object. We can also see that the subquery was cached:

"cacheable": true

.

EXPLAIN FORMAT=JSON

 also provides information about subqueries in the 

GROUP BY

 clause. It uses the 

group_by_subqueries

 array in the 

grouping_operation

  member for this purpose.

mysql> explain format=json select count(emp_no) from salaries group by salary > ALL (select s/c as avg_salary from (select dept_no, sum(salary) as s, count(emp_no) as c from salaries join dept_emp using (emp_no) group by dept_no) t)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "3412037.60"
    },
    "grouping_operation": {
      "using_temporary_table": true,
      "using_filesort": true,
      "cost_info": {
        "sort_cost": "2838638.00"
      },
      "table": {
        "table_name": "salaries",
        "access_type": "ALL",
        "rows_examined_per_scan": 2838638,
        "rows_produced_per_join": 2838638,
        "filtered": "100.00",
        "cost_info": {
          "read_cost": "5672.00",
          "eval_cost": "567727.60",
          "prefix_cost": "573399.60",
          "data_read_per_join": "43M"
        },
        "used_columns": [
          "emp_no",
          "salary",
          "from_date"
        ]
      },
      "group_by_subqueries": [
        {
          "dependent": true,
          "cacheable": false,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "881731.00"
            },
            "table": {
              "table_name": "t",
              "access_type": "ALL",
              "rows_examined_per_scan": 3526884,
              "rows_produced_per_join": 3526884,
              "filtered": "100.00",
              "cost_info": {
                "read_cost": "176354.20",
                "eval_cost": "705376.80",
                "prefix_cost": "881731.00",
                "data_read_per_join": "134M"
              },
              "used_columns": [
                "dept_no",
                "s",
                "c"
              ],
              "attached_condition": "((<cache>(`employees`.`salaries`.`salary`) <= (`t`.`s` / `t`.`c`)) or isnull((`t`.`s` / `t`.`c`)))",
              "materialized_from_subquery": {
                "using_temporary_table": true,
                "dependent": false,
                "cacheable": true,
                "query_block": {
                  "select_id": 3,
                  "cost_info": {
                    "query_cost": "1106758.94"
                  },
                  "grouping_operation": {
                    "using_filesort": false,
                    "nested_loop": [
                      {
                        "table": {
                          "table_name": "dept_emp",
                          "access_type": "index",
                          "possible_keys": [
                            "PRIMARY",
                            "emp_no",
                            "dept_no"
                          ],
                          "key": "dept_no",
                          "used_key_parts": [
                            "dept_no"
                          ],
                          "key_length": "4",
                          "rows_examined_per_scan": 331215,
                          "rows_produced_per_join": 331215,
                          "filtered": "100.00",
                          "using_index": true,
                          "cost_info": {
                            "read_cost": "673.00",
                            "eval_cost": "66243.00",
                            "prefix_cost": "66916.00",
                            "data_read_per_join": "5M"
                          },
                          "used_columns": [
                            "emp_no",
                            "dept_no"
                          ]
                        }
                      },
                      {
                        "table": {
                          "table_name": "salaries",
                          "access_type": "ref",
                          "possible_keys": [
                            "PRIMARY",
                            "emp_no"
                          ],
                          "key": "PRIMARY",
                          "used_key_parts": [
                            "emp_no"
                          ],
                          "key_length": "4",
                          "ref": [
                            "employees.dept_emp.emp_no"
                          ],
                          "rows_examined_per_scan": 10,
                          "rows_produced_per_join": 3526884,
                          "filtered": "100.00",
                          "cost_info": {
                            "read_cost": "334466.14",
                            "eval_cost": "705376.80",
                            "prefix_cost": "1106758.95",
                            "data_read_per_join": "53M"
                          },
                          "used_columns": [
                            "emp_no",
                            "salary",
                            "from_date"
                          ]
                        }
                      }
                    ]
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.01 sec)
Note (Code 1003): /* select#1 */ select count(`employees`.`salaries`.`emp_no`) AS `count(emp_no)` from `employees`.`salaries` group by <not>(<in_optimizer>(`employees`.`salaries`.`salary`,<exists>(/* select#2 */ select 1 from (/* select#3 */ select `employees`.`dept_emp`.`dept_no` AS `dept_no`,sum(`employees`.`salaries`.`salary`) AS `s`,count(`employees`.`salaries`.`emp_no`) AS `c` from `employees`.`salaries` join `employees`.`dept_emp` where (`employees`.`salaries`.`emp_no` = `employees`.`dept_emp`.`emp_no`) group by `employees`.`dept_emp`.`dept_no`) `t` where ((<cache>(`employees`.`salaries`.`salary`) <= (`t`.`s` / `t`.`c`)) or isnull((`t`.`s` / `t`.`c`))) having <is_not_null_test>((`t`.`s` / `t`.`c`)))))

Again, this output gives a clear view of query optimization: subquery in

GROUP BY

 itself cannot be optimized, cached or converted into temporary table, but the subquery inside the subquery (

select dept_no, sum(salary) as s, count(emp_no) as c from salaries join dept_emp using (emp_no) group by dept_no

) could be materialized into a temporary table and cached.

A regular

EXPLAIN

 command does not provide such details:

mysql> explain select count(emp_no) from salaries group by salary > ALL (select s/c as avg_salary from (select dept_no, sum(salary) as s, count(emp_no) as c from salaries join dept_emp using (emp_no) group by dept_no) t)G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838638
     filtered: 100.00
        Extra: Using temporary; Using filesort
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: <derived3>
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3526884
     filtered: 100.00
        Extra: Using where
*************************** 3. row ***************************
           id: 3
  select_type: DERIVED
        table: dept_emp
   partitions: NULL
         type: index
possible_keys: PRIMARY,emp_no,dept_no
          key: dept_no
      key_len: 4
          ref: NULL
         rows: 331215
     filtered: 100.00
        Extra: Using index
*************************** 4. row ***************************
           id: 3
  select_type: DERIVED
        table: salaries
   partitions: NULL
         type: ref
possible_keys: PRIMARY,emp_no
          key: PRIMARY
      key_len: 4
          ref: employees.dept_emp.emp_no
         rows: 10
     filtered: 100.00
        Extra: NULL
4 rows in set, 1 warning (0.01 sec)
Note (Code 1003): /* select#1 */ select count(`employees`.`salaries`.`emp_no`) AS `count(emp_no)` from `employees`.`salaries` group by <not>(<in_optimizer>(`employees`.`salaries`.`salary`,<exists>(/* select#2 */ select 1 from (/* select#3 */ select `employees`.`dept_emp`.`dept_no` AS `dept_no`,sum(`employees`.`salaries`.`salary`) AS `s`,count(`employees`.`salaries`.`emp_no`) AS `c` from `employees`.`salaries` join `employees`.`dept_emp` where (`employees`.`salaries`.`emp_no` = `employees`.`dept_emp`.`emp_no`) group by `employees`.`dept_emp`.`dept_no`) `t` where ((<cache>(`employees`.`salaries`.`salary`) <= (`t`.`s` / `t`.`c`)) or isnull((`t`.`s` / `t`.`c`))) having <is_not_null_test>((`t`.`s` / `t`.`c`)))))

Most importantly, we cannot guess from the output if the 

DERIVED

 subquery can be cached.

Conlcusion:

EXPLAIN FORMAT=JSON

  provides details on how subqueries in

ORDER BY

 and

GROUP BY

 clauses are optimized.

Dec
28
2015
--

EXPLAIN FORMAT=JSON provides insights on optimizer_switch effectiveness

optimizer_switch

EXPLAIN FORMAT=JSONThe previous post in the EXPLAIN FORMAT=JSON is Cool! series showed an example of the query

select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null)

, where the subquery was materialized into a temporary table and then joined with the outer query. This is known as a semi-join optimization. But what happens if we turn off this optimization?

EXPLAIN FORMAT=JSON

 can help us with this investigation too.

First lets look at the original output again:

mysql> explain format=json select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null)G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "16.72"
    },
    "nested_loop": [
      {
        "table": {
          "table_name": "departments",
          <skipped>
      },
      {
        "table": {
          "table_name": "<subquery2>",
          "access_type": "eq_ref",
          "key": "<auto_key>",
          "key_length": "4",
          "ref": [
            "employees.departments.dept_no"
          ],
          "rows_examined_per_scan": 1,
          "materialized_from_subquery": {
            "using_temporary_table": true,
            "query_block": {
              "table": {
                "table_name": "dept_manager",
                "access_type": "ALL",
                "possible_keys": [
                  "dept_no"
                ],
                "rows_examined_per_scan": 24,
                "rows_produced_per_join": 21,
                "filtered": "90.00",
                "cost_info": {
                  "read_cost": "1.48",
                  "eval_cost": "4.32",
                  "prefix_cost": "5.80",
                  "data_read_per_join": "345"
                },
                "used_columns": [
                  "dept_no",
                  "to_date"
                ],
                "attached_condition": "(`employees`.`dept_manager`.`to_date` is not null)"
              }
            }
          }
        }
      }
    ]
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`departments`.`dept_name` AS `dept_name` from `employees`.`departments` semi join (`employees`.`dept_manager`) where ((`<subquery2>`.`dept_no` = `employees`.`departments`.`dept_no`) and (`employees`.`dept_manager`.`to_date` is not null))

To repeat what happened here: the subquery was materialized into a  temporary table, then  joined with the departments table. Semi-join optimization is ON by default (as would be most likely without intervention).

What happens if we temporarily turn semi-join optimization OFF?

mysql> set optimizer_switch="semijoin=off";
Query OK, 0 rows affected (0.00 sec)

And then execute

EXPLAIN

 one more time:

mysql> explain format=json select dept_name from departments where dept_no in (select dept_no from dept_manager where to_date is not null) G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "2.80"
    },
    "table": {
      "table_name": "departments",
      "access_type": "index",
      "key": "dept_name",
      "used_key_parts": [
        "dept_name"
      ],
      "key_length": "42",
      "rows_examined_per_scan": 9,
      "rows_produced_per_join": 9,
      "filtered": "100.00",
      "using_index": true,
      "cost_info": {
        "read_cost": "1.00",
        "eval_cost": "1.80",
        "prefix_cost": "2.80",
        "data_read_per_join": "432"
      },
      "used_columns": [
        "dept_no",
        "dept_name"
      ],
      "attached_condition": "<in_optimizer>(`employees`.`departments`.`dept_no`,`employees`.`departments`.`dept_no` in ( <materialize> (/* select#2 */ select `employees`.`dept_manager`.`dept_no` from `employees`.`dept_manager` where (`employees`.`dept_manager`.`to_date` is not null) ), <primary_index_lookup>(`employees`.`departments`.`dept_no` in <temporary table> on <auto_key> where ((`employees`.`departments`.`dept_no` = `materialized-subquery`.`dept_no`)))))",
      "attached_subqueries": [
        {
          "table": {
            "table_name": "<materialized_subquery>",
            "access_type": "eq_ref",
            "key": "<auto_key>",
            "key_length": "4",
            "rows_examined_per_scan": 1,
            "materialized_from_subquery": {
              "using_temporary_table": true,
              "dependent": true,
              "cacheable": false,
              "query_block": {
                "select_id": 2,
                "cost_info": {
                  "query_cost": "5.80"
                },
                "table": {
                  "table_name": "dept_manager",
                  "access_type": "ALL",
                  "possible_keys": [
                    "dept_no"
                  ],
                  "rows_examined_per_scan": 24,
                  "rows_produced_per_join": 21,
                  "filtered": "90.00",
                  "cost_info": {
                    "read_cost": "1.48",
                    "eval_cost": "4.32",
                    "prefix_cost": "5.80",
                    "data_read_per_join": "345"
                  },
                  "used_columns": [
                    "dept_no",
                    "to_date"
                  ],
                  "attached_condition": "(`employees`.`dept_manager`.`to_date` is not null)"
                }
              }
            }
          }
        }
      ]
    }
  }
}
1 row in set, 1 warning (0.00 sec)
Note (Code 1003): /* select#1 */ select `employees`.`departments`.`dept_name` AS `dept_name` from `employees`.`departments` where <in_optimizer>(`employees`.`departments`.`dept_no`,`employees`.`departments`.`dept_no` in ( <materialize> (/* select#2 */ select `employees`.`dept_manager`.`dept_no` from `employees`.`dept_manager` where (`employees`.`dept_manager`.`to_date` is not null) ), <primary_index_lookup>(`employees`.`departments`.`dept_no` in <temporary table> on <auto_key> where ((`employees`.`departments`.`dept_no` = `materialized-subquery`.`dept_no`)))))

Now the picture is completely different. There is no

nested_loop

 member, and instead there is an 

attached_subqueries

 array containing a single member: the temporary table materialized from the subquery

select dept_no from dept_manager where to_date is not null

 (including all the details of this materialization).

Conclusion: We can experiment with the value of

optimizer_switch

 and use

EXPLAIN FORMAT=JSON

 to examine how a particular optimization affects our queries.

Dec
27
2015
--

The Freelancer Generation: Why Startups And Enterprises Need To Pay Attention

shutterstock_279072494 Gone are the days of the 40-hour work week that kept us at work eight hours a day. I find that most startup founders or business owners in Silicon Valley work 50-60 hours a week. Today, freelancing is becoming the accepted norm of the startup world. As more startups are starting to use the millions of freelancers, it’s driving more and more people to join the freelancer generation. Read More

Dec
23
2015
--

Salesforce Grabs Quote To Cash Vendor SteelBrick for $360 Million

Business man signing a contract. They say there’s not supposed to be much M&A action the week of Christmas, but Salesforce apparently didn’t get the memo, picking up quote-to-cash vendor SteelBrick today for $360 million. According to the 8-K form Salesforce filed with the SEC, the deal is for $360 million in stock, less the $60 million SteelBrick has in cash, but not including Salesforce’s investment… Read More

Dec
23
2015
--

After 500,000 Apps Built, Bizness Apps Launches Apex, A New White-Label App Builder

3 Founded in 2010 as a DIY app and website development platform, Bizness Apps has quietly turned its attention to resellers and digital agencies, becoming one of the largest providers of white-label app development software on the market. In fact, over 500,000 total mobile apps have now been created using the company’s software, and they account for five percent of all apps in the… Read More

Dec
23
2015
--

Percona Server for MongoDB storage engines in iiBench insert workload

storage engine

storage enginesWe recently released the GA version of Percona Server for MongoDB, which comes with a variety of storage engines: RocksDB, PerconaFT and WiredTiger.

Both RocksDB and PerconaFT are write-optimized engines, so I wanted to compare all engines in a workload oriented to data ingestions.

For a benchmark I used iiBench-mongo (https://github.com/mdcallag/iibench-mongodb), and I inserted one billion (bln) rows into a collection with three indexes. Inserts were done in ten parallel threads.

For memory limits, I used a 10GB as the cache size, with a total limit of 20GB available for the mongod process, limited with cgroups (so the extra 10GB of memory was available for engine memory allocation and OS cache).

For the storage I used a single Crucial M500 960GB SSD. This is a consumer grade SATA SSD. It does not provide the best performance, but it is a great option price/performance wise.

Every time I mention WiredTiger, someone in the comments asks about the LSM option for WiredTiger. Even though LSM is still not an official mode in MongoDB 3.2, I added WiredTiger-LSM from MongoDB 3.2 into the mix. It won’t have the optimal settings, as there is no documentation how to use LSM in WiredTiger.

First, let me show a combined graph for all engines:
engines-timeline

And now, let’s zoom in on the individual engines.

WiredTiger:

wt-3.0

RocksDB + PerconaFT:

rocks-perconaft-3.0

UPDATE on 12/30/15
With an input from RocksDB developers at Facebook, after extra tuning of RocksDB (add delayed_write_rate=12582912;soft_rate_limit=0;hard_rate_limit=0; to config) I am able to get much better result for RocksDB:
rocks-3.0-dyn12M

What conclusions can we make?

  1. WiredTiger’s memory (about the first one million (mln) rows) performed extremely well, achieving over 100,000 inserts/sec. As data grows and exceeds memory size, WiredTiger behaved as a traditional B-Tree engine (which is no surprise).
  2. PerconaFT and RocksDB showed closer to constant throughput, with RocksDB being overall better, However, with data growth both engines start to experience challenges. For PerconaFT, the throughput varies more with more data, and RocksDB shows more stalls (which I think is related to a compaction process).
  3. WiredTiger LSM didn’t show as much variance as a B-Tree, but it still had a decline related to data size, which in general should not be there (as we see with RocksDB, also LSM based).

Inserting data is only one part of the equation. Now we also need to retrieve data from the database (which we’ll cover in another blog post).

Configuration for PerconaFT:

numactl --interleave=all ./mongod --dbpath=/mnt/m500/perconaft --storageEngine=PerconaFT --PerconaFTEngineCacheSize=$(( 10*1024*1024*1024 )) --syncdelay=900 --PerconaFTIndexFanout=128 --PerconaFTCollectionFanout=128 --PerconaFTIndexCompression=quicklz --PerconaFTCollectionCompression=quicklz --PerconaFTIndexReadPageSize=16384 --PerconaFTCollectionReadPageSize=16384

Configuration for RocksDB:

storage.rocksdb.configString:
 "bytes_per_sync=16m;max_background_flushes=3;max_background_compactions=12;max_write_buffer_number=4;max_bytes_for_level_base=1500m;target_file_size_base=200m;level0_slowdown_writes_trigger=12;write_buffer_size=400m;compression_per_level=kSnappyCompression:kSnappyCompression:kSnappyCompression:kSnappyCompression:kSnappyCompression:kSnappyCompression:kSnappyCompression;optimize_filters_for_hits=true"

Configuration for WiredTiger-3.2 LSM:

storage.wiredTiger.collectionConfig.configString:
 "type=lsm"
 storage.wiredTiger.indexConfig.configString:
 "type=lsm"

Load parameters for iibench:

TEST_RUN_ARGS_LOAD="1000000000 6000 1000 999999 10 256 3 0

Dec
22
2015
--

Oracle StackEngine Acquisition Part Of Expanding Cloud Strategy

Oracle bi-plane leaving a cloud behind it. After years of ridiculing the cloud, Oracle has been taking it a lot more seriously recently, and it quietly purchased StackEngine last Friday as part of an effort to boost its Platform as a Service offerings.
StackEngine, which is based in Austin, Texas had a very brief announcement on its website linking to Oracle.com. It simply stated that the database giant had purchased the… Read More

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com