Feb
21
2018
--

Google’s Cloud IoT Core is now generally available

 Cloud IoT Core, Google’s fully managed service for connecting, managing and ingesting data from IoT devices, is now out of beta and generally available. Google envisions the service, which launched in public beta last September, as the first entry point for IoT data into its cloud. Once the data has been ingested, users can use Cloud IoT Core to push data to Google’s cloud… Read More

Feb
21
2018
--

Outsourcing management startup 4me announces $1.65 million seed investment led by Storm Ventures

 4me, a startup that helps companies organize and track their IT outsourcing projects, announced a $1.65 million seed investment led by Storm Ventures. The company, which launched in 2010, would seem to be a bit long in the tooth to warrant a seed investment round, but the founders had successful exits from previous startups and were able to self-fund the company all these years. Read More

Feb
16
2018
--

Oracle grabs Zenedge as it continues to beef up its cloud security play

 Oracle announced yesterday that it intends to acquire Zenedge, a 4-year old hybrid security startup. They didn’t reveal a purchase price. With Zenedge, Oracle gets a security service to add it to its growing cloud play. In this case, the company has products to protect customers whether in the cloud, on-prem or across hybrid environments. The company offers a range of services from… Read More

Feb
12
2018
--

Oracle to expand automation capabilities across developer cloud services

Larry Ellison, chairman of Oracle Corp. Last fall at Oracle OpenWorld, chairman Larry Ellison showed he was a man of the people by comparing the company’s new autonomous database service to auto-pilot on his private plane. Regardless, those autonomous capabilities were pretty advanced, providing customers with a self-provisioning, self-tuning and self-repairing database. Today, Oracle announced it was expanding that… Read More

Feb
07
2018
--

Akamai has laid off 400 workers or 5 percent of global workforce

 Akamai, the Cambridge Massachusetts content delivery network and network services provider, announced they had laid off 400 people in their earnings call with analysts yesterday.
On the call, Akamai CEO Tom Leighton indicated that the 400 people represented 5 percent of the company’s 8000 worldwide workforce. “As part of our effort to improve operational efficiency, we reduced… Read More

Feb
07
2018
--

Intel’s latest chip is designed for computing at the edge

 As we develop increasingly sophisticated technologies like self-driving cars and industrial internet of things sensors, it’s going to require that we move computing to the edge. Essentially this means that instead of sending data to the cloud for processing, it needs to be done right on the device itself because even a little bit of latency is too much. Intel announced a new chip… Read More

Feb
06
2018
--

Slack names Allen Shim as company’s first CFO

 In a blog post this morning Slack CEO Stewart Butterfield announced the company is naming long-time employee Allen Shim as the company’s first CFO. “Today, I’m excited to announce another milestone: Allen Shim has been appointed Chief Financial Officer for Slack,” Butterfield wrote in the blog post. He went on to describe Shim as his right hand man, who has been with… Read More

Feb
06
2018
--

Microsoft will buy out existing cloud storage contracts for customers switching to OneDrive for Business

 Microsoft is targeting its cloud storage rivals including Dropbox, Box, and Google today by offering to essentially buy out customers’ existing contracts if they make the switch to OneDrive for Business. The company says that customers currently paying for one of these competitive solutions, can instead opt to use OneDrive for free for the remainder of their contract’s term. The… Read More

Feb
01
2018
--

Google’s Diane Greene says billion-dollar cloud revenue already puts them in elite company

 It has long been believed that the big three in the cloud consisted of AWS, Microsoft and Google, with IBM not doing too badly either. But in its earnings call with analysts today, the company revealed it’s pulling in a billion dollars a quarter in combined cloud revenue. That’s a figure that Google’s Diane Greene says already puts her company on elite footing. Read More

Jan
31
2018
--

Aurora Hash Join Optimization (with a Gentle Reminder on Lab Features)

Aurora Hash Join Lab Mode

Aurora Hash Join Lab ModeThe Aurora hash join feature for relational databases has been around for a while now. But unlike MySQL Block Nested Loop algorithm, an Aurora hash join only caters to a specific number of use cases. When implemented with the optimizer properly, they can provide great benefits with certain workloads. Below we’ll see a brief example of a quick win.

This new feature is available in Aurora lab mode version 1.16. Because this is a lab feature, it’s important to make sure to test your queries before upgrading, especially if you are looking to scale up to the new R4 instances before the Superbowl to avoid hitting the same problem I discuss below.

When lab mode is enabled and

hash_join

  is ON, you can verify the optimizer feature from the

optimizer_switch

 variable:

mysql> SELECT @@aurora_version, @@aurora_lab_mode, @@optimizer_switch G
*************************** 1. row ***************************
  @@aurora_version: 1.16
 @@aurora_lab_mode: 1
@@optimizer_switch: index_merge=on,...,hash_join=on,hash_join_cost_based=on

Hash joins work well when joining large result sets because – unlike block nested loop in the same query – the optimizer scans the larger table and matches it against the hashed smaller table instead of the other way around. Consider the tables and query below:

+----------+----------+
| tbl      | rows     |
+----------+----------+
| branches |    55143 |
| users    |   103949 |
| history  | 27168887 |
+----------+----------+
EXPLAIN
SELECT SQL_NO_CACHE COUNT(*)
FROM branches b
   INNER JOIN users u ON (b.u_id = u.u_id)
   INNER JOIN history h ON (u.u_id = h.u_id);

With hash joins enabled, we can see from the Extra column in the EXPLAIN output how it builds the join conditions:

mysql> EXPLAIN
    -> SELECT SQL_NO_CACHE COUNT(*)
    -> FROM branches b
    ->    INNER JOIN users u ON (b.u_id = u.u_id)
    ->    INNER JOIN history h ON (u.u_id = h.u_id);
+----+-------------+-------+-------+---------------+---------+---------+------+----------+----------------------------------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows     | Extra                                                    |
+----+-------------+-------+-------+---------------+---------+---------+------+----------+----------------------------------------------------------+
|  1 | SIMPLE      | u     | index | PRIMARY       | PRIMARY | 4       | NULL |   103342 | Using index                                              |
|  1 | SIMPLE      | h     | ALL   | NULL          | NULL    | NULL    | NULL | 24619023 | Using join buffer (Hash Join Outer table h)              |
|  1 | SIMPLE      | b     | index | user_id       | user_id | 4       | NULL |    54129 | Using index; Using join buffer (Hash Join Inner table b) |
+----+-------------+-------+-------+---------------+---------+---------+------+----------+----------------------------------------------------------+

Without hash joins, it’s a straightforward Cartesian (almost) product of all three tables:

mysql> SET optimizer_switch='hash_join=off';
Query OK, 0 rows affected (0.02 sec)
mysql> EXPLAIN
    -> SELECT SQL_NO_CACHE COUNT(*)
    -> FROM branches b
    ->    INNER JOIN users u ON (b.u_id = u.u_id)
    ->    INNER JOIN history h ON (u.u_id = h.u_id);
+----+-------------+-------+--------+---------------+---------+---------+----------------+----------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref            | rows     | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+----------------+----------+-------------+
|  1 | SIMPLE      | h     | ALL    | NULL          | NULL    | NULL    | NULL           | 24619023 | NULL        |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY       | PRIMARY | 4       | percona.h.u_id |        1 | Using index |
|  1 | SIMPLE      | b     | ref    | user_id       | user_id | 4       | percona.h.u_id |        7 | Using index |
+----+-------------+-------+--------+---------------+---------+---------+----------------+----------+-------------+

Now, the execution times without hash joins enabled:

mysql> SELECT SQL_NO_CACHE COUNT(*)
    -> FROM branches b
    ->    INNER JOIN users u ON (b.u_id = u.u_id)
    ->    INNER JOIN history h ON (u.u_id = h.u_id);
+-----------+
| COUNT(*)  |
+-----------+
| 128815553 |
+-----------+
1 row in set (1 min 6.95 sec)
mysql> SET optimizer_switch='hash_join=off';
Query OK, 0 rows affected (0.01 sec)
mysql> SELECT SQL_NO_CACHE COUNT(*)
    -> FROM branches b
    ->    INNER JOIN users u ON (b.u_id = u.u_id)
    ->    INNER JOIN history h ON (u.u_id = h.u_id);
+-----------+
| COUNT(*)  |
+-----------+
| 128815553 |
+-----------+
1 row in set (2 min 28.27 sec)

Clearly with this optimization enabled, we have more than a 50% gain from the example query.

Now while this type of query might be rare, most of us know we need to avoid really large JOINs as they are not scalable. But at some point, we find some that take advantage of the feature. Here is an excerpt from an actual production query I’ve recently worked on. It shows the good execution plan versus the one using hash joins.

This particular EXPLAIN output only differs in the row where without a hash join, it uses an index, and the query executes normally. With the hash join enabled, the optimizer thought it was better to use it instead:

...
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
         type: eq_ref
possible_keys: PRIMARY,r_type_id_ix,r_id_r_type_id_dt_ix
          key: PRIMARY
      key_len: 4
          ref: db.x.p_id
         rows: 1
        Extra: Using where
...
...
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
         type: index
possible_keys: PRIMARY,r_type_id_ix,r_id_r_type_id_dt_ix
          key: r_id_r_type_id_dt_ix
      key_len: 18
          ref: NULL
         rows: 715568233
        Extra: Using where; Using index; Using join buffer (Hash Join Inner table t)
...

Needless to say, it caused problems. Unfortunately, a bug on Aurora 1.16 exists where hash joins cannot be turned off selectively (it is enabled by default) from the parameter group. If you try this, you get an error “Error saving: Invalid parameter value: hash_join=off for: optimizer_switch”. The only way to disable the feature is to turn off

lab_mode

, which requires an instance restart. An alternative is to simply add

SET optimizer_switch='hash_join=off';

 from the application, especially if you rely on some of the other lab mode features in Aurora.

To summarize, the new hash join feature is a great addition. But as it’s a lab feature, be careful when upgrading!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com