Aug
31
2016
--

Salesforce shares face plant despite beating analyst expectations

salesforce Salesforce earnings came out today! They’re not great, either, and it looks like a weak outlook for the company’s third quarter is doing some damage to its shares, which were down as much as 8 percent. For a company that literally defined the phrase “software as a service” — basically, running your business online — and one that’s had a decent year… Read More

Aug
31
2016
--

Glint raises $27 million to stop solid employees from bailing

Screen Shot 2016-08-31 at 2.44.04 PM Redwood City-based Glint has raised $27 million in a Series C round of venture funding for software that helps companies figure out how their teams are experiencing problems, or why and when solid employees may leave, undesirably. Glint replaces the traditional 360-review tools used by corporate HR teams and executives with short, anonymous surveys sent out once in a while to specific groups… Read More

Aug
31
2016
--

Facebook open sources Zstandard compression algorithm and MyRocks storage engine

Illustration and Painting Today, Facebook is releasing its Zstandard compression algorithm into the wild as open source. The lossless compression technology is aimed at replacing existing libraries like zlib that are powered by the outdated Deflate compression algorithm. In addition to Zstandard, Facebook is also dropping its MyRocks storage engine as open source. MyRocks is currently being used by Facebook to… Read More

Aug
30
2016
--

Webinar Wednesday, August 31 – MongoDB Security: A Practical Approach

Percona MySQL and MongoDB Webinars

MongoDB SecurityPlease join David Murphy as he presents a webinar Wednesday, August 31 at 10 am PDT (UTC-7) on MongoDB Security: A Practical Approach.

Register
This webinar will discuss the many features and options available in the MongoDB community to help secure your database environment. First, we will cover how these features work and how to fill in known gaps. Next, we will look at the enterprise-type features shared by Percona Server for MongoDB and MongoDB Enterprise. Finally, we will examine some disk and network designs that can ensure your security out of the gate – you don’t want to read about how your MongoDB database leaked hundreds of gigs of data on someone’s security blog!

We will cover the following topics:

  • Using SSL for all things
  • Limiting network access
  • Custom roles
  • The missing true SUPER user
  • Wildcarding databases and collections
  • Adding specific actions to a normal role
  • LDAP
  • Auditing
  • Disk encryption
  • Network designs

Register for the webinar here.

MongoDB SecurityDavid Murphy, MongoDB Practice Manager
David joined Percona in October 2015 as Practice Manager for MongoDB. Prior to that, David was on the ObjectRocket by Rackspace team as the Lead DBA. With the growth involved with any recently acquired startup, David’s role covered a wide range from evangelism, research, run book development, knowledgebase design, consulting, technical account management, mentoring and much more. Prior to the world of MongoDB had been a MySQL and NoSQL architect at Electronic Arts working with some of the largest titles in the world like FIFA, SimCity, and Battle Field providing tuning, design, and technology choice responsibilities. David maintains an active interest in database speaking and exploring new technologies.
Aug
30
2016
--

MySQL Sharding with ProxySQL

MySQL Sharding with ProxySQL

MySQL Sharding with ProxySQLThis article demonstrates how MySQL sharding with ProxySQL works.

Recently a colleague of mine asked me to provide a simple example on how ProxySQL performs sharding.

In response, I’m writing this short tutorial in the hope it will illustrate ProxySQL’s sharding functionalities, and help people out there better understand how to use it.

ProxySQL is a very powerful platform that allows us to manipulate and manage our connections and queries in a simple but effective way. This article shows you how.

Before starting let’s clarify some basic concepts.

  • ProxySQL organizes its internal set of servers in Host Groups (HG), and each HG can be associated with users and Query Rules (QR)
  • Each QR can be final (apply = 1) or = let ProxySQL continue to parse other QRs
  • A QR can be a rewrite action, be a simple match, have a specific target HG, or be generic
  • QRs are defined using regex

You can see QRs as a sequence of filters and transformations that you can arrange as you like.

These simple basic rules give us enormous flexibility. They allow us to create very simple actions like a simple query re-write, or very complex chains that could see dozens of QR concatenated. Documentation can be found here.

The information related to HGs or QRs is easily accessible using the ProxySQL administrator interface, in the tables mysql_servers, mysql_query_rules and stats.stats_mysql_query_rules. The last one allows us to evaluate if and how the rule(s) is used.

With regards to sharding, what can ProxySQL do to help us achieve what we need (in a relatively easy way)? Some people/companies include sharding logic in the application, use multiple connections to reach the different targets, or have some logic to split the load across several schemas/tables. ProxySQL allows us to simplify the way connectivity and query distribution is supposed to work reading data in the query or accepting HINTS.

No matter what the requirements, the sharding exercise can be summarized in a few different categories.

  • By splitting the data inside the same container (like having a shard by State where each State is a schema)
  • By physical data location (this can have multiple MySQL servers in the same room, as well as having them geographically distributed)
  • A combination of the two, where I do split by State using a dedicated server, and again split by schema/table by whatever (say by gender)

In the following examples, I show how to use ProxySQL to cover the three different scenarios defined above (and a bit more).

The example below will report text from the Admin ProxySQL interface and the MySQL console. I will mark each one as follows:

  • Mc for MySQL console
  • Pa for ProxySQL Admin

Please note that the MySQL console MUST use the -c flag to pass the comments in the query. This is because the default behavior in the MySQL console is to remove the comments.

I am going to illustrate procedures that you can replicate on your laptop, and when possible I will mention a real implementation. This because I want you to directly test the ProxySQL functionalities.

For the example described below I have a PrxySQL v1.2.2 that is going to become the master in few days. You can download it from:

git clone https://github.com/sysown/proxysql.git
git checkout v1.2.2

Then to compile:

cd <path to proxy source code>
make
make install

If you need full instructions on how to install and configure ProxySQL, read here and here.

Finally, you need to have the WORLD test DB loaded. WORLD test DB can be found here.

Shard inside the same MySQL Server using three different schemas split by continent

Obviously, you can have any number of shards and relative schemas. What is relevant here is demonstrating how traffic gets redirected to different targets (schemas), maintaining the same structure (tables), by discriminating the target based on some relevant information in the data or pass by the application.

OK, let us roll the ball.

[Mc]
+---------------+-------------+
| Continent     | count(Code) |
+---------------+-------------+
| Asia          |          51 | <--
| Europe        |          46 | <--
| North America |          37 |
| Africa        |          58 | <--
| Oceania       |          28 |
| Antarctica    |           5 |
| South America |          14 |
+---------------+-------------+

For this exercise, I will use three hosts in replica.

To summarize, I will need:

  • Three hosts: 192.168.1.[5-6-7]
  • Three schemas: Continent X + world schema
  • One user : user_shardRW
  • Three hostgroups: 10, 20, 30 (for future use)

First, we will create the schemas Asia, Africa, Europe:

[Mc]
Create schema [Asia|Europe|North_America|Africa];
create table Asia.City as select a.* from  world.City a join Country on a.CountryCode = Country.code where Continent='Asia' ;
create table Europe.City as select a.* from  world.City a join Country on a.CountryCode = Country.code where Continent='Europe' ;
create table Africa.City as select a.* from  world.City a join Country on a.CountryCode = Country.code where Continent='Africa' ;
create table North_America.City as select a.* from  world.City a join Country on a.CountryCode = Country.code where Continent='North America' ;
create table Asia.Country as select * from  world.Country where Continent='Asia' ;
create table Europe.Country as select * from  world.Country where Continent='Europe' ;
create table Africa.Country as select * from  world.Country  where Continent='Africa' ;
create table North_America.Country as select * from  world.Country where Continent='North America' ;

Now, create the user

grant all on *.* to user_shardRW@'%' identified by 'test';

Now let us start to configure the ProxySQL:

[Pa]
insert into mysql_users (username,password,active,default_hostgroup,default_schema) values ('user_shardRW','test',1,10,'test_shard1');
LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK;
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.5',10,3306,100);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.6',20,3306,100);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.7',30,3306,100);
LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;

With this we have defined the user, the servers and the host groups.

Let us start to define the logic with the query rules:

[Pa]
delete from mysql_query_rules where rule_id > 30;
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply) VALUES (31,1,'user_shardRW',"^SELECT\s*(.*)\s*from\s*world.(\S*)\s(.*).*Continent='(\S*)'\s*(\s*.*)$","SELECT \1 from \4.\2 WHERE 1=1 \5",1);
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;

I am now going to query the master (or a single node), but I am expecting ProxySQL to redirect the query to the right shard, catching the value of the continent:

[Mc]
 SELECT name,population from world.City  WHERE Continent='Europe' and CountryCode='ITA' order by population desc limit 1;
+------+------------+
| name | population |
+------+------------+
| Roma |    2643581 |
+------+------------+

You can say: “Hey! You are querying the schema World, of course you get back the correct data.”

This is not what really happened. ProxySQL did not query the schema World, but the schema Europe.

Let’s look at the details:

[Pa]
select * from stats_mysql_query_digest;
Original    :SELECT name,population from world.City  WHERE Continent='Europe' and CountryCode='ITA' order by population desc limit 1;
Transformed :SELECT name,population from Europe.City WHERE ?=? and CountryCode=? order by population desc limit ?

Let me explain what happened.

Rule 31 in ProxySQL will take all the FIELDS we will pass in the query. It will catch the CONTINENT in the WHERE clause, it will take any condition after WHERE and it will reorganize the queries all using the RegEx.

Does this work for any table in the sharded schemas? Of course it does.

A query like:

SELECT name,population from world.Country WHERE Continent='Asia' ;

Will be transformed into:

SELECT name,population from Asia.Country WHERE ?=?

[Mc]
+----------------------+------------+
| name                 | population |
+----------------------+------------+
| Afghanistan          |   22720000 |
| United Arab Emirates |    2441000 |
| Armenia              |    3520000 |
<snip ...>
| Vietnam              |   79832000 |
| Yemen                |   18112000 |
+----------------------+------------+

Another possible approach is to instruct ProxySQL to shard is to pass a hint inside a comment. Let see how.

First let me disable the rule I just inserted. This is not really needed, but we’ll do it so you can see how. ?

[Pa]
mysql> update mysql_query_rules set active=0 where rule_id=31;
Query OK, 1 row affected (0.00 sec)
mysql> LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;
Query OK, 0 rows affected (0.00 sec)

Done.

Now what I want is for *ANY* query that contains the comment /* continent=X */ to go to the continent X schema, same server.

To do so, I instruct ProxySQL to replace any reference to the world schema inside the query I am going to submit.

[Pa]
delete from mysql_query_rules where rule_id in (31,33,34,35,36);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,FlagOUT,FlagIN) VALUES (31,1,'user_shardRW',"\S*\s*\/\*\s*continent=.*Asia\s*\*.*",null,0,23,0);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,FlagIN,FlagOUT) VALUES (32,1,'user_shardRW','world.','Asia.',0,23,23);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,FlagOUT,FlagIN) VALUES (33,1,'user_shardRW',"\S*\s*\/\*\s*continent=.*Europe\s*\*.*",null,0,25,0);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,FlagIN,FlagOUT) VALUES (34,1,'user_shardRW','world.','Europe.',0,25,25);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,FlagOUT,FlagIN) VALUES (35,1,'user_shardRW',"\S*\s*\/\*\s*continent=.*Africa\s*\*.*",null,0,24,0);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,FlagIN,FlagOUT) VALUES (36,1,'user_shardRW','world.','Africa.',0,24,24);
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;

How does this work?

I have defined two concatenated rules. The first captures the incoming query containing the desired value (like continent = Asia). If the match is there, ProxySQL exits that action, but while doing so it will read the Apply field. If Apply is 0, it will read the FlagOUT value. At this point it will go to the first rule (in sequence) that has the value of FlagIN equal to the FlagOUT.

The second rule gets the request and will replace the value of world with the one I have defined. In short, it replaces whatever is in the match_pattern with the value that is in the replace_pattern.

Now ProxySQL implements the Re2 Google library for RegEx. Re2 is very fast but has some limitations, like it does NOT support (at the time of the writing) the flag option g. In other words, if I have a select with many tables, and as such several “world”, Re2 will replace ONLY the first instance.

As such, a query like:

Select /* continent=Europe */ * from world.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='ITA' ;

Will be transformed into:

Select /* continent=Europe */ * from Europe.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='ITA' ;

And fail.

The other day, Rene and I were discussing how to solve this given the lack of implementation in Re2. Finally, we opted for recursive actions.

What does this mean? It means that ProxySQL from v1.2.2 now has a new functionality that allows recursive calls to a Query Rule. The maximum number of iterations that ProxySQL can run is managed by the option (global variable) mysql-query_processor_iterations. Mysql-query_processor_iterations define how many operations a query process can execute as whole (from start to end).

This new implementation allows us to reference a Query Rule to itself to be executed multiple times.

If you go back you will notice that QR 34 has FlagIN and FlagOUT pointing to the same value of 25 and Apply =0. This brings ProxySQL to recursively call rule 34 until it changes ALL the values of the word world.

The result is the following:

[Mc]
Select /* continent=Europe */ Code, City.Name, City.population  from world.Country join world.City on world.City.CountryCode=world.Country.Code where City.population > 10000 group by Name order by City.Population desc limit 5;
+------+---------------+------------+
| Code | Name          | population |
+------+---------------+------------+
| RUS  | Moscow        |    8389200 |
| GBR  | London        |    7285000 |
| RUS  | St Petersburg |    4694000 |
| DEU  | Berlin        |    3386667 |
| ESP  | Madrid        |    2879052 |
+------+---------------+------------+

You can see ProxySQL internal information using the following queries:

[Pa]
 select active,hits, mysql_query_rules.rule_id, match_digest, match_pattern, replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
| active | hits | rule_id | match_digest        | match_pattern                          | replace_pattern | cache_ttl | apply | flagIN | flagOUT |
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
| 1      | 1    | 33      | NULL                | \S*\s*\/\*\s*continent=.*Europe\s*\*.* | NULL            | NULL      | 0     | 0      | 25      |
| 1      | 4    | 34      | NULL                | world.                                 | Europe.         | NULL      | 0     | 25     | 25      |
| 1      | 0    | 35      | NULL                | \S*\s*\/\*\s*continent=.*Africa\s*\*.* | NULL            | NULL      | 0     | 0      | 24      |
| 1      | 0    | 36      | NULL                | world.                                 | Africa.         | NULL      | 0     | 24     | 24      |
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+

And:

[Pa]
select * from stats_mysql_query_digest;
<snip and taking only digest_text>
Select Code, City.Name, City.population from Europe.Country join Europe.City on Europe.City.CountryCode=Europe.Country.Code where City.population > ? group by Name order by City.Population desc limit ?

As you can see ProxySQL has nicely replaced the word world in the query to Europe, and it ran Query Rule 34 four times (hits).

This is obviously working for Insert/Update/Delete as well.

Queries like:

insert into  /* continent=Europe */  world.City values(999999,'AAAAAAA','ITA','ROMA',0) ;

Will be transformed into:

[Pa]
select digest_text from stats_mysql_query_digest;
+-------------------------------------------+
| digest_text                               |
+-------------------------------------------+
| insert into Europe.City values(?,?,?,?,?) |
+-------------------------------------------+

And executed only on the desired schema.

Sharding by host

Using hint

How can I shard and redirect the queries to a host (instead of a schema)? This is even easier! ?

The main point is that whatever matches the rule should go to a defined HG. No rewrite imply, which means less work.

So how this is done? As before, I have three NODES: 192.168.1.[5-6-7]. For this example, I will use world DB (no continent schema), distributed in each node, and I will retrieve the node bound to the IP to be sure I am going to the right place.

I instruct ProxySQL to send my query by using a HINT to a specific host. I choose the hint “shard_host_HG”, and I am going to inject it in the query as a comment.

As such the Query Rules will be:

[Pa]
delete from mysql_query_rules where rule_id in (40,41,42, 10,11,12);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,destination_hostgroup,apply) VALUES (10,1,'user_shardRW',"\/\*\s*shard_host_HG=.*Europe\s*\*.",10,0);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,destination_hostgroup,apply) VALUES (11,1,'user_shardRW',"\/\*\s*shard_host_HG=.*Asia\s*\*.",20,0);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,destination_hostgroup,apply) VALUES (12,1,'user_shardRW',"\/\*\s*shard_host_HG=.*Africa\s*\*.",30,0);
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;

While the queries I am going to test are:

[Mc]
Select /* shard_host_HG=Europe */ City.Name, City.Population from world.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='ITA' limit 5; SELECT * /* shard_host_HG=Europe */ from information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';
Select /* shard_host_HG=Asia */ City.Name, City.Population from world.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='IND' limit 5; SELECT * /* shard_host_HG=Asia */ from information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';
Select /* shard_host_HG=Africa */ City.Name, City.Population from world.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='ETH' limit 5; SELECT * /* shard_host_HG=Africa */ from information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';

Running the query for Africa, I will get:

[Mc]
Select /* shard_host_HG=Africa */ City.Name, City.Population from world.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='ETH' limit 5; SELECT * /* shard_host_HG=Africa */ from information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';
+-------------+------------+
| Name        | Population |
+-------------+------------+
| Addis Abeba |    2495000 |
| Dire Dawa   |     164851 |
| Nazret      |     127842 |
| Gonder      |     112249 |
| Dese        |      97314 |
+-------------+------------+
+---------------+----------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+---------------+----------------+
| BIND_ADDRESS  | 192.168.1.7    |
+---------------+----------------+

That will give me:

[Pa]
select active,hits, mysql_query_rules.rule_id, match_digest, match_pattern, replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
| active | hits | rule_id | match_digest        | match_pattern                          | replace_pattern | cache_ttl | apply | flagIN | flagOUT |
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
| 1      | 0    | 40      | NULL                | \/\*\s*shard_host_HG=.*Europe\s*\*.    | NULL            | NULL      | 0     | 0      | 0       |
| 1      | 0    | 41      | NULL                | \/\*\s*shard_host_HG=.*Asia\s*\*.      | NULL            | NULL      | 0     | 0      | 0       |
| 1      | 2    | 42      | NULL                | \/\*\s*shard_host_HG=.*Africa\s*\*.    | NULL            | NULL      | 0     | 0      | 0       | <-- Note the HITS (2 as the run queries)
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+

In this example, we have NO replace_patter. This is only a matching and redirecting Rule, where the destination HG is defined in the value of the destination_hostgroup attribute while inserting. In the case for Africa, it is HG 30.

The server in HG 30 is:

[Pa]
select hostgroup_id,hostname,port,status from mysql_servers ;
+--------------+-------------+------+--------+
| hostgroup_id | hostname    | port | status |
+--------------+-------------+------+--------+
| 10           | 192.168.1.5 | 3306 | ONLINE |
| 20           | 192.168.1.6 | 3306 | ONLINE |
| 30           | 192.168.1.7 | 3306 | ONLINE | <---
+--------------+-------------+------+--------+

This perfectly matches our returned value.

You can try by your own the other two continents.

Using destination_hostgroup

Another way to assign a query’s final host is to use the the destination_hostgroup, set the Schema_name attribute and use the use schema syntax in the query.

For example:

[Pa]
INSERT INTO mysql_query_rules (active,schemaname,destination_hostgroup,apply) VALUES
(1, 'shard00', 1, 1), (1, 'shard01', 1, 1), (1, 'shard03', 1, 1),
(1, 'shard04', 2, 1), (1, 'shard06', 2, 1), (1, 'shard06', 2, 1),
(1, 'shard07', 3, 1), (1, 'shard08', 3, 1), (1, 'shard09', 3, 1);

And then in the query do something like :

use shard02; Select * from tablex;

I mention this method because it is currently one of the most common in large companies using SHARDING.

But it is not safe, because it relays on the fact the query will be executed in the desired HG. The risk of error is high.

Just think if a query doing join against a specified SHARD:

use shard01; Select * from tablex join shard03 on tablex.id = shard03.tabley.id;

This will probably generate an error because shard03 is probably NOT present on the host containing shard01.

As such this approach can be used ONLY when you are 100% sure of what you are doing, and when you are sure NO query will have an explicit schema declaration.

Shard by host and by schema

Finally, is obviously possible to combine the two approaches and shard by host with only a subset of schemas.

To do so, let’s use all three nodes and have the schema distribution as follow:

  • Europe on Server 192.168.1.5 -> HG 10
  • Asia on Server 192.168.1.6 -> HG 20
  • Africa on Server 192.168.1.7 -> HG 30

I have already set the query rules using HINT, so what I have to do is to use them BOTH to combine the operations:

[Mc]
Select /* shard_host_HG=Asia */ /* continent=Asia */  City.Name, City.Population from world.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='IND' limit 5; SELECT * /* shard_host_HG=Asia */ from information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';
+--------------------+------------+
| Name               | Population |
+--------------------+------------+
| Mumbai (Bombay)    |   10500000 |
| Delhi              |    7206704 |
| Calcutta [Kolkata] |    4399819 |
| Chennai (Madras)   |    3841396 |
| Hyderabad          |    2964638 |
+--------------------+------------+
5 rows in set (0.00 sec)
+---------------+----------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+---------------+----------------+
| BIND_ADDRESS  | 192.168.1.6    |
+---------------+----------------+
1 row in set (0.01 sec)

[Pa]
mysql> select digest_text from stats_mysql_query_digest;
+--------------------------------------------------------------------------------------------------------------------------------------------+
| digest_text                                                                                                                                |
+--------------------------------------------------------------------------------------------------------------------------------------------+
| SELECT * from information_schema.GLOBAL_VARIABLES where variable_name like ?                                                               |
| Select City.Name, City.Population from Asia.Country join Asia.City on Asia.City.CountryCode=Asia.Country.Code where Country.code=? limit ? |
+--------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
mysql> select active,hits, mysql_query_rules.rule_id, match_digest, match_pattern, replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
| active | hits | rule_id | match_digest        | match_pattern                          | replace_pattern | cache_ttl | apply | flagIN | flagOUT |
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
| 1      | 0    | 10      | NULL                | \/\*\s*shard_host_HG=.*Europe\s*\*.    | NULL            | NULL      | 0     | 0      | NULL    |
| 1      | 2    | 11      | NULL                | \/\*\s*shard_host_HG=.*Asia\s*\*.      | NULL            | NULL      | 0     | 0      | NULL    |
| 1      | 0    | 12      | NULL                | \/\*\s*shard_host_HG=.*Africa\s*\*.    | NULL            | NULL      | 0     | 0      | NULL    |
| 1      | 0    | 13      | NULL                | NULL                                   | NULL            | NULL      | 0     | 0      | 0       |
| 1      | 1    | 31      | NULL                | \S*\s*\/\*\s*continent=.*Asia\s*\*.*   | NULL            | NULL      | 0     | 0      | 23      |
| 1      | 4    | 32      | NULL                | world.                                 | Asia.           | NULL      | 0     | 23     | 23      |
| 1      | 0    | 33      | NULL                | \S*\s*\/\*\s*continent=.*Europe\s*\*.* | NULL            | NULL      | 0     | 0      | 25      |
| 1      | 0    | 34      | NULL                | world.                                 | Europe.         | NULL      | 0     | 25     | 25      |
| 1      | 0    | 35      | NULL                | \S*\s*\/\*\s*continent=.*Africa\s*\*.* | NULL            | NULL      | 0     | 0      | 24      |
| 1      | 0    | 36      | NULL                | world.                                 | Africa.         | NULL      | 0     | 24     | 24      |
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+

As you can see rule 11 has two HITS, which means my queries will go to the associated HG. But given that the Apply for rule 11 is =0, ProxySQL will first continue to process the Query Rules.

As such it will also transform the queries for rules 31 and 32, each one having the expected number of hits (one the first and the second four because of the loop).

That was all we had to do to perform a perfect two layer sharding in ProxySQL.

Conclusion

ProxySQL allows the user to access data distributed by shards in a very simple way. The query rules that follow the consolidate RegEx pattern, in conjunction with the possibility to concatenate rules and the Host Group approach definition give us huge flexibility with relative simplicity.

References:

https://github.com/sysown/proxysql/tree/v1.2.2/doc
https://github.com/google/re2/wiki/Syntax
http://www.proxysql.com/2015/09/proxysql-tutorial-setup-in-mysql.html
https://github.com/sysown/proxysql/blob/v1.2.2/doc/configuration_howto.md
https://github.com/sysown/proxysql/blob/v1.2.2/INSTALL.md
https://dev.mysql.com/doc/index-other.html

Credits

It is obvious that I need to acknowledge and kudos the work Rene’ Cannao is doing to make ProxySQL a solid, fast and flexible product. I also should mention that I work with him very often (more often than he likes), asking him for fixes and discussing optimization strategies. Requests that he satisfies with surprising speed and efficiency.

Aug
30
2016
--

Percona Live Europe Discounted Pricing and Community Dinner!

Get your Percona Live Europe discounted tickets now, and sign up for the community dinner.

percona live europeThe countdown is on for the annual Percona Live Europe Open Source Database Conference! This year the conference will be taking place in the great city of Amsterdam October 3-5. This three-day conference will focus on the latest trends, news and best practices in the MySQL, MongoDB, PostgreSQL and other open source databases, while tackling subjects such as analytics, architecture and design, security, operations, scalability and performance. Percona Live provides in-depth discussions for your high-availability, IoT, cloud, big data and other changing business needs.

With breakout sessions, tutorial sessions and keynote speakers, there will certainly be no lack of content.
Advanced Rate Registration ENDS September 5, so make sure to register now to secure the best price possible.

As it is a Percona Live Europe conference, there will certainly be no lack of FUN either!!!!

percona live europeAs tradition holds, there will be a Community Dinner. Tuesday night, October 4, Percona Live Diamond Sponsor Booking.com hosts the Community Dinner at their very own headquarters located in historic Rembrandt Square in the heart of the city. After breakout sessions conclude, attendees are picked up right outside of the venue and taken to booking.com’s headquarters by canal boats! This gives all attendees the opportunity to play “tourist” while viewing the beauty of Amsterdam from the water. Attendees are dropped off right next to Booking.com’s office (return trip isn’t included)! The Lightning Talks for this year’s conference will be featured at the dinner.

Come and show your support for the community while enjoying dinner and drinks! The first 50 people registered for the dinner get in the doors for €10 (after that the price goes to €15 euro). Space is limited so make sure to sign up ASAP!

So don’t forget, register for the conference and sign up for the community dinner before space is gone! See you in Amsterdam!

Aug
30
2016
--

Okta extends identity management to APIs

Man and woman back to back with smart phones. Behind them on chalk board is a collection symbols representing different apps. Okta announced it was bringing identity management to APIs today at its Oktane customer conference in Las Vegas.
For a long time, Okta was about connecting people with cloud applications such as ServiceNow, Salesforce or Office 365. A couple of years ago, the company extended that capability to enable customers to control the devices where employees could access those cloud… Read More

Aug
30
2016
--

VMware update strives to place virtual desktops in modern context

PCs on a grid in front of a city skyline. VMware announced a series of updates to its virtual desktop (VDI) line today at VMworld in Las Vegas. The announcement attempts to set the virtual desktop in a more modern context, by offering customers a range of options in the cloud, on-prem or hybrid. What’s more, it’s trying to reduce the cost by partnering with a variety of hardware vendors from HP to Dell and even the… Read More

Aug
29
2016
--

Alphabet’s David Drummond leaves Uber’s board amid mounting competition

uber-self-driving-car David Drummond, who joined Uber’s board of directors in August 2013, stepped down several weeks ago, Uber has confirmed to TechCrunch. The Wall Street Journal first reported the news. At first blush: no surprise whatsoever. Earlier this month, Uber CEO Travis Kalanick said the company’s first fleet of self-driving cars would begin running in Pittsburgh soon (as soon as this… Read More

Aug
29
2016
--

Percona Live Europe featured talk with Alexander Krasheninnikov — Processing 11 billion events a day with Spark in Badoo

Percona Live Europe Featured Talk

Percona Live Europe Featured TalkWelcome to a new Percona Live Europe featured talk with Percona Live Europe 2016: Amsterdam speakers! In this series of blogs, we’ll highlight some of the speakers that will be at this year’s conference. We’ll also discuss the technologies and outlooks of the speakers themselves. Make sure to read to the end to get a special Percona Live Europe registration bonus!

In this Percona Live Europe featured talk, we’ll meet Alexander Krasheninnikov, Head of Data Team at Badoo. His talk will be on Processing 11 billions events a day with Spark in Badoo. Badoo is one of the world’s largest and fastest growing social networks for meeting new people. I had a chance to speak with Alexander and learn a bit more about the database environment at Badoo:

Percona: Give me a brief history of yourself: how you got into database development, where you work, what you love about it?

Alexander: Currently, I work at Badoo as Head of Data Team. Our team is responsible for providing internal API’s for statistics data collecting and processing.

I started as a developer at Badoo, but the project I am going to cover in my talk lead to creating a separate department.

Percona: Your talk is called “Processing 11 billion events a day with Spark in Badoo.” What were the issues with your environment that led you to Spark? How did Spark solve these needs?

Alexander: When we designed the Unified Data Stream system in Badoo, we’ve extracted several requirements: scalability, fault tolerance and reliability. Altogether, these requirements moved us towards using Hadoop as deep data storage and data processing framework. Our initial implementation was built on top of Scribe + WebHDFS + Hive. But we’ve realized that processing speed and any lag of data delivery is unacceptable (we need near-realtime data processing). One of our BI team mentioned Spark as being significantly faster than Hive in some cases, (especially ones similar to ours). When investigated Spark’s API, we found the Streaming submodule — ideal for our needs. Additionally, this framework allowed us to use some third-party libraries, and write code. We’ve actually created an aggregation framework that follows “divide and conquer” principle. Without Spark, we definitely went way re-inventing lot of things from it.

Percona: Why is tracking the event stream important for your business model? How are you using the data Spark is providing you to reach business goals?

Alexander: The event stream always represents some important business/technical metrics — votes, messages, likes and so on. All this, brought together, forms the “health” of our product. The primary goal of our Spark-based system is to process a heterogeneous event stream one way, and draw charts automatically. We acheived this goal, and now we have hundreds of charts and dozens of developers/analysts/product team members using them. The system also evolved, and now we perform automatic anomaly detection over the event stream. We report strange data behavior to all the interested people.

Percona: What is changing in data use in your businesses model that keeps you awake at night? What tools or features are you looking for to address these issues?

Alexander: As I’ve mentioned before, we have an anomaly detection process for our metrics. If some of our metrics are out of expected bounds, it is treated as being an anomaly, and notification are sent. Also, we have a self-monitoring functionality for the whole system — a small event rate of heartbeats is generated, and processed with two different systems. If those show a significant difference — that defintely keeps me awake at night! ?

Percona: What are looking forward to the most at Percona Live Europe this year?

Alexander: My main interest is distributed open source databases. At Percona Live Europe, I expect to gain a lot of new information from the appropriate conference sections. Particularly, I want to get some knowledge about Yandex ClickHouse, as it looks very promising.

You can read more about how Alexander and Badoo use Spark here: techblog.badoo.com.

Want to find out more about Alexander, Spark and Badoo? Register for Percona Live Europe 2016, and come see his talk Processing 11 billions events a day with Spark in Badoo.

Use the code FeaturedTalk and receive €25 off the current registration price!

Percona Live Europe 2016: Amsterdam is the premier event for the diverse and active open source database community. The conferences have a technical focus with an emphasis on the core topics of MySQL, MongoDB, and other open source databases. Percona live tackles subjects such as analytics, architecture and design, security, operations, scalability and performance. It also provides in-depth discussions for your high-availability, IoT, cloud, big data and other changing business needs. This conference is an opportunity to network with peers and technology professionals by bringing together accomplished DBA’s, system architects and developers from around the world to share their knowledge and experience. All of these people help you learn how to tackle your open source database challenges in a whole new way.

This conference has something for everyone!

Percona Live Europe 2016: Amsterdam is October 3-5 at the Mövenpick Hotel Amsterdam City Centre.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com