Mar
16
2020
--

To make locks touchless, Proxy bluetooth ID raises $42M

We need to go hands-off in the age of coronavirus. That means touching fewer doors, elevators, and sign-in iPads. But once a building is using phone-based identity for security, there’s opportunities to speed up access to WIFI networks and printers, or personalize conference rooms and video call set-ups. Keyless office entry startup Proxy wants to deliver all of this while keeping your phone in your pocket.

The door is just a starting point” Proxy co-founder and CEO Denis Mars tells me. “We’re . . . empowering a movement to take back control of our privacy, our sense of self, our humanity, our individuality.”

With the contagion concerns and security risks of people rubbing dirty, cloneable, stealable key cards against their office doors, investors see big potential in Proxy. Today it’s announcing here a $42 million Series B led by Scale Venture Partners with participation from former funders Kleiner Perkins and Y Combinator plus new additions Silicon Valley Bank and West Ventures.

The raise brings Proxy to $58.8 million in funding so it can staff up at offices across the world and speed up deployments of its door sensor hardware and access control software. “We’re spread thin” says Mars. “Part of this funding is to try to grow up as quickly as possible and not grow for growth sake. We’re making sure we’re secure, meeting all the privacy requirements.”

How does Proxy work? Employers get their staff to install an app that knows their identity within the company, including when and where they’re allowed entry. Buildings install Proxy’s signal readers, which can either integrate with existing access control software or the startup’s own management dashboard.

Employees can then open doors, elevators, turnstiles, and garages with a Bluetooth low-energy signal without having to even take their phone out. Bosses can also opt to require a facial scan or fingerprint or a wave of the phone near the sensor. Existing keycards and fobs still work with Proxy’s Pro readers. Proxy costs about $300 to $350 per reader, plus installation and a $30 per month per reader subscription to its management software.

Now the company is expanding access to devices once you’re already in the building thanks to its SDK and APIs. Wifi router-makers are starting to pre-provision their hardware to automatically connect the phones of employees or temporarily allow registered guests with Proxy installed — no need for passwords written on whiteboards. Its new Nano sensors can also be hooked up to printers and vending machines to verify access or charge expense accounts. And food delivery companies can add the Proxy SDK so couriers can be granted the momentary ability to open doors when they arrive with lunch.

Rather than just indiscriminately beaming your identity out into the world, Proxy uses tokenized credentials so only its sensors know who you are. Users have to approve of new networks’ ability to read their tokens, Proxy has SOC-2 security audit certification, and complies with GDPR. “We feel very strongly about where the biometrics are stored . . . they should stay on your phone” says Mars.

Yet despite integrating with the technology for two-factor entry unlocks, Mars says “We’re not big fans of facial recognition. You don’t want every random company having your face in their database. The face becomes the password you were supposed to change every 30 days.”

Keeping your data and identity safe as we see an explosion of Internet Of Things devices was actually the impetus for starting Proxy. Mars had sold his teleconferencing startup Bitplay to Jive Software where he met his eventually co-founder Simon Ratner, who’d joined after his video annotation startup  Omnisio was acquired by YouTube. Mars was frustrated about every IoT lightbulb and appliance wanting him to download an app, set up a profile, and give it his data.

The duo founded Proxy in 2016 as a universal identity signal. Today it has over 60 customers. While other apps want you to constantly open them, Proxy’s purpose is to work silently in the background and make people more productive. “We believe the most important technologies in the world don’t seek your attention. They work for you, they empower you, and they get out of the way so you can focus your attention on what matters most — living your life.”

Now Proxy could actually help save lives. “The nature of our product is contactless interactions in commercial buildings and workplaces so there’s a bit of an unintended benefit that helps prevent the spread of the virus” Mars explains. “We have seen an uptick in customers starting to set doors and other experiences in longer-range hands-free mode so that users can walk up to an automated door and not have to touch the handles or badge/reader every time.”

The big challenge facing Proxy is maintaining security and dependability since it’s a mission-critical business. A bug or outage could potentially lock employees out of their workplace (when they eventually return from quarantine). It will have to keep hackers out of employee files. Proxy needs to stay ahead of access control incumbents like ADT and HID as well as smaller direct competitors like $10 million-funded Nexkey and $28 million-funded Openpath.

Luckily, Proxy has found a powerful growth flywheel. First an office in a big building gets set up, then they convince the real estate manager to equip the lobby’s turnstiles and elevators with Proxy. Other tenants in the building start to use it, so they buy Proxy for their office. Then they get their offices in other cities on board…starting the flywheel again. That’s why Proxy is doubling down on sales to commercial real estate owners.

The question is when Proxy will start knocking on consumers’ doors. While leveling up into the enterprise access control software business might be tough for home smartlock companies like August, Proxy could go down market if it built more physical lock hardware. Perhaps we’ll start to get smart homes that know who’s home, and stop having to carry pointy metal sticks in our pockets.

Mar
27
2019
--

Proxy raises $13.6M to unlock anything with Bluetooth identity

You know how kings used to have trumpeters heralding their arrival wherever they went? Proxy wants to do that with Bluetooth. The startup lets you instantly unlock office doors and reserve meeting rooms using Bluetooth Low Energy signal. You never even have to pull out your phone or open an app. But Proxy is gearing up to build an entire Bluetooth identity layer for the world that could invisibly hover around its users. That could allow devices around the workplace and beyond to instantly recognize your credentials and preferences to sign you into teleconferences, pay for public transit or ask the barista for your usual.

Today, Proxy emerges from stealth after piloting its keyless, badgeless office entry tech with 50 companies. It’s raised a $13.6 million Series A round led by Kleiner Perkins to turn your phone into your skeleton key. “The door is a forcing function to solve all the hard problems — everything from safety to reliability to the experience to privacy,” says Proxy co-founder and CEO Denis Mars. “If you’re gonna do this, it’s gonna have to work right, and especially if you’re going to do this in the workplace with enterprises where there’s no room to fix it.”

But rather than creepily trying to capitalize on your data, Proxy believes you should own and control it. Each interaction is powered by an encrypted one-time token so you’re not just beaming your unprotected information out into the universe. “I’ve been really worried about how the internet world spills over to the physical world. Cookies are everywhere with no control. What’s the future going to be like? Are we going to be tracked everywhere or is there a better way?” He figured the best path to the destiny he wanted was to build it himself.

Mars and his co-founder Simon Ratner, both Australian, have been best buddies for 10 years. Ratner co-founded a video annotation startup called Omnisio that was acquired by YouTube, while Mars co-founded teleconferencing company Bitplay, which was bought by Jive Software. Ratner ended up joining Jive where the pair began plotting a new startup. “We asked ourselves what we wanted to do with the next 10 or 20 years of our lives. We both had kids and it changed our perspective. What’s meaningful that’s worth working on for a long time?”

They decided to fix a real problem while also addressing their privacy concerns. As he experimented with Internet of Things devices, Mars found every fridge and light bulb wanted you to download an app, set up a profile, enter your password and then hit a button to make something happen. He became convinced this couldn’t scale and we’d need a hands-free way to tell computers who we are. The idea for Proxy emerged. Mars wanted to know, “Can we create this universal signal that anything can pick up?”

Most offices already have infrastructure for badge-based RFID entry. The problem is that employees often forget their badges, waste time fumbling to scan them and don’t get additional value from the system elsewhere.

So rather than re-invent the wheel, Proxy integrates with existing access control systems at offices. It just replaces your cards with an app authorized to constantly emit a Bluetooth Low Energy signal with an encrypted identifier of your identity. The signal is picked up by readers that fit onto the existing fixtures. Employees can then just walk up to a door with their phone within about six feet of the sensor and the door pops open. Meanwhile, their bosses can define who can go where using the same software as before, but the user still owns their credentials.

“Data is valuable, but how does the end user benefit? How do we change all that value being stuck with these big tech companies and instead give it to the user?” Mars asks. “We need to make privacy a thing that’s not exploited.”

Mars believes now’s the time for Proxy because phone battery life is finally getting good enough that people aren’t constantly worried about running out of juice. Proxy’s Bluetooth Low Energy signal doesn’t suck up much, and geofencing can wake up the app in case it shuts down while on a long stint away from the office. Proxy has even considered putting inductive charging into its sensors so you could top up until your phone turns back on and you can unlock the door.

Opening office doors isn’t super exciting, though. What comes next is. Proxy is polishing its features that auto-reserve conference rooms when you walk inside, that sign you into your teleconferencing system when you approach the screen and that personalize workstations when you arrive. It’s also working on better office guest check-in to eliminate the annoying iPad sign-in process in the lobby. Next, Mars is eyeing “Your car, your home, all your devices. All these things are going to ask ‘can I sense you and do something useful for you?’ ”

After demoing at Y Combinator, thousands of companies reached out to Proxy, from hotel chains to corporate conglomerates to theme parks. Proxy charges for its hardware, plus a monthly subscription fee per reader. Employees are eager to ditch their keycards, so Proxy sees 90 percent adoption across all its deployments. Customers only churn if something breaks, and it hasn’t lost a customer in two years, Mars claims.

The status quo of keycards, competitors like Openpath and long-standing incumbents all typically only handle doors, while Proxy wants to build an omni-device identity system. Now Proxy has the cash to challenge them, thanks to the $13.6 million from Kleiner, Y Combinator, Coatue Management and strategic investor WeWork. In fact, Proxy now counts WeWork’s headquarters and Dropbox as clients. “With Proxywe can give our employees, contractors and visitors a seamless smartphone-enabled access experience they love, while actually bolstering security,” says Christopher Bauer, Dropbox’s physical security systems architect.

The cash will help answer the question of “How do we turn this into a protocol so we don’t have to build the other side for everyone?,” Mars explains. Proxy will build out SDKs that can be integrated into any device, like a smoke detector that could recognize which people are in the vicinity and report that to first responders. Mars thinks hotel rooms that learn your climate, wake-up call and housekeeping preferences would be a no-brainer. Amazon Go-style autonomous retail could also benefit from the tech.

When asked what keeps him up at night, Mars concludes that “the biggest thing that scares me is that this requires us to be the most trustworthy company on the planet. There is no ‘move fast, break things’ here. It’s ‘move fast, do it right, don’t screw it up.’ “

Feb
03
2017
--

Percona Live Featured Tutorial with Derek Downey, David Turner and René Cannaò — ProxySQL Tutorial

Percona Live Featured Tutorial

Welcome to another post in the series of Percona Live featured tutorial speakers blogs! In these blogs, we’ll highlight some of the tutorial speakers that will be at this year’s Percona Live conference. We’ll also discuss how these tutorials can help you improve your database environment. Make sure to read to the end to get a special Percona Live 2017 registration bonus!

In this Percona Live featured tutorial, we’ll meet Derek Downey (OSDB Practice Advocate, Pythian), David Turner (Storage SRE, Uber) and René Cannaò (MySQL SRE, Dropbox / ProxySQL). Their session is ProxySQL Tutorial. There is a stigma attached to database proxies when it comes to MySQL. This tutorial hopes to blow away that stigma by showing you what can be done with a proxy designed from the ground up to perform. I had a chance to speak with Derek, David and René and learn a bit more about ProxySQL:

Percona: How did you get into database technology? What do you love about it?

Percona Live Featured Tutorial
Derek Downey

Derek: I took a relational database course in college based on Oracle. Set theory and the relational model made a lot of sense to me. After a few years as a web developer at a small company, I transitioned to a hybrid SysAdmin/DBA role and got my first taste of the potential of “the cloud” (and some of the drawbacks).

I really came to understand that data is the lifeblood of any organization, and making sure it is always available through any disaster – from human error to hurricanes – is a unique and exciting challenge.

You should never notice the DBA if they’re doing their job right. There’s not much praise for a DBA on a job well done. But it’s a vital position to keep a company running. And that suits me just fine.

David: I started working for the Advanced Projects Group at the University of Missouri, now known as MOREnet. They were largely responsible for connecting all of the libraries and schools in the state to the Internet. I was initially helping them with their Internet presence as a webmaster. Later they needed help with their databases. I got very excited about working with Oracle at the time, and decided to join that team.

My relationship with MySQL started primarily because the cost of sharding Oracle was so high. Additionally, MySQL’s replication allowed us to use slaves. Oracle’s Dataguard/standby options wouldn’t allow reads from the slaves at that time. Lastly, MySQL was sort of “wild west” fun, since it lacked so many other features that Oracle had long ago. You had to get creative. It has been humbling to see how much innovation has come from the community and how far MySQL has come. And this is only the beginning!

Percona Live Featured Tutorial
René Cannaò

René: My career followed the classic path of a system administrator that ends up becoming a DBA. I used to work for a few companies as webmaster,  and finally as SysAdmin for a web hosting company. I always saw a similar pattern: “the bottleneck is in the database.” Nobody ever knew why the database was the bottleneck. I volunteered to improve the performance of this “unknown system.” Learning was a fun experience, and the result was extremely rewarding. I love understanding how databases operate and their internals. This is the only way to be able to get the maximum performance: “scientia potentia est”!

Percona: Your tutorial is called “ProxySQL Tutorial.” What exactly is ProxySQL, and what does it do?

Derek: I’ll leave it to René, the creator of ProxySQL, to give more detail on exactly what it is. But for a DBA trying to ensure their data is always available, it is a new and exciting tool in our toolbox.

René: ProxySQL is the MySQL data gateway. It’s the Stargate that can inspect, control, transform, manage, and route all traffic between clients and database servers. It builds reliable and fault-tolerant networks. It is a software bridge that empowers MySQL DBAs, built by DBAs for DBAs, allowing them to control all MySQL traffic where previously such traffic could not be controlled either on the client side (normally the developers’ realm) or server side (where there are not enough tools).

Percona Live Featured Tutorial
David Turner

David: Architecturally, ProxySQL is a separate process between the client and the database. Because traffic passes through it, ProxySQL can become many things (three of which got my attention). It can be a multiplexer, a filter, and a replicator.

Multiplexers reduce many signals down to a few. Web servers often open many static connections to MySQL. Since MySQL can only support a limited number of connections before performance suffers, ProxySQL’s ability to transparently manage tens of thousands of connections while only opening a few to the database is a great feature.

Administrators can update ProxySQL to filter and even rewrite queries based on patterns they decide on. As someone that has worked in operations and seen how long it can take to disable misbehaving applications, this is a very compelling feature. With ProxySQL in place, I can completely block a query from the database in no time.

ProxySQL’s replication or mirroring capability means that all of the queries sent to one database can now be sent to N databases. As someone that has to roll out new versions of MySQL, test index changes, and benchmark hardware this is also a compelling feature.

Percona: What are the advantages of using ProxySQL in a database environment?

René: ProxySQL is the bridge between the clients and the servers. It creates two layers, and controls all the communication between the two. Sitting in the middle, ProxySQL provides a lot of advantages normally impossible to achieve in a standard database environment, such as throttling or blocking queries, rewriting queries, implementing sharding, read/write splitting, caching, duplicating traffic, handling backend failures, failovers, integration with HA solutions, generating real-time statistics, etc. All this, without any application change, and completely transparent to the application.

Derek: For me, ProxySQL decouples the application from the data layer. This provides more control over the backend database environment to the DBA in regards to queries, maintenance and failovers, without impact to the application.

David: In addition to the roles noted above, ProxySQL can be part of a failover solution, routing queries to a new master when the current master fails. Other advantages are splitting queries over multiple databases to distribute the load, provide additional metrics, etc.

Percona: What do you want attendees to take away from your tutorial session? Why should they attend?

Derek: This tutorial highlights what ProxySQL is trying to achieve, and discusses how to add ProxySQL to common architectures environments. Attendees will get hands-on experience with the technology, and learn how to install and configure ProxySQL to achieve query rewriting, seamless failover, and query mirroring.

David: Not only hands-on experience with ProxySQL, but an understanding of how much they can leverage with it. The more I use ProxySQL, the more advantages I see to it. For example, I did not realize that by clustering ProxySQL processes I can distribute the query matching and rewrites over many hosts, as well as use them as a caching layer.

René: ProxySQL is built upon very innovative technologies. Certain architectural concepts like hostgroups, chaining of query rules, granular routing and sharding, query retries and the very powerful Admin interface are concepts not always intuitive for DBAs with experience using other proxies. This tutorial helps understand these concepts, and attendees get hand-on experience in the configuration of ProxySQL in various scenarios.

Percona: What are you most looking forward to at Percona Live?

David: First, the people. Next, everything everyone is working on. We’re really lucky to work in such an innovative and collaborative industry. As databases evolve, we are on the ground floor of their evolution. What an exciting place to be.

Derek: I am mostly looking forward to reconnecting with my peers in the MySQL community. Both ones I’ve formerly worked with or previously met at Percona Live, as well as meeting new open source database professionals and hearing how they are providing solutions for their companies.

René: I am looking forward to attending sessions regarding new features in MySQL 8 and other new technologies. But moreover, I am excited to interact with MySQL users and get more input on how to improve ProxySQL so that it can become an indispensable tool in any MySQL environment.

Register for Percona Live Data Performance Conference 2017, and see Derek, David and René present their ProxySQL Tutorial. Use the code FeaturedTutorial and receive $30 off the current registration price!

Percona Live Data Performance Conference 2017 is the premier open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, NoSQL, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Data Performance Conference will be April 24-27, 2017 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.

Sep
16
2016
--

Consul, ProxySQL and MySQL HA

ProxySQL

When it comes to “decision time” about which type of MySQL HA (high-availability) solution to implement, and how to architect the solution, many questions come to mind. The most important questions are:

  • “What are the best tools to provide HA and Load Balancing?”
  • “Should I be deploying this proxy tool on my application servers or on a standalone server?”.

Ultimately, the best tool really depends on the needs of your application and your environment. You might already be using specific tools such as Consul or MHA, or you might be looking to implement tools that provide richer features. The dilemma of deploying a proxy instance per application host versus a standalone proxy instance is usually a trade-off between “a less effective load balancing algorithm” or “a single point of failure.” Neither are desirable, but there are ways to implement a solution that balances all aspects.

In this article, we’ll go through a solution that is suitable for an application that has not been coded to split reads and writes over separate MySQL instances. An application like this would rely on a proxy or 3rd party tool to split reads/writes, and preferably a solution that has high-availability at the proxy layer. The solution described here is comprised of ProxySQLConsul and Master High Availability (MHA). Within this article, we’ll focus on the configuration required for ProxySQL and Consul since there are many articles that cover MHA configuration (such as Miguel’s recent MHA Quick Start Guide blog post).

When deploying Consul in production, a minimum of 3x instances are recommended – in this example, the Consul agents run on the Application Server (appserver) as well as on the two “ProxySQL servers” mysql1 and mysql2 (which act as the HA proxy pair). This is not a hard requirement, and these instances can easily run on another host or docker container. MySQL is deployed locally on mysql1 and mysql2, however this could just as well be 1..n separate standalone DB server instances:

Consul ProxySQL

So let’s move on to the actual configuration of this HA solution, starting with Consul.

Installation of Consul:

Firstly, we’ll need to install the required packages, download the Consul archive and perform the initial configuration. We’ll need to perform the same installation on each of the nodes (i.e., appserver, mysql1 and mysql2).

### Install pre-requisite packages:
sudo yum -y install wget unzip bind-utils dnsmasq
### Install Consul:
sudo useradd consul
sudo mkdir -p /opt/consul /etc/consul.d
sudo touch /var/log/consul.log /etc/consul.d/proxysql.json
cd /opt/consul
sudo wget https://releases.hashicorp.com/consul/0.6.4/consul_0.6.4_linux_amd64.zip
sudo unzip consul_0.6.4_linux_amd64.zip
sudo ln -s /opt/consul/consul /usr/bin/consul
sudo chown consul:consul -R /etc/consul* /opt/consul* /var/log/consul.log

Configuration of Consul on Application Server (used as ‘bootstrap’ node):

Now, that we’re done with the installation on each of the hosts, let’s continue with the configuration. In this example we’ll bootstrap the Consul cluster using “appserver”:

### Edit configuration files
$ sudo vi /etc/consul.conf
{
  "datacenter": "dc1",
  "data_dir": "/opt/consul/",
  "log_level": "INFO",
  "node_name": "agent1",
  "server": true,
  "ui": true,
  "bootstrap": true,
  "client_addr": "0.0.0.0",
  "advertise_addr": "192.168.1.119"  ## Add server IP here
}
######
$ sudo vi /etc/consul.d/proxysql.json
{"services": [
  {
   "id": "proxy1",
   "name": "proxysql",
   "address": "192.168.1.120",
   "tags": ["mysql"],
   "port": 6033,
   "check": {
     "script": "mysqladmin ping --host=192.168.1.120 --port=6033 --user=root --password=123",
     "interval": "3s"}
   },
  {
   "id": "proxy2",
   "name": "proxysql",
   "address": "192.168.1.121",
   "tags": ["mysql"],
   "port": 6033,
   "check": {
     "script": "mysqladmin ping --host=192.168.1.121 --port=6033 --user=root --password=123",
     "interval": "3s"}
   }
 ]
}
######
### Start Consul agent
$ sudo su - consul -c 'consul agent -config-file=/etc/consul.conf -config-dir=/etc/consul.d > /var/log/consul.log &'
### Setup DNSMASQ (as root)
echo "server=/consul/127.0.0.1#8600" > /etc/dnsmasq.d/10-consul
service dnsmasq restart
### Remember to add the localhost as a DNS server (this step can vary
### depending on how your DNS servers are managed... here I'm just
### adding the following line to resolve.conf:
sudo vi /etc/resolve.conf
#... snippet ...#
nameserver 127.0.0.1
#... snippet ...#
### Restart dnsmasq
sudo service dnsmasq restart

The service should now be started, and you can verify this in the logs in “/var/log/consul.log”.

Configuration of Consul on Proxy Servers:

The next item is to configure each of the proxy Consul agents. Note that the “agent name” and the “IP address” need to be updated for each host (values for both must be unique):

### Edit configuration files
$ sudo vi /etc/consul.conf
{
  "datacenter": "dc1",
  "data_dir": "/opt/consul/",
  "log_level": "INFO",
  "node_name": "agent2",  ### Agent node name must be unique
  "server": true,
  "ui": true,
  "bootstrap": false,   ### Disable bootstrap on joiner nodes
  "client_addr": "0.0.0.0",
  "advertise_addr": "192.168.1.xxx",  ### Set to local instance IP
  "dns_config": {
    "only_passing": true
  }
}
######
$ sudo vi /etc/consul.d/proxysql.json
{"services": [
  {
   "id": "proxy1",
   "name": "proxysql",
   "address": "192.168.1.120",
   "tags": ["mysql"],
   "port": 6033,
   "check": {
     "script": "mysqladmin ping --host=192.168.1.120 --port=6033 --user=root --password=123",
     "interval": "3s"}
   },
  {
   "id": "proxy2",
   "name": "proxysql",
   "address": "192.168.1.121",
   "tags": ["mysql"],
   "port": 6033,
   "check": {
     "script": "mysqladmin ping --host=192.168.1.121 --port=6033 --user=root password=123",
     "interval": "3s"}
   }
 ]
}
######
### Start Consul agent:
$ sudo su - consul -c 'consul agent -config-file=/etc/consul.conf -config-dir=/etc/consul.d > /var/log/consul.log &'
### Join Consul cluster specifying 1st node IP e.g.
$ consul join 192.168.1.119
### Verify logs and look out for the following messages:
$ cat /var/log/consul.log
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
         Node name: 'agent2'
        Datacenter: 'dc1'
            Server: true (bootstrap: false)
       Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
      Cluster Addr: 192.168.1.120 (LAN: 8301, WAN: 8302)
    Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
             Atlas:
==> Log data will now stream in as it occurs:
# ... snippet ...
    2016/09/05 19:48:04 [INFO] agent: Synced service 'consul'
    2016/09/05 19:48:04 [INFO] agent: Synced check 'service:proxysql1'
    2016/09/05 19:48:04 [INFO] agent: Synced check 'service:proxysql2'
# ... snippet ...

At this point, we have Consul installed, configured and running on each of our hosts appserver (mysql1 and mysql2). Now it’s time to install and configure ProxySQL on mysql1 and mysql2.

Installation & Configuration of ProxySQL:

The same procedure should be run on both mysql1 and mysql2 hosts:

### Install ProxySQL packages and initialise ProxySQL DB
sudo yum -y install https://github.com/sysown/proxysql/releases/download/v1.2.2/proxysql-1.2.2-1-centos7.x86_64.rpm
sudo service proxysql initial
sudo service proxysql stop
### Edit the ProxySQL configuration file to update username / password
vi /etc/proxysql.cnf
###
admin_variables=
{
    admin_credentials="admin:admin"
    mysql_ifaces="127.0.0.1:6032;/tmp/proxysql_admin.sock"
}
###
### Start ProxySQL
sudo service proxysql start
### Connect to ProxySQL and configure
mysql -P6032 -h127.0.0.1 -uadmin -padmin
### First we create a replication hostgroup:
mysql> INSERT INTO mysql_replication_hostgroups VALUES (10,11,'Standard Replication Groups');
### Add both nodes to the hostgroup 11 (ProxySQL will automatically put the writer node in hostgroup 10)
mysql> INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.120',11,3306,1000);
mysql> INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.121',11,3306,1000);
### Save server configuration
mysql> LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;
### Add query rules for RW split
mysql> INSERT INTO mysql_query_rules (active, match_pattern, destination_hostgroup, cache_ttl, apply) VALUES (1, '^SELECT .* FOR UPDATE', 10, NULL, 1);
mysql> INSERT INTO mysql_query_rules (active, match_pattern, destination_hostgroup, cache_ttl, apply) VALUES (1, '^SELECT .*', 11, NULL, 1);
mysql> LOAD MYSQL QUERY RULES TO RUNTIME; SAVE MYSQL QUERY RULES TO DISK;
### Finally configure ProxySQL user and save configuration
mysql> INSERT INTO mysql_users (username,password,active,default_hostgroup,default_schema) VALUES ('root','123',1,10,'test');
mysql> LOAD MYSQL USERS TO RUNTIME; SAVE MYSQL USERS TO DISK;
mysql> EXIT;

MySQL Configuration:

We also need to perform one configuration step on the MySQL servers in order to create a user for ProxySQL to monitor the instances:

### ProxySQL's monitor user on the master MySQL server (default username and password is monitor/monitor)
mysql -h192.168.1.120 -P3306 -uroot -p123 -e"GRANT USAGE ON *.* TO monitor@'%' IDENTIFIED BY 'monitor';"

We can view the configuration of the monitor user on the ProxySQL host by checking the global variables on the admin interface:

mysql> SHOW VARIABLES LIKE 'mysql-monitor%';
+----------------------------------------+---------+
| Variable_name                          | Value   |
+----------------------------------------+---------+
| mysql-monitor_enabled                  | true    |
| mysql-monitor_connect_timeout          | 200     |
| mysql-monitor_ping_max_failures        | 3       |
| mysql-monitor_ping_timeout             | 100     |
| mysql-monitor_replication_lag_interval | 10000   |
| mysql-monitor_replication_lag_timeout  | 1000    |
| mysql-monitor_username                 | monitor |
| mysql-monitor_password                 | monitor |
| mysql-monitor_query_interval           | 60000   |
| mysql-monitor_query_timeout            | 100     |
| mysql-monitor_slave_lag_when_null      | 60      |
| mysql-monitor_writer_is_also_reader    | true    |
| mysql-monitor_history                  | 600000  |
| mysql-monitor_connect_interval         | 60000   |
| mysql-monitor_ping_interval            | 10000   |
| mysql-monitor_read_only_interval       | 1500    |
| mysql-monitor_read_only_timeout        | 500     |
+----------------------------------------+---------+

Testing Consul:

Now that Consul and ProxySQL are configured we can do some tests from the “appserver”. First, we’ll verify that the hosts we’ve added are both reporting [OK] on our DNS requests:

$ dig @127.0.0.1 -p 53 proxysql.service.consul
; <<>> DiG 9.9.4-RedHat-9.9.4-29.el7_2.3 <<>> @127.0.0.1 -p 53 proxysql.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9975
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;proxysql.service.consul.	IN	A
;; ANSWER SECTION:
proxysql.service.consul. 0	IN	A	192.168.1.121
proxysql.service.consul. 0	IN	A	192.168.1.120
;; Query time: 1 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Sep 05 19:32:12 UTC 2016
;; MSG SIZE  rcvd: 158

As you can see from the output above, DNS is reporting both 192.168.120 and 192.168.1.121 as available for the ProxySQL service. As soon as the ProxySQL check fails, the nodes will no longer report in the output above.

We can also view the status of our cluster and agents through the Consul Web GUI which runs on port 8500 of all the Consul servers in this configuration (e.g. http://192.168.1.120:8500/):

Consul GUI

Testing ProxySQL:

So now that we have this configured we can also do some basic tests to see that ProxySQL is load balancing our connections:

[percona@appserver consul.d]$ mysql -hproxysql.service.consul -e"select @@hostname"
+--------------------+
| @@hostname         |
+--------------------+
| mysql1.localdomain |
+--------------------+
[percona@appserver consul.d]$ mysql -hproxysql.service.consul -e"select @@hostname"
+--------------------+
| @@hostname         |
+--------------------+
| mysql2.localdomain |
+--------------------+

Perfect! We’re ready to use the hostname “proxysql.service.consul” to connect to our MySQL instances using a round-robin load balancing and HA proxy solution. If one of the two ProxySQL instances fails, we’ll continue communicating with the database through the other. Of course, this configuration is not limited to just two hosts, so feel free to add as many as you need. Be aware that in this example the two hosts’ replication hierarchy is managed by MHA in order to allow for master/slave promotion. By performing an automatic or manual failover using MHA, ProxySQL automatically detects the change in replication topology and redirect writes to the newly promoted master instance.

To make this configuration more durable, it is encouraged to create a more intelligent Consul check – i.e., a check that checks more than just the availability of the MySQL service (an example would be to select some data from a table). It is also recommended to fine tune the interval of the check to suit the requirements of your application.

Aug
30
2016
--

MySQL Sharding with ProxySQL

MySQL Sharding with ProxySQL

MySQL Sharding with ProxySQLThis article demonstrates how MySQL sharding with ProxySQL works.

Recently a colleague of mine asked me to provide a simple example on how ProxySQL performs sharding.

In response, I’m writing this short tutorial in the hope it will illustrate ProxySQL’s sharding functionalities, and help people out there better understand how to use it.

ProxySQL is a very powerful platform that allows us to manipulate and manage our connections and queries in a simple but effective way. This article shows you how.

Before starting let’s clarify some basic concepts.

  • ProxySQL organizes its internal set of servers in Host Groups (HG), and each HG can be associated with users and Query Rules (QR)
  • Each QR can be final (apply = 1) or = let ProxySQL continue to parse other QRs
  • A QR can be a rewrite action, be a simple match, have a specific target HG, or be generic
  • QRs are defined using regex

You can see QRs as a sequence of filters and transformations that you can arrange as you like.

These simple basic rules give us enormous flexibility. They allow us to create very simple actions like a simple query re-write, or very complex chains that could see dozens of QR concatenated. Documentation can be found here.

The information related to HGs or QRs is easily accessible using the ProxySQL administrator interface, in the tables mysql_servers, mysql_query_rules and stats.stats_mysql_query_rules. The last one allows us to evaluate if and how the rule(s) is used.

With regards to sharding, what can ProxySQL do to help us achieve what we need (in a relatively easy way)? Some people/companies include sharding logic in the application, use multiple connections to reach the different targets, or have some logic to split the load across several schemas/tables. ProxySQL allows us to simplify the way connectivity and query distribution is supposed to work reading data in the query or accepting HINTS.

No matter what the requirements, the sharding exercise can be summarized in a few different categories.

  • By splitting the data inside the same container (like having a shard by State where each State is a schema)
  • By physical data location (this can have multiple MySQL servers in the same room, as well as having them geographically distributed)
  • A combination of the two, where I do split by State using a dedicated server, and again split by schema/table by whatever (say by gender)

In the following examples, I show how to use ProxySQL to cover the three different scenarios defined above (and a bit more).

The example below will report text from the Admin ProxySQL interface and the MySQL console. I will mark each one as follows:

  • Mc for MySQL console
  • Pa for ProxySQL Admin

Please note that the MySQL console MUST use the -c flag to pass the comments in the query. This is because the default behavior in the MySQL console is to remove the comments.

I am going to illustrate procedures that you can replicate on your laptop, and when possible I will mention a real implementation. This because I want you to directly test the ProxySQL functionalities.

For the example described below I have a PrxySQL v1.2.2 that is going to become the master in few days. You can download it from:

git clone https://github.com/sysown/proxysql.git
git checkout v1.2.2

Then to compile:

cd <path to proxy source code>
make
make install

If you need full instructions on how to install and configure ProxySQL, read here and here.

Finally, you need to have the WORLD test DB loaded. WORLD test DB can be found here.

Shard inside the same MySQL Server using three different schemas split by continent

Obviously, you can have any number of shards and relative schemas. What is relevant here is demonstrating how traffic gets redirected to different targets (schemas), maintaining the same structure (tables), by discriminating the target based on some relevant information in the data or pass by the application.

OK, let us roll the ball.

[Mc]
+---------------+-------------+
| Continent     | count(Code) |
+---------------+-------------+
| Asia          |          51 | <--
| Europe        |          46 | <--
| North America |          37 |
| Africa        |          58 | <--
| Oceania       |          28 |
| Antarctica    |           5 |
| South America |          14 |
+---------------+-------------+

For this exercise, I will use three hosts in replica.

To summarize, I will need:

  • Three hosts: 192.168.1.[5-6-7]
  • Three schemas: Continent X + world schema
  • One user : user_shardRW
  • Three hostgroups: 10, 20, 30 (for future use)

First, we will create the schemas Asia, Africa, Europe:

[Mc]
Create schema [Asia|Europe|North_America|Africa];
create table Asia.City as select a.* from  world.City a join Country on a.CountryCode = Country.code where Continent='Asia' ;
create table Europe.City as select a.* from  world.City a join Country on a.CountryCode = Country.code where Continent='Europe' ;
create table Africa.City as select a.* from  world.City a join Country on a.CountryCode = Country.code where Continent='Africa' ;
create table North_America.City as select a.* from  world.City a join Country on a.CountryCode = Country.code where Continent='North America' ;
create table Asia.Country as select * from  world.Country where Continent='Asia' ;
create table Europe.Country as select * from  world.Country where Continent='Europe' ;
create table Africa.Country as select * from  world.Country  where Continent='Africa' ;
create table North_America.Country as select * from  world.Country where Continent='North America' ;

Now, create the user

grant all on *.* to user_shardRW@'%' identified by 'test';

Now let us start to configure the ProxySQL:

[Pa]
insert into mysql_users (username,password,active,default_hostgroup,default_schema) values ('user_shardRW','test',1,10,'test_shard1');
LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK;
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.5',10,3306,100);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.6',20,3306,100);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.7',30,3306,100);
LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;

With this we have defined the user, the servers and the host groups.

Let us start to define the logic with the query rules:

[Pa]
delete from mysql_query_rules where rule_id > 30;
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply) VALUES (31,1,'user_shardRW',"^SELECT\s*(.*)\s*from\s*world.(\S*)\s(.*).*Continent='(\S*)'\s*(\s*.*)$","SELECT \1 from \4.\2 WHERE 1=1 \5",1);
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;

I am now going to query the master (or a single node), but I am expecting ProxySQL to redirect the query to the right shard, catching the value of the continent:

[Mc]
 SELECT name,population from world.City  WHERE Continent='Europe' and CountryCode='ITA' order by population desc limit 1;
+------+------------+
| name | population |
+------+------------+
| Roma |    2643581 |
+------+------------+

You can say: “Hey! You are querying the schema World, of course you get back the correct data.”

This is not what really happened. ProxySQL did not query the schema World, but the schema Europe.

Let’s look at the details:

[Pa]
select * from stats_mysql_query_digest;
Original    :SELECT name,population from world.City  WHERE Continent='Europe' and CountryCode='ITA' order by population desc limit 1;
Transformed :SELECT name,population from Europe.City WHERE ?=? and CountryCode=? order by population desc limit ?

Let me explain what happened.

Rule 31 in ProxySQL will take all the FIELDS we will pass in the query. It will catch the CONTINENT in the WHERE clause, it will take any condition after WHERE and it will reorganize the queries all using the RegEx.

Does this work for any table in the sharded schemas? Of course it does.

A query like:

SELECT name,population from world.Country WHERE Continent='Asia' ;

Will be transformed into:

SELECT name,population from Asia.Country WHERE ?=?

[Mc]
+----------------------+------------+
| name                 | population |
+----------------------+------------+
| Afghanistan          |   22720000 |
| United Arab Emirates |    2441000 |
| Armenia              |    3520000 |
<snip ...>
| Vietnam              |   79832000 |
| Yemen                |   18112000 |
+----------------------+------------+

Another possible approach is to instruct ProxySQL to shard is to pass a hint inside a comment. Let see how.

First let me disable the rule I just inserted. This is not really needed, but we’ll do it so you can see how. ?

[Pa]
mysql> update mysql_query_rules set active=0 where rule_id=31;
Query OK, 1 row affected (0.00 sec)
mysql> LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;
Query OK, 0 rows affected (0.00 sec)

Done.

Now what I want is for *ANY* query that contains the comment /* continent=X */ to go to the continent X schema, same server.

To do so, I instruct ProxySQL to replace any reference to the world schema inside the query I am going to submit.

[Pa]
delete from mysql_query_rules where rule_id in (31,33,34,35,36);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,FlagOUT,FlagIN) VALUES (31,1,'user_shardRW',"\S*\s*\/\*\s*continent=.*Asia\s*\*.*",null,0,23,0);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,FlagIN,FlagOUT) VALUES (32,1,'user_shardRW','world.','Asia.',0,23,23);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,FlagOUT,FlagIN) VALUES (33,1,'user_shardRW',"\S*\s*\/\*\s*continent=.*Europe\s*\*.*",null,0,25,0);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,FlagIN,FlagOUT) VALUES (34,1,'user_shardRW','world.','Europe.',0,25,25);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,FlagOUT,FlagIN) VALUES (35,1,'user_shardRW',"\S*\s*\/\*\s*continent=.*Africa\s*\*.*",null,0,24,0);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,FlagIN,FlagOUT) VALUES (36,1,'user_shardRW','world.','Africa.',0,24,24);
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;

How does this work?

I have defined two concatenated rules. The first captures the incoming query containing the desired value (like continent = Asia). If the match is there, ProxySQL exits that action, but while doing so it will read the Apply field. If Apply is 0, it will read the FlagOUT value. At this point it will go to the first rule (in sequence) that has the value of FlagIN equal to the FlagOUT.

The second rule gets the request and will replace the value of world with the one I have defined. In short, it replaces whatever is in the match_pattern with the value that is in the replace_pattern.

Now ProxySQL implements the Re2 Google library for RegEx. Re2 is very fast but has some limitations, like it does NOT support (at the time of the writing) the flag option g. In other words, if I have a select with many tables, and as such several “world”, Re2 will replace ONLY the first instance.

As such, a query like:

Select /* continent=Europe */ * from world.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='ITA' ;

Will be transformed into:

Select /* continent=Europe */ * from Europe.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='ITA' ;

And fail.

The other day, Rene and I were discussing how to solve this given the lack of implementation in Re2. Finally, we opted for recursive actions.

What does this mean? It means that ProxySQL from v1.2.2 now has a new functionality that allows recursive calls to a Query Rule. The maximum number of iterations that ProxySQL can run is managed by the option (global variable) mysql-query_processor_iterations. Mysql-query_processor_iterations define how many operations a query process can execute as whole (from start to end).

This new implementation allows us to reference a Query Rule to itself to be executed multiple times.

If you go back you will notice that QR 34 has FlagIN and FlagOUT pointing to the same value of 25 and Apply =0. This brings ProxySQL to recursively call rule 34 until it changes ALL the values of the word world.

The result is the following:

[Mc]
Select /* continent=Europe */ Code, City.Name, City.population  from world.Country join world.City on world.City.CountryCode=world.Country.Code where City.population > 10000 group by Name order by City.Population desc limit 5;
+------+---------------+------------+
| Code | Name          | population |
+------+---------------+------------+
| RUS  | Moscow        |    8389200 |
| GBR  | London        |    7285000 |
| RUS  | St Petersburg |    4694000 |
| DEU  | Berlin        |    3386667 |
| ESP  | Madrid        |    2879052 |
+------+---------------+------------+

You can see ProxySQL internal information using the following queries:

[Pa]
 select active,hits, mysql_query_rules.rule_id, match_digest, match_pattern, replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
| active | hits | rule_id | match_digest        | match_pattern                          | replace_pattern | cache_ttl | apply | flagIN | flagOUT |
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
| 1      | 1    | 33      | NULL                | \S*\s*\/\*\s*continent=.*Europe\s*\*.* | NULL            | NULL      | 0     | 0      | 25      |
| 1      | 4    | 34      | NULL                | world.                                 | Europe.         | NULL      | 0     | 25     | 25      |
| 1      | 0    | 35      | NULL                | \S*\s*\/\*\s*continent=.*Africa\s*\*.* | NULL            | NULL      | 0     | 0      | 24      |
| 1      | 0    | 36      | NULL                | world.                                 | Africa.         | NULL      | 0     | 24     | 24      |
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+

And:

[Pa]
select * from stats_mysql_query_digest;
<snip and taking only digest_text>
Select Code, City.Name, City.population from Europe.Country join Europe.City on Europe.City.CountryCode=Europe.Country.Code where City.population > ? group by Name order by City.Population desc limit ?

As you can see ProxySQL has nicely replaced the word world in the query to Europe, and it ran Query Rule 34 four times (hits).

This is obviously working for Insert/Update/Delete as well.

Queries like:

insert into  /* continent=Europe */  world.City values(999999,'AAAAAAA','ITA','ROMA',0) ;

Will be transformed into:

[Pa]
select digest_text from stats_mysql_query_digest;
+-------------------------------------------+
| digest_text                               |
+-------------------------------------------+
| insert into Europe.City values(?,?,?,?,?) |
+-------------------------------------------+

And executed only on the desired schema.

Sharding by host

Using hint

How can I shard and redirect the queries to a host (instead of a schema)? This is even easier! ?

The main point is that whatever matches the rule should go to a defined HG. No rewrite imply, which means less work.

So how this is done? As before, I have three NODES: 192.168.1.[5-6-7]. For this example, I will use world DB (no continent schema), distributed in each node, and I will retrieve the node bound to the IP to be sure I am going to the right place.

I instruct ProxySQL to send my query by using a HINT to a specific host. I choose the hint “shard_host_HG”, and I am going to inject it in the query as a comment.

As such the Query Rules will be:

[Pa]
delete from mysql_query_rules where rule_id in (40,41,42, 10,11,12);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,destination_hostgroup,apply) VALUES (10,1,'user_shardRW',"\/\*\s*shard_host_HG=.*Europe\s*\*.",10,0);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,destination_hostgroup,apply) VALUES (11,1,'user_shardRW',"\/\*\s*shard_host_HG=.*Asia\s*\*.",20,0);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,destination_hostgroup,apply) VALUES (12,1,'user_shardRW',"\/\*\s*shard_host_HG=.*Africa\s*\*.",30,0);
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;

While the queries I am going to test are:

[Mc]
Select /* shard_host_HG=Europe */ City.Name, City.Population from world.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='ITA' limit 5; SELECT * /* shard_host_HG=Europe */ from information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';
Select /* shard_host_HG=Asia */ City.Name, City.Population from world.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='IND' limit 5; SELECT * /* shard_host_HG=Asia */ from information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';
Select /* shard_host_HG=Africa */ City.Name, City.Population from world.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='ETH' limit 5; SELECT * /* shard_host_HG=Africa */ from information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';

Running the query for Africa, I will get:

[Mc]
Select /* shard_host_HG=Africa */ City.Name, City.Population from world.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='ETH' limit 5; SELECT * /* shard_host_HG=Africa */ from information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';
+-------------+------------+
| Name        | Population |
+-------------+------------+
| Addis Abeba |    2495000 |
| Dire Dawa   |     164851 |
| Nazret      |     127842 |
| Gonder      |     112249 |
| Dese        |      97314 |
+-------------+------------+
+---------------+----------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+---------------+----------------+
| BIND_ADDRESS  | 192.168.1.7    |
+---------------+----------------+

That will give me:

[Pa]
select active,hits, mysql_query_rules.rule_id, match_digest, match_pattern, replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
| active | hits | rule_id | match_digest        | match_pattern                          | replace_pattern | cache_ttl | apply | flagIN | flagOUT |
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
| 1      | 0    | 40      | NULL                | \/\*\s*shard_host_HG=.*Europe\s*\*.    | NULL            | NULL      | 0     | 0      | 0       |
| 1      | 0    | 41      | NULL                | \/\*\s*shard_host_HG=.*Asia\s*\*.      | NULL            | NULL      | 0     | 0      | 0       |
| 1      | 2    | 42      | NULL                | \/\*\s*shard_host_HG=.*Africa\s*\*.    | NULL            | NULL      | 0     | 0      | 0       | <-- Note the HITS (2 as the run queries)
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+

In this example, we have NO replace_patter. This is only a matching and redirecting Rule, where the destination HG is defined in the value of the destination_hostgroup attribute while inserting. In the case for Africa, it is HG 30.

The server in HG 30 is:

[Pa]
select hostgroup_id,hostname,port,status from mysql_servers ;
+--------------+-------------+------+--------+
| hostgroup_id | hostname    | port | status |
+--------------+-------------+------+--------+
| 10           | 192.168.1.5 | 3306 | ONLINE |
| 20           | 192.168.1.6 | 3306 | ONLINE |
| 30           | 192.168.1.7 | 3306 | ONLINE | <---
+--------------+-------------+------+--------+

This perfectly matches our returned value.

You can try by your own the other two continents.

Using destination_hostgroup

Another way to assign a query’s final host is to use the the destination_hostgroup, set the Schema_name attribute and use the use schema syntax in the query.

For example:

[Pa]
INSERT INTO mysql_query_rules (active,schemaname,destination_hostgroup,apply) VALUES
(1, 'shard00', 1, 1), (1, 'shard01', 1, 1), (1, 'shard03', 1, 1),
(1, 'shard04', 2, 1), (1, 'shard06', 2, 1), (1, 'shard06', 2, 1),
(1, 'shard07', 3, 1), (1, 'shard08', 3, 1), (1, 'shard09', 3, 1);

And then in the query do something like :

use shard02; Select * from tablex;

I mention this method because it is currently one of the most common in large companies using SHARDING.

But it is not safe, because it relays on the fact the query will be executed in the desired HG. The risk of error is high.

Just think if a query doing join against a specified SHARD:

use shard01; Select * from tablex join shard03 on tablex.id = shard03.tabley.id;

This will probably generate an error because shard03 is probably NOT present on the host containing shard01.

As such this approach can be used ONLY when you are 100% sure of what you are doing, and when you are sure NO query will have an explicit schema declaration.

Shard by host and by schema

Finally, is obviously possible to combine the two approaches and shard by host with only a subset of schemas.

To do so, let’s use all three nodes and have the schema distribution as follow:

  • Europe on Server 192.168.1.5 -> HG 10
  • Asia on Server 192.168.1.6 -> HG 20
  • Africa on Server 192.168.1.7 -> HG 30

I have already set the query rules using HINT, so what I have to do is to use them BOTH to combine the operations:

[Mc]
Select /* shard_host_HG=Asia */ /* continent=Asia */  City.Name, City.Population from world.Country join world.City on world.City.CountryCode=world.Country.Code where Country.code='IND' limit 5; SELECT * /* shard_host_HG=Asia */ from information_schema.GLOBAL_VARIABLES where variable_name like 'bind%';
+--------------------+------------+
| Name               | Population |
+--------------------+------------+
| Mumbai (Bombay)    |   10500000 |
| Delhi              |    7206704 |
| Calcutta [Kolkata] |    4399819 |
| Chennai (Madras)   |    3841396 |
| Hyderabad          |    2964638 |
+--------------------+------------+
5 rows in set (0.00 sec)
+---------------+----------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+---------------+----------------+
| BIND_ADDRESS  | 192.168.1.6    |
+---------------+----------------+
1 row in set (0.01 sec)

[Pa]
mysql> select digest_text from stats_mysql_query_digest;
+--------------------------------------------------------------------------------------------------------------------------------------------+
| digest_text                                                                                                                                |
+--------------------------------------------------------------------------------------------------------------------------------------------+
| SELECT * from information_schema.GLOBAL_VARIABLES where variable_name like ?                                                               |
| Select City.Name, City.Population from Asia.Country join Asia.City on Asia.City.CountryCode=Asia.Country.Code where Country.code=? limit ? |
+--------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
mysql> select active,hits, mysql_query_rules.rule_id, match_digest, match_pattern, replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
| active | hits | rule_id | match_digest        | match_pattern                          | replace_pattern | cache_ttl | apply | flagIN | flagOUT |
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+
| 1      | 0    | 10      | NULL                | \/\*\s*shard_host_HG=.*Europe\s*\*.    | NULL            | NULL      | 0     | 0      | NULL    |
| 1      | 2    | 11      | NULL                | \/\*\s*shard_host_HG=.*Asia\s*\*.      | NULL            | NULL      | 0     | 0      | NULL    |
| 1      | 0    | 12      | NULL                | \/\*\s*shard_host_HG=.*Africa\s*\*.    | NULL            | NULL      | 0     | 0      | NULL    |
| 1      | 0    | 13      | NULL                | NULL                                   | NULL            | NULL      | 0     | 0      | 0       |
| 1      | 1    | 31      | NULL                | \S*\s*\/\*\s*continent=.*Asia\s*\*.*   | NULL            | NULL      | 0     | 0      | 23      |
| 1      | 4    | 32      | NULL                | world.                                 | Asia.           | NULL      | 0     | 23     | 23      |
| 1      | 0    | 33      | NULL                | \S*\s*\/\*\s*continent=.*Europe\s*\*.* | NULL            | NULL      | 0     | 0      | 25      |
| 1      | 0    | 34      | NULL                | world.                                 | Europe.         | NULL      | 0     | 25     | 25      |
| 1      | 0    | 35      | NULL                | \S*\s*\/\*\s*continent=.*Africa\s*\*.* | NULL            | NULL      | 0     | 0      | 24      |
| 1      | 0    | 36      | NULL                | world.                                 | Africa.         | NULL      | 0     | 24     | 24      |
+--------+------+---------+---------------------+----------------------------------------+-----------------+-----------+-------+--------+---------+

As you can see rule 11 has two HITS, which means my queries will go to the associated HG. But given that the Apply for rule 11 is =0, ProxySQL will first continue to process the Query Rules.

As such it will also transform the queries for rules 31 and 32, each one having the expected number of hits (one the first and the second four because of the loop).

That was all we had to do to perform a perfect two layer sharding in ProxySQL.

Conclusion

ProxySQL allows the user to access data distributed by shards in a very simple way. The query rules that follow the consolidate RegEx pattern, in conjunction with the possibility to concatenate rules and the Host Group approach definition give us huge flexibility with relative simplicity.

References:

https://github.com/sysown/proxysql/tree/v1.2.2/doc
https://github.com/google/re2/wiki/Syntax
http://www.proxysql.com/2015/09/proxysql-tutorial-setup-in-mysql.html
https://github.com/sysown/proxysql/blob/v1.2.2/doc/configuration_howto.md
https://github.com/sysown/proxysql/blob/v1.2.2/INSTALL.md
https://dev.mysql.com/doc/index-other.html

Credits

It is obvious that I need to acknowledge and kudos the work Rene’ Cannao is doing to make ProxySQL a solid, fast and flexible product. I also should mention that I work with him very often (more often than he likes), asking him for fixes and discussing optimization strategies. Requests that he satisfies with surprising speed and efficiency.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com