Sep
19
2017
--

Percona Live Europe Featured Talks: Automatic Database Management System Tuning Through Large-Scale Machine Learning with Dana Van Aken

Percona Live Europe 2017

Percona Live EuropeWelcome to another post in our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Dana Van Aken, a Ph.D. student in Computer Science at Carnegie Mellon University. Her talk is titled Automatic Database Management System Tuning Through Large-Scale Machine Learning. DBMSs are difficult to manage because they have hundreds of configuration “knobs” that control factors such as the amount of memory to use for caches and how often to write data to storage. Organizations often hire experts to help with tuning activities, but experts are prohibitively expensive for many. In this talk, Dana will present OtterTune, a new tool that can automatically find good settings for a DBMS’s configuration knobs. In our conversation, we discussed how machine learning helps DBAs manage DBMSs:

Percona: How did you get into database technology? What do you love about it?

Dana: I got involved with research as an undergrad and ended up working on a systems project with a few Ph.D. students. It turned out to be a fantastic experience and is what convinced me to go for my Ph.D. I visited potential universities and chatted with many faculty members. I met with my current advisor at Carnegie Mellon University, Andy Pavlo, for a half hour and left his office excited about databases and the research problems he was interested in. Three years later, I’m even more excited about databases and the progress we’ve made in developing smarter auto-tuning techniques.

Percona: You’re presenting a session called “Automatic Database Management System Tuning Through Large-Scale Machine Learning”. How does automation make DBAs life easier in a DBMS production environment?

Dana: The role of the DBA is becoming more challenging due to the advent of new technologies and increasing scalability requirements of data-intensive applications. Many DBAs are constantly having to adjust their responsibilities to manage more database servers or support new platforms to meet an organization’s needs as they change over time. Automation is critical for reducing the DBA’s workload to a manageable size so that they can focus on higher-value tasks. Many organizations now automate at least some of the repetitive tasks that were once DBA responsibilities: several have adopted public/private cloud-based services whereas others have built their own automated solutions internally.

The problem is that the tasks that have now become the biggest time sinks for DBAs are much harder to automate. For example, DBMSs have dozens of configuration options. Tuning them is an essential but tedious task for DBAs, because it’s a trial and error approach even for experts. What makes this task even more time-consuming is that the best configuration for one DBMS may not be the best for another. It depends on the application’s workload and the server’s hardware. Given this, successfully automating DBMS tuning is a big win for DBAs since it would streamline common configuration tasks and give DBAs more time to deal with other issues. This is why we’re working hard to develop smarter tuning techniques that are mature and practical enough to be used in a production environment.

Percona: What do you want attendees to take away from your session? Why should they attend?

Dana: I’ll be presenting OtterTune, a new tool that we’re developing at Carnegie Mellon University that can automatically find good settings for a DBMS’s configuration knobs. I’ll first discuss the practical aspects and limitations of the tool. Then I’ll move on to our machine learning (ML) pipeline. All of the ML algorithms that we use are popular techniques that have both practical and theoretical work backing their effectiveness. I’ll discuss each algorithm in our pipeline using concrete examples from MySQL to give better intuition about what we are doing. I will also go over the outputs from each stage (e.g., the configuration parameters that the algorithm find to be the most impactful on performance). I will then talk about lessons I learned along the way, and finally wrap up with some exciting performance results that show how OtterTune’s configurations compared to those created by top-notch DBAs!

My talk will be accessible to a general audience. You do not need a machine learning background to understand our research.

Percona: What are you most looking forward to at Percona Live Europe 2017?

Dana: This is my first Percona Live conference, and I’m excited about attending. I’m looking forward to talking with other developers and DBAs about the projects they’re working on and the challenges they’re facing and getting feedback on OtterTune and our ideas.

Want to find out more about Dana and machine learning for DBMS management? Register for Percona Live Europe 2017, and see his talk Automatic Database Management System Tuning Through Large-Scale Machine Learning. Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

Sep
19
2017
--

ProxySQL Improves MySQL SSL Connections

In this blog post, we’ll look at how ProxySQL improves MySQL SSL connection performance.

When deploying MySQL with SSL, the main concern is that the initial handshake causes significant overhead if you are not using connection pools (i.e., mysqlnd-mux with PHP, mysql.connector.pooling in Python, etc.). Closing and making new connections over and over can greatly impact on your total query response time. A customer and colleague recently educated me that although you can improve SSL encryption/decryption performance with the AES-NI hardware extension on modern Intel processors, the actual overhead when creating SSL connections comes from the handshake when multiple roundtrips between the server and client are needed.

With ProxySQL’s support for SSL on its backend connections and connection pooling, we can have it sit in front of any application, on the same server (illustrated below):

ProxySQL

With this setup, ProxySQL is running on the same server as the application and is connected to MySQL though local socket. MySQL data does not need to go through the TCP stream unsecured.

To quickly verify how this performs, I used a PHP script that simply creates 10k connections in a single thread as fast it can:

<?php
$i = 10000;
$user = 'percona';
$pass = 'percona';
while($i>=0) {
	$mysqli = mysqli_init();
	// Use SSL
	//$link = mysqli_real_connect($mysqli, "192.168.56.110", $user, $pass, "", 3306, "", MYSQL_CLIENT_SSL)
	// No SSL
	//$link = mysqli_real_connect($mysqli, "192.168.56.110", $user, $pass, "", 3306 )
	// OpenVPN
	//$link = mysqli_real_connect($mysqli, "10.8.99.1",      $user, $pass, "", 3306 )
	// ProxySQL
	$link = mysqli_real_connect($mysqli, "localhost",      $user, $pass, "", 6033, "/tmp/proxysql.sock")
		or die(mysqli_connect_error());
	$info = mysqli_get_host_info($mysqli);
	$i--;
	mysqli_close($mysqli);
	unset($mysqli);
}
?>

Direct connection to MySQL, no SSL:

[root@ad ~]# time php php-test.php
real 0m20.417s
user 0m0.201s
sys 0m3.396s

Direct connection to MySQL with SSL:

[root@ad ~]# time php php-test.php
real	1m19.922s
user	0m29.933s
sys	0m9.550s

Direct connection to MySQL, no SSL, with OpenVPN tunnel:

[root@ad ~]# time php php-test.php
real 0m15.161s
user 0m0.493s
sys 0m0.803s

Now, using ProxySQL via the local socket file:

[root@ad ~]# time php php-test.php
real	0m2.791s
user	0m0.402s
sys	0m0.436s

Below is a graph of these numbers:

ProxySQL

As you can see, the difference between SSL and no SSL performance overhead is about 400% – pretty bad for some workloads.

Connections through OpenVPN are also better than MySQL without SSL. While this is interesting, the OpenVPN server needs to be deployed on another server, separate from the MySQL server and application. This approach allows the application servers and MySQL servers (including replica/cluster nodes) to communicate on the same secured network, but creates a single point of failure. Alternatively, deploying OpenVPN on the MySQL server means if you have an additional high availability layer in place and it gets quite complicated when a new master is promoted. In short, OpenVPN adds many additional moving parts.

The beauty with ProxySQL is that you can just run it from all application servers and it works fine if you simply point it to a VIP that directs it to the correct MySQL server (master), or use the replication group feature to identify the authoritative master.

Lastly, it is important to note that these tests were done on CentOS 7.3 with OpenSSL 1.0.1e, Percona Server for MySQL 5.7.19, ProxySQL 1.4.1, PHP 5.4 and OpenVPN 2.4.3.

Happy ProxySQLing!

Sep
19
2017
--

Agent AI aims to turbocharge its AI tools by offering free CRM

 Agent AI is looking to automate more of the customer service process. To do that, it’s built its own customer relationship management product, as well as AI tools that sit on top — and now it’s making the CRM part available for free. While giant software businesses have been built around CRM, CEO Fred Hsu said the market has changed, with the software becoming less… Read More

Sep
19
2017
--

Google Cloud’s Natural Language API gets content classification and more granular sentiment analysis

 Google Cloud announced two updates this morning to its Natural Language API. Specifically users will now have access to content classification and entity sentiment analysis. These features are particularly valuable for brands and media companies For starters, GCP users will now be able to tag content as corresponding with common topics like health, entertainment and law (cc: Henry).… Read More

Sep
19
2017
--

TalkIQ raises $14 million Series A to give enterprises AI insights into voice communication

 There’s no shortage of startups building their brands around AI for enterprise. And within the enterprise, few spaces are as competitive as AI-powered voice analytics. TalkIQ is the latest company in the space to carry home a large round of financing with promise. With $14 million in Series A funding, the TalkIQ team is hoping its proprietary tech stack and engineering-heavy team will… Read More

Sep
19
2017
--

Minio scores $20 million Series A to build a neutral object storage layer

 Minio has a plan to become the neutral object storage layer, while still maintaining Amazon S3 object storage compatibility. That may seem like an odd strategy, but as CEO Anand Babu Periasamy, co-founder and CEO of Minio points out, there is a clear market need.
By building a solution that enables customers to store data across a variety of solutions including S3, he believes he is giving… Read More

Sep
19
2017
--

Real-time data analytics startup Incorta raises $15M Series B led by Kleiner Perkins

 Incorta, the startup that wants to speed up big data analytics by eliminating the need for data warehouses, has raised a $15 million Series B led by new investor Kleiner Perkins. Existing investors GV and Ron Wohl, former executive vice president of applications development at Oracle, also participated. Read More

Sep
19
2017
--

Salesforce Einstein celebrates its first birthday with several new features

 Salesforce launched Einstein, its artificial intelligence platform just one year ago this week. As it celebrates its first birthday, it’s worth taking a look back at the first year and looking at a couple of enhancements they’re adding as a birthday surprise. It’s easy to lose sight of the fact that Einstein isn’t actually a product at all, even though Salesforce markets… Read More

Sep
19
2017
--

Threat Stack snares $45 million investment as spotlight shines brightly on security

 Threat Stack, the Boston-based security startup that helps companies stay protected in the cloud, reeled in a $45 million investment today. It seems that they are in the right place in the right time as news of the Equifax breach swirls on mainstream media. The round includes a big institutional backer, as fellow Boston firm Fidelity Investments participated through their investment arm,… Read More

Sep
18
2017
--

Goodbye, photo studios. Hello, colormass virtual photoshoots

 Berlin-based colormass, one of the startups presenting today at TechCrunch Disrupt as part of the Battlefield, has developed a platform that lets you recreate an IKEA-style experience for your own merchandise: highly realistic, but digitally manipulated 3D facsimiles. Read More

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com